ubuntu下的chattts 學習6：音色固定的學習

魔搭社區

該區提供了隨機種子級音樂的試聽與下載。

spk = torch.load(<PT-FILE-PATH>)
params_infer_code = {'spk_emb': spk,
}
略

測試過程：

1.先建一個文件夾：然后從上面的網站上下載了兩個。放在里面測試

2.測試代碼

import ChatTTS
import torch
import torchaudiochat = ChatTTS.Chat()
chat.load(compile=False) # Set to True for better performance
###################################
# Sample a speaker from Gaussian.# rand_spk = chat.sample_random_speaker()
# print(rand_spk) # save it for later timbre recovery
guding_spk=torch.load("speaker/seed_2279.pt")params_infer_code = ChatTTS.Chat.InferCodeParams(spk_emb = guding_spk, # add sampled speakertemperature = .3,   # using custom temperaturetop_P = 0.7,        # top P decodetop_K = 20,         # top K decode
)###################################
# For sentence level manual control.# use oral_(0-9), laugh_(0-2), break_(0-7)
# to generate special token in text to synthesize.
params_refine_text = ChatTTS.Chat.RefineTextParams(prompt='[oral_2][laugh_0][break_6]',
)
#
# wavs = chat.infer(
#     texts,
#     params_refine_text=params_refine_text,
#     params_infer_code=params_infer_code,
# )###################################
# For word level manual control.
text = '故事中國網是《故事會》雜志社主辦的網站。網站內容豐富，與雜志期刊《故事會》同步更新'
# text = 'What is [uv_break]your favorite english food?[laugh][lbreak]'
wavs = chat.infer(text, skip_refine_text=True, params_refine_text=params_refine_text,  params_infer_code=params_infer_code)
"""
In some versions of torchaudio, the first line works but in other versions, so does the second line.
"""
try:torchaudio.save("word_level_output.wav", torch.from_numpy(wavs[0]).unsqueeze(0), 24000)
except:torchaudio.save("word_level_output.wav", torch.from_numpy(wavs[0]), 24000)

3.提示情況：

python guding.py
/home/duyicheng/gitee/ChatTTS/guding.py:12: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
? guding_spk=torch.load("speaker/seed_2279.pt")
found invalid characters: {'《', '》'}
code:?? 0%|???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? | 1/2048(max) [00:00,? 2.80it/s]We detected that you are passing `past_key_values` as a tuple of tuples. This is deprecated and will be removed in v4.47. Please convert your cache or use an appropriate `Cache` class (https://huggingface.co/docs/transformers/kv_cache#legacy-cache-format)
code:? 16%|█████████████████████████████▏???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? | 334/2048(max) [00:10, 31.73it/s]
(chattts) duyicheng@duyicheng-computer:~/gitee/ChatTTS$

4.解決

根據你提供的信息，有幾個問題需要注意：

1. **FutureWarning 關于 `torch.load`**:
?? - 這個警告是因為你在使用 `torch.load` 加載模型權重時，默認情況下使用了 `pickle` 模塊。未來版本的 PyTorch 將默認設置 `weights_only=True`，這意味著只有權重會被加載，而不是整個對象。
?? - 建議你顯式地設置 `weights_only=True` 來避免潛在的安全風險。
?? - 修改代碼如下：
???? ```python
???? guding_spk = torch.load("speaker/seed_2279.pt", weights_only=True)
???? ```

2. **無效字符警告**:
?? - 警告提示在文本中找到了無效字符 `{'《', '》'}`。這些字符可能不是模型支持的字符集的一部分。
?? - 你可以嘗試移除或替換這些字符，或者檢查模型是否支持中文字符。
?? - 修改文本如下：
???? ```python
???? text = '故事中國網是《故事會》雜志社主辦的網站。網站內容豐富，與雜志期刊《故事會》同步更新'.replace('《', '').replace('》', '')
???? ```

3. **關于 `past_key_values` 的棄用警告**:
?? - 這個警告來自 Hugging Face 的 Transformers 庫，提示你傳遞的 `past_key_values` 格式即將被棄用。
?? - 你需要檢查并更新你的代碼以使用新的緩存格式。
?? - 參考 Hugging Face 的文檔進行相應的修改：[Legacy Cache Format](https://huggingface.co/docs/transformers/kv_cache#legacy-cache-format)

總結：
- 更新 `torch.load` 的調用以設置 `weights_only=True`。
- 移除或替換文本中的無效字符。
- 檢查并更新 `past_key_values` 的使用方式以符合最新的庫要求。

?5.修改后代碼

import ChatTTS
import torch
import torchaudiochat = ChatTTS.Chat()
chat.load(compile=False) # Set to True for better performance
###################################
# Sample a speaker from Gaussian.# rand_spk = chat.sample_random_speaker()
# print(rand_spk) # save it for later timbre recovery
guding_spk=torch.load("speaker/seed_2279.pt",weights_only=True)params_infer_code = ChatTTS.Chat.InferCodeParams(spk_emb = guding_spk, # add sampled speakertemperature = .3,   # using custom temperaturetop_P = 0.7,        # top P decodetop_K = 20,         # top K decode
)###################################
# For sentence level manual control.# use oral_(0-9), laugh_(0-2), break_(0-7)
# to generate special token in text to synthesize.
params_refine_text = ChatTTS.Chat.RefineTextParams(prompt='[oral_2][laugh_0][break_6]',
)
#
# wavs = chat.infer(
#     texts,
#     params_refine_text=params_refine_text,
#     params_infer_code=params_infer_code,
# )###################################
# For word level manual control.
text = """
根據你提供的信息，有幾個問題需要注意：
FutureWarning 關于 torch.load:
這個警告是因為你在使用 torch.load 加載模型權重時，默認情況下使用了 pickle 模塊。未來版本的 PyTorch 將默認設置 weights_only=True，這意味著只有權重會被加載，而不是整個對象。
建議你顯式地設置 weights_only=True 來避免潛在的安全風險。
修改代碼如下："""
# text = 'What is [uv_break]your favorite english food?[laugh][lbreak]'
wavs = chat.infer(text, skip_refine_text=True, params_refine_text=params_refine_text,  params_infer_code=params_infer_code)
"""
In some versions of torchaudio, the first line works but in other versions, so does the second line.
"""
try:torchaudio.save("word_level_output.wav", torch.from_numpy(wavs[0]).unsqueeze(0), 24000)
except:torchaudio.save("word_level_output.wav", torch.from_numpy(wavs[0]), 24000)

6.固定隨機的內容并自行測試的方法


rand_spk = chat.sample_random_speaker()
torch.save(rand_spk, "speaker/XXXX.pt")
# print(rand_spk) # save it for later timbre recovery

以上測試通過。6G的卡，音質方面比4G? 要好不少。原來我一臺破機器上的破音很多。可能真得與硬件有關。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/web/62455.shtml
繁體地址，請注明出處：http://hk.pswp.cn/web/62455.shtml
英文地址，請注明出處：http://en.pswp.cn/web/62455.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！