服務器部署網易開源TTS | EmotiVoice部署教程

一、環境

ubuntu 20.04
python 3.8
cuda 11.8

二、部署

1、docker方式部署

1.1、安裝docker

如何安裝docker，可以參考這篇文章

1.2、拉取鏡像

docker run -dp 127.0.0.1:8501:8501 syq163/emoti-voice:latest

2、完整安裝

安裝python依賴

conda create -n EmotiVoice python=3.8 -y
conda activate EmotiVoice
pip install torch torchaudio
pip install numpy numba scipy transformers==4.26.1 soundfile yacs g2p_en jieba pypinyin

安裝git lfs和下載模型

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
git lfs install
git lfs clone https://huggingface.co/WangZeJun/simbert-base-chinese WangZeJun/simbert-base-chinese

下載預訓練模型

https://drive.google.com/drive/folders/1y6Xwj_GG9ulsAonca_unSGbJ4lxbNymM

將預訓練模型放在源碼中的位置

WangZeJun/simbert-base-chinese

下載源碼

git clone https://github.com/lukeewin/EmotiVoice.git

在源碼路徑中創建目錄保存預訓練模型

mkdir -p outputs/style_encoder/ckpt
mkdir -p outputs/prompt_tts_open_source_joint/ckpt

將g_*, do_*文件放到outputs/prompt_tts_open_source_joint/ckpt，將checkpoint_*放到outputs/style_encoder/ckpt中

推理輸入文本格式是：<speaker>|<style_prompt/emotion_prompt/content>|<phoneme>|<content>

例如: 8051|非常開心|<sos/eos> uo3 sp1 l ai2 sp0 d ao4 sp1 b ei3 sp0 j ing1 sp3 q ing1 sp0 h ua2 sp0 d a4 sp0 x ve2 <sos/eos>|我來到北京，清華大學
其中的音素（phonemes）可以這樣得到：python frontend.py data/my_text.txt > data/my_text_for_tts.txt.

TEXT=data/inference/text
python inference_am_vocoder_joint.py \
--logdir prompt_tts_open_source_joint \
--config_folder config/joint \
--checkpoint g_00140000 \
--test_file $TEXT

合成的語音結果在：outputs/prompt_tts_open_source_joint/test_audio

pip install streamlit
streamlit run demo_page.py

更多內容

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/210286.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/210286.shtml
英文地址，請注明出處：http://en.pswp.cn/news/210286.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！