Python + 在線 + 文生音，音轉文（中文文本轉為英文語音，語音轉為中文文本）

開源模型

平臺：https://huggingface.co/
ars-語言轉文本: pipeline("automatic-speech-recognition", model="openai/whisper-large-v3", device=0 )
- hf:?https://huggingface.co/openai/whisper-large-v3?
- github:?https://github.com/openai/whisper
tts-文本轉語音：pipeline("text-to-speech", "microsoft/speecht5_tts", device=0)
- hf:?https://huggingface.co/microsoft/speecht5_tts?
- github:?https://github.com/microsoft/SpeechT5?
文本語言識別：pipeline("text-classification", model="papluca/xlm-roberta-base-language-detection", device=0)
- hf:?https://huggingface.co/papluca/xlm-roberta-base-language-detection
- github:? https://github.com/saffsd/langid.py
文本翻譯--zh-en:
- pipeline("translation", model="Helsinki-NLP/opus-mt-en-zh", device=0, torch_dtype=torch_dtype)
- pipeline("translation", model="Helsinki-NLP/opus-mt-zh-en", device=0, torch_dtype=torch_dtype)
- hf:?https://huggingface.co/Helsinki-NLP/opus-mt-zh-en?
- github:?https://github.com/Helsinki-NLP/OPUS-MT-train?
流程
- 網頁端提交：當前批次，文本，音頻base64數據，通過給python-flask后端產生一個處理任務(mongo)
- 后端循環處理要處理的任務
- 網頁端查詢已處理好的任務--批次

代碼

## 接口from flask import Flask, request
from flask_cors import CORSimport time
import json
from datetime import datetimeimport mongo_util
import audio_message_util as amutil
import audio_util as autilapp = Flask(__name__)
CORS(app)client = mongo_util.get_client()
db = mongo_util.get_db(client, "")class DateTimeEncoder(json.JSONEncoder):def default(self, obj):if isinstance(obj, datetime):return obj.isoformat()return json.JSONEncoder.default(self, obj)@app.route('/audio/totxt', methods=['POST'])
def totxt():base64data = request.json['data']batch = request.json['batch']filename = 'webm/' + batch + '/' + str(time.time()).replace('.', '_')+'.webm'out_file = filename.replace('webm', 'mp3')autil.base64_tomp3(base64data, filename, out_file)print(datetime.now(), batch, filename, out_file)c = {'batch': batch,'original_path': filename,'audio_path': out_file,'type': 1,'status': 1,}amutil.save(db, c)return 'ok'@app.route('/txt/toaudio', methods=['POST'])
def toaudio():text = request.json['data']batch = request.json['batch']print(datetime.now(), batch, text)c = {'batch': batch,'original_text': text,'type': 2,'status': 1,}amutil.save(db, c)return 'ok'@app.route('/audio/gettxt', methods=['POST'])
def gettxt():batch = request.json['batch']cs = amutil.get(db, {'status':2, 'batch':batch})csj = []for c in cs:# print(c)if c['type'] == 2 and c['audiofile'] != None:c['audiourl'] = 'data:audio/webm;codecs=opus;base64,' + autil.get_audio_base64(c['audiofile'])if c['type'] == 1 and 'audio_path' in c and c['audio_path'] != None:c['audiourl'] = 'data:audio/webm;codecs=opus;base64,' + autil.get_audio_base64(c['audio_path'])csj.append(c)return json.dumps(csj, cls=DateTimeEncoder)if __name__ == '__main__':app.run(debug=True, port="8080")

結果

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/web/40689.shtml
繁體地址，請注明出處：http://hk.pswp.cn/web/40689.shtml
英文地址，請注明出處：http://en.pswp.cn/web/40689.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！