應用場景
- 從商品詳情頁爬取商品評論,對其做輿情分析;
- 電話客服,對音頻進行分析,做輿情分析;
通過開發相應的服務接口,進一步工程化;
模型選用
- 文本,選用了通義實驗室fine-tune的structBERT 模型,基于大眾點評的評論數據進行訓練,使用預訓練模型進行推理,CPU 能跑,支持模型微調,基本上不用微調了,因為他是基于電商領域的數據集進行訓練的,基本夠用,可本地部署;
參考論文:
title: Incorporating language structures into pre-training for deep language understanding
author:Wang, Wei and Bi, Bin and Yan, Ming and Wu, Chen and Bao, Zuyi and Xia, Jiangnan and Peng, Liwei and Si, Luo
journal:arXiv preprint arXiv:1908.04577,
year:2019
版本依賴:
modelscope-lib 最新版本
推理代碼:
semantic_cls = pipeline(Tasks.text_classification, 'damo/nlp_structbert_sentiment-classification_chinese-base')comment0 = '非常厚實的一包大米,來自遙遠的東北,盤錦大米,應該不錯的,密封性很好。賣家的服務真是貼心周到!他們提供了專業的建議,幫助我選擇了合適的商品。物流速度也很快,讓我順利收到了商品。'
result0 = semantic_cls(input=comment0)
if result0['scores'][0] > result0['scores'][1]:print("'" + comment0 + "',屬于" + result0["labels"][0] + "評價")
else:print("'" + comment0 + "',屬于" + result0["labels"][1] + "評價")comment1 = '食物的口感還不錯,不過店員的服務態度可以進一步改善一下。'
result1 = semantic_cls(input=comment1)
if result1['scores'][0] > result1['scores'][1]:print("'" + comment1 + "',屬于" + result1["labels"][0] + "評價")
else:print("'" + comment1 + "',屬于" + result1["labels"][1] + "評價")comment2 = '衣服尺碼合適,色彩可以再鮮艷一些,客服響應速度一般。'
result2 = semantic_cls(input=comment2)
if result2['scores'][0] > result2['scores'][1]:print("'" + comment2 + "',屬于" + result2["labels"][0] + "評價")
else:print("'" + comment2 + "',屬于" + result2["labels"][1] + "評價")comment3 = '物流慢,售后不好,貨品質量差。'
result3 = semantic_cls(input=comment3)
if result3['scores'][0] > result3['scores'][1]:print("'" + comment3 + "',屬于" + result3["labels"][0] + "評價")
else:print("'" + comment3 + "',屬于" + result3["labels"][1] + "評價")comment4 = '物流包裝順壞,不過客服處理速度比較快,也給了比較滿意的賠償。'
result4 = semantic_cls(input=comment4)
if result4['scores'][0] > result4['scores'][1]:print("'" + comment4 + "',屬于" + result4["labels"][0] + "評價")
else:print("'" + comment4 + "',屬于" + result4["labels"][1] + "評價")comment5 = '冰箱制冷噪聲較大,制冷慢。'
result5 = semantic_cls(input=comment5)
if result5['scores'][0] > result5['scores'][1]:print("'" + comment5 + "',屬于" + result5["labels"][0] + "評價")
else:print("'" + comment5 + "',屬于" + result5["labels"][1] + "評價")comment6 = '買了一件劉德華同款鞋,穿在自己腳上不像劉德華,像掃大街的。'
result6 = semantic_cls(input=comment6)
if result6['scores'][0] > result6['scores'][1]:print("'" + comment6 + "',屬于" + result6["labels"][0] + "評價")
else:print("'" + comment6 + "',屬于" + result6["labels"][1] + "評價")
運行結果:
'非常厚實的一包大米,來自遙遠的東北,盤錦大米,應該不錯的,密封性很好。賣家的服務真是貼心周到!他們提供了專業的建議,幫助我選擇了合適的商品。物流速度也很快,讓我順利收到了商品。',屬于正面評價
'食物的口感還不錯,不過店員的服務態度可以進一步改善一下。',屬于正面評價
'衣服尺碼合適,色彩可以再鮮艷一些,客服響應速度一般。',屬于正面評價
'物流慢,售后不好,貨品質量差。',屬于負面評價
'物流包裝順壞,不過客服處理速度比較快,也給了比較滿意的賠償。',屬于正面評價
'冰箱制冷噪聲較大,制冷慢。',屬于負面評價
'買了一件劉德華同款鞋,穿在自己腳上不像劉德華,像掃大街的。',屬于負面評價
- 音頻,選用了通義實驗室 fine-tune的emotion2vec微調模型,CPU 能跑,可本地部署;
參考論文:
title: Self-Supervised Pre-Training for Speech Emotion Representation
author:Ma, Ziyang and Zheng, Zhisheng and Ye, Jiaxin and Li, Jinchao and Gao, Zhifu and Zhang, Shiliang and Chen, Xie
journal:arXiv preprint arXiv:2312.15185
year:2023
開源地址:
Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
版本依賴:
modelscope >= 1.11.1
funasr>=1.0.5
推理代碼:
from funasr import AutoModelmodel = AutoModel(model="iic/emotion2vec_base_finetuned", model_revision="v2.0.4")wav_file = f"{model.model_path}/example/test.wav"
res = model.generate(wav_file, output_dir="./outputs", granularity="utterance", extract_embedding=False)
print(res)scores = res[0]["scores"]max_score = 0
max_index = 0
i = 0
for score in scores:if score > max_score:max_score = scoremax_index = ii += 1print("音頻分析后,情感基調為:" + res[0]["labels"][max_index])
運行結果
rtf_avg: 0.263: 100%|██████████| 1/1 [00:02<00:00,? 2.64s/it]
[{'key': 'rand_key_2yW4Acq9GFz6Y', 'labels': ['生氣/angry', '厭惡/disgusted', '恐懼/fearful', '開心/happy', '中立/neutral', '其他/other', '難過/sad', '吃驚/surprised', '<unk>'], 'scores': [0.06824027001857758, 0.030794354155659676, 0.20301730930805206, 0.09666425734758377, 0.12219445407390594, 0.06753909587860107, 0.13648174703121185, 0.11873088777065277, 0.1563376784324646]}]
音頻分析后,情感為:恐懼/fearfulProcess finished with exit code 0