RNN實現情感分類
??概述
情感分類是自然語言處理中的經典任務,是典型的分類問題。本節使用MindSpore實現一個基于RNN網絡的情感分類模型,實現如下的效果:
輸入: This film is terrible
正確標簽: Negative
預測標簽: Negative輸入: This film is great
正確標簽: Positive
預測標簽: Positive
數據準備
本節使用情感分類的經典數據集IMDB影評數據集,數據集包含Positive和Negative兩類,下面為其樣例:
Review | Label |
---|---|
"Quitting" may be as much about exiting a pre-ordained identity as about drug withdrawal. As a rural guy coming to Beijing, class and success must have struck this young artist face on as an appeal to separate from his roots and far surpass his peasant parents' acting success. Troubles arise, however, when the new man is too new, when it demands too big a departure from family, history, nature, and personal identity. The ensuing splits, and confusion between the imaginary and the real and the dissonance between the ordinary and the heroic are the stuff of a gut check on the one hand or a complete escape from self on the other. | Negative |
This movie is amazing because the fact that the real people portray themselves and their real life experience and do such a good job it's like they're almost living the past over again. Jia Hongsheng plays himself an actor who quit everything except music and drugs struggling with depression and searching for the meaning of life while being angry at everyone especially the people who care for him most. | Positive |
此外,需要使用預訓練詞向量對自然語言單詞進行編碼,以獲取文本的語義特征,本節選取Glove詞向量作為Embedding。
數據下載模塊
為了方便數據集和預訓練詞向量的下載,首先設計數據下載模塊,實現可視化下載流程,并保存至指定路徑。數據下載模塊使用requests
庫進行http請求,并通過tqdm
庫對下載百分比進行可視化。此外針對下載安全性,使用IO的方式下載臨時文件,而后保存至指定的路徑并返回。
------------------------
以上為今天學習內容簡介,下面是運行代碼訓練的結果。