昇思25天學習打卡營第17天|文本解碼原理--以MindNLP為例

文本解碼就是根據當前已經輸入的內容不斷地預測下一個詞，前期通過大量的文本文章等輸入，讓模型學習好以后，根據已學習的內容，不斷預測下一個詞。就像鸚鵡學舌一樣你不斷的叫他說你好大帥哥，你好大帥哥。后面某一天，當你說你好的時候，他會自然的接著說大帥哥。文本解碼同理。
不過內容量會大很多，除了會說你好大帥哥，也會說你好大美女。那AI是怎么知道應該說哪個。他會看前文，因為我們喂給他文章里面，“女”這個詞總是關聯出現大美女，所以當前面出現女，接著說你好的時候，他就知道大美女的概率高于大帥哥，就是優先出現大帥哥。

import mindspore
from mindnlp.transformers import GPT2Tokenizer, GPT2LMHeadModeltokenizer = GPT2Tokenizer.from_pretrained("iiBcai/gpt2", mirror='modelscope')# add the EOS token as PAD token to avoid warnings
model = GPT2LMHeadModel.from_pretrained("iiBcai/gpt2", pad_token_id=tokenizer.eos_token_id, mirror='modelscope')# encode context the generation is conditioned on
input_ids = tokenizer.encode('I enjoy walking with my cute dog', return_tensors='ms')mindspore.set_seed(0)
# set top_k = 50 and set top_p = 0.95 and num_return_sequences = 3
sample_outputs = model.generate(input_ids,do_sample=True,max_length=50,top_k=5,top_p=0.95,num_return_sequences=3
)print("Output:\n" + 100 * '-')
for i, sample_output in enumerate(sample_outputs):print("{}: {}".format(i, tokenizer.decode(sample_output, skip_special_tokens=True)))

Output:
----------------------------------------------------------------------------------------------------
0: I enjoy walking with my cute dog."My dog loves the smell of the dog. I'm so happy that she's happy with me."I love to walk with my dog. I'm so happy that she's happy
1: I enjoy walking with my cute dog. I'm a big fan of my cat and her dog, but I don't have the same enthusiasm for her. It's hard not to like her because it is my dog.My husband, who
2: I enjoy walking with my cute dog, but I'm also not sure I would want my dog to walk alone with me."She also told The Daily Beast that the dog is very protective."I think she's very protective of

類似這個示例，當輸入I enjoy walking with my cute dog的時候，AI會一直續寫下去，總體看上去，效果還是很不錯的。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/web/44255.shtml
繁體地址，請注明出處：http://hk.pswp.cn/web/44255.shtml
英文地址，請注明出處：http://en.pswp.cn/web/44255.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！