題意:在本地使用帶有語義核(Semantic Kernel)的文本嵌入模型
問題背景:
I've been reading Stephen Toub's?blog post?about building a simple console-based .NET chat application from the ground up with semantic-kernel. I'm following the examples but instead of OpenAI I want to use microsoft Phi 3 and the nomic embedding model. The first examples in the blog post I could recreate using the semantic kernel huggingface plugin. But I can't seem to run the text embedding example.
我一直在閱讀Stephen Toub的博客文章,文章講述了如何使用語義核(semantic-kernel)從頭開始構建一個基于控制臺的簡單.NET聊天應用程序。我按照示例操作,但我想使用微軟的Phi 3和nomic嵌入模型,而不是OpenAI。我能夠使用語義核的huggingface插件重現博客文章中的第一個示例。但是,我似乎無法運行文本嵌入的示例。
I've downloaded Phi and nomic embed text and are running them on a local server with lm studio.
我已經下載了Phi和nomic嵌入文本模型,并正在使用lm studio在本地服務器上運行它們。
Here's the code I came up with that uses the huggingface plugin:
這里是我編寫的使用huggingface插件的代碼
using System.Net;
using System.Text;
using System.Text.RegularExpressions;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Embeddings;
using Microsoft.SemanticKernel.Memory;
using System.Numerics.Tensors;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel.ChatCompletion;#pragma warning disable SKEXP0070, SKEXP0003, SKEXP0001, SKEXP0011, SKEXP0052, SKEXP0055, SKEXP0050 // Type is for evaluation purposes only and is subject to change or removal in future updates. internal class Program
{private static async Task Main(string[] args){//Suppress this diagnostic to proceed.// Initialize the Semantic kernelIKernelBuilder kernelBuilder = Kernel.CreateBuilder();kernelBuilder.Services.ConfigureHttpClientDefaults(c => c.AddStandardResilienceHandler());var kernel = kernelBuilder.AddHuggingFaceTextEmbeddingGeneration("nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q8_0.gguf",new Uri("http://localhost:1234/v1"),apiKey: "lm-studio",serviceId: null).Build();var embeddingGenerator = kernel.GetRequiredService<ITextEmbeddingGenerationService>();var memoryBuilder = new MemoryBuilder();memoryBuilder.WithTextEmbeddingGeneration(embeddingGenerator);memoryBuilder.WithMemoryStore(new VolatileMemoryStore());var memory = memoryBuilder.Build();// Download a document and create embeddings for itstring input = "What is an amphibian?";string[] examples = [ "What is an amphibian?","Cos'è un anfibio?","A frog is an amphibian.","Frogs, toads, and salamanders are all examples.","Amphibians are four-limbed and ectothermic vertebrates of the class Amphibia.","They are four-limbed and ectothermic vertebrates.","A frog is green.","A tree is green.","It's not easy bein' green.","A dog is a mammal.","A dog is a man's best friend.","You ain't never had a friend like me.","Rachel, Monica, Phoebe, Joey, Chandler, Ross"];for (int i = 0; i < examples.Length; i++)await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");var embed = await embeddingGenerator.GenerateEmbeddingsAsync([input]);ReadOnlyMemory<float> inputEmbedding = (embed)[0];// Generate embeddings for each chunk.IList<ReadOnlyMemory<float>> embeddings = await embeddingGenerator.GenerateEmbeddingsAsync(examples);// Print the cosine similarity between the input and each examplefloat[] similarity = embeddings.Select(e => TensorPrimitives.CosineSimilarity(e.Span, inputEmbedding.Span)).ToArray();similarity.AsSpan().Sort(examples.AsSpan(), (f1, f2) => f2.CompareTo(f1));Console.WriteLine("Similarity Example");for (int i = 0; i < similarity.Length; i++)Console.WriteLine($"{similarity[i]:F6} {examples[i]}");}
}
At the line:? ?這部分代碼存在問題
for (int i = 0; i < examples.Length; i++)await memory.SaveInformationAsync("net7perf", examples[i], $"paragraph{i}");
I get the following exception:? ? ? ? 得到了下面的異常信息
JsonException: The JSON value could not be converted to Microsoft.SemanticKernel.Connectors.HuggingFace.Core.TextEmbeddingResponse
Does anybody know what I'm doing wrong?? ? ? ? 有人知道我錯在哪里嗎?
I've downloaded the following nuget packages into the project:
我已經將以下NuGet包下載到項目中:
Id | Versions | ProjectName |
---|---|---|
Microsoft.SemanticKernel.Core | {1.15.0} | LocalLlmApp |
Microsoft.SemanticKernel.Plugins.Memory | {1.15.0-alpha} | LocalLlmApp |
Microsoft.Extensions.Http.Resilience | {8.6.0} | LocalLlmApp |
Microsoft.Extensions.Logging | {8.0.0} | LocalLlmApp |
Microsoft.SemanticKernel.Connectors.HuggingFace | {1.15.0-preview} | LocalLlmApp |
Newtonsoft.Json | {13.0.3} | LocalLlmApp |
Microsoft.Extensions.Logging.Console | {8.0.0} | LocalLlmApp |
問題解決:
I think you cannot use?AddHuggingFaceTextEmbeddingGeneration
?with an embedding model from LM Studio out of the box. The reason is that the?HuggingFaceClient
?internally changes the url and adds:
我認為你不能直接使用AddHuggingFaceTextEmbeddingGeneration
與LM Studio中的嵌入模型,因為HuggingFaceClient
內部會更改URL并添加:
pipeline/feature-extraction/
private Uri GetEmbeddingGenerationEndpoint(string modelId)=> new($"{this.Endpoint}{this.Separator}pipeline/feature-extraction/{modelId}");
that's the same as the Error Message I get in the LM Studio Console:
這與我在LM Studio控制臺中收到的錯誤信息相同:
[2024-07-03 22:18:19.898] [ERROR] Unexpected endpoint or method. (POST /v1/embedding/pipeline/feature-extraction/nomic-ai/nomic-embed-text-v1.5-GGUF/nomic-embed-text-v1.5.Q5_K_M.gguf). Returning 200 anyway
In order to get this working the url would have to be changed.
為了使這個工作正常進行,URL必須被更改。