大模型應用開發2-SpringAI實戰

SpringAI整合了大多數大模型,而且對于大模型開發的三種技術架構都有比較好的封裝和支持,開發起來非常方便。不同的模型能夠接收的輸入類型、輸出類型不一定相同。SpringAI根據模型的輸入和輸出類型不同對模型進行了分類:

大模型應用開發大多數情況下使用的都是基于對話模型(Chat Model),也就是輸出結果為自然語言或代碼的模型。SpringAI支持的大模型中最完整的就是OpenAI和Ollama平臺的大模型。

1.SpringAI入門實例

1.1 工程創建

創建SpringBoot項目并引入SpringAI基礎依賴:

SpringAI完全適配了SpringBoot的自動裝配功能,而且給不同的大模型提供了不同的starter,比如:

模型/平臺

starter

Anthropic

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-anthropic-spring-boot-starter</artifactId> 
</dependency>

Azure OpenAI

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-azure-openai-spring-boot-starter</artifactId>
</dependency>

DeepSeek

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-openai-spring-boot-starter</artifactId> 
</dependency>

Hugging Face

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-huggingface-spring-boot-starter</artifactId>
</dependency>

Ollama

<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-ollama-spring-boot-starter</artifactId> </dependency>

OpenAI

<dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

我們可以根據自己選擇的平臺來選擇引入不同的依賴。這里我們先以Ollama為例。

首先,在項目pom.xml中添加spring-ai的版本信息:

<spring-ai.version>1.0.0-M6</spring-ai.version>

然后,添加spring-ai的依賴管理項:

<dependencyManagement> <dependencies> <dependency><groupId>org.springframework.ai</groupId> <artifactId>spring-ai-bom</artifactId><version>${spring-ai.version}</version> <type>pom</type> <scope>import</scope></dependency></dependencies> 
</dependencyManagement>

最后,引入spring-ai-ollama的依賴:

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>

為了方便后續開發,我們再手動引入一個Lombok依賴:

<dependency><groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId><version>1.18.22</version> 
</dependency>

注意: 千萬不要用start.spring.io提供的lombok,有bug!!

1.2 配置模型信息

在application.yml中配置模型參數

spring:application:name: ai-demoai:ollama:base-url: http://localhost:11434 # ollama服務地址, 這就是默認值chat:model: deepseek-r1:7b # 模型名稱options:temperature: 0.8 # 模型溫度,影響模型生成結果的隨機性,越小越穩定

1.3 封裝ChatClient

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.SimpleLoggerAdvisor;
import org.springframework.ai.ollama.OllamaChatModel;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;@Configuration
public class CommonConfiguration {// 注意參數中的model就是使用的模型,這里用了Ollama,也可以選擇OpenAIChatModel@Beanpublic ChatClient chatClient(OllamaChatModel model) {return ChatClient.builder(model) // 創建ChatClient工廠.build(); // 構建ChatClient實例}
}
  • ChatClient.builder:會得到一個ChatClient.Builder工廠對象,利用它可以自由選擇模型、添加各種自定義配置

  • OllamaChatModel:如果你引入了ollama的starter,這里就可以自動注入OllamaChatModel對象。同理,OpenAI也是一樣的用法。

  • Spring 會默認將方法名作為 Bean 的名稱,默認生成名稱為chatClient的Bean

1.4 同步調用

import lombok.RequiredArgsConstructor;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;@RequiredArgsConstructor
@RestController
@RequestMapping("/ai")
public class ChatController {private final ChatClient chatClient;// 請求方式和路徑不要改動,將來要與前端聯調@RequestMapping("/chat")public String chat(@RequestParam(defaultValue = "講個笑話") String prompt) {return chatClient.prompt(prompt) // 傳入user提示詞.call() // 同步請求,會等待AI全部輸出完才返回結果.content(); //返回響應內容}
}

注意,基于call()方法的調用屬于同步調用,需要所有響應結果全部返回后才能返回給前端。

1.5 流式調用

同步調用需要等待很長時間頁面才能看到結果,用戶體驗不好。為了解決這個問題,我們可以改進調用方式為流式調用。在SpringAI中使用了WebFlux技術實現流式調用。

// 注意看返回值,是Flux<String>,也就是流式結果,另外需要設定響應類型和編碼,不然前端會亂碼
@RequestMapping(value = "/chat", produces = "text/html;charset=UTF-8")
public Flux<String> chat(@RequestParam(defaultValue = "講個笑話") String prompt) {return chatClient.prompt(prompt).stream() // 流式調用.content();
}

1.6 設置System

可以發現,當我們詢問AI你是誰的時候,它回答自己是DeepSeek-R1,這是大模型底層的設定。如果我們希望AI按照新的設定工作,就需要給它設置System背景信息。

在SpringAI中,設置System信息非常方便,不需要在每次發送時封裝到Message,而是創建ChatClient時指定即可。

我們修改配置類中的代碼,給ChatClient設定默認的System信息:

@Bean
public ChatClient chatClient(OllamaChatModel model) {return ChatClient.builder(model) // 創建ChatClient工廠實例.defaultSystem("您是一個經驗豐富的軟件開發工程師,請以友好、樂于助人和愉快的方式解答學生的各種問題。").defaultAdvisors(new SimpleLoggerAdvisor()).build(); // 構建ChatClient實例}

1.7 日志功能

默認情況下,應用于AI的交互時不記錄日志的,我們無法得知SpringAI組織的提示詞到底長什么樣,有沒有問題。這樣不方便我們調試。

1.7.1 Advisor

SpringAI基于AOP機制實現與大模型對話過程的增強、攔截、修改等功能。所有的增強通知都需要實現Advisor接口。

Spring提供了一些Advisor的默認實現,來實現一些基本的增強功能:

  • SimpleLoggerAdvisor:日志記錄的Advisor

  • MessageChatMemoryAdvisor:會話記憶的Advisor

  • QuestionAnswerAdvisor:實現RAG的Advisor

1.7.2?添加日志Advisor

首先,我們需要修改配置文件,給ChatClient添加日志Advisor:

@Bean
public ChatClient chatClient(OllamaChatModel model) {return ChatClient.builder(model) // 創建ChatClient工廠實例.defaultSystem("你是一個熱心、可愛的智能助手,你的名字叫小團團,請以小團團的身份和語氣回答問題。").defaultAdvisors(new SimpleLoggerAdvisor()) // 添加默認的Advisor,記錄日志.build(); // 構建ChatClient實例}

1.7.2 修改日志級別

接下來,我們在application.yml中添加日志配置,更新日志級別:

logging:level:org.springframework.ai: debug # AI對話的日志級別com.lgh.ai: debug # 本項目的日志級別

1.8 會話記憶

SpringAI自帶了會話記憶功能,可以幫我們把歷史會話保存下來,下一次請求AI時會自動拼接。

1.8.1 ChatMemory

話記憶功能同樣是基于AOP實現,Spring提供了一個MessageChatMemoryAdvisor的通知,我們可以像之前添加日志通知一樣添加到ChatClient即可。不過,要注意的是,MessageChatMemoryAdvisor需要指定一個ChatMemory實例,也就是會話歷史保存的方式。

ChatMemory接口聲明如下(此接口SpringAI自帶):

public interface ChatMemory {// TODO: consider a non-blocking interface for streaming usagesdefault void add(String conversationId, Message message) {this.add(conversationId, List.of(message));}// 添加會話信息到指定conversationId的會話歷史中void add(String conversationId, List<Message> messages);// 根據conversationId查詢歷史會話List<Message> get(String conversationId, int lastN);// 清除指定conversationId的會話歷史void clear(String conversationId);}

所有的會話記憶都是與conversationId有關聯的,也就是會話Id,將來不同會話id的記憶自然是分開管理的。

目前,在SpringAI中有兩個ChatMemory的實現:

  • InMemoryChatMemory:會話歷史保存在內存中

  • CassandraChatMemory:會話保存在Cassandra數據庫中(需要引入額外依賴,并且綁定了向量數據庫,不夠靈活)

基于內存的ChatMemory(SpringAI自帶):

public class InMemoryChatMemory implements ChatMemory {Map<String, List<Message>> conversationHistory = new ConcurrentHashMap();public InMemoryChatMemory() {}public void add(String conversationId, List<Message> messages) {this.conversationHistory.putIfAbsent(conversationId, new ArrayList());((List)this.conversationHistory.get(conversationId)).addAll(messages);}public List<Message> get(String conversationId, int lastN) {List<Message> all = (List)this.conversationHistory.get(conversationId);return all != null ? all.stream().skip((long)Math.max(0, all.size() - lastN)).toList() : List.of();}public void clear(String conversationId) {this.conversationHistory.remove(conversationId);}
}

基于Redis的ChatMemory實現:


import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.lgh.web.manager.springai.model.Msg;
import lombok.RequiredArgsConstructor;
import org.springframework.ai.chat.memory.ChatMemory;
import org.springframework.ai.chat.messages.Message;
import org.springframework.data.redis.core.StringRedisTemplate;
import org.springframework.stereotype.Component;
import java.util.List;/*** Redis ChatMemory實現類* @Author GuihaoLv*/
@RequiredArgsConstructor
@Component
public class RedisChatMemory implements ChatMemory {private final StringRedisTemplate redisTemplate;private final ObjectMapper objectMapper;private final static String PREFIX = "chat:";@Overridepublic void add(String conversationId, List<Message> messages) {if (messages == null || messages.isEmpty()) {return;}List<String> list = messages.stream().map(Msg::new).map(msg -> {try {return objectMapper.writeValueAsString(msg);} catch (JsonProcessingException e) {throw new RuntimeException(e);}}).toList();redisTemplate.opsForList().leftPushAll(PREFIX + conversationId, list);}@Overridepublic List<Message> get(String conversationId, int lastN) {List<String> list = redisTemplate.opsForList().range(PREFIX +conversationId, 0, lastN);if (list == null || list.isEmpty()) {return List.of();}return list.stream().map(s -> {try {return objectMapper.readValue(s, Msg.class);} catch (JsonProcessingException e) {throw new RuntimeException(e);}}).map(Msg::toMessage).toList();}@Overridepublic void clear(String conversationId) {redisTemplate.delete(PREFIX + conversationId);}
}

Msg消息類封裝:

/*** 消息類* @Author GuihaoLv*/
@NoArgsConstructor
@AllArgsConstructor
@Data
public class Msg {MessageType messageType; // 消息類型(枚舉)String text;   // 消息文本內容Map<String, Object> metadata; // 消息元數據(附加信息)List<AssistantMessage.ToolCall> toolCalls;// 工具調用列表(僅助手消息可能有)public Msg(Message message) {this.messageType = message.getMessageType();this.text = message.getText();this.metadata = message.getMetadata();// 僅當原始消息是助手消息時,才復制toolCallsif(message instanceof AssistantMessage am) {this.toolCalls = am.getToolCalls();}}public Message toMessage() {return switch (messageType) {case SYSTEM -> new SystemMessage(text);case USER -> new UserMessage(text, List.of(), metadata);case ASSISTANT -> new AssistantMessage(text, metadata, toolCalls, List.of());default -> throw new IllegalArgumentException("Unsupported message type: " + messageType);};}
}

1.8.2 添加會話記憶Advisor

注冊ChatMemory

@Bean
public ChatMemory chatMemory() {return new InMemoryChatMemory();
}
    @Autowiredprivate StringRedisTemplate stringRedisTemplate;@Beanpublic ObjectMapper objectMapper() {ObjectMapper objectMapper = new ObjectMapper();objectMapper.configure(com.fasterxml.jackson.databind.DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);// 添加Java 8日期時間模塊支持objectMapper.registerModule(new JavaTimeModule());return objectMapper;}@Beanpublic ChatMemory chatMemory() {return new RedisChatMemory(stringRedisTemplate, objectMapper());}

添加MessageChatMemoryAdvisorChatClient

@Bean
public ChatClient chatClient(OllamaChatModel model, ChatMemory chatMemory) {return ChatClient.builder(model) // 創建ChatClient工廠實例.defaultSystem("你的名字叫小黑。請以友好、樂于助人和愉快的方式解答學生的各種問題。").defaultAdvisors(new SimpleLoggerAdvisor()) // 添加默認的Advisor,記錄日志.defaultAdvisors(new MessageChatMemoryAdvisor(chatMemory)).build(); // 構建ChatClient實例
}

現在聊天會話已經有記憶功能了。

1.9 會話歷史

會話歷史與會話記憶是兩個不同的事情:

會話記憶:是指讓大模型記住每一輪對話的內容,不至于前一句剛問完,下一句就忘了。

會話歷史:是指要記錄總共有多少不同的對話

在ChatMemory中,會記錄一個會話中的所有消息,記錄方式是以conversationId為key,以List<Message>為value,根據這些歷史消息,大模型就能繼續回答問題,這就是所謂的會話記憶。

而會話歷史,就是每一個會話的conversationId,將來根據conversationId再去查詢List<Message>

1.9.1 會話記憶管理

由于會話記憶是以conversationId來管理的,也就是會話id(以后簡稱為chatId)。將來要查詢會話歷史,其實就是查詢歷史中有哪些chatId。因此,為了實現查詢會話歷史記錄,我們必須記錄所有的chatId,我們需要定義一個管理會話歷史的標準接口。

/*** 會話記錄操作相關接口* @Author GuihaoLv*/
public interface ChatHistoryRepository {/*** 保存會話記錄* @param type 業務類型,如:chat、service、pdf* @param chatId 會話ID*/void save(String type, String chatId);/*** 獲取會話ID列表* @param type 業務類型,如:chat、service、pdf* @return 會話ID列表*/List<String> getChatIds(String type);
}

基于內存的會話歷史管理:

 /*** 基于內存實現的會話管理* @Author GuihaoLv*/
@Slf4j
//@Component
@RequiredArgsConstructor
public class InMemoryChatHistoryRepository implements ChatHistoryRepository {private Map<String, List<String>> chatHistory;private final ObjectMapper objectMapper;private final ChatMemory chatMemory;@Overridepublic void save(String type, String chatId) {/*if (!chatHistory.containsKey(type)) {chatHistory.put(type, new ArrayList<>());}List<String> chatIds = chatHistory.get(type);*/List<String> chatIds = chatHistory.computeIfAbsent(type, k -> new ArrayList<>());if (chatIds.contains(chatId)) {return;}chatIds.add(chatId);}@Overridepublic List<String> getChatIds(String type) {/*List<String> chatIds = chatHistory.get(type);return chatIds == null ? List.of() : chatIds;*/return chatHistory.getOrDefault(type, List.of());}@PostConstructprivate void init() {// 1.初始化會話歷史記錄this.chatHistory = new HashMap<>();// 2.讀取本地會話歷史和會話記憶FileSystemResource historyResource = new FileSystemResource("chat-history.json");FileSystemResource memoryResource = new FileSystemResource("chat-memory.json");if (!historyResource.exists()) {return;}try {// 會話歷史Map<String, List<String>> chatIds = this.objectMapper.readValue(historyResource.getInputStream(), new TypeReference<>() {});if (chatIds != null) {this.chatHistory = chatIds;}// 會話記憶Map<String, List<Msg>> memory = this.objectMapper.readValue(memoryResource.getInputStream(), new TypeReference<>() {});if (memory != null) {memory.forEach(this::convertMsgToMessage);}} catch (IOException ex) {throw new RuntimeException(ex);}}private void convertMsgToMessage(String chatId, List<Msg> messages) {this.chatMemory.add(chatId, messages.stream().map(Msg::toMessage).toList());}@PreDestroyprivate void persistent() {String history = toJsonString(this.chatHistory);String memory = getMemoryJsonString();FileSystemResource historyResource = new FileSystemResource("chat-history.json");FileSystemResource memoryResource = new FileSystemResource("chat-memory.json");try (PrintWriter historyWriter = new PrintWriter(historyResource.getOutputStream(), true, StandardCharsets.UTF_8);PrintWriter memoryWriter = new PrintWriter(memoryResource.getOutputStream(), true, StandardCharsets.UTF_8)) {historyWriter.write(history);memoryWriter.write(memory);} catch (IOException ex) {log.error("IOException occurred while saving vector store file.", ex);throw new RuntimeException(ex);} catch (SecurityException ex) {log.error("SecurityException occurred while saving vector store file.", ex);throw new RuntimeException(ex);} catch (NullPointerException ex) {log.error("NullPointerException occurred while saving vector store file.", ex);throw new RuntimeException(ex);}}private String getMemoryJsonString() {Class<InMemoryChatMemory> clazz = InMemoryChatMemory.class;try {Field field = clazz.getDeclaredField("conversationHistory");field.setAccessible(true);Map<String, List<Message>> memory = (Map<String, List<Message>>) field.get(chatMemory);Map<String, List<Msg>> memoryToSave = new HashMap<>();memory.forEach((chatId, messages) -> memoryToSave.put(chatId, messages.stream().map(Msg::new).toList()));return toJsonString(memoryToSave);} catch (NoSuchFieldException | IllegalAccessException e) {throw new RuntimeException(e);}}private String toJsonString(Object object) {ObjectWriter objectWriter = this.objectMapper.writerWithDefaultPrettyPrinter();try {return objectWriter.writeValueAsString(object);} catch (JsonProcessingException e) {throw new RuntimeException("Error serializing documentMap to JSON.", e);}}}

基于Redis實現會話歷史管理:

 /*** Redis ChatHistory 實現類* @Author GuihaoLv*/
@RequiredArgsConstructor
@Component
public class RedisChatHistory implements ChatHistoryRepository{private final StringRedisTemplate redisTemplate;private final static String CHAT_HISTORY_KEY_PREFIX = "chat:history:";@Overridepublic void save(String type, String chatId) {redisTemplate.opsForSet().add(CHAT_HISTORY_KEY_PREFIX + type, chatId);}@Overridepublic List<String> getChatIds(String type) {Set<String> chatIds =redisTemplate.opsForSet().members(CHAT_HISTORY_KEY_PREFIX + type);if(chatIds == null || chatIds.isEmpty()) {return Collections.emptyList();}return chatIds.stream().sorted(String::compareTo).toList();}
}

1.9.2 保存會話id

接下來,修改ChatController中的chat方法,做到3點:

  • 添加一個請求參數:chatId,每次前端請求AI時都需要傳遞chatId

  • 每次處理請求時,將chatId存儲到ChatRepository

  • 每次發請求到AI大模型時,都傳遞自定義的chatId

@CrossOrigin("*")
@RequiredArgsConstructor
@RestController
@RequestMapping("/ai")
public class ChatController {private final ChatClient chatClient;private final ChatMemory chatMemory;private final ChatHistoryRepository chatHistoryRepository;@RequestMapping(value = "/chat", produces = "text/html;charset=UTF-8")public Flux<String> chat(@RequestParam(defaultValue = "講個笑話") String prompt, String chatId) {chatHistoryRepository.addChatId(chatId);return chatClient.prompt(prompt).advisors(a -> a.param(CHAT_MEMORY_CONVERSATION_ID_KEY, chatId)).stream().content();}
}

這里傳遞chatId給Advisor的方式是通過AdvisorContext,也就是以key-value形式存入上下文:

chatClient.advisors(a -> a.param(CHAT_MEMORY_CONVERSATION_ID_KEY, chatId))

其中的CHAT_MEMORY_CONVERSATION_ID_KEY是AbstractChatMemoryAdvisor中定義的常量key,將來MessageChatMemoryAdvisor執行的過程中就可以拿到這個chatId了

1.9.3 查詢會話歷史

我們定義一個新的Controller,專門實現會話歷史的查詢。包含兩個接口:

  • 根據業務類型查詢會話歷史列表(我們將來有3個不同業務,需要分別記錄歷史。大家的業務可能是按userId記錄,根據UserId查詢)

  • 根據chatId查詢指定會話的歷史消息

其中,查詢會話歷史消息,也就是Message集合。但是由于Message并不符合頁面的需要,我們需要自己定義一個VO.

 /*** 消息查詢結果類* @Author GuihaoLv*/
@NoArgsConstructor
@Data
public class MessageVO {private String role;private String content;public MessageVO(Message message) {switch (message.getMessageType()) {case USER:role = "user";break;case ASSISTANT:role = "assistant";break;default:role = "";break;}this.content = message.getText();}
}
/*** AI會話歷史記錄* @author GuihaoLv*/
@RestController
@RequestMapping("/web/aiHistory")
@Tag(name = "AI會話歷史記錄")
@Slf4j
public class ChatHistoryController {@Autowiredprivate RedisChatHistory chatHistoryRepository;@Autowiredprivate RedisChatMemory chatMemory;/*** 獲取會話ID列表* @param type* @return* */@GetMapping("/{type}")@Operation(summary = "獲取會話ID列表")public List<String> getChatIds(@PathVariable("type") String type) {return chatHistoryRepository.getChatIds(type);}/*** 獲取會話記錄* @param type* @param chatId* @return*/@GetMapping("/{type}/{chatId}")@Operation(summary = "獲取會話記錄")public List<MessageVO> getChatHistory(@PathVariable("type") String type, @PathVariable("chatId") String chatId) {List<Message> messages = chatMemory.get(chatId, Integer.MAX_VALUE);if(messages == null) {return List.of();}return messages.stream().map(MessageVO::new).toList();}}

會話記憶整體邏輯設計:

2 FunctionCalling?

2.1 FunctionCalling介紹

AI擅長的是非結構化數據的分析,如果需求中包含嚴格的邏輯校驗或需要讀寫數據庫等業務邏輯,我們可以賦予大模型執行業務規則的邏輯。我們可以把數據庫操作等業務邏輯都定義成Function,或者也可以叫Tool,也就是工具。然后,我們可以在提示詞中,告訴大模型,什么情況下需要調用什么工具,將來用戶在與大模型交互的時候,大模型就可以在適當的時候調用工具了。

流程解讀:

  1. 提前把這些操作定義為Function(SpringAI中叫Tool),

  2. 然后將Function的名稱、作用、需要的參數等信息都封裝為Prompt提示詞與用戶的提問一起發送給大模型

  3. 大模型在與用戶交互的過程中,根據用戶交流的內容判斷是否需要調用Function

  4. 如果需要則返回Function名稱、參數等信息

  5. Java解析結果,判斷要執行哪個函數,代碼執行Function,把結果再次封裝到Prompt中發送給AI

  6. AI繼續與用戶交互,直到完成任務

SpringAI提供了FunctionCalling的功能,由于解析大模型響應,找到函數名稱、參數,調用函數等這些動作都是固定的,所以SpringAI再次利用AOP的能力,幫我們把中間調用函數的部分自動完成了。

我們要做的事情就簡化了:

  • 編寫基礎提示詞(不包括Tool的定義)

  • 編寫Tool(Function)

  • 配置Advisor(SpringAI利用AOP幫我們拼接Tool定義到提示詞,完成Tool調用動作)

2.2 FunctionCalling實戰

實現一個大模型自動總結并保存當前會話知識點

2.2.1 業務封裝

@Mapper
public interface AINoteMapper {/*** 插入一條記錄* @param aINote* @return*/@Insert("insert into tb_ai_note (user_id, chat_id, title, content) values (#{userId}, #{chatId}, #{title}, #{content})")Boolean insert(AINote aINote);/*** 刪除一條記錄* @param aiNoteId* @return*/@Delete("delete from tb_ai_note where id = #{aiNoteId}")Boolean deleteById(Long aiNoteId);/*** 查詢所有記錄* @return*/@Select("SELECT * FROM tb_ai_note")List<AINote> selectList();/*** 根據ID查詢記錄* @param aiNoteId* @return*/@Select("SELECT * FROM tb_ai_note where id = #{aiNoteId}")AINote selectById(Long aiNoteId);/*** 添加AI詞生文記錄* @param generateText* @return*/@Insert("insert into tb_generate_text (user_id, prompt_words, generated_text,translated_text) values (#{userId}, #{promptWords}, #{generatedText},#{translatedText})")Boolean addGT(GenerateText generateText);/*** 獲取AI詞生文記錄* @param userId* @return*/@Select("SELECT * FROM tb_generate_text where user_id = #{userId} order by create_time desc")List<GenerateText> getGTList(Long userId);/*** 刪除AI詞生文記錄* @param generateTextId* @return*/@Delete("delete from tb_generate_text where id = #{generateTextId}")Boolean deleteGT(Long generateTextId);
}
/**
* AI筆記表
* @Author GuihaoLv
*/
@Data
@AllArgsConstructor
@NoArgsConstructor
@Builder
public class AINote extends BaseEntity implements Serializable {private Long userId;//用戶IDprivate String chatId;//會話IDprivate String title;//標題private String content;//內容
}

2.2.2 Function定義

 /*** AI筆記工具類* @Author GuihaoLv*/
@Component
public class AINoteTools {@Autowiredprivate AINoteMapper aiNoteMapper;/*** 將會話中的知識點保存為AI筆記* @param chatId 會話ID(關聯筆記所屬會話)* @param title 筆記標題(總結知識點核心)* @param content 筆記內容(詳細知識點)* @return 保存結果(true成功/false失敗)*/@Tool(description = "將會話中的知識點創建為AI筆記,需傳入會話ID、標題和內容,自動關聯當前用戶")public Boolean createAINote(@ToolParam(required = false, description = "會話唯一標識,用于關聯筆記和對應會話") String chatId,@ToolParam(required = true, description = "筆記標題,簡潔概括知識點內容(不超過20字)") String title,@ToolParam(required = true, description = "筆記詳細內容,記錄會話中的知識點詳情") String content) {// 構建筆記對象(自動填充當前用戶ID)AINote aiNote = new AINote();aiNote.setUserId(UserUtil.getUserId());aiNote.setChatId(chatId);aiNote.setTitle(title);aiNote.setContent(content);// 保存到數據庫return aiNoteMapper.insert(aiNote);}
}

這里的@ToolParam注解是SpringAI提供的用來解釋Function參數的注解。其中的信息都會通過提示詞的方式發送給AI模型。

2.2.3 System提示詞

public static final String CHAT_ROLE ="""你是一個可以幫助用戶記錄會話筆記的助手。當用戶發出以下指令時,必須調用createAINote工具:- "把剛才的內容記成筆記"- "記錄這段知識點"- "保存當前對話內容"- 其他類似要求保存會話內容的表述調用工具時必須包含3個參數:1. chatId:當前會話的ID(從會話上下文獲取)2. title:從會話內容中提煉的標題(不超過20字)3. content:需要記錄的會話知識點詳情(完整提取相關內容)調用成功后,回復用戶"已為你保存筆記:[標題]";調用失敗則提示"筆記保存失敗,請重試"。""";

2.2.4 在ChatClient中配置tool

    @Beanpublic ChatClient chatCommonClient(OpenAiChatModel model, ChatMemory chatMemory,VectorStore vectorStore,  AINoteTools aiNoteTools) {return ChatClient.builder(model).defaultOptions(ChatOptions.builder().model("qwen-omni-turbo").build()).defaultSystem(AIChatConstant.CHAT_ROLE).defaultAdvisors(new SimpleLoggerAdvisor(),new MessageChatMemoryAdvisor(chatMemory),new QuestionAnswerAdvisor(vectorStore,SearchRequest.builder().similarityThreshold(0.6).topK(2).build())).defaultTools(List.of(aiNoteTools)).build();}

目前SpringAI的OpenAI客戶端與阿里云百煉存在兼容性問題,所以FunctionCalling功能無法使用stream模式,為了兼容百煉云平臺,我們需做調整

3 兼容百煉云平臺

截止SpringAI的1.0.0-M6版本為止,SpringAI的OpenAiModel和阿里云百煉的部分接口存在兼容性問題,包括但不限于以下兩個問題:

  • FunctionCalling的stream模式,阿里云百煉返回的tool-arguments是不完整的,需要拼接,而OpenAI則是完整的,無需拼接。

  • 音頻識別中的數據格式,阿里云百煉的qwen-omni模型要求的參數格式為data:;base64,${media-data},而OpenAI是直接{media-data}

由于SpringAI的OpenAI模塊是遵循OpenAI規范的,所以即便版本升級也不會去兼容阿里云,除非SpringAI單獨為阿里云開發starter,所以目前解決方案有兩個:

  • 等待阿里云官方推出的spring-alibaba-ai升級到最新版本

  • 自己重寫OpenAiModel的實現邏輯。

接下來,我們就用重寫OpenAiModel的方式,來解決上述兩個問題。

首先,我們自己寫一個遵循阿里巴巴百煉平臺接口規范的ChatModel,其中大部分代碼來自SpringAI的OpenAiChatModel,只需要重寫接口協議不匹配的地方即可,重寫部分會以黃色高亮顯示。

新建一個AlibabaOpenAiChatModel類:

package com.itheima.ai.model;import io.micrometer.observation.Observation;
import io.micrometer.observation.ObservationRegistry;
import io.micrometer.observation.contextpropagation.ObservationThreadLocalAccessor;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.chat.messages.AssistantMessage;
import org.springframework.ai.chat.messages.MessageType;
import org.springframework.ai.chat.messages.ToolResponseMessage;
import org.springframework.ai.chat.messages.UserMessage;
import org.springframework.ai.chat.metadata.*;
import org.springframework.ai.chat.model.*;
import org.springframework.ai.chat.observation.ChatModelObservationContext;
import org.springframework.ai.chat.observation.ChatModelObservationConvention;
import org.springframework.ai.chat.observation.ChatModelObservationDocumentation;
import org.springframework.ai.chat.observation.DefaultChatModelObservationConvention;
import org.springframework.ai.chat.prompt.ChatOptions;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.model.Media;
import org.springframework.ai.model.ModelOptionsUtils;
import org.springframework.ai.model.function.FunctionCallback;
import org.springframework.ai.model.function.FunctionCallbackResolver;
import org.springframework.ai.model.function.FunctionCallingOptions;
import org.springframework.ai.model.tool.LegacyToolCallingManager;
import org.springframework.ai.model.tool.ToolCallingChatOptions;
import org.springframework.ai.model.tool.ToolCallingManager;
import org.springframework.ai.model.tool.ToolExecutionResult;
import org.springframework.ai.openai.OpenAiChatOptions;
import org.springframework.ai.openai.api.OpenAiApi;
import org.springframework.ai.openai.api.common.OpenAiApiConstants;
import org.springframework.ai.openai.metadata.support.OpenAiResponseHeaderExtractor;
import org.springframework.ai.retry.RetryUtils;
import org.springframework.ai.tool.definition.ToolDefinition;
import org.springframework.core.io.ByteArrayResource;
import org.springframework.core.io.Resource;
import org.springframework.http.ResponseEntity;
import org.springframework.lang.Nullable;
import org.springframework.retry.support.RetryTemplate;
import org.springframework.util.*;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;import java.util.*;
import java.util.concurrent.ConcurrentHashMap;
import java.util.stream.Collectors;public class AlibabaOpenAiChatModel extends AbstractToolCallSupport implements ChatModel {private static final Logger logger = LoggerFactory.getLogger(AlibabaOpenAiChatModel.class);private static final ChatModelObservationConvention DEFAULT_OBSERVATION_CONVENTION = new DefaultChatModelObservationConvention();private static final ToolCallingManager DEFAULT_TOOL_CALLING_MANAGER = ToolCallingManager.builder().build();/*** The default options used for the chat completion requests.*/private final OpenAiChatOptions defaultOptions;/*** The retry template used to retry the OpenAI API calls.*/private final RetryTemplate retryTemplate;/*** Low-level access to the OpenAI API.*/private final OpenAiApi openAiApi;/*** Observation registry used for instrumentation.*/private final ObservationRegistry observationRegistry;private final ToolCallingManager toolCallingManager;/*** Conventions to use for generating observations.*/private ChatModelObservationConvention observationConvention = DEFAULT_OBSERVATION_CONVENTION;/*** Creates an instance of the AlibabaOpenAiChatModel.* @param openAiApi The OpenAiApi instance to be used for interacting with the OpenAI* Chat API.* @throws IllegalArgumentException if openAiApi is null* @deprecated Use AlibabaOpenAiChatModel.Builder.*/@Deprecatedpublic AlibabaOpenAiChatModel(OpenAiApi openAiApi) {this(openAiApi, OpenAiChatOptions.builder().model(OpenAiApi.DEFAULT_CHAT_MODEL).temperature(0.7).build());}/*** Initializes an instance of the AlibabaOpenAiChatModel.* @param openAiApi The OpenAiApi instance to be used for interacting with the OpenAI* Chat API.* @param options The OpenAiChatOptions to configure the chat model.* @deprecated Use AlibabaOpenAiChatModel.Builder.*/@Deprecatedpublic AlibabaOpenAiChatModel(OpenAiApi openAiApi, OpenAiChatOptions options) {this(openAiApi, options, null, RetryUtils.DEFAULT_RETRY_TEMPLATE);}/*** Initializes a new instance of the AlibabaOpenAiChatModel.* @param openAiApi The OpenAiApi instance to be used for interacting with the OpenAI* Chat API.* @param options The OpenAiChatOptions to configure the chat model.* @param functionCallbackResolver The function callback resolver.* @param retryTemplate The retry template.* @deprecated Use AlibabaOpenAiChatModel.Builder.*/@Deprecatedpublic AlibabaOpenAiChatModel(OpenAiApi openAiApi, OpenAiChatOptions options,@Nullable FunctionCallbackResolver functionCallbackResolver, RetryTemplate retryTemplate) {this(openAiApi, options, functionCallbackResolver, List.of(), retryTemplate);}/*** Initializes a new instance of the AlibabaOpenAiChatModel.* @param openAiApi The OpenAiApi instance to be used for interacting with the OpenAI* Chat API.* @param options The OpenAiChatOptions to configure the chat model.* @param functionCallbackResolver The function callback resolver.* @param toolFunctionCallbacks The tool function callbacks.* @param retryTemplate The retry template.* @deprecated Use AlibabaOpenAiChatModel.Builder.*/@Deprecatedpublic AlibabaOpenAiChatModel(OpenAiApi openAiApi, OpenAiChatOptions options,@Nullable FunctionCallbackResolver functionCallbackResolver,@Nullable List<FunctionCallback> toolFunctionCallbacks, RetryTemplate retryTemplate) {this(openAiApi, options, functionCallbackResolver, toolFunctionCallbacks, retryTemplate,ObservationRegistry.NOOP);}/*** Initializes a new instance of the AlibabaOpenAiChatModel.* @param openAiApi The OpenAiApi instance to be used for interacting with the OpenAI* Chat API.* @param options The OpenAiChatOptions to configure the chat model.* @param functionCallbackResolver The function callback resolver.* @param toolFunctionCallbacks The tool function callbacks.* @param retryTemplate The retry template.* @param observationRegistry The ObservationRegistry used for instrumentation.* @deprecated Use AlibabaOpenAiChatModel.Builder or AlibabaOpenAiChatModel(OpenAiApi,* OpenAiChatOptions, ToolCallingManager, RetryTemplate, ObservationRegistry).*/@Deprecatedpublic AlibabaOpenAiChatModel(OpenAiApi openAiApi, OpenAiChatOptions options,@Nullable FunctionCallbackResolver functionCallbackResolver,@Nullable List<FunctionCallback> toolFunctionCallbacks, RetryTemplate retryTemplate,ObservationRegistry observationRegistry) {this(openAiApi, options,LegacyToolCallingManager.builder().functionCallbackResolver(functionCallbackResolver).functionCallbacks(toolFunctionCallbacks).build(),retryTemplate, observationRegistry);logger.warn("This constructor is deprecated and will be removed in the next milestone. "+ "Please use the AlibabaOpenAiChatModel.Builder or the new constructor accepting ToolCallingManager instead.");}public AlibabaOpenAiChatModel(OpenAiApi openAiApi, OpenAiChatOptions defaultOptions, ToolCallingManager toolCallingManager,RetryTemplate retryTemplate, ObservationRegistry observationRegistry) {// We do not pass the 'defaultOptions' to the AbstractToolSupport,// because it modifies them. We are using ToolCallingManager instead,// so we just pass empty options here.super(null, OpenAiChatOptions.builder().build(), List.of());Assert.notNull(openAiApi, "openAiApi cannot be null");Assert.notNull(defaultOptions, "defaultOptions cannot be null");Assert.notNull(toolCallingManager, "toolCallingManager cannot be null");Assert.notNull(retryTemplate, "retryTemplate cannot be null");Assert.notNull(observationRegistry, "observationRegistry cannot be null");this.openAiApi = openAiApi;this.defaultOptions = defaultOptions;this.toolCallingManager = toolCallingManager;this.retryTemplate = retryTemplate;this.observationRegistry = observationRegistry;}@Overridepublic ChatResponse call(Prompt prompt) {// Before moving any further, build the final request Prompt,// merging runtime and default options.Prompt requestPrompt = buildRequestPrompt(prompt);return this.internalCall(requestPrompt, null);}public ChatResponse internalCall(Prompt prompt, ChatResponse previousChatResponse) {OpenAiApi.ChatCompletionRequest request = createRequest(prompt, false);ChatModelObservationContext observationContext = ChatModelObservationContext.builder().prompt(prompt).provider(OpenAiApiConstants.PROVIDER_NAME).requestOptions(prompt.getOptions()).build();ChatResponse response = ChatModelObservationDocumentation.CHAT_MODEL_OPERATION.observation(this.observationConvention, DEFAULT_OBSERVATION_CONVENTION, () -> observationContext,this.observationRegistry).observe(() -> {ResponseEntity<OpenAiApi.ChatCompletion> completionEntity = this.retryTemplate.execute(ctx -> this.openAiApi.chatCompletionEntity(request, getAdditionalHttpHeaders(prompt)));var chatCompletion = completionEntity.getBody();if (chatCompletion == null) {logger.warn("No chat completion returned for prompt: {}", prompt);return new ChatResponse(List.of());}List<OpenAiApi.ChatCompletion.Choice> choices = chatCompletion.choices();if (choices == null) {logger.warn("No choices returned for prompt: {}", prompt);return new ChatResponse(List.of());}List<Generation> generations = choices.stream().map(choice -> {// @formatter:offMap<String, Object> metadata = Map.of("id", chatCompletion.id() != null ? chatCompletion.id() : "","role", choice.message().role() != null ? choice.message().role().name() : "","index", choice.index(),"finishReason", choice.finishReason() != null ? choice.finishReason().name() : "","refusal", StringUtils.hasText(choice.message().refusal()) ? choice.message().refusal() : "");// @formatter:onreturn buildGeneration(choice, metadata, request);}).toList();RateLimit rateLimit = OpenAiResponseHeaderExtractor.extractAiResponseHeaders(completionEntity);// Current usageOpenAiApi.Usage usage = completionEntity.getBody().usage();Usage currentChatResponseUsage = usage != null ? getDefaultUsage(usage) : new EmptyUsage();Usage accumulatedUsage = UsageUtils.getCumulativeUsage(currentChatResponseUsage, previousChatResponse);ChatResponse chatResponse = new ChatResponse(generations,from(completionEntity.getBody(), rateLimit, accumulatedUsage));observationContext.setResponse(chatResponse);return chatResponse;});if (ToolCallingChatOptions.isInternalToolExecutionEnabled(prompt.getOptions()) && response != null&& response.hasToolCalls()) {var toolExecutionResult = this.toolCallingManager.executeToolCalls(prompt, response);if (toolExecutionResult.returnDirect()) {// Return tool execution result directly to the client.return ChatResponse.builder().from(response).generations(ToolExecutionResult.buildGenerations(toolExecutionResult)).build();}else {// Send the tool execution result back to the model.return this.internalCall(new Prompt(toolExecutionResult.conversationHistory(), prompt.getOptions()),response);}}return response;}@Overridepublic Flux<ChatResponse> stream(Prompt prompt) {// Before moving any further, build the final request Prompt,// merging runtime and default options.Prompt requestPrompt = buildRequestPrompt(prompt);return internalStream(requestPrompt, null);}public Flux<ChatResponse> internalStream(Prompt prompt, ChatResponse previousChatResponse) {return Flux.deferContextual(contextView -> {OpenAiApi.ChatCompletionRequest request = createRequest(prompt, true);if (request.outputModalities() != null) {if (request.outputModalities().stream().anyMatch(m -> m.equals("audio"))) {logger.warn("Audio output is not supported for streaming requests. Removing audio output.");throw new IllegalArgumentException("Audio output is not supported for streaming requests.");}}if (request.audioParameters() != null) {logger.warn("Audio parameters are not supported for streaming requests. Removing audio parameters.");throw new IllegalArgumentException("Audio parameters are not supported for streaming requests.");}Flux<OpenAiApi.ChatCompletionChunk> completionChunks = this.openAiApi.chatCompletionStream(request,getAdditionalHttpHeaders(prompt));// For chunked responses, only the first chunk contains the choice role.// The rest of the chunks with same ID share the same role.ConcurrentHashMap<String, String> roleMap = new ConcurrentHashMap<>();final ChatModelObservationContext observationContext = ChatModelObservationContext.builder().prompt(prompt).provider(OpenAiApiConstants.PROVIDER_NAME).requestOptions(prompt.getOptions()).build();Observation observation = ChatModelObservationDocumentation.CHAT_MODEL_OPERATION.observation(this.observationConvention, DEFAULT_OBSERVATION_CONVENTION, () -> observationContext,this.observationRegistry);observation.parentObservation(contextView.getOrDefault(ObservationThreadLocalAccessor.KEY, null)).start();// Convert the ChatCompletionChunk into a ChatCompletion to be able to reuse// the function call handling logic.Flux<ChatResponse> chatResponse = completionChunks.map(this::chunkToChatCompletion).switchMap(chatCompletion -> Mono.just(chatCompletion).map(chatCompletion2 -> {try {@SuppressWarnings("null")String id = chatCompletion2.id();List<Generation> generations = chatCompletion2.choices().stream().map(choice -> { // @formatter:offif (choice.message().role() != null) {roleMap.putIfAbsent(id, choice.message().role().name());}Map<String, Object> metadata = Map.of("id", chatCompletion2.id(),"role", roleMap.getOrDefault(id, ""),"index", choice.index(),"finishReason", choice.finishReason() != null ? choice.finishReason().name() : "","refusal", StringUtils.hasText(choice.message().refusal()) ? choice.message().refusal() : "");return buildGeneration(choice, metadata, request);}).toList();// @formatter:onOpenAiApi.Usage usage = chatCompletion2.usage();Usage currentChatResponseUsage = usage != null ? getDefaultUsage(usage) : new EmptyUsage();Usage accumulatedUsage = UsageUtils.getCumulativeUsage(currentChatResponseUsage,previousChatResponse);return new ChatResponse(generations, from(chatCompletion2, null, accumulatedUsage));}catch (Exception e) {logger.error("Error processing chat completion", e);return new ChatResponse(List.of());}// When in stream mode and enabled to include the usage, the OpenAI// Chat completion response would have the usage set only in its// final response. Hence, the following overlapping buffer is// created to store both the current and the subsequent response// to accumulate the usage from the subsequent response.})).buffer(2, 1).map(bufferList -> {ChatResponse firstResponse = bufferList.get(0);if (request.streamOptions() != null && request.streamOptions().includeUsage()) {if (bufferList.size() == 2) {ChatResponse secondResponse = bufferList.get(1);if (secondResponse != null && secondResponse.getMetadata() != null) {// This is the usage from the final Chat response for a// given Chat request.Usage usage = secondResponse.getMetadata().getUsage();if (!UsageUtils.isEmpty(usage)) {// Store the usage from the final response to the// penultimate response for accumulation.return new ChatResponse(firstResponse.getResults(),from(firstResponse.getMetadata(), usage));}}}}return firstResponse;});// @formatter:offFlux<ChatResponse> flux = chatResponse.flatMap(response -> {if (ToolCallingChatOptions.isInternalToolExecutionEnabled(prompt.getOptions()) && response.hasToolCalls()) {var toolExecutionResult = this.toolCallingManager.executeToolCalls(prompt, response);if (toolExecutionResult.returnDirect()) {// Return tool execution result directly to the client.return Flux.just(ChatResponse.builder().from(response).generations(ToolExecutionResult.buildGenerations(toolExecutionResult)).build());} else {// Send the tool execution result back to the model.return this.internalStream(new Prompt(toolExecutionResult.conversationHistory(), prompt.getOptions()),response);}}else {return Flux.just(response);}}).doOnError(observation::error).doFinally(s -> observation.stop()).contextWrite(ctx -> ctx.put(ObservationThreadLocalAccessor.KEY, observation));// @formatter:onreturn new MessageAggregator().aggregate(flux, observationContext::setResponse);});}private MultiValueMap<String, String> getAdditionalHttpHeaders(Prompt prompt) {Map<String, String> headers = new HashMap<>(this.defaultOptions.getHttpHeaders());if (prompt.getOptions() != null && prompt.getOptions() instanceof OpenAiChatOptions chatOptions) {headers.putAll(chatOptions.getHttpHeaders());}return CollectionUtils.toMultiValueMap(headers.entrySet().stream().collect(Collectors.toMap(Map.Entry::getKey, e -> List.of(e.getValue()))));}private Generation buildGeneration(OpenAiApi.ChatCompletion.Choice choice, Map<String, Object> metadata, OpenAiApi.ChatCompletionRequest request) {List<AssistantMessage.ToolCall> toolCalls = choice.message().toolCalls() == null ? List.of(): choice.message().toolCalls().stream().map(toolCall -> new AssistantMessage.ToolCall(toolCall.id(), "function",toolCall.function().name(), toolCall.function().arguments())).reduce((tc1, tc2) -> new AssistantMessage.ToolCall(tc1.id(), "function", tc1.name(), tc1.arguments() + tc2.arguments())).stream().toList();String finishReason = (choice.finishReason() != null ? choice.finishReason().name() : "");var generationMetadataBuilder = ChatGenerationMetadata.builder().finishReason(finishReason);List<Media> media = new ArrayList<>();String textContent = choice.message().content();var audioOutput = choice.message().audioOutput();if (audioOutput != null) {String mimeType = String.format("audio/%s", request.audioParameters().format().name().toLowerCase());byte[] audioData = Base64.getDecoder().decode(audioOutput.data());Resource resource = new ByteArrayResource(audioData);Media.builder().mimeType(MimeTypeUtils.parseMimeType(mimeType)).data(resource).id(audioOutput.id()).build();media.add(Media.builder().mimeType(MimeTypeUtils.parseMimeType(mimeType)).data(resource).id(audioOutput.id()).build());if (!StringUtils.hasText(textContent)) {textContent = audioOutput.transcript();}generationMetadataBuilder.metadata("audioId", audioOutput.id());generationMetadataBuilder.metadata("audioExpiresAt", audioOutput.expiresAt());}var assistantMessage = new AssistantMessage(textContent, metadata, toolCalls, media);return new Generation(assistantMessage, generationMetadataBuilder.build());}private ChatResponseMetadata from(OpenAiApi.ChatCompletion result, RateLimit rateLimit, Usage usage) {Assert.notNull(result, "OpenAI ChatCompletionResult must not be null");var builder = ChatResponseMetadata.builder().id(result.id() != null ? result.id() : "").usage(usage).model(result.model() != null ? result.model() : "").keyValue("created", result.created() != null ? result.created() : 0L).keyValue("system-fingerprint", result.systemFingerprint() != null ? result.systemFingerprint() : "");if (rateLimit != null) {builder.rateLimit(rateLimit);}return builder.build();}private ChatResponseMetadata from(ChatResponseMetadata chatResponseMetadata, Usage usage) {Assert.notNull(chatResponseMetadata, "OpenAI ChatResponseMetadata must not be null");var builder = ChatResponseMetadata.builder().id(chatResponseMetadata.getId() != null ? chatResponseMetadata.getId() : "").usage(usage).model(chatResponseMetadata.getModel() != null ? chatResponseMetadata.getModel() : "");if (chatResponseMetadata.getRateLimit() != null) {builder.rateLimit(chatResponseMetadata.getRateLimit());}return builder.build();}/*** Convert the ChatCompletionChunk into a ChatCompletion. The Usage is set to null.* @param chunk the ChatCompletionChunk to convert* @return the ChatCompletion*/private OpenAiApi.ChatCompletion chunkToChatCompletion(OpenAiApi.ChatCompletionChunk chunk) {List<OpenAiApi.ChatCompletion.Choice> choices = chunk.choices().stream().map(chunkChoice -> new OpenAiApi.ChatCompletion.Choice(chunkChoice.finishReason(), chunkChoice.index(), chunkChoice.delta(),chunkChoice.logprobs())).toList();return new OpenAiApi.ChatCompletion(chunk.id(), choices, chunk.created(), chunk.model(), chunk.serviceTier(),chunk.systemFingerprint(), "chat.completion", chunk.usage());}private DefaultUsage getDefaultUsage(OpenAiApi.Usage usage) {return new DefaultUsage(usage.promptTokens(), usage.completionTokens(), usage.totalTokens(), usage);}Prompt buildRequestPrompt(Prompt prompt) {// Process runtime optionsOpenAiChatOptions runtimeOptions = null;if (prompt.getOptions() != null) {if (prompt.getOptions() instanceof ToolCallingChatOptions toolCallingChatOptions) {runtimeOptions = ModelOptionsUtils.copyToTarget(toolCallingChatOptions, ToolCallingChatOptions.class,OpenAiChatOptions.class);}else if (prompt.getOptions() instanceof FunctionCallingOptions functionCallingOptions) {runtimeOptions = ModelOptionsUtils.copyToTarget(functionCallingOptions, FunctionCallingOptions.class,OpenAiChatOptions.class);}else {runtimeOptions = ModelOptionsUtils.copyToTarget(prompt.getOptions(), ChatOptions.class,OpenAiChatOptions.class);}}// Define request options by merging runtime options and default optionsOpenAiChatOptions requestOptions = ModelOptionsUtils.merge(runtimeOptions, this.defaultOptions,OpenAiChatOptions.class);// Merge @JsonIgnore-annotated options explicitly since they are ignored by// Jackson, used by ModelOptionsUtils.if (runtimeOptions != null) {requestOptions.setHttpHeaders(mergeHttpHeaders(runtimeOptions.getHttpHeaders(), this.defaultOptions.getHttpHeaders()));requestOptions.setInternalToolExecutionEnabled(ModelOptionsUtils.mergeOption(runtimeOptions.isInternalToolExecutionEnabled(),this.defaultOptions.isInternalToolExecutionEnabled()));requestOptions.setToolNames(ToolCallingChatOptions.mergeToolNames(runtimeOptions.getToolNames(),this.defaultOptions.getToolNames()));requestOptions.setToolCallbacks(ToolCallingChatOptions.mergeToolCallbacks(runtimeOptions.getToolCallbacks(),this.defaultOptions.getToolCallbacks()));requestOptions.setToolContext(ToolCallingChatOptions.mergeToolContext(runtimeOptions.getToolContext(),this.defaultOptions.getToolContext()));}else {requestOptions.setHttpHeaders(this.defaultOptions.getHttpHeaders());requestOptions.setInternalToolExecutionEnabled(this.defaultOptions.isInternalToolExecutionEnabled());requestOptions.setToolNames(this.defaultOptions.getToolNames());requestOptions.setToolCallbacks(this.defaultOptions.getToolCallbacks());requestOptions.setToolContext(this.defaultOptions.getToolContext());}ToolCallingChatOptions.validateToolCallbacks(requestOptions.getToolCallbacks());return new Prompt(prompt.getInstructions(), requestOptions);}private Map<String, String> mergeHttpHeaders(Map<String, String> runtimeHttpHeaders,Map<String, String> defaultHttpHeaders) {var mergedHttpHeaders = new HashMap<>(defaultHttpHeaders);mergedHttpHeaders.putAll(runtimeHttpHeaders);return mergedHttpHeaders;}/*** Accessible for testing.*/OpenAiApi.ChatCompletionRequest createRequest(Prompt prompt, boolean stream) {List<OpenAiApi.ChatCompletionMessage> chatCompletionMessages = prompt.getInstructions().stream().map(message -> {if (message.getMessageType() == MessageType.USER || message.getMessageType() == MessageType.SYSTEM) {Object content = message.getText();if (message instanceof UserMessage userMessage) {if (!CollectionUtils.isEmpty(userMessage.getMedia())) {List<OpenAiApi.ChatCompletionMessage.MediaContent> contentList = new ArrayList<>(List.of(new OpenAiApi.ChatCompletionMessage.MediaContent(message.getText())));contentList.addAll(userMessage.getMedia().stream().map(this::mapToMediaContent).toList());content = contentList;}}return List.of(new OpenAiApi.ChatCompletionMessage(content,OpenAiApi.ChatCompletionMessage.Role.valueOf(message.getMessageType().name())));}else if (message.getMessageType() == MessageType.ASSISTANT) {var assistantMessage = (AssistantMessage) message;List<OpenAiApi.ChatCompletionMessage.ToolCall> toolCalls = null;if (!CollectionUtils.isEmpty(assistantMessage.getToolCalls())) {toolCalls = assistantMessage.getToolCalls().stream().map(toolCall -> {var function = new OpenAiApi.ChatCompletionMessage.ChatCompletionFunction(toolCall.name(), toolCall.arguments());return new OpenAiApi.ChatCompletionMessage.ToolCall(toolCall.id(), toolCall.type(), function);}).toList();}OpenAiApi.ChatCompletionMessage.AudioOutput audioOutput = null;if (!CollectionUtils.isEmpty(assistantMessage.getMedia())) {Assert.isTrue(assistantMessage.getMedia().size() == 1,"Only one media content is supported for assistant messages");audioOutput = new OpenAiApi.ChatCompletionMessage.AudioOutput(assistantMessage.getMedia().get(0).getId(), null, null, null);}return List.of(new OpenAiApi.ChatCompletionMessage(assistantMessage.getText(),OpenAiApi.ChatCompletionMessage.Role.ASSISTANT, null, null, toolCalls, null, audioOutput));}else if (message.getMessageType() == MessageType.TOOL) {ToolResponseMessage toolMessage = (ToolResponseMessage) message;toolMessage.getResponses().forEach(response -> Assert.isTrue(response.id() != null, "ToolResponseMessage must have an id"));return toolMessage.getResponses().stream().map(tr -> new OpenAiApi.ChatCompletionMessage(tr.responseData(), OpenAiApi.ChatCompletionMessage.Role.TOOL, tr.name(),tr.id(), null, null, null)).toList();}else {throw new IllegalArgumentException("Unsupported message type: " + message.getMessageType());}}).flatMap(List::stream).toList();OpenAiApi.ChatCompletionRequest request = new OpenAiApi.ChatCompletionRequest(chatCompletionMessages, stream);OpenAiChatOptions requestOptions = (OpenAiChatOptions) prompt.getOptions();request = ModelOptionsUtils.merge(requestOptions, request, OpenAiApi.ChatCompletionRequest.class);// Add the tool definitions to the request's tools parameter.List<ToolDefinition> toolDefinitions = this.toolCallingManager.resolveToolDefinitions(requestOptions);if (!CollectionUtils.isEmpty(toolDefinitions)) {request = ModelOptionsUtils.merge(OpenAiChatOptions.builder().tools(this.getFunctionTools(toolDefinitions)).build(), request,OpenAiApi.ChatCompletionRequest.class);}// Remove `streamOptions` from the request if it is not a streaming requestif (request.streamOptions() != null && !stream) {logger.warn("Removing streamOptions from the request as it is not a streaming request!");request = request.streamOptions(null);}return request;}private OpenAiApi.ChatCompletionMessage.MediaContent mapToMediaContent(Media media) {var mimeType = media.getMimeType();if (MimeTypeUtils.parseMimeType("audio/mp3").equals(mimeType) || MimeTypeUtils.parseMimeType("audio/mpeg").equals(mimeType)) {return new OpenAiApi.ChatCompletionMessage.MediaContent(new OpenAiApi.ChatCompletionMessage.MediaContent.InputAudio(fromAudioData(media.getData()), OpenAiApi.ChatCompletionMessage.MediaContent.InputAudio.Format.MP3));}if (MimeTypeUtils.parseMimeType("audio/wav").equals(mimeType)) {return new OpenAiApi.ChatCompletionMessage.MediaContent(new OpenAiApi.ChatCompletionMessage.MediaContent.InputAudio(fromAudioData(media.getData()), OpenAiApi.ChatCompletionMessage.MediaContent.InputAudio.Format.WAV));}else {return new OpenAiApi.ChatCompletionMessage.MediaContent(new OpenAiApi.ChatCompletionMessage.MediaContent.ImageUrl(this.fromMediaData(media.getMimeType(), media.getData())));}}private String fromAudioData(Object audioData) {if (audioData instanceof byte[] bytes) {return String.format("data:;base64,%s", Base64.getEncoder().encodeToString(bytes));}throw new IllegalArgumentException("Unsupported audio data type: " + audioData.getClass().getSimpleName());}private String fromMediaData(MimeType mimeType, Object mediaContentData) {if (mediaContentData instanceof byte[] bytes) {// Assume the bytes are an image. So, convert the bytes to a base64 encoded// following the prefix pattern.return String.format("data:%s;base64,%s", mimeType.toString(), Base64.getEncoder().encodeToString(bytes));}else if (mediaContentData instanceof String text) {// Assume the text is a URLs or a base64 encoded image prefixed by the user.return text;}else {throw new IllegalArgumentException("Unsupported media data type: " + mediaContentData.getClass().getSimpleName());}}private List<OpenAiApi.FunctionTool> getFunctionTools(List<ToolDefinition> toolDefinitions) {return toolDefinitions.stream().map(toolDefinition -> {var function = new OpenAiApi.FunctionTool.Function(toolDefinition.description(), toolDefinition.name(),toolDefinition.inputSchema());return new OpenAiApi.FunctionTool(function);}).toList();}@Overridepublic ChatOptions getDefaultOptions() {return OpenAiChatOptions.fromOptions(this.defaultOptions);}@Overridepublic String toString() {return "AlibabaOpenAiChatModel [defaultOptions=" + this.defaultOptions + "]";}/*** Use the provided convention for reporting observation data* @param observationConvention The provided convention*/public void setObservationConvention(ChatModelObservationConvention observationConvention) {Assert.notNull(observationConvention, "observationConvention cannot be null");this.observationConvention = observationConvention;}public static AlibabaOpenAiChatModel.Builder builder() {return new AlibabaOpenAiChatModel.Builder();}public static final class Builder {private OpenAiApi openAiApi;private OpenAiChatOptions defaultOptions = OpenAiChatOptions.builder().model(OpenAiApi.DEFAULT_CHAT_MODEL).temperature(0.7).build();private ToolCallingManager toolCallingManager;private FunctionCallbackResolver functionCallbackResolver;private List<FunctionCallback> toolFunctionCallbacks;private RetryTemplate retryTemplate = RetryUtils.DEFAULT_RETRY_TEMPLATE;private ObservationRegistry observationRegistry = ObservationRegistry.NOOP;private Builder() {}public AlibabaOpenAiChatModel.Builder openAiApi(OpenAiApi openAiApi) {this.openAiApi = openAiApi;return this;}public AlibabaOpenAiChatModel.Builder defaultOptions(OpenAiChatOptions defaultOptions) {this.defaultOptions = defaultOptions;return this;}public AlibabaOpenAiChatModel.Builder toolCallingManager(ToolCallingManager toolCallingManager) {this.toolCallingManager = toolCallingManager;return this;}@Deprecatedpublic AlibabaOpenAiChatModel.Builder functionCallbackResolver(FunctionCallbackResolver functionCallbackResolver) {this.functionCallbackResolver = functionCallbackResolver;return this;}@Deprecatedpublic AlibabaOpenAiChatModel.Builder toolFunctionCallbacks(List<FunctionCallback> toolFunctionCallbacks) {this.toolFunctionCallbacks = toolFunctionCallbacks;return this;}public AlibabaOpenAiChatModel.Builder retryTemplate(RetryTemplate retryTemplate) {this.retryTemplate = retryTemplate;return this;}public AlibabaOpenAiChatModel.Builder observationRegistry(ObservationRegistry observationRegistry) {this.observationRegistry = observationRegistry;return this;}public AlibabaOpenAiChatModel build() {if (toolCallingManager != null) {Assert.isNull(functionCallbackResolver,"functionCallbackResolver cannot be set when toolCallingManager is set");Assert.isNull(toolFunctionCallbacks,"toolFunctionCallbacks cannot be set when toolCallingManager is set");return new AlibabaOpenAiChatModel(openAiApi, defaultOptions, toolCallingManager, retryTemplate,observationRegistry);}if (functionCallbackResolver != null) {Assert.isNull(toolCallingManager,"toolCallingManager cannot be set when functionCallbackResolver is set");List<FunctionCallback> toolCallbacks = this.toolFunctionCallbacks != null ? this.toolFunctionCallbacks: List.of();return new AlibabaOpenAiChatModel(openAiApi, defaultOptions, functionCallbackResolver, toolCallbacks,retryTemplate, observationRegistry);}return new AlibabaOpenAiChatModel(openAiApi, defaultOptions, DEFAULT_TOOL_CALLING_MANAGER, retryTemplate,observationRegistry);}}}

接下來,我們要把AliababaOpenAiChatModel配置到Spring容器。

修改CommonConfiguration,添加配置:

@Bean
public AlibabaOpenAiChatModel alibabaOpenAiChatModel(OpenAiConnectionProperties commonProperties, OpenAiChatProperties chatProperties, ObjectProvider<RestClient.Builder> restClientBuilderProvider, ObjectProvider<WebClient.Builder> webClientBuilderProvider, ToolCallingManager toolCallingManager, RetryTemplate retryTemplate, ResponseErrorHandler responseErrorHandler, ObjectProvider<ObservationRegistry> observationRegistry, ObjectProvider<ChatModelObservationConvention> observationConvention) {String baseUrl = StringUtils.hasText(chatProperties.getBaseUrl()) ? chatProperties.getBaseUrl() : commonProperties.getBaseUrl();String apiKey = StringUtils.hasText(chatProperties.getApiKey()) ? chatProperties.getApiKey() : commonProperties.getApiKey();String projectId = StringUtils.hasText(chatProperties.getProjectId()) ? chatProperties.getProjectId() : commonProperties.getProjectId();String organizationId = StringUtils.hasText(chatProperties.getOrganizationId()) ? chatProperties.getOrganizationId() : commonProperties.getOrganizationId();Map<String, List<String>> connectionHeaders = new HashMap<>();if (StringUtils.hasText(projectId)) {connectionHeaders.put("OpenAI-Project", List.of(projectId));}if (StringUtils.hasText(organizationId)) {connectionHeaders.put("OpenAI-Organization", List.of(organizationId));}RestClient.Builder restClientBuilder = restClientBuilderProvider.getIfAvailable(RestClient::builder);WebClient.Builder webClientBuilder = webClientBuilderProvider.getIfAvailable(WebClient::builder);OpenAiApi openAiApi = OpenAiApi.builder().baseUrl(baseUrl).apiKey(new SimpleApiKey(apiKey)).headers(CollectionUtils.toMultiValueMap(connectionHeaders)).completionsPath(chatProperties.getCompletionsPath()).embeddingsPath("/v1/embeddings").restClientBuilder(restClientBuilder).webClientBuilder(webClientBuilder).responseErrorHandler(responseErrorHandler).build();AlibabaOpenAiChatModel chatModel = AlibabaOpenAiChatModel.builder().openAiApi(openAiApi).defaultOptions(chatProperties.getOptions()).toolCallingManager(toolCallingManager).retryTemplate(retryTemplate).observationRegistry((ObservationRegistry)observationRegistry.getIfUnique(() -> ObservationRegistry.NOOP)).build();Objects.requireNonNull(chatModel);observationConvention.ifAvailable(chatModel::setObservationConvention);return chatModel;
}

最后,讓之前的ChatClient都使用自定義的AlibabaOpenAiChatModel.

修改CommonConfiguration中的ChatClient配置:

@Bean
public ChatClient chatClient(AlibabaOpenAiChatModel model, ChatMemory chatMemory) {return ChatClient.builder(model) // 創建ChatClient工廠實例.defaultOptions(ChatOptions.builder().model("qwen-omni-turbo").build()).defaultSystem("。請以友好、樂于助人和愉快的方式解答用戶的各種問題。").defaultAdvisors(new SimpleLoggerAdvisor()) // 添加默認的Advisor,記錄日志.defaultAdvisors(new MessageChatMemoryAdvisor(chatMemory)).build(); // 構建ChatClient實例}@Bean
public ChatClient serviceChatClient(AlibabaOpenAiChatModel model,ChatMemory chatMemory,CourseTools courseTools) {return ChatClient.builder(model).defaultSystem(CUSTOMER_SERVICE_SYSTEM).defaultAdvisors(new MessageChatMemoryAdvisor(chatMemory), // CHAT MEMORYnew SimpleLoggerAdvisor()).defaultTools(courseTools).build();
}

4.RAG

由于訓練大模型非常耗時,再加上訓練語料本身比較滯后,所以大模型存在知識限制問題:

  • 知識數據比較落后,往往是幾個月之前的

  • 不包含太過專業領域或者企業私有的數據

為了解決這些問題,我們就需要用到RAG了。

4.1 RAG原理

實現思路是給大模型外掛一個知識庫,可以是專業領域知識,也可以是企業私有的數據。因為通常知識庫數據量都是非常大的,而大模型的上下文是有大小限制的,早期的GPT上下文不能超過2000token,現在也不到200k token,因此知識庫不能直接寫在提示詞中。所以,我們需要想辦法從龐大的知識庫中找到與用戶問題相關的一小部分,組裝成提示詞,發送給大模型就可以了。

4.1.1 向量模型

向量是空間中有方向和長度的量,空間可以是二維,也可以是多維。向量既然是在空間中,兩個向量之間就一定能計算距離。我們以二維向量為例,向量之間的距離有兩種計算方法:

通常,兩個向量之間歐式距離越近,我們認為兩個向量的相似度越高。(余弦距離相反,越大相似度越高)所以,如果我們能把文本轉為向量,就可以通過向量距離來判斷文本的相似度了。現在,有不少的專門的向量模型,就可以實現將文本向量化。一個好的向量模型,就是要盡可能讓文本含義相似的向量,在空間中距離更近

接下來,我們就準備一個向量模型,用于將文本向量化。

阿里云百煉平臺就提供了這樣的模型:

這里我們選擇通用文本向量-v3,這個模型兼容OpenAI,所以我們依然采用OpenAI的配置。

修改配置文件,添加向量模型:

server:ai:openai:base-url: ${spring.ai.openai.base-url}api-key: ${spring.ai.openai.api-key}chat:options:model: qwen-max-latestembedding:options:model: text-embedding-v3dimensions: 1024vectorstore:redis:initialize-schema: trueindex: 0prefix: "doc:" # 向量庫key前綴

4.1.2 向量數據庫

向量數據庫的主要作用有兩個:

  • 存儲向量數據

  • 基于相似度檢索數據

剛好符合我們的需求。

SpringAI支持很多向量數據庫,并且都進行了封裝,可以用統一的API去訪問:

  • Azure Vector Search - The Azure vector store.

  • Apache Cassandra - The Apache Cassandra vector store.

  • Chroma Vector Store - The Chroma vector store.

  • Elasticsearch Vector Store - The Elasticsearch vector store.

  • GemFire Vector Store - The GemFire vector store.

  • MariaDB Vector Store - The MariaDB vector store.

  • Milvus Vector Store - The Milvus vector store.

  • MongoDB Atlas Vector Store - The MongoDB Atlas vector store.

  • Neo4j Vector Store - The Neo4j vector store.

  • OpenSearch Vector Store - The OpenSearch vector store.

  • Oracle Vector Store - The Oracle Database vector store.

  • PgVector Store - The PostgreSQL/PGVector vector store.

  • Pinecone Vector Store - PineCone vector store.

  • Qdrant Vector Store - Qdrant vector store.

  • Redis Vector Store - The Redis vector store.

  • SAP Hana Vector Store - The SAP HANA vector store.

  • Typesense Vector Store - The Typesense vector store.

  • Weaviate Vector Store - The Weaviate vector store.

  • SimpleVectorStore - A simple implementation of persistent vector storage, good for educational purposes.

這些庫都實現了統一的接口:VectorStore,因此操作方式一模一樣,大家學會任意一個,其它就都不是問題。

不過,除了最后一個庫以外,其它所有向量數據庫都是需要安裝部署的。每個企業用的向量庫都不一樣。

4.2 VectorStore

VectorStore接口:

public interface VectorStore extends DocumentWriter {default String getName() {return this.getClass().getSimpleName();}// 保存文檔到向量庫void add(List<Document> documents);// 根據文檔id刪除文檔void delete(List<String> idList);void delete(Filter.Expression filterExpression);default void delete(String filterExpression) { ... };// 根據條件檢索文檔List<Document> similaritySearch(String query);// 根據條件檢索文檔List<Document> similaritySearch(SearchRequest request);default <T> Optional<T> getNativeClient() {return Optional.empty();}
}

VectorStore操作向量化的基本單位是Document,我們在使用時需要將自己的知識庫分割轉換為一個個的Document,然后寫入VectorStore.

基于內存或Redis-Stack實現向量數據庫:

  @Beanpublic VectorStore vectorStore(OpenAiEmbeddingModel embeddingModel) {return SimpleVectorStore.builder(embeddingModel).build();}/*** 創建RedisStack向量數據庫** @param embeddingModel 嵌入模型* @param properties     redis-stack的配置信息* @return vectorStore 向量數據庫*/@Beanpublic VectorStore vectorStore(EmbeddingModel embeddingModel,RedisVectorStoreProperties properties,RedisConnectionDetails redisConnectionDetails) {JedisPooled jedisPooled = new JedisPooled(redisConnectionDetails.getStandalone().getHost(),redisConnectionDetails.getStandalone().getPort(), redisConnectionDetails.getUsername(),redisConnectionDetails.getPassword());return RedisVectorStore.builder(jedisPooled, embeddingModel).indexName(properties.getIndex()).prefix(properties.getPrefix()).initializeSchema(properties.isInitializeSchema()).build();}

文件讀取和轉換:

知識庫太大,是需要拆分成文檔片段,然后再做向量化的。而且SpringAI中向量庫接收的是Document類型的文檔,也就是說,我們處理文

文檔讀取、拆分、轉換的動作并不需要我們親自完成。在SpringAI中提供了各種文檔讀取的工具,可以參考官網:https://docs.spring.io/spring-ai/reference/api/etl-pipeline.html#_pdf_paragraph

比如PDF文檔讀取和拆分,SpringAI提供了兩種默認的拆分原則:

  • PagePdfDocumentReader :按頁拆分,推薦使用

  • ParagraphPdfDocumentReader :按pdf的目錄拆分,不推薦,因為很多PDF不規范,沒有章節標簽

當然,大家也可以自己實現PDF的讀取和拆分功能。

這里我們選擇使用PagePdfDocumentReader

首先,我們需要在pom.xml中引入依賴:

<dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-pdf-document-reader</artifactId>
</dependency>

然后就可以利用工具把PDF文件讀取并處理成Document了。

我們寫一個單元測試(別忘了配置API_KEY):

@Test
public void testVectorStore(){Resource resource = new FileSystemResource("中二知識筆記.pdf");// 1.創建PDF的讀取器PagePdfDocumentReader reader = new PagePdfDocumentReader(resource, // 文件源PdfDocumentReaderConfig.builder().withPageExtractedTextFormatter(ExtractedTextFormatter.defaults()).withPagesPerDocument(1) // 每1頁PDF作為一個Document.build());// 2.讀取PDF文檔,拆分為DocumentList<Document> documents = reader.read();// 3.寫入向量庫vectorStore.add(documents);// 4.搜索SearchRequest request = SearchRequest.builder().query("論語中教育的目的是什么").topK(1).similarityThreshold(0.6).filterExpression("file_name == '中二知識筆記.pdf'").build();List<Document> docs = vectorStore.similaritySearch(request);if (docs == null) {System.out.println("沒有搜索到任何內容");return;}for (Document doc : docs) {System.out.println(doc.getId());System.out.println(doc.getScore());System.out.println(doc.getText());}
}

4.3 RAG原理總結

OK,現在我們有了這些工具:

  • PDFReader:讀取文檔并拆分為片段

  • 向量大模型:將文本片段向量化

  • 向量數據庫:存儲向量,檢索向量

讓我們梳理一下要解決的問題和解決思路:

  • 要解決大模型的知識限制問題,需要外掛知識庫

  • 受到大模型上下文限制,知識庫不能簡單的直接拼接在提示詞中

  • 我們需要從龐大的知識庫中找到與用戶問題相關的一小部分,再組裝成提示詞

  • 這些可以利用文檔讀取器向量大模型向量數據庫來解決。

所以RAG要做的事情就是將知識庫分割,然后利用向量模型做向量化,存入向量數據庫,然后查詢的時候去檢索:

第一階段(存儲知識庫)

  • 將知識庫內容切片,分為一個個片段

  • 將每個片段利用向量模型向量化

  • 將所有向量化后的片段寫入向量數據庫

第二階段(檢索知識庫)

  • 每當用戶詢問AI時,將用戶問題向量化

  • 拿著問題向量去向量數據庫檢索最相關的片段

第三階段(對話大模型

  • 將檢索到的片段、用戶的問題一起拼接為提示詞

  • 發送提示詞給大模型,得到響應

4.4 AI文獻閱讀助手實例

基于RAG實現一個AI文獻閱讀助手

整體架構:

4.4.1 PDF文件管理

文件管理接口:

public interface FileRepository {/*** 保存文件,還要記錄chatId與文件的映射關系* @param chatId 會話id* @param resource 文件* @return 上傳成功,返回true; 否則返回false*/boolean save(String chatId, Resource resource);/*** 根據chatId獲取文件* @param chatId 會話id* @return 找到的文件*/Resource getFile(String chatId);}

@Slf4j
@Component
public class LocalPdfFileRepository implements FileRepository {@Autowiredprivate CommonFileServiceImpl commonFileService;@Autowiredprivate PdfFileMappingMapper fileMappingMapper;@Autowiredprivate FileUtil fileUtil;@Autowiredprivate VectorStore vectorStore;/*** 保存文件到MinIO并記錄映射關系到MySQL*/@Overridepublic boolean save(String chatId, Resource resource) {try {// 轉換Resource為MultipartFileMultipartFile file = convertResourceToMultipartFile(resource);if (file == null) {log.error("文件轉換失敗,chatId:{}", chatId);return false;}// 上傳到MinIOString fileUrl = commonFileService.upload(file);// 保存新記錄到數據庫FileMapping mapping = FileMapping.builder().chatId(chatId).fileName(file.getOriginalFilename()).filePath(fileUrl).contentType(file.getContentType()).build();int rows = fileMappingMapper.insert(mapping);return rows > 0;} catch (Exception e) {log.error("保存文件映射失敗,chatId:{}", chatId, e);return false;}}/*** 從MinIO獲取文件*/@Overridepublic Resource getFile(String chatId) {try {// 查詢數據庫獲取文件信息FileMapping mapping = fileMappingMapper.selectByChatId(chatId);if (mapping == null) {log.warn("文件映射不存在,chatId:{}", chatId);return null;}// 從MinIO下載文件String fileName=fileUtil.extractFileNameFromUrl(mapping.getFilePath());byte[] fileBytes = commonFileService.download(fileName);if (fileBytes == null || fileBytes.length == 0) {log.error("文件內容為空,filePath:{}", mapping.getFilePath());return null;}// 轉換為Resource返回return new ByteArrayResource(fileBytes) {@Overridepublic String getFilename() {return mapping.getFileName();}@Overridepublic long contentLength() {return fileBytes.length;}};} catch (Exception e) {log.error("獲取文件失敗,chatId:{}", chatId, e);return null;}}/*** 轉換Resource為MultipartFile(解決私有類和類型問題)*/private MultipartFile convertResourceToMultipartFile(Resource resource) throws IOException {// 獲取文件名String filename = Optional.ofNullable(resource.getFilename()).orElse("temp-" + UUID.randomUUID() + ".pdf");// 獲取文件類型(解決Resource無getContentType()的問題)String contentType = null;if (resource.exists()) {// 嘗試通過文件路徑探測類型try {contentType = Files.probeContentType(resource.getFile().toPath());} catch (IOException e) {log.warn("通過文件路徑獲取類型失敗,chatId:{}", e);}// 兜底:使用默認PDF類型if (contentType == null) {contentType = MediaType.APPLICATION_PDF_VALUE;}} else {contentType = MediaType.APPLICATION_OCTET_STREAM_VALUE;}// 讀取文件內容為字節數組byte[] content = FileCopyUtils.copyToByteArray(resource.getInputStream());// 自定義MultipartFile實現(避免使用私有內部類)String finalContentType = contentType;return new MultipartFile() {@Overridepublic String getName() {return "file"; // 參數名,可自定義}@Overridepublic String getOriginalFilename() {return filename;}@Overridepublic String getContentType() {return finalContentType;}@Overridepublic boolean isEmpty() {return content.length == 0;}@Overridepublic long getSize() {return content.length;}@Overridepublic byte[] getBytes() throws IOException {return content;}@Overridepublic InputStream getInputStream() throws IOException {return new ByteArrayInputStream(content);}@Overridepublic void transferTo(File dest) throws IOException, IllegalStateException {FileCopyUtils.copy(content, dest);}};}/*** 初始化:加載向量存儲*/@PostConstructprivate void init() {try {File vectorFile = new File("chat-pdf.json");if (vectorFile.exists() && vectorStore instanceof SimpleVectorStore) {((SimpleVectorStore) vectorStore).load(vectorFile);log.info("向量存儲已加載");}} catch (Exception e) {log.error("初始化向量存儲失敗", e);}}/*** 銷毀前:保存向量存儲*/@PreDestroyprivate void persistent() {try {if (vectorStore instanceof SimpleVectorStore) {((SimpleVectorStore) vectorStore).save(new File("chat-pdf.json"));log.info("向量存儲已保存");}} catch (Exception e) {log.error("保存向量存儲失敗", e);}}
}

4.4.2 文獻閱讀助手

     /*** 文件閱讀助手* @param* @param* @return*/@PostMapping(value = "/chat", produces = "text/html;charset=utf-8")@Operation(summary = "文件閱讀助手")public Flux<String> chat(@RequestBody PDFDto pdfDto) {String prompt = pdfDto.getPrompt();String chatId = pdfDto.getChatId();// 1.找到會話文件Resource file = fileRepository.getFile(chatId);if (!file.exists()) {// 文件不存在,不回答throw new RuntimeException("會話文件不存在!");}// 2.保存會話idchatHistoryRepository.save("pdf", chatId);// 3.請求模型return pdfChatClient.prompt().user(prompt).advisors(a -> a.param(CHAT_MEMORY_CONVERSATION_ID_KEY, chatId)).advisors(a -> a.param(FILTER_EXPRESSION, "file_name == '" + file.getFilename() + "'")).stream().content();}/*** 閱讀助手文件上傳*/@PostMapping("/upload/{chatId}")@Operation(summary = "閱讀助手文件上傳")public Result uploadPdf(@PathVariable String chatId, @RequestParam("file") MultipartFile file) {try {// 1. 校驗文件是否為PDF格式if (!Objects.equals(file.getContentType(), "application/pdf")) {return Result.fail("只能上傳PDF文件!");}// 2.保存文件boolean success = fileRepository.save(chatId, file.getResource());if (!success) {return Result.fail("保存文件失敗!");}// 3.寫入向量庫this.writeToVectorStore(file.getResource());return Result.success();} catch (Exception e) {log.error("Failed to upload PDF.", e);return Result.fail("上傳文件失敗!");}}/*** 閱讀助手文件下載*/@GetMapping("/file/{chatId}")@Operation(summary = "閱讀助手文件下載")public ResponseEntity<Resource> download(@PathVariable("chatId") String chatId) throws IOException {// 1.讀取文件Resource resource = fileRepository.getFile(chatId);if (!resource.exists()) {return ResponseEntity.notFound().build();}// 2.文件名編碼,寫入響應頭String filename = URLEncoder.encode(Objects.requireNonNull(resource.getFilename()), StandardCharsets.UTF_8);// 3.返回文件return ResponseEntity.ok().contentType(MediaType.APPLICATION_OCTET_STREAM).header("Content-Disposition", "attachment; filename=\"" + filename + "\"").body(resource);}//     private void writeToVectorStore(Resource resource) {
//         // 1.創建PDF的讀取器
//         PagePdfDocumentReader reader = new PagePdfDocumentReader(
//                 resource, // 文件源
//                 PdfDocumentReaderConfig.builder()
//                         .withPageExtractedTextFormatter(ExtractedTextFormatter.defaults())
//                         .withPagesPerDocument(1) // 每1頁PDF作為一個Document
//                         .build()
//         );
//         // 2.讀取PDF文檔,拆分為Document
//         List<Document> documents = reader.read();
//         // 3.寫入向量庫
//         vectorStore.add(documents);
//     }private void writeToVectorStore(Resource resource) {try {// 使用 Tika 解析 PDF 內容ContentHandler handler = new BodyContentHandler(-1); // 不限制內容長度Metadata metadata = new Metadata();ParseContext context = new ParseContext();PDFParser parser = new PDFParser();// 解析 PDF 并提取文本parser.parse(resource.getInputStream(), handler, metadata, context);String content = handler.toString();// 關鍵修復:將 Metadata 轉換為 MapMap<String, Object> metadataMap = new HashMap<>();for (String name : metadata.names()) {metadataMap.put(name, metadata.get(name));}// 補充文件名到元數據(可選)metadataMap.put("file_name", resource.getFilename());// 創建 Document 并寫入向量庫Document document = new Document(resource.getFilename(), // 文檔 IDcontent, // 提取的文本內容metadataMap // 轉換后的元數據 Map);vectorStore.add(List.of(document));} catch (Exception e) {log.error("Failed to parse PDF with Tika", e);throw new RuntimeException("解析PDF失敗", e);}}

ChatClient配置:

    /*** AI文獻閱讀助手* @param model* @param chatMemory* @param vectorStore* @return*/@Beanpublic ChatClient pdfChatClient(OpenAiChatModel model, ChatMemory chatMemory, VectorStore vectorStore) {return ChatClient.builder(model).defaultSystem("請根據上下文回答問題,遇到上下文沒有的問題,不要隨意編造。").defaultAdvisors(new SimpleLoggerAdvisor(),new MessageChatMemoryAdvisor(chatMemory),new QuestionAnswerAdvisor(vectorStore,SearchRequest.builder().similarityThreshold(0.6).topK(2).build())).build();}

5.多模態

多模態是指不同類型的數據輸入,如文本、圖像、聲音、視頻等。目前為止,我們與大模型交互都是基于普通文本輸入,這跟我們選擇的大模型有關。deepseek、qwen-max等模型都是純文本模型,在ollama和百煉平臺,我們也能找到很多多模態模型。

以ollama為例,在搜索時點擊vison,就能找到支持圖像識別的模型:

在阿里云百煉平臺也一樣:

阿里云的qwen-omni模型是支持文本、圖像、音頻、視頻輸入的全模態模型,還能支持語音合成功能,非常強大。

注意

在SpringAI的當前版本(1.0.0-m6)中,qwen-omni與SpringAI中的OpenAI模塊的兼容性有問題,目前僅支持文本和圖片兩種模態。音頻會有數據格式錯誤問題,視頻完全不支持。

目前的解決方案有兩種:

  • 一是使用spring-ai-alibaba來替代。

  • 二是重寫OpenAIModel的實現

多模態Agent實例:

    /*** 智能體對話多模態助手* @param model* @param chatMemory* @param vectorStore* @return*/@Beanpublic ChatClient chatCommonClient(AlibabaOpenAiChatModel model, ChatMemory chatMemory,VectorStore vectorStore,  AINoteTools aiNoteTools) {return ChatClient.builder(model).defaultOptions(ChatOptions.builder().model("qwen-omni-turbo").build()).defaultSystem(AIChatConstant.CHAT_ROLE).defaultAdvisors(new SimpleLoggerAdvisor(),new MessageChatMemoryAdvisor(chatMemory),new QuestionAnswerAdvisor(vectorStore,SearchRequest.builder().similarityThreshold(0.6).topK(2).build())).defaultTools(List.of(aiNoteTools)).build();}
    /*** 智能體對話* @param prompt* @param chatId* @param files* @return*/@PostMapping(value = "/commonChat", produces = "text/html;charset=utf-8")@Operation(summary = "智能體對話")public Flux<String> chat(@RequestParam("prompt") String prompt, @RequestParam(required = false) String chatId,@RequestParam(value = "files", required = false) List<MultipartFile> files) {// 1.保存會話,idchatHistoryRepository.save("chat", chatId);// 2.請求模型if (files == null || files.isEmpty()) {// 沒有附件,純文本聊天return textChat(prompt, chatId);} else {// 有附件,多模態聊天return multiModalChat(prompt, chatId, files);}}private Flux<String> multiModalChat(String prompt, String chatId, List<MultipartFile> files) {// 1.解析多媒體List<Media> medias = files.stream().map(file -> new Media(MimeType.valueOf(Objects.requireNonNull(file.getContentType())),file.getResource())).toList();// 2.請求模型return chatCommonClient.prompt().user(p -> p.text(prompt).media(medias.toArray(Media[]::new))).advisors(a -> a.param(CHAT_MEMORY_CONVERSATION_ID_KEY, chatId)).stream().content();}private Flux<String> textChat(String prompt, String chatId) {return chatCommonClient.prompt().user(prompt).advisors(a -> a.param(CHAT_MEMORY_CONVERSATION_ID_KEY, chatId)).stream().content();}

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/923631.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/923631.shtml
英文地址,請注明出處:http://en.pswp.cn/news/923631.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

TDengine 時序函數 DIFF 用戶手冊

DIFF 函數用戶手冊 函數概述 DIFF 函數用于計算數據列中相鄰兩行數據的差值&#xff0c;通常用于分析數據的變化趨勢和增量。該函數特別適用于監控智能電表數據的變化模式。 語法 SELECT DIFF(column_name [, ignore_negative]) FROM table_name;參數說明 column_name: 數…

清除gradle緩存的某個依賴

要清除 Gradle 緩存中的某個特定依賴&#xff0c;可以按照以下步驟操作&#xff1a;找到依賴在緩存中的路徑 Gradle 緩存的默認位置&#xff1a; Windows: C:\Users\<用戶名>\.gradle\caches\modules-2\files-2.1 macOS/Linux: ~/.gradle/caches/modules-2/files-2.1 路徑…

機器人控制器開發(驅動層——伺服驅動canopen的sdo和pdo)

文章總覽 一、核心區別&#xff1a;一句話概括 ? ??SDO&#xff08;服務數據對象&#xff09;??&#xff1a;像 ??“問詢/設置”??。用于??點對點、非周期??的參數配置和讀取。例如&#xff0c;設置電機增益、讀取當前位置等。??速度慢&#xff0c;但確保數據準…

返利APP排行榜數據實時更新:基于 WebSocket 與 Redis 的高并發數據推送技術

返利APP排行榜數據實時更新&#xff1a;基于 WebSocket 與 Redis 的高并發數據推送技術 大家好&#xff0c;我是阿可&#xff0c;微賺淘客系統及省賺客APP創始人&#xff0c;是個冬天不穿秋褲&#xff0c;天冷也要風度的程序猿&#xff01; 在返利APP運營中&#xff0c;用戶對排…

[論文閱讀] 人工智能 + 軟件工程 | 告別冗余HTML與高算力消耗:EfficientUICoder如何破解UI2Code的token難題

告別冗余HTML與高算力消耗&#xff1a;EfficientUICoder如何破解UI2Code的token難題 論文信息信息類別具體內容論文原標題EfficientUICoder: A Dual-Modal Token Compression Framework for UI-to-Code Generation with Multimodal Large Language Models論文鏈接https://arxiv…

【STM32項目開源】STM32單片機智能語音風扇控制系統

目錄 一、設計背景和意義 1.1設計背景&#xff1a; 1.2設計意義&#xff1a; 二、實物展示 三、硬件功能介紹 2.1 硬件清單&#xff1a; 2.2 功能介紹&#xff1a; 四、軟件設計流程圖 五、硬件PCB展示 六、軟件主函序展示 七、單片機實物資料 資料獲取 查看主頁介…

git clone vllm

這個錯誤不是 vLLM 本身的問題&#xff0c;而是 pip 在 clone GitHub 倉庫時失敗了&#xff1a; error: RPC failed; curl 16 Error in the HTTP2 framing layer fatal: expected flush after ref listing根因通常是&#xff1a; 網絡問題&#xff08;訪問 GitHub 被中斷 / 代理…

光譜相機的新興領域應用

光譜相機在?新興領域?的應用正快速拓展&#xff0c;結合?AI、納米技術、量子傳感?等前沿科技&#xff0c;突破傳統檢測極限。以下是六大最具潛力的新興應用方向及技術突破點&#xff1a;?1. 元宇宙與數字孿生??應用場景?&#xff1a;?虛擬材質建模?&#xff1a;通過高…

深入理解數據結構之復雜度

文章目錄1.數據結構前言1.1 數據結構1.2 算法2.算法效率2.1 復雜度的概念2.2 復雜度的重要性3.1 大O的漸進表式法3.2 時間復雜度計算示例3.2.1 示例13.2.2 示例23.2.3 示例33.2.4 示例43.2.5 示例53.2.6 示例63.2.7 示例74.空間復雜度4.1 空間復雜度計算示例4.1.1 示例14.1.2 示…

【Vue3】10-編寫vue項目時,ref的應用(2)

合集篇&#xff1a; 1.【Vue3】創建并運行一個簡易的Vue3項目 2.【Vue3】編寫vue實現一個簡單效果&#xff0c;并使用setup糖簡化代碼 目錄refref 定義對象類型的響應式數據1. 概念理解a. 概念b. 分析2. 代碼實操代碼場景步驟一&#xff1a;導入ref步驟二&#xff1a;修改數據形…

clickhouse 中SUM(CASE WHEN ...) 返回什么類型?

文章目錄clickhouse 中SUM(CASE WHEN ...) 返回什么類型&#xff1f;CASE WHENSUM(CASE WHEN ...) 返回什么類型&#xff1f;clickhouse 中SUM(CASE WHEN …) 返回什么類型&#xff1f; CASE WHEN ClickHouse中的CASE WHEN用法與SQL標準中的用法基本相同&#xff0c;用于實現…

【算法】C語言多組輸入輸出模板

在 C語言 里&#xff0c;“多組輸入輸出”是很多在線評測系統&#xff08;OJ&#xff09;常見的模式&#xff0c;通常有兩種情況&#xff1a;1. 輸入到文件結束&#xff08;EOF&#xff09;比如題目沒有告訴有多少組數據&#xff0c;就需要一直讀直到輸入結束。#include <st…

【Ubuntu】sudo apt update出現E :倉庫***沒有Release文件

【Ubuntu】sudo apt update出現E &#xff1a;倉庫***沒有Release文件 1 問題描述 在執行sudo apt update更新一下軟件包時出現了如下報錯 E: 倉庫***沒有Release 文件。 N: 無法安全地用該源進行更新&#xff0c;所以默認禁用該源。 N:參見apt-secure&#xff08;8&#xf…

全球后量子遷移進展:區域特色與產業落地差異

一、量子威脅具象化&#xff1a;從技術風險到產業沖擊量子計算對傳統密碼體系的威脅已從理論走向現實&#xff0c;其破壞性不僅體現在算法破解效率的飛躍&#xff0c;更滲透到數據全生命周期的安全防護中。以金融領域為例&#xff0c;2024 年國際安全機構模擬實驗顯示&#xff…

貪心算法應用:決策樹(ID3/C4.5)詳解

Java中的貪心算法應用&#xff1a;決策樹&#xff08;ID3/C4.5&#xff09;詳解 決策樹是一種常用的機器學習算法&#xff0c;它通過遞歸地將數據集分割成更小的子集來構建樹形結構。ID3和C4.5是兩種經典的決策樹算法&#xff0c;它們都使用了貪心算法來選擇最優的特征進行分割…

華為任旭東:開源協作,激發創新,共創智能世界 | GOSIM HANGZHOU 2025

GOSIM HANGZHOU 2025峰會盛大開幕&#xff0c;華為首席開源聯絡官、CNCF基金會董事任旭東以《開源協作&#xff0c;激發創新&#xff0c;共創智能世界》為題發表Keynote演講。顛覆性技術到工業應用的轉換時間越來越短&#xff0c;AI技術正在推動傳統軟件產業的演進&#xff0c;…

本地部署 GPS 跟蹤系統 Traccar 并實現外部訪問

Traccar 是一款集成了強大的 java 后端服務的 GPS 跟蹤系統 。它支持在多種設備使用&#xff0c;在物流運輸、資產管理和個人安全等領域應用。本文將詳細的介紹如何利用 Docker 在本地部署 Traccar 并結合路由俠實現外網訪問本地部署的 Traccar 。 第一步&#xff0c;本地部署…

【開題答辯全過程】以 “川趣玩”旅行團預定微信小程序為例,包含答辯的問題和答案

個人簡介一名14年經驗的資深畢設內行人&#xff0c;語言擅長Java、php、微信小程序、Python、Golang、安卓Android等開發項目包括大數據、深度學習、網站、小程序、安卓、算法。平常會做一些項目定制化開發、代碼講解、答辯教學、文檔編寫、也懂一些降重方面的技巧。感謝大家的…

Android Doze低電耗休眠模式 與 WorkManager

1. Doze模式下&#xff0c;WorkManager setInitialDelay設置小于15分鐘&#xff0c;被系統強制到15分鐘執行&#xff0c;怎么辦 ? Android 擁有兩項省電功能&#xff0c;通過管理設備未連接電源時應用的行為來延長用戶電池續航時間&#xff1a;低電耗模式 (Doze) 和應用待機模…

iOS 能耗監控與電池優化實戰:如何查看App耗電量、分析CPU、GPU內存使用、(uni-app iOS開發性能調試指南)

在 iOS 應用開發中&#xff0c;能耗與電池消耗是用戶最直觀的體驗指標。 即便功能完善&#xff0c;如果 App 存在以下問題&#xff1a; 電池掉電快、設備發熱嚴重&#xff1b;后臺任務執行過多&#xff1b;頁面渲染與文件操作引發 CPU/GPU 過載&#xff1b;日志或緩存導致頻繁 …