仿DeepSeek AI問答系統完整版（帶RAG本地知識庫+聯網搜索+深度思考） +springboot+vue3

今天教大家如何設計一個企業級的?deepseek問答?一樣的系統?, 基于目前主流的技術：前端vue3，后端springboot。同時還帶來的項目的部署教程。

系統的核心功能

1. 支持本地上傳文檔知識庫，RAG技術。支持的文檔有txt，doc，docx，pdf 。

2. 支持聯網搜索。

3. 支持深度思考。

4. 支持歷史上下文消息。

5.支持websocket流式。

6. 支持用戶登錄，注冊。

7. 支持會話管理。

系統需要的組件

ElasticSearch8 ：存儲知識庫文檔向量。

redis：存儲系統用的消息緩存。

mysql8：存儲關系型表。

技術棧

JDK11 + SpringBoot + VUE3

視頻演示

仿DeepSeek AI問答系統完整版

圖片演示

系統實現

RAG技術實現

RAG是一種結合?信息檢索（Retrieval）?和?文本生成（Generation）?的 AI 技術，主要用于提升大語言模型（LLM）生成內容的準確性和時效性。

下面來介紹下RAG實現的核心步驟。

文本提取分塊

用戶上傳文檔時，首選需要將文檔解析成很多文本塊，系統通過?DocmentChunkParser 接口的?textChunks 方法將傳入的文檔（txt，docx，pdf）解析成文本塊，代碼：

public interface DocmentChunkParser {// 每個文本塊的最大字符數public static final int CHUNK_SIZE = 1000;// 文本塊之間的重疊字符數public static final int CHUNK_OVERLAP = 200;List<String> textChunks();
}

對應的TXT實現：

?對應的WORD實現：

??對應的PDF實現：

embeddings階段，文本轉向量

拿到文本塊后，需要將文本轉換成向量，也就是?embeddings階段。系統采用的是 “阿里云百煉” 平臺的向量模型：

用戶需要自己申請api的key。到阿里百煉平臺申請就行了。

向量數據庫存儲+檢索

將文本轉成向量后，需要存儲到向量數據庫里面，這里我選擇的是elasticsearch8。

為什么選擇elasticsearch ，支持數據量大，能水平分片，支持?dense_vector?字段，直接支持向量存儲和相似性搜索，無需插件。支持?cosine（余弦相似度）、dot_product（點積）、l2_norm（歐式距離）等計算方式。

支持近似最近鄰搜索（ANN）。如果企業用也可以選擇。

整個向量數據操作都是在?DocumentChunkRepository 接口中實現：

public interface DocumentChunkRepository {/*** 通過文檔id查詢文本塊* @param documentId* @return*/List<DocumentChunk> findByDocumentId(String documentId);/*** 刪除文檔* @param documentId*/void deleteByDocumentId(String documentId);/*** 存儲文本向量* @param documentChunk* @return*/DocumentChunk save(DocumentChunk documentChunk);/*** 向量關鍵詞搜索，基于KNN算法* @param documentId* @param queryVector* @param k* @return*/List<DocumentChunk> findTopKSimilarChunks(String documentId, List<Float> queryVector, int k);
}

聯網搜索實現

我們直到deepseek模型是不能搜索到今天的天氣，新聞等信息的。如果有這樣的需求，就需要開啟聯網搜索功能。

首先需要尋找一個聯網搜索的插件或者api接口。目前有很多這樣的接口，比如：

searchapi （國外），?duckduckgo(國外)，必應搜索API 等。

國內的我隨便找了一個叫 “博查搜索” ，提供了api搜索。

下面是對接博查搜索的代碼：

/*** 博查AI搜索服務實現* 基于博查AI開放平臺的Web Search API* 支持實時網頁搜索，適用于AI應用*/
@Slf4j
@Service
public class BochaWebSearchService implements WebSearchService {// 博查API配置private static final String BOCHA_API_URL = "https://api.bochaai.com/v1/web-search";private static final String BACKUP_API_URL = "https://api.bochaai.com/v1/search";@Value("${bocha.api.key}")private String apiKey;private final HttpClient httpClient;private final ObjectMapper objectMapper;private final Random random = new Random();// 用戶代理池private static final String[] USER_AGENTS = {"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36","Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36","Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:121.0) Gecko/20100101 Firefox/121.0"};public BochaWebSearchService() {this.httpClient = HttpClient.newBuilder().connectTimeout(Duration.ofSeconds(10)).build();this.objectMapper = new ObjectMapper();}@Overridepublic List<String> search(String query, int maxResults) {if (query == null || query.trim().isEmpty()) {log.warn("搜索查詢為空");return Collections.emptyList();}try {// 首先嘗試主APIList<String> results = performSearch(BOCHA_API_URL, query, maxResults);if (!results.isEmpty()) {return results;}// 主API失敗，嘗試備用APIlog.warn("主API返回空結果，嘗試備用API");results = performSearch(BACKUP_API_URL, query, maxResults);if (!results.isEmpty()) {return results;}log.warn("所有API都返回空結果，返回模擬結果");return getMockResults(query, maxResults);} catch (Exception e) {log.error("博查搜索服務異常: {}", e.getMessage(), e);return getMockResults(query, maxResults);}}/*** 執行搜索請求*/private List<String> performSearch(String apiUrl, String query, int maxResults) throws Exception {String requestBody = buildRequestBody(query, maxResults);HttpRequest request = HttpRequest.newBuilder().uri(URI.create(apiUrl)).header("Authorization", "Bearer " + apiKey).header("Content-Type", "application/json").header("Accept", "application/json").header("User-Agent", getRandomUserAgent()).POST(HttpRequest.BodyPublishers.ofString(requestBody, StandardCharsets.UTF_8)).timeout(Duration.ofSeconds(30)).build();log.info("發送博查搜索請求: {}", query);CompletableFuture<HttpResponse<String>> future = httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofString(StandardCharsets.UTF_8));HttpResponse<String> response = future.get(30, TimeUnit.SECONDS);if (response.statusCode() == 200) {return parseSearchResults(response.body(), maxResults);} else {log.warn("博查API請求失敗，狀態碼: {}, 響應: {}", response.statusCode(), response.body());throw new RuntimeException("API請求失敗: " + response.statusCode());}}/*** 構建請求體*/private String buildRequestBody(String query, int maxResults) {try {Map<String, Object> requestData = new HashMap<>();requestData.put("query", query);requestData.put("count", Math.min(maxResults, 20)); // 博查API最大支持20個結果requestData.put("freshness", "oneYear"); // 搜索一年內的內容requestData.put("summary", false); // 不需要摘要requestData.put("safeSearch", "moderate"); // 中等安全搜索return objectMapper.writeValueAsString(requestData);} catch (Exception e) {log.error("構建請求體失敗: {}", e.getMessage());throw new RuntimeException("構建請求體失敗", e);}}/*** 解析搜索結果*/private List<String> parseSearchResults(String responseBody, int maxResults) {try {JsonNode root = objectMapper.readTree(responseBody);List<String> results = new ArrayList<>();// 解析博查API響應格式JsonNode data = root.path("data") ;JsonNode webPages = data.path("webPages");JsonNode valueArray = webPages.path("value");if (valueArray.isArray()) {for (JsonNode item : valueArray) {if (results.size() >= maxResults) {break;}String title = getJsonValue(item, "name");String url = getJsonValue(item, "url");String snippet = getJsonValue(item, "snippet");String siteName = getJsonValue(item, "siteName");if (!title.isEmpty() && !url.isEmpty()) {StringBuilder result = new StringBuilder();result.append("標題: ").append(title);if (!siteName.isEmpty()) {result.append(" (來源: ").append(siteName).append(")");}result.append("\n鏈接: ").append(url);if (!snippet.isEmpty()) {result.append("\n摘要: ").append(snippet);}results.add(result.toString());}}}log.info("博查搜索成功，返回{}個結果", results.size());return results;} catch (Exception e) {log.error("解析博查搜索結果失敗: {}", e.getMessage(), e);return Collections.emptyList();}}/*** 安全獲取JSON值*/private String getJsonValue(JsonNode node, String fieldName) {JsonNode field = node.path(fieldName);return field.isMissingNode() ? "" : field.asText("").trim();}/*** 獲取隨機User-Agent*/private String getRandomUserAgent() {return USER_AGENTS[random.nextInt(USER_AGENTS.length)];}/*** 獲取模擬搜索結果（當API不可用時）*/private List<String> getMockResults(String query, int maxResults) {List<String> mockResults = new ArrayList<>();int count = Math.min(maxResults, 3);for (int i = 1; i <= count; i++) {mockResults.add(String.format("標題: 關于'%s'的搜索結果 %d (模擬數據)\n" +"鏈接: https://example.com/search-result-%d\n" +"摘要: 這是關于'%s'的模擬搜索結果，實際使用時請配置博查API Key。",query, i, i, query));}log.info("返回{}個模擬搜索結果", mockResults.size());return mockResults;}
}

將聯網搜索的結果轉變成一個List<String>的字符串集合，然后傳到deepseek ，deepseek就會按照聯網的結果進行總結輸出。