Elasticsearch：使用 AI SDK 和 Elastic 構建 AI 代理

作者：來自 Elastic?Carly Richmond

你是否經常聽到 AI 代理（AI agents）這個詞，但不太確定它們是什么，或者如何在 TypeScript（或 JavaScript）中構建一個？跟我一起深入了解 AI 代理的概念、它們的可能應用場景，以及一個使用 AI SDK 和 Elasticsearch 構建的旅行規劃代理示例。

你是否經常聽到 AI 代理這個詞，但不太確定它們是什么，或者它們如何與 Elastic 關聯？在這里，我將深入探討 AI 代理，具體包括：

什么是 AI 代理？
AI 代理可以解決哪些問題？
一個基于 AI SDK、TypeScript 和 Elasticsearch 的旅行規劃代理示例，代碼可在 GitHub 上找到。

什么是 AI 代理？

AI 代理是一種能夠自主執行任務并代表人類采取行動的軟件，它利用人工智能實現這一目標。AI 代理通過結合一個或多個大語言模型（large language models - LLMs）與用戶定義的工具（或函數）來執行特定操作。例如，這些工具可以執行以下操作：

從數據庫、傳感器、API 或 Elasticsearch 等搜索引擎提取信息。
執行復雜計算，并讓 LLM 總結其結果。
基于各種數據輸入快速做出關鍵決策。
根據響應觸發必要的警報和反饋。

AI 代理可以做什么？

AI 代理可以根據其類型在多個領域中應用，可能的示例包括：

基于效用的代理：評估行動并提供推薦以最大化收益，例如根據用戶的觀看歷史推薦電影和電視劇。
基于模型的代理：根據傳感器輸入實時決策，例如自動駕駛汽車或智能吸塵器。
學習型代理：結合數據和機器學習識別模式和異常，例如用于欺詐檢測。
投資建議代理：根據用戶的風險偏好和現有投資組合提供投資建議，以最大化收益。如果能權衡準確性、聲譽風險和監管因素，這將加速決策過程。
簡單聊天機器人：如當前的聊天機器人，可訪問用戶賬戶信息并用自然語言回答基本問題。

示例：旅行規劃助手

為了更好地理解 AI 代理的功能，以及如何使用熟悉的 Web 技術構建一個 AI 代理，我們來看一個使用 AI SDK、TypeScript 和 Elasticsearch 編寫的簡單旅行規劃助手示例。

架構

我們的示例由 5 個不同的元素組成：

一個名為 weatherTool 的工具，從 Weather API 獲取提問者指定位置的天氣數據。
一個名為 fcdoTool 的工具，從 GOV.UK content API 提供目的地的當前旅行狀態。
flightTool 工具使用簡單查詢從 Elasticsearch 獲取航班信息。
以上所有信息都會傳遞給 LLM GPT-4 Turbo。

模型選擇

在構建你的第一個 AI 代理時，確定使用哪個模型可能會很困難。資源如 Hugging Face Open LLM Leaderboard 是一個不錯的起點。此外，你還可以參考 Berkeley Function-Calling Leaderboard 來獲取工具使用的指導。

在我們的案例中，AI SDK 特別推薦使用具有強大工具調用能力的模型，例如 gpt-4 或 gpt-4-turbo，這在其 Prompt Engineering 文檔中有詳細說明。如果選擇了錯誤的模型，可能會導致 LLM 無法按預期調用多個工具，甚至會出現兼容性錯誤，如下所示：

# Llama3 lack of tooling support (3.1 or higher)
llama3 does not support tools# Unsupported toolChoice option to configure tool usage
AI_UnsupportedFunctionalityError: 'Unsupported tool choice type: required' functionality not supported.

先決條件

要運行此示例，請確保按照倉庫 README 中的先決條件進行操作。

基礎聊天助手

你可以使用 AI SDK 創建的最簡單的 AI 代理將生成來自 LLM 的響應，而無需任何額外的上下文。AI SDK 支持許多 JavaScript 框架，具體可參考其文檔。然而，AI SDK UI 庫文檔列出了對 React、Svelte、Vue.js 和 SolidJS 的不同支持，許多教程針對 Next.js。因此，我們的示例使用 Next.js 編寫。

任何 AI SDK 聊天機器人的基本結構使用 useChat 鉤子來處理來自后端路由的請求，默認情況下是 /api/chat/：

page.tsx 文件包含了我們在 Chat 組件中的客戶端實現，包括由 useChat hook 暴露的提交、加載和錯誤處理功能。加載和錯誤處理功能是可選的，但建議提供請求狀態的指示。與簡單的 REST 調用相比，代理可能需要相當長的時間來響應，因此在此過程中保持用戶更新狀態非常重要，避免用戶快速連續點擊和重復調用。

由于該組件涉及客戶端交互，我使用了 use client 指令，以確保該組件被視為客戶端包的一部分：

'use client';import { useChat } from '@ai-sdk/react';
import Spinner from './components/spinner';export default function Chat() {/* useChat hook helps us handle the input, resulting messages, and also handle the loading and error states for a better user experience */const { messages, input, handleInputChange, handleSubmit, isLoading, stop, error, reload } = useChat();return (<div className="chat__form"><div className="chat__messages">{/* Display all user messages and assistant responses */messages.map(m => (<div key={m.id} className="message"><div>{ /* Messages with the role of *assistant* denote responses from the LLM*/ }<div className="role">{m.role === "assistant" ? "Sorley" : "Me"}</div>{ /* User or LLM generated content */}<div className="itinerary__div" dangerouslySetInnerHTML={{ __html: markdownConverter.makeHtml(m.content) }}></div></div></div>))}</div>{/* Spinner shows when awaiting a response */isLoading && (<div className="spinner__container"><Spinner /><button id="stop__button" type="button" onClick={() => stop()}>Stop</button></div>)}{/* Show error message and return button when something goes wrong */error && (<><div className="error__container">Unable to generate a plan. Please try again later!</div><button id="retry__button" type="button" onClick={() => reload()}>Retry</button></>)}{ /* Form using default input and submission handler form the useChat hook */ }<form onSubmit={handleSubmit}><inputclassName="search-box__input"value={input}placeholder="Where would you like to go?"onChange={handleInputChange}disabled={error != null}/></form></div>);
}

Chat 組件將通過鉤子暴露的 input 屬性保持用戶輸入，并在提交時將響應發送到相應的路由。我使用了默認的 handleSubmit 方法，它將調用 /ai/chat/ 的 POST 路由。

該路由的處理程序位于 /ai/chat/route.ts 中，通過 OpenAI?provider 程序初始化與 gpt-4-turbo LLM 的連接：

import { openai } from '@ai-sdk/openai';
import { streamText } from 'ai';
import { NextResponse } from 'next/server';// Allow streaming responses up to 30 seconds to address typically longer responses from LLMs
export const maxDuration = 30;// Post request handler
export async function POST(req: Request) {const { messages } = await req.json();try {// Generate response from the LLM using the provided model, system prompt and messagesconst result = streamText({model: openai('gpt-4-turbo'),system: 'You are a helpful assistant that returns travel itineraries',messages});// Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experiencereturn result.toDataStreamResponse();} catch(e) {console.error(e);return new NextResponse("Unable to generate a plan. Please try again later!");}
}

請注意，上述實現將默認從環境變量 OPENAI_API_KEY 中提取 API 密鑰。如果需要自定義 OpenAI 提供程序的配置，可以使用 createOpenAI 方法來覆蓋提供程序的設置。

通過以上路由，結合 Showdown 幫助將 GPT 的 Markdown 輸出格式化為 HTML，再加上一些 CSS 魔法（在 globals.css 文件中），我們最終得到了一個簡單的響應式 UI，可以根據用戶的提示生成行程：

基本的 LLM 行程視頻

添加工具

向 AI 代理添加工具基本上就是創建 LLM 可以使用的自定義功能，以增強其生成的響應。在此階段，我將添加 3 個新的工具，LLM 可以選擇在生成行程時使用，如下圖所示：

天氣工具

雖然生成的行程是一個很好的開始，但我們可能希望添加 LLM 沒有經過訓練的額外信息，比如天氣。這促使我們編寫第一個工具，它不僅可以作為 LLM 的輸入，還能提供額外的數據，幫助我們調整 UI。

創建的天氣工具，完整代碼如下，接受一個參數 location，LLM 將從用戶輸入中提取該位置。schema 屬性使用 TypeScript 的 schema 驗證庫 Zod 來驗證傳入的參數類型，確保傳遞的是正確的參數類型。description 屬性允許你定義工具的功能，幫助 LLM 決定是否調用該工具。

import { tool as createTool } from 'ai';
import { z } from 'zod';import { WeatherResponse } from '../model/weather.model';export const weatherTool = createTool({description: 'Display the weather for a holiday location',parameters: z.object({location: z.string().describe('The location to get the weather for')}),execute: async function ({ location }) {// While a historical forecast may be better, this example gets the next 3 daysconst url = `https://api.weatherapi.com/v1/forecast.json?q=${location}&days=3&key=${process.env.WEATHER_API_KEY}`;try {const response = await fetch(url);const weather : WeatherResponse = await response.json();return { location: location, condition: weather.current.condition.text, condition_image: weather.current.condition.icon,temperature: Math.round(weather.current.temp_c),feels_like_temperature: Math.round(weather.current.feelslike_c),humidity: weather.current.humidity};} catch(e) {console.error(e);return { message: 'Unable to obtain weather information', location: location};}}
});

你可能已經猜到，execute 屬性是我們定義異步函數并實現工具邏輯的地方。具體來說，發送到天氣 API 的位置會傳遞給我們的工具函數。然后，響應會被轉換為一個單一的 JSON 對象，可以顯示在 UI 上，并且也用于生成行程。

鑒于我們目前只運行一個工具，因此不需要考慮順序或并行流程。簡單來說，就是在原始 api/chat 路由中處理 LLM 輸出的 streamText 方法中添加 tools 屬性：

import { weatherTool } from '@/app/ai/weather.tool';// Other imports omittedexport const tools = {displayWeather: weatherTool,
};// Post request handler
export async function POST(req: Request) {const { messages } = await req.json();// Generate response from the LLM using the provided model, system prompt and messages (try catch block omitted)const result = streamText({model: openai('gpt-4-turbo'),system: 'You are a helpful assistant that returns travel itineraries based on the specified location.',messages,maxSteps: 2,tools});// Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experiencereturn result.toDataStreamResponse();
}

工具輸出與消息一起提供，這使我們能夠為用戶提供更完整的體驗。每條消息包含一個 parts 屬性，其中包含 type 和 state 屬性。當這些屬性的值分別為 tool-invocation 和 result 時，我們可以從 toolInvocation 屬性中提取返回的結果，并按需要顯示它們。

更改后的 page.tsx 源代碼將顯示天氣摘要以及生成的行程：

'use client';import { useChat } from '@ai-sdk/react';
import Image from 'next/image';import { Weather } from './components/weather';import pending from '../../public/multi-cloud.svg';export default function Chat() {/* useChat hook helps us handle the input, resulting messages, and also handle the loading and error states for a better user experience */const { messages, input, handleInputChange, handleSubmit, isLoading, stop, error, reload } = useChat();return (<div className="chat__form"><div className="chat__messages">{/* Display all user messages and assistant responses */messages.map(m => (<div key={m.id} className="message"><div>{ /* Messages with the role of *assistant* denote responses from the LLM */}<div className="role">{m.role === "assistant" ? "Sorley" : "Me"}</div>{ /* Tool handling */}<div className="tools__summary">{m.parts.map(part => {if (part.type === 'tool-invocation') {const { toolName, toolCallId, state } = part.toolInvocation;if (state === 'result') {{ /* Show weather results */}if (toolName === 'displayWeather') {const { result } = part.toolInvocation;return (<div key={toolCallId}><Weather {...result} /></div>);}} else {return (<div key={toolCallId}>{toolName === 'displayWeather' ? (<div className="weather__tool"><Image src={pending} width={80} height={80} alt="Placeholder Weather"/><p className="loading__weather__message">Loading weather...</p></div>) : null}</div>);}}})}</div>{ /* User or LLM generated content */}<div className="itinerary__div" dangerouslySetInnerHTML={{ __html: markdownConverter.makeHtml(m.content) }}></div></div></div>))}</div>{ /* Spinner and loading handling omitted */ }{ /* Form using default input and submission handler form the useChat hook */}<form onSubmit={handleSubmit}><inputclassName="search-box__input"value={input}placeholder="Where would you like to go?"onChange={handleInputChange}disabled={error != null}/></form></div>);
}

上述代碼將向用戶提供以下輸出：

FCO 工具

AI 代理的強大之處在于 LLM 可以選擇觸發多個工具來獲取相關信息，以生成響應。假設我們想要查看目標國家的旅行指南。下面的代碼展示了如何創建一個新的工具 fcdoGuidance，它可以觸發一個對 GOV.UK Content API 的 API 調用：

import { tool as createTool } from 'ai';
import { z } from 'zod';import { FCDOResponse } from '../model/fco.model';export const fcdoTool = createTool({description: 'Display the FCDO guidance for a destination',parameters: z.object({country: z.string().describe('The country of the location to get the guidance for')}),execute: async function ({ country }) {const url = `https://www.gov.uk/api/content/foreign-travel-advice/${country.toLowerCase()}`;try {const response = await fetch(url, { headers: { 'Content-Type': 'application/json' } });const fcoResponse: FCDOResponse = await response.json();const alertStatus: string = fcoResponse.details.alert_status.length == 0 ? 'Unknown' : fcoResponse.details.alert_status[0].replaceAll('_', ' ');return { status: alertStatus, url: fcoResponse.details?.document?.url};} catch(e) {console.error(e);return { message: 'Unable to obtain FCDO information', location: location};}}
});

你會注意到，格式與之前討論的天氣工具非常相似。事實上，要將該工具包含到 LLM 輸出中，只需將其添加到 tools 屬性，并修改 /api/chat 路由中的提示即可：

// Imports omittedexport const tools = {fcdoGuidance: fcdoTool,displayWeather: weatherTool,
};// Post request handler
export async function POST(req: Request) {const { messages } = await req.json();// Generate response from the LLM using the provided model, system prompt and messages (try/ catch block omitted)const result = streamText({model: openai('gpt-4-turbo'),system:"You are a helpful assistant that returns travel itineraries based on a location" + "Use the current weather from the displayWeather tool to adjust the itinerary and give packing suggestions." +"If the FCDO tool warns against travel DO NOT generate an itinerary.",messages,maxSteps: 2,tools});// Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experiencereturn result.toDataStreamResponse();
}

一旦將顯示工具輸出的組件添加到頁面，對于不建議旅行的國家，輸出應該如下所示：

支持工具調用的LLM可以選擇是否調用工具，除非它認為有必要。使用gpt-4-turbo時，我們的兩個工具會并行調用。然而，之前嘗試使用llama3.1時，取決于輸入，只有一個模型會被調用。

航班信息工具

RAG（Retrieval Augmented Generation - 檢索增強生成）指的是一種軟件架構，其中從搜索引擎或數據庫中提取的文檔作為上下文傳遞給 LLM，以基于提供的文檔集來生成回應。這種架構允許 LLM 根據它之前沒有訓練過的數據生成更準確的回應。雖然 Agentic RAG 通過定義的工具或結合向量或混合搜索處理文檔，但也可以像我們這里所做的那樣，利用 RAG 作為與傳統詞匯搜索的復雜流程的一部分。

為了將航班信息與其他工具一起傳遞給LLM，最后一個工具 flightTool 通過 Elasticsearch JavaScript 客戶端，從 Elasticsearch 中拉取出發和到達航班的航班信息，使用提供的出發地和目的地：

import { tool as createTool } from 'ai';
import { z } from 'zod';import { Client } from '@elastic/elasticsearch';
import { SearchResponseBody } from '@elastic/elasticsearch/lib/api/types';import { Flight } from '../model/flight.model';const index: string = "upcoming-flight-data";
const client: Client = new Client({node: process.env.ELASTIC_ENDPOINT,auth: {apiKey: process.env.ELASTIC_API_KEY || "",},
});function extractFlights(response: SearchResponseBody<Flight>): (Flight | undefined)[] {return response.hits.hits.map(hit => { return hit._source})
}export const flightTool = createTool({description:"Get flight information for a given destination from Elasticsearch, both outbound and return journeys",parameters: z.object({destination: z.string().describe("The destination we are flying to"),origin: z.string().describe("The origin we are flying from (defaults to London if not specified)"),}),execute: async function ({ destination, origin }) {try {const responses = await client.msearch({searches: [{ index: index },{query: {bool: {must: [{match: {origin: origin,},},{match: {destination: destination,},},],},},},// Return leg{ index: index },{query: {bool: {must: [{match: {origin: destination,},},{match: {destination: origin,},},],},},},],});if (responses.responses.length < 2) {throw new Error("Unable to obtain flight data");}return {outbound: extractFlights(responses.responses[0] as SearchResponseBody<Flight>),inbound: extractFlights(responses.responses[1] as SearchResponseBody<Flight>)};} catch (e) {console.error(e);return {message: "Unable to obtain flight information",location: location,};}},
});

這個示例使用了 Multi search API 來分別拉取出發和到達航班的信息，然后通過?extractFlights?工具方法提取文檔。

為了使用工具的輸出，我們需要再次修改我們的提示和工具集合，更新?/ai/chat/route.ts?文件：

// Imports omitted// Allow streaming responses up to 30 seconds to address typically longer responses from LLMs
export const maxDuration = 30;export const tools = {getFlights: flightTool,displayWeather: weatherTool,fcdoGuidance: fcdoTool
};// Post request handler
export async function POST(req: Request) {const { messages } = await req.json();// Generate response from the LLM using the provided model, system prompt and messages (try/ catch block omitted)const result = streamText({model: openai('gpt-4-turbo'),system:"You are a helpful assistant that returns travel itineraries based on location, the FCDO guidance from the specified tool, and the weather captured from the displayWeather tool." + "Use the flight information from tool getFlights only to recommend possible flights in the itinerary." + "Return an itinerary of sites to see and things to do based on the weather." + "If the FCDO tool warns against travel DO NOT generate an itinerary.",messages,maxSteps: 2,tools});// Return data stream to allow the useChat hook to handle the results as they are streamed through for a better user experiencereturn result.toDataStreamResponse();
}

通過最終的提示，所有 3 個工具將被調用，以生成包含航班選項的行程：

總結

如果你之前對 AI 代理還不完全了解，現在你應該清楚了！我們通過使用 AI SDK、Typescript 和 Elasticsearch 的簡單旅行規劃示例來進行了解。我們可以擴展我們的規劃器，添加其他數據源，允許用戶預訂旅行以及旅游，甚至根據位置生成圖像橫幅（目前 AI SDK 中對此的支持仍處于實驗階段）。

如果你還沒有深入了解代碼，可以在這里查看！