使用langgraph 構建RAG 智能問答代理

RAG 智能問答代理：

? 支持用戶持續提問

? 根據模型判斷是否需要查資料

? 自動調用 PDF 檢索工具查找內容

? 自動引用內容回答

? 可以輸入 exit / quit 退出

下載需要的library

pip install langchain-google-genai
pip install langgraph
pip install langchain-community
pip install langchain-chroma
pip install pypdf

from typing import TypedDict, Annotated, Sequence
from langchain_core.messages import BaseMessage, ToolMessage, SystemMessage, HumanMessage, AIMessage
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain_core.tools import tool
from operator import add as add_messages
from langgraph.graph import StateGraph, START, END
from langgraph.prebuilt import ToolNode
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
import os

🔹 from typing import TypedDict, Annotated, Sequence
這些是 Python 的標準類型提示工具（來自 typing 模塊）：

TypedDict：允許你定義帶類型的字典。例如，用于聲明狀態數據結構。

Annotated：用于在類型提示中附加額外的元信息。常用于 Pydantic 或運行時類型校驗。

Sequence：表示有序的元素集合（比如 list 或 tuple），用于函數簽名中明確參數是一個有序結構。

🔹 from langchain_core.messages import BaseMessage, ToolMessage, SystemMessage, HumanMessage, AIMessage
這些是 LangChain Core 中的消息類型，用于構建對話流。常見于聊天模型的輸入/輸出：

BaseMessage：所有消息的基類。

HumanMessage：用戶輸入的信息（人類說的話）。

AIMessage：AI 生成的回復。

SystemMessage：系統預設信息，例如系統提示或指令。

ToolMessage：來自外部工具的消息（比如函數調用返回的結果）。

這些類型可幫助聊天模型更準確地理解不同來源的消息。

🔹 from langchain_google_genai import ChatGoogleGenerativeAI
這是用于接入 Google Gemini 模型（以前稱 PaLM）的 LangChain 接口：

ChatGoogleGenerativeAI：允許你使用 Google 的生成式 AI（例如 Gemini）進行聊天任務，支持多輪對話、函數調用、上下文管理等。

🔹 from langchain_google_genai import GoogleGenerativeAIEmbeddings
此類用于生成文本嵌入向量（Embeddings）：

GoogleGenerativeAIEmbeddings：調用 Google Gemini 的嵌入接口，把文字轉為向量（常用于檢索、語義搜索、向量數據庫等）。

🔹 from langchain_core.tools import tool
tool 是一個裝飾器，用于將一個普通函數轉換為可供語言模型調用的工具（tool）。

常用于實現帶函數調用能力的聊天模型（Function Calling 或 Tool Use）。

🔹 from operator import add as add_messages
Python 標準庫的 operator.add 函數用于執行 a + b 操作。

這里使用了 as add_messages 給它起了別名，暗示這個函數可能會被用于合并兩個消息列表

🔹 from langgraph.graph import StateGraph, START, END
這是 LangGraph 的核心模塊，用于構建有狀態的對話圖（Stateful Computation Graph）：

StateGraph：LangGraph 的狀態圖結構，用于構建可追蹤的多步驟對話流程。

START, END：圖中的特殊節點，表示開始和結束點。通常用于定義流程的入口和出口。

LangGraph 是 LangChain 的實驗性擴展，可構建類似狀態機的復雜應用，例如代理、多工具工作流。

🔹 from langgraph.prebuilt import ToolNode
ToolNode 是一個預構建的節點類，表示圖中執行工具調用的節點。

它可自動處理工具輸入/輸出，在 LangGraph 中連接工具和對話狀態。

🔹 from langchain_community.document_loaders import PyPDFLoader
PyPDFLoader 是一個文檔加載器，用于從 PDF 文件中提取文本內容。

常用于文檔問答（RAG）場景，比如加載報告、合同、書籍等。

🔹 from langchain.text_splitter import RecursiveCharacterTextSplitter
RecursiveCharacterTextSplitter 是一種文本切分器，用于將文檔按段落或句子切成小塊（chunks），便于索引和嵌入。

支持遞歸回退策略，根據標點符號逐級切分，保持語義連貫。

🔹 from langchain_chroma import Chroma
Chroma 是向量數據庫的接口，支持將嵌入存入本地或遠程數據庫，并進行語義搜索。

它用于構建帶記憶或文檔檢索能力的聊天系統。

llm = ChatGoogleGenerativeAI(model="gemini-2.5-flash", google_api_key=GOOGLE_API_KEY, temperature=0)

創建一個低溫度設定的 Gemini 2.5 Flash 模型客戶端，用于多輪對話、函數調用（Tool Use）、文檔問答等任務。
temperature 控制模型生成的隨機性/創造性。
模型在生成時將非常穩定、可預測、偏向選擇概率最高的輸出。非常適合用于確定性應用，如問答系統、數據處理、文檔提取等。

embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001",google_api_key=GOOGLE_API_KEY
)

創建一個 embeddings 實例，它使用 Google 的 Gemini 嵌入模型（embedding-001）將文本轉化為高維向量，用于語義檢索、向量數據庫存儲或文本相似度比較等下游任務。

pdf_path = "Stock_Market_Performance_2024.pdf"if not os.path.exists(pdf_path):raise FileNotFoundError(f"PDF file not found: {pdf_path}")

os.path.exists(pdf_path) 會檢查這個路徑下是否存在對應的文件或文件夾。
raise FileNotFoundError(…) 是 Python 的一種錯誤拋出機制。
當找不到文件時，主動拋出 FileNotFoundError 異常。

pdf_loader = PyPDFLoader(pdf_path)  # This loads the PDF# Checks if the PDF is there
try:pages = pdf_loader.load()print(f"PDF has been loaded and has {len(pages)} pages")
except Exception as e:print(f"Error loading PDF: {e}")raise

PyPDFLoader(pdf_path) 是 LangChain 提供的一個 PDF 文檔加載器類。
它會讀取 PDF 文件，為后續的問答系統或嵌入向量處理做準備。
但注意：這行代碼只是初始化了一個加載器對象，并沒有真正加載內容。真正加載發生在 .load() 方法中。

.load() 方法才是真正執行讀取 PDF 文件內容的操作。
它會返回一個按頁拆分的文本內容列表（List[Document]），每一頁是一個 LangChain Document 對象，包含內容和元信息（如頁碼）。

raise：再次拋出這個異常，終止程序運行（防止之后的代碼繼續運行在錯誤狀態下）。

創建一個「遞歸字符級文本切分器」，把文檔切成一段段長度不超過 1000 字符的小塊文本（chunk），且相鄰的文本塊之間有 200 字符的重疊（overlap）。

# Chunking Process
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000,chunk_overlap=200
)

相鄰兩個 chunk 之間會有 200 個字符的重疊內容。
目的是：保持語義連續性，避免重要信息被截斷在邊界上。

把 PDF 文檔內容切分成小段（chunks），轉化為向量，然后存入本地的 Chroma 向量數據庫，用于后續的語義搜索或問答（RAG）應用。

pages_split = text_splitter.split_documents(pages)  # We now apply this to our pagespersist_directory = r"vec_db"
collection_name = "stock_market"# If our collection does not exist in the directory, we create using the os command
if not os.path.exists(persist_directory):os.makedirs(persist_directory)try:# Here, we actually create the chroma database using our embeddigns modelvectorstore = Chroma.from_documents(documents=pages_split,embedding=embeddings,persist_directory=persist_directory,collection_name=collection_name)print(f"Created ChromaDB vector store!")except Exception as e:print(f"Error setting up ChromaDB: {str(e)}")raise

pages 是前面通過 PyPDFLoader().load() 得到的文檔頁（每頁是一個 Document 對象）。

text_splitter.split_documents(pages) 會對每頁文本進行切分（chunking），生成多個小段文本，每段仍然是一個 Document 對象。

切分規則由 text_splitter 定義（如：1000 字，重疊 200 字）。

檢查本地磁盤上是否已經存在名為 “vec_db” 的文件夾。

如果不存在，則使用 os.makedirs() 創建它，防止寫入數據庫時報錯。

documents=pages_split：要寫入向量數據庫的 Document 列表（每個含文本和元數據）。

embedding=embeddings：使用哪個嵌入模型把文本轉成向量。你前面定義的是 Google Gemini 的嵌入模型。

persist_directory=persist_directory：Chroma 向量數據庫在磁盤上保存的文件夾路徑。

collection_name=collection_name：給這組向量一個邏輯名稱（一個“集合”）。

從 Chroma 向量數據庫中創建一個 Retriever（檢索器），用于后續問答（RAG）或聊天任務中從文檔中“找出最相關的內容”。

# Now we create our retriever
retriever = vectorstore.as_retriever(search_type="similarity",search_kwargs={"k": 5}  # K is the amount of chunks to return
)

創建一個“文檔檢索器（Retriever）”，每次輸入一個查詢問題時，它會返回與問題最相關的 5 個文檔塊（chunk），基于語義相似度。
vectorstore 是你剛剛用 Chroma 建好的本地向量數據庫，包含了你 PDF 文檔的所有嵌入向量。
.as_retriever() 是 LangChain 的方法，它會將向量數據庫對象封裝為一個可檢索的接口對象。
指定檢索方式為 “similarity”，即使用向量相似度搜索。
k=5 表示每次檢索返回最相關的前 5 個文檔塊（chunk）。

將你之前創建的 retriever 封裝成一個可以被 LLM 代理（如 Gemini 或 LangChain Agent）調用的工具。工具的作用是：從 PDF 文檔中查找問題相關的段落內容，并返回結果給語言模型。

@tool
def retriever_tool(query: str) -> str:"""This tool searches and returns the information from the Stock Market Performance 2024 document."""docs = retriever.invoke(query)if not docs:return "I found no relevant information in the Stock Market Performance 2024 document."results = []for i, doc in enumerate(docs):results.append(f"Document {i+1}:\n{doc.page_content}")return "\n\n".join(results)

創建了一個名為 retriever_tool 的函數工具，用于在向量數據庫中檢索有關用戶查詢的問題的文檔段落（chunks），然后格式化成文本返回。

定義一個名為 retriever_tool 的函數

它接收一個字符串 query，表示用戶的查詢問題

返回一個字符串（即結果文本）

函數的文檔注釋（docstring）
這段文字會作為工具的說明文檔，提供給代理模型了解工具用途

retriever.invoke(query) 會返回與你的問題語義最相近的文檔段落列表（通常是 5 個）

如果找不到相關段落（即檢索結果為空），直接返回一段提示信息。
這樣可以避免模型誤解為“有內容但沒展示”。

將所有結果用兩個換行符拼接成一個字符串

返回給 LLM，作為檢索器的最終輸出

將你定義的外部工具（retriever_tool）綁定到語言模型（llm）上，讓它具有調用工具的能力。

tools = [retriever_tool]llm = llm.bind_tools(tools)

創建一個名為 tools 的列表，里面包含一個你自定義的工具函數 retriever_tool。

retriever_tool 是你前面定義過并用 @tool 裝飾過的檢索工具，它可以從 PDF 文檔中找出與用戶問題相關的段落。

把你定義的工具 tools 綁定到語言模型對象 llm 上，使得它在生成回答時可以調用這些工具。

這使得模型可以在需要外部知識時自動調用 retriever_tool 來輔助回答。

llm 是你使用的 ChatGoogleGenerativeAI(…)（Gemini 模型）對象。

.bind_tools(tools) 返回的是一個新模型對象，這個對象的能力比原始 llm 更強：

它能識別工具說明（docstring）

它知道哪些工具可以使用、怎么使用（參數是什么）

它會在遇到問題無法回答時自動決定是否調用工具

你可以把這個看作是“增強模型”：它除了聊天之外，還能主動調用外部函數。

構建一個LangGraph 智能代理（Agent）系統中的狀態管理與控制邏輯，這是 LangChain 中創建多步驟對話流程（如工具調用、LLM響應、條件跳轉等）時的關鍵組成部分。

class AgentState(TypedDict):messages: Annotated[Sequence[BaseMessage], add_messages]def should_continue(state: AgentState):"""Check if the last message contains tool calls."""result = state['messages'][-1]return hasattr(result, 'tool_calls') and len(result.tool_calls) > 0

AgentState 是什么？
這是一個使用 TypedDict 定義的自定義狀態類型，用來在 LangGraph 工作流中傳遞和記錄會話狀態。

messages 是 AgentState 中的唯一字段，表示當前會話的消息歷史。

它是一個消息對象的列表（Sequence[BaseMessage]），每個對象可以是：

HumanMessage：用戶輸入

AIMessage：模型響應

ToolMessage：工具返回結果

SystemMessage：系統指令

LangChain 中統一繼承自 BaseMessage。

這部分是 LangGraph 的特有寫法，表示：

這個 messages 字段在每一步節點運行時應該追加消息（而不是替換），由 add_messages 函數控制。

🧠 作用：
LangGraph 會根據 add_messages 邏輯自動將新的消息（如 LLM 的回答、工具調用結果）加入到 messages 這個狀態列表中，實現完整會話追蹤。

should_continue() 是做什么的？
它是一個用于流程控制（Conditional Edge）的函數，告訴 LangGraph：

“根據當前的 AgentState，我該不該繼續調用工具（tool）？”

獲取當前狀態中最后一條消息（即剛生成的 LLM 響應）

判斷：

最后一條消息是否有 tool_calls 屬性（即模型有沒有提出要調用某個工具）

并且這些工具調用的列表不為空

如果滿足這兩個條件，就返回 True，表示應該“繼續”，即跳轉到下一步工具調用節點。

否則就返回 False，表示無需再調用工具，可以終止或輸出結果。

什么是 system prompt？

在大多數對話式 AI 系統中（包括 GPT、Gemini、Claude 等），system prompt 是用來定義模型角色與行為規則的提示語。

它在每次對話開始時就被送入模型，作為“隱藏的第一句話”，影響模型后續所有行為。

在 LangChain 或 LangGraph 中，常通過 SystemMessage(content=system_prompt) 明確注入。

system_prompt = """
You are an intelligent AI assistant who answers questions about Stock Market Performance in 2024 based on the PDF document loaded into your knowledge base.
Use the retriever tool available to answer questions about the stock market performance data. You can make multiple calls if needed.
If you need to look up some information before asking a follow up question, you are allowed to do that!
Please always cite the specific parts of the documents you use in your answers.
"""

“你現在是一個股市問答專家，但不能靠記憶胡說八道。你只能通過一個叫 retriever_tool 的工具從我們給你的 PDF 文檔中查資料，才能回答問題。你可以多次查資料，而且請在回答時引用文檔的內容。”

構建一個具備“工具調用能力”的語言模型代理（LLM Agent）的關鍵組件，用于在 LangGraph 中調用語言模型并處理對話狀態。

tools_dict = {our_tool.name: our_tool for our_tool in tools}  # Creating a dictionary of our tools# LLM Agent
def call_llm(state: AgentState) -> AgentState:"""Function to call the LLM with the current state."""messages = list(state['messages'])messages = [SystemMessage(content=system_prompt)] + messagesmessage = llm.invoke(messages)return {'messages': [message]}

創建一個以工具名稱為 key、工具本身為 value 的字典，用于在后續流程中快速根據名稱定位工具。

定義了一個 LangGraph 中的節點函數（node），在流程圖運行時，它會根據當前 state 調用語言模型并返回新結果。

從當前的對話狀態 AgentState 中提取消息列表

state[‘messages’] 是之前記錄的用戶提問 / 工具回復 / 模型回答等內容

把你之前定義的 system_prompt 加到消息列表的最前面

這一步至關重要：它確保每次模型調用都記住“我是誰、我該怎么答、要用 retriever_tool 查資料”

把你之前定義的 system_prompt 加到消息列表的最前面

這一步至關重要：它確保每次模型調用都記住“我是誰、我該怎么答、要用 retriever_tool 查資料”

把剛才模型生成的新 AIMessage 包裝成一個新的 AgentState 字典

LangGraph 中，所有節點函數都要返回類似這種結構（新狀態），以便下一步繼續推進

🧠 配合你之前的 add_messages 注釋，這個新消息會自動合并進原有消息列表中。

LangGraph 智能代理流程中的一個核心步驟：執行工具調用的動作節點。

# Retriever Agent
def take_action(state: AgentState) -> AgentState:"""Execute tool calls from the LLM's response."""tool_calls = state['messages'][-1].tool_calls# 取出對話歷史中的最后一條消息（即模型剛剛生成的 AIMessage）
.	# tool_calls 是該消息中模型請求調用的工具清單results = []for t in tool_calls:# 可能有多個工具請求（例如一個問句查兩段內容），所以逐一處理# t 表示一個單獨的工具調用print(f"Calling Tool: {t['name']} with query: {t['args'].get('query', 'No query provided')}")# 實時打印工具名稱和參數（便于調試）if not t['name'] in tools_dict:  # Checks if a valid tool is presentprint(f"\nTool: {t['name']} does not exist.")result = "Incorrect Tool Name, Please Retry and Select tool from List of Available tools."# 如果模型給出的工具名不合法（拼錯或沒注冊），返回錯誤提示# 避免運行未知或惡意工具名else:result = tools_dict[t['name']].invoke(t['args'].get('query', ''))print(f"Result length: {len(str(result))}")# 從 tools_dict 里取出對應的工具函數（如 retriever_tool）# 調用 .invoke() 方法執行工具（工具的實際邏輯如從 PDF 中查找內容）# 獲取結果（通常是字符串），打印結果長度# Appends the Tool Messageresults.append(ToolMessage(tool_call_id=t['id'], name=t['name'], content=str(result)))# LangChain 使用 ToolMessage 來表示“工具的返回結果”# 它需要：# tool_call_id: 和模型請求時的 ID 對應（用于追蹤）# name: 工具名# content: 工具運行的結果（必須是字符串）print("Tools Execution Complete. Back to the model!")return {'messages': results}

這個函數叫 take_action，它的作用是：

從上一步模型生成的響應中提取所有工具調用（tool_calls），逐個執行對應工具函數，并把結果封裝成 ToolMessage 發送回模型。

用戶提問 → LLM 判斷是否需要工具 → 如果需要 → 調用工具 → 再回到 LLM → 最終回答。

graph = StateGraph(AgentState)
graph.add_node("llm", call_llm)
graph.add_node("retriever_agent", take_action)graph.add_conditional_edges("llm",should_continue,{True: "retriever_agent", False: END}
)
graph.add_edge("retriever_agent", "llm")
graph.set_entry_point("llm")
rag_agent = graph.compile()

創建一個新的 LangGraph 對話流程圖，名為 graph
注冊兩個節點函數（處理流程的步驟）：

“llm”：使用語言模型（如 Gemini 2.5）生成回復或提出工具調用請求 → 對應函數 call_llm

“retriever_agent”：真正執行工具調用 → 對應函數 take_action

執行完 “llm” 節點后

使用 should_continue() 判斷模型返回是否包含工具調用（tool_calls）

如果是：

跳轉到 “retriever_agent” 節點 → 實際調用工具

如果否：

直接終止（END）→ 模型回答完畢

一旦工具調用完畢，工具節點 retriever_agent 會輸出 ToolMessage 到狀態中

然后再次執行 “llm” 節點，讓模型根據工具返回結果繼續回答或再發起下一輪調用

告訴 LangGraph：流程從 “llm” 節點開始（即用戶提問后，先調用語言模型）

     ┌────────────┐│   llm      │ ?─────────────┐└────┬───────┘               ││                       │should_continue()              ││         │                 │Yes       No                 │↓         ↓                ▲
┌────────────┐  └───────? END   │
│retriever_agent│               │
└──────┬───────┘                ││                        │└────────?───────────────┘

RAG 智能問答代理的主程序入口，以命令行（終端）方式運行，實現一個可以連續提問、基于 PDF 自動查資料、調用工具回答的智能問答助手。

print("\n=== RAG AGENT===")while True:user_input = input("\nWhat is your question: ")if user_input.lower() in ['exit', 'quit']:breakmessages = [HumanMessage(content=user_input)]  # converts back to a HumanMessage typeresult = rag_agent.invoke({"messages": messages})print("\n=== ANSWER ===")print(result['messages'][-1].content)

進入一個無限循環，等待用戶在命令行輸入問題

input() 是 Python 的標準輸入函數

每次輸入的內容都會被作為提問交給智能代理

如果用戶輸入了 exit 或 quit（不區分大小寫），就退出循環，結束程序

是一種常見的 CLI 退出機制

將用戶輸入內容包裝成 HumanMessage 對象

這是 LangChain 中對話格式的標準寫法，便于與模型/工具交互

messages 是一個 list，因為智能代理是多輪對話系統（可以支持多條消息歷史）

你在前面已經構建并 compile() 出了一個圖形化代理：rag_agent

.invoke() 方法會觸發整個 LangGraph 流程，包括：

調用語言模型（llm 節點）

判斷是否需要工具（should_continue）

如果需要，調用工具（retriever_agent 節點）

工具返回結果后，再次讓模型生成最終回答

輸出 LangGraph 最終執行結果中的最后一條消息（通常是 AI 的回答）

result[‘messages’] 是整個狀態流中的歷史消息列表

[-1] 表示最后一條，通常是模型的回答 AIMessage

User input (What happened in Q2 2024?)↓
[HumanMessage(content=user_input)]↓
rag_agent.invoke(...)  → 調用 LangGraph 圖↓
[llm] → (需要工具？) → [retriever_agent] → [llm]↓
模型綜合結果回答↓
輸出 ANSWER

=== RAG AGENT===What is your question: how was the SMP500 performing in 2024?
Calling Tool: retriever_tool with query: S&P 500 performance in 2024
Result length: 4385
Tools Execution Complete. Back to the model!=== ANSWER ===
The S&P 500 index had a remarkably strong performance in 2024, delivering roughly a 25% total return (around +23% in price terms). This marked the second consecutive year of over 20% returns for the S&P 500, a feat not seen since the late 1990s. The gains were disproportionately driven by mega-cap technology stocks, particularly the "Magnificent 7" companies, which accounted for over half (about 54%) of the S&P 500's total return for the year.(Source: Document 1, Document 4, Document 5)What is your question: How did OpenAI perform in 2024?
Calling Tool: retriever_tool with query: OpenAI performance 2024
Result length: 4078
Tools Execution Complete. Back to the model!=== ANSWER ===
I am sorry, but I cannot provide information on OpenAI's performance in 2024. The provided documents discuss the overall U.S. stock market performance, highlighting the strong performance of mega-cap technology stocks and companies benefiting from AI adoption, such as Nvidia, Netflix, and Alphabet (Google). However, OpenAI is not mentioned in the provided text.What is your question: exit