本地大模型編程實戰(29)查詢圖數據庫NEO4J(2)

上一篇文章 用大語言模型LLM查詢圖數據庫NEO4J(1) 介紹了使用GraphQACypherChain查詢NEO4J。用它實現簡單快捷,但是不容易定制,在生產環境中可能會面臨挑戰。

本文將基于langgraph 框架,用LLM(大語言模型)查詢圖數據庫NEO4J。它可以定義清晰復雜的工作流,能應對比較復雜的應用場景。

以下是即將實現的可視化LangGraph流程:
LLM查詢圖數據庫NEO4J

文章目錄

    • 定義狀態
    • 第一個節點:護欄/guardrails
    • 節點:生成Cypher/generate_cypher(查詢NEO4J的語句)
      • 使用少量例子增強提示詞
      • 用提示詞推理Cypher
    • 節點:執行Cypher查詢
    • 生成最終回答
    • 構建工作流
    • 見證效果
    • 總結
    • 代碼
    • 參考

定義狀態

我們將首先定義 LangGraph 應用程序的輸入、輸出和整體狀態。
我們可以認為所謂的狀態是:節點之間數據交換的數據格式。它們都繼承自TypedDict

from operator import add
from typing import Annotated, List
from typing_extensions import TypedDictclass InputState(TypedDict):"""輸入"""question: strclass OverallState(TypedDict):"""整體"""question: strnext_action: strcypher_statement: strcypher_errors: List[str]database_records: List[dict]steps: Annotated[List[str], add]class OutputState(TypedDict):"""輸出"""answer: strsteps: List[str]cypher_statement: str

第一個節點:護欄/guardrails

第一個節點 guardrails 是一個簡單的“護欄”步驟:我們會驗證問題是否與電影或其演員陣容相關,如果不是,我們會通知用戶我們無法回答任何其他問題。否則,我們將進入 Cypher 生成節點。

from typing import Literalfrom langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Fieldguardrails_system = """
As an intelligent assistant, your primary objective is to decide whether a given question is related to movies or not. 
If the question is related to movies, output "movie". Otherwise, output "end".
To make this decision, assess the content of the question and determine if it refers to any movie, actor, director, film industry, 
or related topics. Provide only the specified output: "movie" or "end".
"""
guardrails_prompt = ChatPromptTemplate.from_messages([("system",guardrails_system,),("human",("{question}"),),]
)class GuardrailsOutput(BaseModel):decision: Literal["movie", "end"] = Field(description="Decision on whether the question is related to movies")from langchain_ollama import ChatOllama
llm_llama = ChatOllama(model="llama3.1",temperature=0, verbose=True)guardrails_chain = guardrails_prompt | llm_llama.with_structured_output(GuardrailsOutput)def guardrails(state: InputState) -> OverallState:"""Decides if the question is related to movies or not."""guardrails_output = guardrails_chain.invoke({"question": state.get("question")})database_records = Noneif guardrails_output.decision == "end":database_records = "This questions is not about movies or their cast. Therefore I cannot answer this question."return {"next_action": guardrails_output.decision,"database_records": database_records,"steps": ["guardrail"],}

該節點使用llama3.1,通過提示詞判斷輸出的內容是否與電影有關:如果有關則返回movie,在后面會生成Cypher并查詢圖數據庫NEO4J,如果無關則返回end,交給大語言模型處理。

節點:生成Cypher/generate_cypher(查詢NEO4J的語句)

使用少量例子增強提示詞

將自然語言轉換為準確的 Cypher 查詢極具挑戰性。增強此過程的一種方法是提供相關的少樣本示例來指導 LLM 生成查詢。為此,我們將使用 Semantic SimilarityExampleSelector 來動態選擇最相關的示例。

# Few-shot prompting
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_neo4j import Neo4jVectorexamples = [{"question": "How many artists are there?","query": "MATCH (a:Person)-[:ACTED_IN]->(:Movie) RETURN count(DISTINCT a)",},{"question": "Which actors played in the movie Casino?","query": "MATCH (m:Movie {title: 'Casino'})<-[:ACTED_IN]-(a) RETURN a.name",},{"question": "How many movies has Tom Hanks acted in?","query": "MATCH (a:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie) RETURN count(m)",},{"question": "List all the genres of the movie Schindler's List","query": "MATCH (m:Movie {title: 'Schindler's List'})-[:IN_GENRE]->(g:Genre) RETURN g.name",},{"question": "Which actors have worked in movies from both the comedy and action genres?","query": "MATCH (a:Person)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g1:Genre), (a)-[:ACTED_IN]->(:Movie)-[:IN_GENRE]->(g2:Genre) WHERE g1.name = 'Comedy' AND g2.name = 'Action' RETURN DISTINCT a.name",},{"question": "Which directors have made movies with at least three different actors named 'John'?","query": "MATCH (d:Person)-[:DIRECTED]->(m:Movie)<-[:ACTED_IN]-(a:Person) WHERE a.name STARTS WITH 'John' WITH d, COUNT(DISTINCT a) AS JohnsCount WHERE JohnsCount >= 3 RETURN d.name",},{"question": "Identify movies where directors also played a role in the film.","query": "MATCH (p:Person)-[:DIRECTED]->(m:Movie), (p)-[:ACTED_IN]->(m) RETURN m.title, p.name",},{"question": "Find the actor with the highest number of movies in the database.","query": "MATCH (a:Actor)-[:ACTED_IN]->(m:Movie) RETURN a.name, COUNT(m) AS movieCount ORDER BY movieCount DESC LIMIT 1",},
]from langchain_ollama import OllamaEmbeddings
embeddings = OllamaEmbeddings(model="nomic-embed-text")example_selector = SemanticSimilarityExampleSelector.from_examples(examples, embeddings, Neo4jVector, k=5, input_keys=["question"]
)

用提示詞推理Cypher

我們馬上要實現 Cypher 生成鏈。提示詞包含圖數據的結構、動態選擇的少樣本示例以及用戶的問題。這種組合能夠生成 Cypher 查詢,以從圖數據庫中檢索相關信息。

import osdef create_enhanced_graph():"""創建NEO4J對象"""os.environ["NEO4J_URI"] = "bolt://localhost:7687"os.environ["NEO4J_USERNAME"] = "neo4j"os.environ["NEO4J_PASSWORD"] = "neo4j"from langchain_neo4j import Neo4jGraphenhanced_graph = Neo4jGraph(enhanced_schema=True)#print(enhanced_graph.schema)return enhanced_graph
enhanced_graph = create_enhanced_graph()from langchain_core.output_parsers import StrOutputParsertext2cypher_prompt = ChatPromptTemplate.from_messages([("system",("Given an input question, convert it to a Cypher query. No pre-amble.""Do not wrap the response in any backticks or anything else. Respond with a Cypher statement only!"),),("human",("""You are a Neo4j expert. Given an input question, create a syntactically correct Cypher query to run.
Do not wrap the response in any backticks or anything else. Respond with a Cypher statement only!
Here is the schema information
{schema}Below are a number of examples of questions and their corresponding Cypher queries.{fewshot_examples}User input: {question}
Cypher query:"""),),]
)llm_qwen = ChatOllama(model="qwen2.5",temperature=0, verbose=True)text2cypher_chain = text2cypher_prompt | llm_qwen | StrOutputParser()def generate_cypher(state: OverallState) -> OverallState:"""Generates a cypher statement based on the provided schema and user input"""NL = "\n"fewshot_examples = (NL * 2).join([f"Question: {el['question']}{NL}Cypher:{el['query']}"for el in example_selector.select_examples({"question": state.get("question")})])generated_cypher = text2cypher_chain.invoke({"question": state.get("question"),"fewshot_examples": fewshot_examples,"schema": enhanced_graph.schema,})return {"cypher_statement": generated_cypher, "steps": ["generate_cypher"]}

節點:執行Cypher查詢

現在我們添加一個節點來執行生成的 Cypher 語句。如果圖數據庫沒有返回結果,我們應該明確告知 LLM,因為留空上下文有時會導致 LLM 幻覺。

可以在此節點前增加 校驗查詢更正查詢 等節點提升結果的準確性。當然,增加這樣的節點也不一定能達到預期效果,因為它們本身也可能出錯,所以要小心對待。

no_results = "I couldn't find any relevant information in the database"def execute_cypher(state: OverallState) -> OverallState:"""Executes the given Cypher statement."""records = enhanced_graph.query(state.get("cypher_statement"))return {"database_records": records if records else no_results,"next_action": "end","steps": ["execute_cypher"],}

生成最終回答

最后一步是生成答案。這需要將初始問題與圖數據庫輸出相結合,以生成相關的答案。

generate_final_prompt = ChatPromptTemplate.from_messages([("system","You are a helpful assistant",),("human",("""Use the following results retrieved from a database to provide
a succinct, definitive answer to the user's question.Respond as if you are answering the question directly.Results: {results}
Question: {question}"""),),]
)generate_final_chain = generate_final_prompt | llm_llama | StrOutputParser()def generate_final_answer(state: OverallState) -> OutputState:"""Decides if the question is related to movies."""final_answer = generate_final_chain.invoke({"question": state.get("question"), "results": state.get("database_records")})return {"answer": final_answer, "steps": ["generate_final_answer"]}

構建工作流

我們將實現 LangGraph 工作流。

先定義條件邊函數:

def guardrails_condition(state: OverallState,
) -> Literal["generate_cypher", "generate_final_answer"]:if state.get("next_action") == "end":return "generate_final_answer"elif state.get("next_action") == "movie":return "generate_cypher"

這個函數將添加到 護欄/guardrails 后面,根據上一步是否生成了Cypher查詢來決定路由到下面哪個節點去。

下面的代碼將把以上的節點和邊連接起來,成為一個完整的工作流:

from langgraph.graph import END, START, StateGraphlanggraph = StateGraph(OverallState, input=InputState, output=OutputState)
langgraph.add_node(guardrails)
langgraph.add_node(generate_cypher)
langgraph.add_node(execute_cypher)
langgraph.add_node(generate_final_answer)langgraph.add_edge(START, "guardrails")
langgraph.add_conditional_edges("guardrails",guardrails_condition,
)langgraph.add_edge("generate_cypher","execute_cypher")
langgraph.add_edge("execute_cypher","generate_final_answer")langgraph.add_edge("generate_final_answer", END)langgraph = langgraph.compile()

見證效果

萬事俱備,我們給構建好的langgraph工作流提兩個問題,看看它的表現吧:

def ask(question:str):response = langgraph.invoke({"question": question})print(f'response:\n{response["answer"]}')ask("What's the weather in Spain?")
ask("What was the cast of the Casino?")

第一個問題與電影無關,沒有查詢NEO4J,問題直接由LLM做了回答:

I'm happy to help with that! Unfortunately, I don't have access to real-time weather information for specific locations like Spain. However, I can suggest checking a reliable weather website or app, such as AccuWeather or Weather.com, for the most up-to-date forecast.Would you like me to provide some general information about Spain's climate instead?

對于第二個問題,執行時間較長,最后給出的回答是:

The cast of the movie "Casino" included James Woods, Joe Pesci, Robert De Niro, and Sharon Stone.

Nice!

總結

本文演示了通過比較復雜的langgraph構建了圖形化的工作流,由它來處理對圖數據的查詢。
我覺得使用這種方式的弊端是比較麻煩,好處則是思路很清晰、容易定制修改,更加適合在生產環境中構建比較復雜的AI應用或者智能體Agent。


代碼

本文涉及的所有代碼以及相關資源都已經共享,參見:

  • github
  • gitee

為便于找到代碼,程序文件名稱最前面的編號與本系列文章的文檔編號相同。

參考

  • Build a Question Answering application over a Graph Database

🪐感謝您觀看,祝好運🪐

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/diannao/81257.shtml
繁體地址,請注明出處:http://hk.pswp.cn/diannao/81257.shtml
英文地址,請注明出處:http://en.pswp.cn/diannao/81257.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

RPG_5.角色動畫

1.創建一個動畫實例 2.創建該實例的c子類 3.繼續創建該類的子類&#xff0c;但是作用是用來鏈接&#xff08;以后會詳細解釋&#xff09; 4.基于PlayerAnimInstance類創建一個子類 5.目前一共創建了四個c類&#xff0c; 最基的類 角色的類 玩家控制的角色的類 玩家控制的角…

Sigmoid函數導數推導詳解

Sigmoid函數導數推導詳解 在邏輯回歸中&#xff0c;Sigmoid函數的導數推導是一個關鍵步驟&#xff0c;它使得梯度下降算法能夠高效地計算。 1. Sigmoid函數定義 首先回顧Sigmoid函數的定義&#xff1a; g ( z ) 1 1 e ? z g(z) \frac{1}{1 e^{-z}} g(z)1e?z1? 2. 導…

MS31860T——8 通道串行接口低邊驅動器

MS31860T 是一款 8 通道低邊驅動器&#xff0c;包含 SPI 串口通信、 PWM斬波器配置、過流保護、短路保護、欠壓鎖定和過熱關斷功能&#xff0c; 芯片可以讀取每個通道的狀態。MS31860T 可以診斷開路的負載情況&#xff0c;并可以讀取故障信息。外部故障引腳指示芯片的故障狀態。…

騰訊 Kuikly 正式開源,了解一下這個基于 Kotlin 的全平臺框架

在 3月的時候通過 《騰訊 TDF 即將開源 Kuikly 跨端框架&#xff0c;Kotlin 支持全平臺》 我們大致知道了 Kuikly 的基本情況&#xff0c;Kuikly 是一個面向終端技術棧的跨端開發框架&#xff0c;完全基于kotlin語言開發&#xff0c;提供原生的性能和體驗。 按照官方的說法&…

AI驅動UI自動化測試框架調研

隨著應用復雜度增加&#xff0c;手動測試變得費時且易出錯&#xff0c;而自動化測試可提高效率和可靠性。如何借助大模型和一些自動化測試框架進行自動化測試&#xff0c;是一個研發團隊很重要的訴求。 目前主流的自動化測試框架很多&#xff0c;Midscene.js結合Playwright提供…

關系型數據庫設計指南

1. 前言 在自己獨立開發一個項目的過程中&#xff0c;我發現了一些以往寫小 Demo 從來沒有遇到過的問題。 最近在獨立制作一個全棧的通知管理平臺。一開始我沒有考慮太多&#xff0c;直接根據頭腦中零星的想法就開擼后端數據庫 model 和 API&#xff0c;用的是學了半成品的 M…

詳解TypeScript中的類型斷言及其繞過類型檢查機制

TypeScript中的類型斷言及其繞過類型檢查機制 一、類型斷言的本質與工作原理編譯時與運行時的區別TypeScript編譯器處理類型斷言的步驟 二、類型斷言的詳細語法與進階用法基礎語法對比鏈式斷言斷言修飾符1. 非空斷言操作符 (!)代碼分析1. getLength 函數分析用法說明&#xff1…

XLSX.utils.sheet_to_json設置了blankrows:true,但無法獲取到開頭的空白行

在用sheetJs的XLSX庫做導入&#xff0c;遇到一個bug。如果開頭行是空白行的話&#xff0c;調用sheet_to_json轉數組獲得的數據也是沒有包含空白行的。這樣會導致在設置對應的起始行時&#xff0c;解析數據不生效。 目前是直接跳過了開頭的兩行空白行 正確應該獲得一下數據 問…

PostgreSQL 數據庫下載和安裝

官網&#xff1a; PostgreSQL: Downloads 推薦下載網站&#xff1a;EDB downloads postgresql 我選了 postgresql-15.12-1-windows-x64.exe 鼠標雙擊&#xff0c;開始安裝&#xff1a; 安裝路徑&#xff1a; Installation Directory: D:\Program Files\PostgreSQL\15 Serv…

一、Javaweb是什么?

1.1 客戶端與服務端 客戶端 &#xff1a;用于與用戶進行交互&#xff0c;接受用戶的輸入或操作&#xff0c;且展示服務器端的數據以及向服務器傳遞數據。 例如&#xff1a;手機app&#xff0c;微信小程序、瀏覽器… 服務端 &#xff1a;與客戶端進行交互&#xff0c;接受客戶…

奇偶ASCII值判斷

奇偶ASCII值判斷 Description 任意輸入一個字符&#xff0c;判斷其ASCII是否是奇數&#xff0c;若是&#xff0c;輸出YES&#xff0c;否則&#xff0c;輸出NO。例如&#xff0c;字符A的ASCII值是65&#xff0c;則輸出YES&#xff0c;若輸入字符B(ASCII值是66)&#xff0c;則輸…

OpenCV 圖形API(74)圖像與通道拼接函數-----合并三個單通道圖像(GMat)為一個多通道圖像的函數merge3()

操作系統&#xff1a;ubuntu22.04 OpenCV版本&#xff1a;OpenCV4.9 IDE:Visual Studio Code 編程語言&#xff1a;C11 算法描述 從3個單通道矩陣創建一個3通道矩陣。 此函數將多個矩陣合并以生成一個單一的多通道矩陣。即&#xff0c;輸出矩陣的每個元素將是輸入矩陣元素的…

多節點監測任務分配方法比較與分析

多監測節點任務分配方法是分布式系統、物聯網&#xff08;IoT&#xff09;、工業監測等領域的核心技術&#xff0c;其核心目標是在資源受限條件下高效分配任務&#xff0c;以優化系統性能。以下從方法分類、對比分析、應用場景選擇及挑戰等方面進行系統闡述&#xff1a; 圖1 多…

【推薦系統筆記】BPR損失函數公式

一、BPR損失函數公式 BPR 損失函數的核心公式如下&#xff1a; L BPR ? ∑ ( u , i , j ) ∈ D ln ? σ ( x ^ u i j ) λ ∣ ∣ Θ ∣ ∣ 2 L_{\text{BPR}} - \sum_{(u, i, j) \in D} \ln \sigma(\hat{x}_{uij}) \lambda ||\Theta||^2 LBPR??(u,i,j)∈D∑?lnσ(x^ui…

Java 核心--泛型枚舉

作者&#xff1a;IvanCodes 發布時間&#xff1a;2025年4月30日&#x1f913; 專欄&#xff1a;Java教程 各位 CSDN伙伴們&#xff0c;大家好&#xff01;&#x1f44b; 寫了那么多代碼&#xff0c;有沒有遇到過這樣的“驚喜”&#xff1a;滿心歡喜地從 ArrayList 里取出數據…

新能源行業供應鏈規劃及集成計劃報告(95頁PPT)(文末有下載方式)

資料解讀&#xff1a;《數字化供應鏈規劃及集成計劃現狀評估報告》 詳細資料請看本解讀文章的最后內容。 該報告圍繞新能源行業 XX 企業供應鏈展開&#xff0c;全面評估其現狀&#xff0c;剖析存在的問題&#xff0c;并提出改進方向和關鍵舉措&#xff0c;旨在提升供應鏈競爭力…

Centos 7 yum配置出現一下報錯:

One of the configured repositories failed (CentOS-$releaserver-Base), and yum doesnt have enough cached data to continue. At this point the only safe thing yum can do is fail. There are a few ways to work "fix" this: 1.解決CentOS Yum Repositor…

Redis 常見問題深度剖析與全方位解決方案指南

Redis 是一款廣泛使用的開源內存數據庫&#xff0c;在實際應用中常會遇到以下一些常見問題&#xff1a; 1.內存占用問題 問題描述&#xff1a;隨著數據量的不斷增加&#xff0c;Redis 占用的內存可能會超出預期&#xff0c;導致服務器內存不足&#xff0c;影響系統的穩定性和…

HOOK上癮思維模型——AI與思維模型【88】

一、定義 HOOK上癮思維模型是一種通過設計一系列的觸發&#xff08;Trigger&#xff09;、行動&#xff08;Action&#xff09;、獎勵&#xff08;Reward&#xff09;和投入&#xff08;Investment&#xff09;環節&#xff0c;來促使用戶形成習慣并持續使用產品或服務的思維框…

【playwright】內網離線部署playwright

背景&#xff1a;安裝好python3.9后&#xff0c;由于內網無法使用pip安裝playwright&#xff0c;多方收集資料&#xff0c;終于部署完成&#xff0c;現匯總如下&#xff1a; 1、playwright需要python3.7以上的版本&#xff0c;如果低于這個版本先要將python解釋器升級 2、在可…