Routing機制與Query Construction策略
- 前言
- Routing
- Logical Routing
- ChatOpenAI
- Structured
- Routing Datasource
- Conclusion
- Semantic Routing
- Embedding & LLM
- Prompt
- Rounting Prompt
- Conclusion
- Query Construction
- Grab Youtube video information
- Structured
- Prompt
- Github
- References
前言
本文引用上一篇博客的作法。在本地開啟一個代理服務器,然后使用OpenAI的ChatOpenAI作為聊天接口客戶端,使其發送的請求鏈接到我們的本地服務器。
Routing
在傳統的 RAG 架構中,所有查詢都走統一的 Retriever和 Prompt 模板,這在多源數據或多任務系統中存在檢索結果不相關、內容不精準、用戶意圖模糊等局限性。為了解決這一問題,Routing
機制可以根據用戶提出的問題,智能地路由到最相關的知識源或處理流程中,以提升回答的精準性與效率。
Logical Routing
ChatOpenAI
如前言所述,使用ChatOpenAI
聊天接口客戶端,但不適用GPT
模型。
os.environ['VLLM_USE_MODELSCOPE'] = 'True'
chat = ChatOpenAI(model='Qwen/Qwen3-0.6B',openai_api_key="EMPTY",openai_api_base='http://localhost:8000/v1',stop=['<|im_end|>'],temperature=0
)
Fig .1 Logical Routing framework diagram
Structured
在Prompt
中,指示LLM根據當前編程語言選擇最相關的數據源。然后通過|
管道運算符傳入with_structured_output
。其中, with_structured_output
的作用是讓大模型生成和RouteQuery
數據格式一樣的數據結構。也稱結構化數據。
class RouteQuery(BaseModel):"""Route a user query to the most relevant datasource."""datasource: Literal["python_docs", "js_docs", "golang_docs"] = Field(...,description="Given a user question choose which datasource would be most relevant for answering their question",)# 生成結構化對象;其目的是讓llm嚴格地按照RouteQuery結構體格式化為對應的JSON格式,并自動解析成Python對象。”
"""
數據結構:
RouteQuery(datasource='python_docs' # 或 'js_docs' 或 'golang_docs'
)
JSON格式:
{"datasource": "python_docs"
}
"""
structured_llm = chat.with_structured_output(RouteQuery)# Prompt
system = """You are an expert at routing a user question to the appropriate data source.Based on the programming language the question is referring to, route it to the relevant data source."""prompt = ChatPromptTemplate.from_messages([("system", system),("human", "{question}"),]
)# Define router
router = prompt | structured_llm
當運行這一部分之后,LLM只會生成["python_docs", "js_docs", "golang_docs"]
三者中的其中一個,因為datasource: Literal["python_docs", "js_docs", "golang_docs"]
指定了它們三個作為候選值。輸出的結果如下所示。
"""
數據結構:
RouteQuery(datasource='python_docs' # 或 'js_docs' 或 'golang_docs'
)
JSON格式:
{"datasource": "python_docs"
}
"""
Fig .2 Data structured flow chart
Routing Datasource
在Question
中給出需要判斷的編程語言。然后經過上一節Structured
的運算后,得到一個結構化數據。最終只需要調用choose_route
匹配對應數據源即可。
question = """Why doesn't the following code work:from langchain_core.prompts import ChatPromptTemplateprompt = ChatPromptTemplate.from_messages(["human", "speak in {language}"])
prompt.invoke("french")
"""# result = router.invoke({"question": question})
def choose_route(result):if "python_docs" in result.datasource.lower():### Logic herereturn "chain for python_docs"elif "js_docs" in result.datasource.lower():### Logic herereturn "chain for js_docs"else:### Logic herereturn "golang_docs"full_chain = router | RunnableLambda(choose_route)full_chain.invoke({"question": question})
Conclusion
上述內容通過一個甄別編程語言的案例,講述了如何結構化數據,以及如何根據結構化后的數據選擇對應的數據源。這有利于我們在實際應用中,根據用戶所提問題,Rounting
到最相關的數據源或向量數據庫中,可以極大地提升召回率。
Semantic Routing
Embedding & LLM
定義ModelScope
社區中開源的Embedding
和Text Generation
模型。
embedding = ModelScopeEmbeddings(model_id='iic/nlp_corom_sentence-embedding_english-base')
# 使用vllm部署OpenAI Serve,然后使用ChatOpenAI
os.environ['VLLM_USE_MODELSCOPE'] = 'True'
chat = ChatOpenAI(model='Qwen/Qwen3-0.6B',openai_api_key="EMPTY",openai_api_base='http://localhost:8000/v1',stop=['<|im_end|>'],temperature=0
)
Prompt
定義兩個prompt template
,并對其向量化。用于后續根據用戶所提問題,選擇合適的prompt
。
physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise and easy to understand manner. \
When you don't know the answer to a question you admit that you don't know.Here is a question:
{query}"""math_template = """You are a very good mathematician. You are great at answering math questions. \
You are so good because you are able to break down hard problems into their component parts, \
answer the component parts, and then put them together to answer the broader question.Here is a question:
{query}"""prompt_templates = [physics_template, math_template]
prompt_embeddings = embedding.embed_documents(prompt_templates)
Fig .3 Semantic Routing framework diagram
Rounting Prompt
對用戶查詢進行向量化,便于計算用戶查詢和prompt template
之間的余弦相似度,根據相似度最高的下標,獲取對應的prompt
。最終交予LLM
處理。
# 根據計算余弦相似度,得到輸入`query`和`templates`中相似度最高的一個`template`
def prompt_router(input):# 向量化 `query`query_embedding = embedding.embed_query(input["query"])# 計算輸入`query`和`prompt`之間的余弦相似度similarity = cosine_similarity([query_embedding], prompt_embeddings)[0]# 以相似度最高的下標獲取對應的templatemost_similar = prompt_templates[similarity.argmax()]# Chosen promptprint("Using MATH" if most_similar == math_template else "Using PHYSICS")return PromptTemplate.from_template(most_similar)# RunnablePassthrough直接返回原值
chain = ({"query": RunnablePassthrough()}| RunnableLambda(prompt_router)| chat| StrOutputParser()
)answer = chain.invoke("What's a black hole")
print(answer)
Conclusion
上述內容描述了一種根據用戶查詢動態匹配Prompt
的策略。
Query Construction
Grab Youtube video information
國內訪問下述內容,可能會出現urllib.error.HTTPError: HTTP Error 400: Bad Request
異常。為了解決這一問題,我們通過另外一種方式,同樣可以構造datasource
。
docs = YoutubeLoader.from_youtube_url("https://www.youtube.com/watch?v=pbAd8O1Lvm4", add_video_info=False
).load()print(docs[0].metadata)
通過subprocess.run
執行一個腳本,并使用yt-dlp
下載視頻信息,然后以JSON
數據格式輸出。最后,根據數據信息構造我們所需的datasource
即可。
result = subprocess.run(["yt-dlp", "--dump-json", f"https://www.youtube.com/watch?v=pbAd8O1Lvm4"],capture_output=True, text=True)
video_info = json.loads(result.stdout)metadata = {"source": 'pbAd8O1Lvm4',"title": video_info.get("title", "Unknown"),"description": video_info.get("description", "Unknown"),"view_count": video_info.get("view_count", 0),"thumbnail_url": video_info.get("thumbnail", ""),"publish_date": datetime.strptime(video_info.get("upload_date", "19700101"), "%Y%m%d").strftime("%Y-%m-%d 00:00:00"),"length": video_info.get("duration", 0),"author": video_info.get("uploader", "Unknown"),
}
Structured
下列定義了一個結構化搜索查詢模式,與Routing
中的Structured
一樣。其目的是將自然語言轉為結構化搜索查詢。
class TutorialSearch(BaseModel):"""Search over a database of tutorial videos about a software library."""content_search: str = Field(...,description="Similarity search query applied to video transcripts.",)title_search: str = Field(...,description=("Alternate version of the content search query to apply to video titles. ""Should be succinct and only include key words that could be in a video ""title."),)min_view_count: Optional[int] = Field(None,description="Minimum view count filter, inclusive. Only use if explicitly specified.",)max_view_count: Optional[int] = Field(None,description="Maximum view count filter, exclusive. Only use if explicitly specified.",)earliest_publish_date: Optional[date] = Field(None,description="Earliest publish date filter, inclusive. Only use if explicitly specified.",)latest_publish_date: Optional[date] = Field(None,description="Latest publish date filter, exclusive. Only use if explicitly specified.",)min_length_sec: Optional[int] = Field(None,description="Minimum video length in seconds, inclusive. Only use if explicitly specified.",)max_length_sec: Optional[int] = Field(None,description="Maximum video length in seconds, exclusive. Only use if explicitly specified.",)def pretty_print(self) -> None:for field in self.__fields__:if getattr(self, field) is not None and getattr(self, field) != getattr(self.__fields__[field], "default", None):print(f"{field}: {getattr(self, field)}")
Fig .4 Data structured flow chart
Prompt
下述Prompt
引導模型將用戶的自然語言問題轉化為結構化的數據庫查詢指令。
system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Given a question, return a database query optimized to retrieve the most relevant results.If there are acronyms or words you are not familiar with, do not try to rephrase them."""
prompt = ChatPromptTemplate.from_messages([("system", system),("human", "{question}"),]
)# 根據問題語義,將問題中涉及的內容映射到 metadata 的結構化字段中。
structured_llm = llm.with_structured_output(TutorialSearch)
query_analyzer = prompt | structured_llm
對于用戶的提問,則將此Question
根據Prompt
進行結構化。其目的是將用戶提問中所包含的單詞映射到datasource
中合適的字段。
query_analyzer.invoke({"question": "rag from scratch"}).pretty_print()
上述代碼運行之后,LLM
會根據語義自動構造合適的結構化數據。
content_search: rag from scratch
title_search: rag from scratch
再例如,Question
是2023 年在 Chat Langchain 上發布的視頻
,其中很明顯日期應該對應datasource
字段中的包含時間日期的對應項,例如earliest_publish_date,latest_publish_date
。
query_analyzer.invoke({"question": "videos on chat langchain published in 2023"}
).pretty_print()
content_search: chat langchain
title_search: 2023
earliest_publish_date: 2023-01-01
latest_publish_date: 2024-01-01
Github
https://github.com/FranzLiszt-1847/LLM
References
[1] https://github.com/langchain-ai/rag-from-scratch/blob/main/rag_from_scratch_5_to_9.ipynb