prompts.py 中的提示詞模板詳解
文件中定義了兩個核心提示詞模板:REASON_PROMPT
和 RELEVANT_EXTRACTION_PROMPT
。這兩個模板在 DeepResearcher 的推理過程中扮演著關鍵角色。下面我將詳細解析這兩個模板的結構和功能。
REASON_PROMPT 詳解
REASON_PROMPT
是用于指導語言模型進行推理和搜索的主要提示詞。它的設計非常精巧,包含多個關鍵部分:
1. 角色與能力定義
"You are a reasoning assistant with the ability to perform dataset searches to help "
"you answer the user's question accurately. You have special tools:\n\n"
這部分明確定義了語言模型的角色(推理助手)和核心能力(執行數據集搜索)。通過明確的角色定位,幫助模型理解其應該如何行動。
2. 工具使用說明
f"- To perform a search: write {BEGIN_SEARCH_QUERY} your query here {END_SEARCH_QUERY}.\n"
f"Then, the system will search and analyze relevant content, then provide you with helpful information in the format {BEGIN_SEARCH_RESULT} ...search results... {END_SEARCH_RESULT}.\n\n"
f"You can repeat the search process multiple times if necessary. The maximum number of search attempts is limited to {MAX_SEARCH_LIMIT}.\n\n"
這部分詳細說明了如何使用搜索工具:
- 使用特定標記包裝搜索查詢
- 系統將返回的結果格式
- 可以多次搜索的規則
- 搜索次數的限制
通過明確的工具使用說明,模型知道如何正確格式化其輸出以觸發搜索功能。
3. 示例學習 - 示例1
這是一個復雜問題的完整示例,展示了如何通過多輪搜索解決比較類問題:
"-- Example 1 --\n"
"Question: \"Are both the directors of Jaws and Casino Royale from the same country?\"\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who is the director of Jaws?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nThe director of Jaws is Steven Spielberg...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information.\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Where is Steven Spielberg from?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nSteven Allan Spielberg is an American filmmaker...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who is the director of Casino Royale?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCasino Royale is a 2006 spy film directed by Martin Campbell...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Where is Martin Campbell from?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nMartin Campbell (born 24 October 1943) is a New Zealand film and television director...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\nIt's enough to answer the question\n"
這個示例展示了:
- 如何將復雜問題分解為多個簡單查詢
- 如何基于前一步的結果構建下一步的查詢
- 如何在獲取足夠信息后停止搜索
- 正確的標記使用方式
4. 示例學習 - 示例2
這是一個較簡單問題的示例,展示了如何通過兩輪搜索解決事實查詢問題:
"-- Example 2 --\n"
"Question: \"When was the founder of craigslist born?\"\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY}Who was the founder of craigslist?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCraigslist was founded by Craig Newmark...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information.\n"
"Assistant:\n"
f" {BEGIN_SEARCH_QUERY} When was Craig Newmark born?{END_SEARCH_QUERY}\n\n"
"User:\n"
f" {BEGIN_SEARCH_RESULT}\nCraig Newmark was born on December 6, 1952...\n{END_SEARCH_RESULT}\n\n"
"Continues reasoning with the new information...\n\n"
"Assistant:\nIt's enough to answer the question\n"
這個示例展示了:
- 如何處理需要兩步推理的簡單問題
- 如何在第一步獲取實體信息后,在第二步查詢該實體的具體屬性
5. 注意事項與提醒
"**Remember**:\n"
f"- You have a dataset to search, so you just provide a proper search query.\n"
f"- Use {BEGIN_SEARCH_QUERY} to request a dataset search and end with {END_SEARCH_QUERY}.\n"
"- The language of query MUST be as the same as 'Question' or 'search result'.\n"
"- If no helpful information can be found, rewrite the search query to be less and precise keywords.\n"
"- When done searching, continue your reasoning.\n\n"
'Please answer the following question. You should think step by step to solve it.\n\n'
這部分提供了重要的使用提醒:
- 強調模型的角色是提供搜索查詢,而不是直接回答
- 重申標記的正確使用方式
- 強調查詢語言需要與問題或搜索結果語言一致
- 提供查詢優化策略
- 指導在完成搜索后繼續推理
- 鼓勵步驟化思考
RELEVANT_EXTRACTION_PROMPT 詳解
RELEVANT_EXTRACTION_PROMPT
是用于從檢索到的文檔中提取相關信息的提示詞。它的結構更加正式,采用了任務指導的形式:
1. 任務說明
"""Task Instruction:You are tasked with reading and analyzing web pages based on the following inputs: **Previous Reasoning Steps**, **Current Search Query**, and **Searched Web Pages**. Your objective is to extract relevant and helpful information for Current Search Query from the Searched Web Pages and seamlessly integrate this information into the Previous Reasoning Steps to continue reasoning for the original question.
這部分明確定義了任務的性質(分析網頁)、輸入(之前的推理步驟、當前搜索查詢、搜索到的網頁)和目標(提取相關信息并整合到推理過程中)。
2. 詳細指南
Guidelines:1. Analyze the Searched Web Pages:- Carefully review the content of each searched web page.- Identify factual information that is relevant to the Current Search Query and can aid in the reasoning process for the original question.2. Extract Relevant Information:- Select the information from the Searched Web Pages that directly contributes to advancing the **Previous Reasoning Steps**.- Ensure that the extracted information is accurate and relevant.
這部分提供了兩個主要步驟的詳細指南:
- 分析網頁內容,找出與當前查詢相關的事實信息
- 提取能夠推進推理過程的相關信息,確保準確性和相關性
3. 輸出格式規范
3. Output Format:- If the web pages provide helpful information for current search query: Present the information beginning with `**Final Information**` as shown below.- The language of query MUST BE as the same as 'Search Query' or 'Web Pages'.\n"Final Information[Helpful information]- If the web pages do not provide any helpful information for current search query: Output the following text.Final InformationNo helpful information found.
這部分詳細規定了輸出格式:
- 有用信息的格式:以"Final Information"開頭,后跟有用信息
- 無用信息的格式:固定文本"No helpful information found."
- 強調語言一致性要求
4. 輸入參數占位符
Inputs:- Previous Reasoning Steps: {prev_reasoning}- Current Search Query: {search_query}- Searched Web Pages: {document}
這部分定義了三個關鍵輸入參數的占位符:
{prev_reasoning}
:之前的推理步驟,提供上下文{search_query}
:當前的搜索查詢,指明信息提取的焦點{document}
:搜索到的網頁內容,是信息提取的源
兩個提示詞的協同工作
這兩個提示詞在 DeepResearcher 的工作流程中協同工作:
REASON_PROMPT
指導語言模型生成推理步驟和搜索查詢,形成思考過程的骨架RELEVANT_EXTRACTION_PROMPT
指導語言模型從檢索到的信息中提取相關內容,填充思考過程的細節
通過這種分工,系統能夠實現:
- 清晰的推理鏈條
- 精準的信息檢索
- 相關信息的有效提取
- 連貫的思考過程
這兩個提示詞的精心設計是 DeepResearcher 能夠模擬人類思考過程、解決復雜問題的關鍵所在。