什么是Agent Reflection
通常指 “智能體反思”,即讓 AI 系統通過自我反思機制優化決策或任務處理過程,類似人類通過復盤改進策略。
創建一個 SQL Agent
- 導入相關的庫
import openai
import pandas as pd
import sqlite3
from util.get_schema import get_schema
from util.make_openai_message import make_openai_message#Genimi_API_Key = "加上你的Gemini APIKey"client = openai.OpenAI(api_key=Genimi_API_Key,base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)
- 定義get_schema 來指定對應的sql表的數據類型等,以方便LLM生成更準確的sql
def get_schema():return """\
0|Team|TEXT eg. "Toronto Raptors"
1|NAME|TEXT eg. "Otto Porter Jr."
2|Jersey|TEXT eg. "0" and when null has a value "NA"
3|POS|TEXT eg. "PF"
4|AGE|INT eg. "22" in years
5|HT|TEXT eg. `6' 7"` or `6' 10"`
6|WT|TEXT eg. "232 lbs"
7|COLLEGE|TEXT eg. "Michigan" and when null has a value "--"
8|SALARY|TEXT eg. "$9,945,830" and when null has a value "--"
"""
- 定一個prompt模版方便調用
def make_openai_message(user, system=""):system_prompt = ""if system != "":system_prompt = {"role": "system","content": system,}message = [system_prompt, {"role": "user", "content": user}]return message
- 準備好LLM的prompt message
question = "What is the median weight in the NBA?"
system = f"""You are an NBA analyst with 15 years of experience writing complex SQL queries.
Consider the nba_roster table with the following schema:
{get_schema()}
Write a sqlite query to answer the following question.
Follow instructions exactly"""message = make_openai_message(question, system)
- 發起LL M提問, 提取出sqlite的SQL,以方便后面執行驗證
def get_llm_response(prompt):client_response = client.chat.completions.create(model="gemini-2.0-flash",messages=message,max_tokens=100,)generated_query = client_response.choices[0].message.contentif "sqlite" in generated_query:sql = generated_query.replace("```sqlite\n", "").replace("\n```", "")return sqlelse:return generated_querysql = get_llm_response(message)
print(sql)
- 連接sqlite,驗證sql執行情況
engine = sqlite3.connect("./nba_roster.db")
try:df = pd.read_sql(sql, con=engine)print(df)
except Exception as e:print(e)
結果驗證,驗證是否有幻覺
- 提問: What is the median weight in the NBA?
- LLM 生成的sql:
執行時候雖然沒有報錯,但是結果不對
SELECT WT FROM nba_roster WHERE WT != '' ORDER BY REPLACE(WT, ' lbs', '') LIMIT 1 OFFSET (SELECT COUNT(*) FROM nba_roster WHERE WT != '') / 2
- 正確的sql:
select
CAST(SUBSTR(WT, 1, INSTR(WT,' ')) as INTEGER) as percentile
from nba_roster
order by percentile limit 1 offset (select count(*) from nba_roster)/2;
增加反思步驟,讓LLM再生成
- 把錯誤的sql和執行結果給到大模型做反思,再生成
reflection = f"""Question: {question}.
Query: {sql} This query is invalid, {df} it cannot answer the question. Write a corrected sqlite query.""""
refection_message = make_openai_message(reflection, system)refection_message = make_openai_message(reflection, system)
sql_updated = get_llm_response(refection_message)
print(f'update query: {sql_updated}')
try:df = pd.read_sql(sql_updated, con=engine)print(df)
except Exception as e:print(e)
- 在加了反思功能之后,生成的sql執行出了正確的結果
參考
- https://learn.deeplearning.ai/courses/improving-accuracy-of-llm-applications/lesson/xf44y/create-an-evaluation 教程中用的meta-llama/Meta-Llama-3-8B-Instruct和Lamimi,我用的Gemini和openai
- nba_roster.db 文件已上傳