【Text2SQL 論文】How to prompt LLMs for Text2SQL

論文：How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings

????

arXiv:2305.11853, NeurlPS 2023

Code: GitHub

一、論文速讀

本文主要是在三種常見的 Text2SQL ICL settings 評估不同的 prompt construction strategies。

二、Text2SQL ICL settings

論文在下面三種 Text2SQL settings 下來做的評估：

Zero-shot Text2SQL：輸入一個 task instruction、一個 test question 以及相應的 DB，在沒有任何 demonstrations 情況下讓 LLM 直接推理出 SQL
Single-domain Few-shot Text2SQL：ICL 的 demonstrations 是構造自與 test question 相同的 database。這個 setting 的目標是評估 LLM 在最小的域內訓練數據下執行 Text2SQL 的能力。
Cross-domain Few-shot Text2SQL：ICL 的 demonstrations 是構造自與 test question 的不同的 database 中。這個 setting 的目標是評估 LLM 通過 out-of-domain demonstrations 中來學習的泛化能力。

三、Prompt Construction

論文在每個 Text2SQL setting 中測試了不同的 prompt construction 的效果。

一個 prompt 中包含 Database Prompt 和 Demonstration Prompt。

3.1 Database Prompt

一個關系型 DB 包含 database schema 和 database content：

database schema 由 table headers 和 tables 之間的 relationships 組成。
database content 指的是存儲在 tables 中的 data

3.1.1 Database Schema 的 prompt 結構

下圖展示了之前的研究中使用的 database schema 的各種 prompt 結構：

在這里插入圖片描述

同時為了保證文本的一致性，論文對 db schema 和 SQL 做了規范化：將 SQL 中除了數據庫內容之外的所有單詞轉換為小寫，并統一文本中的空格和換行符。如下圖就是規范化前后的示例：

在這里插入圖片描述

3.1.2 Database Content 的 prompt 結構

之前的研究內容也表示，了解數據庫的內容示例可以提高模型的性能。

下圖展示了 Database Content 部分的 prompt style：

在這里插入圖片描述

InsertRow：通過 INSERT INTO 語句顯示每個 table 的幾行數據
SelectRow：顯示 SELECT * FROM T LIMIT X 的查詢結果
SelectCol：按照列式的格式顯示多行數據

本文提出使用 SELECT DISTINCT [Column] FROM [Table] LIMIT R 去列出 R 行數據，從而避免重復。

3.2 Demonstration Prompt

在 few-shots settings 中，demonstrations 被放入 prompt text 來輸入給 LLM。

在 single-domain few-shot setting 中，這里融入了一些 question-SQL 的 pairs 作為 demonstrations。

在 cross-domain few-shot setting 中，以往的研究都是：

要么 N 個 examples 都來自于一個相同的 db
要么 N 個 examples 的每一個來自于不同的 db

本文考慮了更泛用的場景：N 個 examples 是由 M 個 db 組成，每個 db 由 K 個 question-SQL pairs，由此 $\times K = N$ 。

四、實驗

本文在 Spider 數據集的 dev split 上實驗，采用執行精度（EX）來評估 predicted SQL 和 gold SQL。

這里論文指出，在選擇 few-shots 的 demonstrations 時，由于少數 db 包含長模式，這有可能導致 prompt token 數量超過 LLM 限制，所以在構造 CreateTable prompt 時，這里只使用 token 少于 1000 的 db。

具體的實驗細節可以參考原論文。

五、實驗結果

這一章介紹了在 zero-shot、single-domain 和 cross-domain 的三種 settings 下 Text2SQL 的經驗發現。

5.1 zero-shot 的 Text2SQL

zero-shot setting 中重點關注于比較不同的 database prompt construction。下圖展示了多種 database prompt 的 Codex 和 ChatGPT 的表現：

在這里插入圖片描述

實驗發現：

規范化后的 db schema 和 SQL 可以有更好的表現
db table 的 relationship 和 content 是很重要的，有效地提高了 LLM 的表現
Codex 在 zero-shot Text2SQL 任務上始終優于 ChatGPT

基于以上發現，論文建議將 Codex 與規范化后的 CreateTable-SelectCol prompt construction 結合起來使用，來實現 zero-shot 的 Text2SQL。

5.2 single-domain 的 Text2SQL

下圖展示了在 Codex 和 ChatGPT 上做 single-domain Text2SQL 任務時，不同的 in-domain examples 的執行精確度的實驗結果：

在這里插入圖片描述

得出以下結論：

in-domain 的 demonstrations 能有效提升 LLM 的表現，并隨著示例數量的提高，LLM 的效果也在逐漸變好
LLM 能夠從 in-domain demonstrations 中快速學習到 table relationship，但難以從中學習到 table content 的知識，因此 table content 的 prompt 是重要的

5.3 cross-domain 的 Text2SQL

ICL 的 demonstrations 中使用了 M 的 demonstration databases，每一個包含 K 個 NLQ-SQL pairs。

下面這個熱力圖展示了 M 和 K 的個數對精確度的影響（橫軸是 M，縱軸是 K，顏色越深，精確度越高）：

在這里插入圖片描述

這里對實驗的分析可以參考原論文。

總之，out-of-domain 的 demonstrations 增強了 LLM 在 Text2SQL 中的能力，但這些示例并沒有提供特定于 DB 的知識，因此，仔細構建 Database Prompt 仍然至關重要，這也與在 zero-shot setting 中所做的觀察是一致的。

六、總結

整的來說，論文在三種 Text2SQL ICL settings 中比較了各種 prompt constructions 的效果，為未來的研究提供了指導。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/web/23410.shtml
繁體地址，請注明出處：http://hk.pswp.cn/web/23410.shtml
英文地址，請注明出處：http://en.pswp.cn/web/23410.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！