使用 COPY 加速 PostgreSQL 批量插入

文章目錄

1.copy命令介紹
2.copy vs insert的優勢
3.測量性能
4.結論

1.copy命令介紹

PostgreSQL 中的命令COPY是執行批量插入和數據遷移的強大工具。它允許快速有效地將大量數據插入表中。

COPY命令為批量插入和數據遷移提供了更簡單且更具成本效益的解決方案。
可以避免使用諸如：分布式處理工具、為數據庫添加更多的 CPU 和 RAM的方案或者其他的加速方案
因此，如果有一個任務需要在短時間內插入大量行，可以考慮使用COPY 命令。它可以顯著加快數據遷移和載入過程。
據說PostgreSQL 16 已將 COPY 的性能提高了 300% 以上
詳盡的有關copy命令的語法可參考官網

2.copy vs insert的優勢

Three-Column Table

	COPY	INSERT (multi-line)
Logging	One log for the entire load	One log for each line/entry
Network	No latency, data is streamed	Latency between inserts
Parsing	Only one parsing operation	Parsing overhead
Transaction	Single transaction	Each insert statement is a separate transaction
Query Plan	Simpler query execution plan	Lots of different query execution plans

總而言之，COPY 速度更快，因為與多行 INSERT 語句相比，它減少了日志記錄、網絡延遲、解析和事務管理的開銷。它允許更簡單的查詢執行計劃，從而實現更快、更高效的批量插入和數據遷移。一個權衡是它需要直接訪問文件系統，因此它可能并不適合所有需要插入數據的場景。另一個權衡是持久性，COPY 生成很少的日志，并在單個事務中執行所有日志，這使得它的風險更大。

3.測量性能

創建3個測試表

test=# create table t1 (id1 bigint,id2 bigint);
CREATE TABLE
Time: 7.744 ms
test=# create table t2 (id1 bigint,id2 bigint);
CREATE TABLE
Time: 8.680 ms
test=# create table t3 (id1 bigint,id2 bigint);
CREATE TABLE
Time: 0.924 ms

向t1插入1千萬筆測試資料，產生size 422MB的測試表

test=# insert into t1 select generate_series(1,10000000),generate_series(10000000,1,-1);
INSERT 0 10000000
Time: 11933.658 ms (00:11.934)
test=# select count(1),pg_size_pretty(pg_relation_size('t1')) from t1;count   | pg_size_pretty 
----------+----------------10000000 | 422 MB(1 row)Time: 377.028 ms

匯出成csv文件備用

test=# \copy t2 from '/var/lib/postgresql/t1.csv';
COPY 10000000
Time: 5997.302 ms (00:05.997)

驗證匯出的csv文件的數據行數與大小

postgres@pgd-prod01:~$ cat t1.csv|wc -l
10000000
postgres@pgd-prod01:~$ ls -alh|grep t1;
-rw-rw-r--  1 postgres postgres  151M Nov 18 11:26 t1.csv

test=# insert into t3 select * from t1;
INSERT 0 10000000
Time: 9811.316 ms (00:09.811)

4.結論

最后測試結果表明，COPY 命令與 INSERT 命令相比具有更高的效率，速度上的差異是相當顯著的，當插入同樣的1仟萬筆數據時，copy費時5997.302 ms,而insert費時9811.316 ms,相較insert而言，節約40%的時間，這是在postgresql 10版本的測試，postgresql 16據說提升更多

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/161145.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/161145.shtml
英文地址，請注明出處：http://en.pswp.cn/news/161145.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！