【GaussDB】使用gdb定位GaussDB編譯package報錯

【GaussDB】使用gdb定位GaussDB編譯package報錯

背景

在某次遷移Oracle到GaussDB時,應用開發人員將改好的package在GaussDB里進行創建,沒有ERROR也沒有WARNING,但是編譯無效對象的時候報錯了。雖然已經找到了是哪個包編譯報錯,但是這個包有上萬行,而且里面也有好幾十個procedure,而報錯信息僅僅只有 ERROR: Failed to query the 323 type in the cache.,沒有上下文信息,連行號都沒有,根本無從判斷是哪里出了問題。

基本排查

  • 嘗試drop這個package,然后重建,再編譯,現象一樣
  • 嘗試重啟數據庫以清空全局PLSQL緩存,再編譯,現象一樣

這意味著這個問題與緩存無關,大概率也與其他依賴對象無關,所以暫時針對這個package進行排查。

對于不會gdb調試的人來說,要排查這個問題只能對著這個package,使用二分法來刪除里面的procedure,直到刪到某個procedure前后報錯發生變化,但要注意里面的procedure的依賴。當時就這么一路刪,最后的確發現了原因,但是耗費的時間非常久,解決依賴關系時還要手動改代碼。

那么有沒有一種方式能迅速定位是哪個procedure的問題么?
當然有,那就是使用gdb直接進行內核級別調試,因為Gauss系數據庫編譯package時,都是會逐個對里面的每一個procedure和function進行編譯。

gdb調試前置準備

由于已經找到了觸發這個報錯的package特征,因此下面就用最小化模擬用例來進行演示:

測試用例

create package pkg_test_4 is
procedure p1(i1 in varchar2,i2 out varchar2,i3 out varchar2);
end;  
/
create package body pkg_test_4 is
procedure p1(i1 in varchar2,i2 in varchar2,i3 out varchar2) isbeginnull;end;
end;  
/
alter package pkg_test_4 compile;

執行效果

gaussdb=# alter package pkg_test_4 compile;
gaussdb=# create or replace package pkg_test_4 is
gaussdb$# procedure p1(i1 in varchar2,
gaussdb$#              i2 out varchar2,
gaussdb$#              i3 out varchar2);
gaussdb$# end;
gaussdb$# /
CREATE PACKAGE
gaussdb=# create or replace package body pkg_test_4 is
gaussdb$# procedure p1(i1 in varchar2,
gaussdb$#              i2 in varchar2,
gaussdb$#              i3 out varchar2) is
gaussdb$#              begin
gaussdb$#                null;
gaussdb$#              end;
gaussdb$# end;
gaussdb$# /
CREATE PACKAGE BODY
gaussdb=# alter package pkg_test_4 compile;
ERROR:  Failed to query the 323 type in the cache.
gaussdb=#

使用gdb調試找問題有個關鍵,就是這個問題最好是能穩定復現的,否則gdb抓不到報錯現場也很難分析問題。

另外,開始gdb調試前,一定要先把對應版本的符號表準備好,比較簡單的方式就是直接把符號表里的bin和lib解壓到GaussDB的bin和lib目錄。

在之前分析MogDB的問題時,我們內核研發有教過我可以使用 b errstart if elevel>19設置斷點來斷住所有 ERROR以上級別的報錯,但是這招在GaussDB似乎不行了

(gdb) b errstart if elevel>19
No symbol "elevel" in current context.
(gdb)

如果直接 b errstart,是可以斷,但是會老是斷,根本沒法跑起來,因為這里就算沒報錯也會調用進來,幾乎所有線程都在頻繁走到這里,參考openGauss源碼中的錯誤級別代碼,里面連INFO/NOTICE都有

/* Error level codes */
#define DEBUG5                                 \10 /* Debugging messages, in categories of \* decreasing detail. */
#define DEBUG4 11
#define DEBUG3 12
#define DEBUG2 13
#define DEBUG1 14 /* used by GUC debug_* variables */
#define LOG                                         \15 /* Server operational messages; sent only to \* server log by default. */
#define COMMERROR                                    \16 /* Client communication problems; same as LOG \* for server reporting, but never sent to    \* client. */
#define INFO                                          \17 /* Messages specifically requested by user (eg \* VACUUM VERBOSE output); always sent to      \* client regardless of client_min_messages,   \* but by default not sent to server log. */
#define NOTICE                                        \18 /* Helpful messages to users about query       \* operation; sent to client and server log by \* default. */
#define WARNING                                      \19 /* Warnings.  NOTICE is for expected messages \* like implicit sequence creation by SERIAL. \* WARNING is for unexpected messages. */
#define ERROR                                       \20 /* user error - abort transaction; return to \* known state */
#define VERBOSEMESSAGE                                  \9 /* indicates to show verbose info for CN and DNs; \* for DNs means to send info back to CN */
/* Save ERROR value in PGERROR so it can be restored when Win32 includes* modify it.  We have to use a constant rather than ERROR because macros* are expanded only when referenced outside macros.*/
#ifdef WIN32
#define PGERROR 20
#endif
#define FATAL 21 /* fatal error - abort process */
#define PANIC 22 /* take down the other backends with me *//* MAKE_SQLSTATE('P', '1', '0' , '0', '0')=96 */
#define CUSTOM_ERRCODE_P1 96

看一下 b errstart 會斷到哪里

(gdb) b errstart
Breakpoint 1 at 0x564904871b20: errstart. (3 locations)
(gdb) info b
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   <MULTIPLE>
1.1                         y   0x0000564904871b20 in errstart(int, char const*, int, char const*, char const*)at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp:4108
1.2                         y   0x00007f6b9cd5e5b0 <errstart(int, char const*, int, char const*, char const*)@plt>
1.3                         y   0x00007f6b9d03eb30 <errstart(int, char const*, int, char const*, char const*)@plt>
(gdb)

從這個斷點信息里來看,errstartelog.cpp的4108行,這很是可疑,因為無論是openGauss還是MogDB,這個errstart函數應該在更前面的位置,大概是第兩百多行的地方。
由于沒有源碼,就只能反匯編看下有沒有能參考的信息了

(gdb) disassemble /m errstart
Dump of assembler code for function errstart(int, char const*, int, char const*, char const*):
238     /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp: No such file or directory.0x0000564904871b25 <+5>:     push   %rbp0x0000564904871b26 <+6>:     mov    %rsp,%rbp0x0000564904871b29 <+9>:     push   %r150x0000564904871b2b <+11>:    push   %r140x0000564904871b2d <+13>:    push   %r130x0000564904871b2f <+15>:    push   %r120x0000564904871b31 <+17>:    mov    %rsi,%r140x0000564904871b34 <+20>:    push   %rbx0x0000564904871b35 <+21>:    mov    %edi,%ebx0x0000564904871b37 <+23>:    sub    $0x58,%rsp0x0000564904871b3e <+30>:    mov    %edx,-0x64(%rbp)0x0000564904871b41 <+33>:    mov    %rcx,-0x70(%rbp)0x0000564904871b45 <+37>:    mov    %r8,-0x78(%rbp)239     in /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp
240     in /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp

可以看到這個函數第一次出現其實是在第238行,從這幾個寄存器的操作來看,應該就是函數的入口,也就是說,實際上這個函數的定義應該在第238行,而不是前面的4108行。我觀察了下,4108行的elevel一直是被優化掉的,看不到里面的值,只有238行的elevel能觀測到值。
所以在GaussDB里要斷ERROR及以上級別的錯誤,斷點應該設置為

b elog.cpp:238 if (elevel > 19)

可以提前先準備好下面的命令,gdb進去后直接復制粘貼,減少進程中斷時間

b elog.cpp:238 if (elevel > 19) 
handle SIGUSR1 nostop noprint
handle SIGUSR2 nostop noprint
handle SIGPIPE nostop
set pagi off
set print elements 300
continue

正式開始gdb調試

先用 ps -ef |grep gaussdb找到進程號
然后gdb -p 進程號

[gaussdb506@ky10-sp3 ~]$ ps -ef |grep gaussdb
root      426694  426551  0 09:19 pts/0    00:00:00 su - gaussdb506
gaussdb+  426699  426694  0 09:19 pts/0    00:00:00 -bash
gaussdb+  427027  426699  0 09:19 pts/0    00:00:00 ps -ef
gaussdb+  427028  426699  0 09:19 pts/0    00:00:00 grep gaussdb
og_last+ 3231792       1  1 Jul22 ?        05:58:52 /opt/og_lastest/openGauss-server/dest/bin/gaussdb
og700rc1 3508544       1  1 Aug04 ?        00:33:58 /opt/og700rc1/app/bin/gaussdb -D /opt/og700rc1/data -M primary
gaussdb+ 3864702       1 29 Aug05 ?        06:33:50 /data/gaussdb506/app/bin/gaussdb
[gaussdb506@ky10-sp3 ~]$ gdb -p 3864702
GNU gdb (GDB) KylinOS 9.2-3.p01.ky10
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-kylin-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:<http://www.gnu.org/software/gdb/documentation/>.For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 3864702
[New LWP 3864703]
[New LWP 3864749]
...#省略
[New LWP 4081944]warning: File "/usr/lib64/libthread_db-1.0.so" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
--Type <RET> for more, q to quit, c to continue without paging--
To enable execution of this file addadd-auto-load-safe-path /usr/lib64/libthread_db-1.0.so
line to your configuration file "/home/gaussdb506/.gdbinit".
To completely disable this security protection addset auto-load safe-path /
line to your configuration file "/home/gaussdb506/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:info "(gdb)Auto-loading safe path"warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.warning: File "/usr/lib64/libthread_db-1.0.so" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
0x00007f6bbc53c849 in poll () from /usr/lib64/libc.so.6
(gdb) b elog.cpp:238 if (elevel > 19)
Breakpoint 1 at 0x564904871b25: file /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp, line 238.
(gdb) handle SIGUSR1 nostop noprint
Signal        Stop      Print   Pass to program Description
SIGUSR1       No        No      Yes             User defined signal 1
(gdb) handle SIGUSR2 nostop noprint
Signal        Stop      Print   Pass to program Description
SIGUSR2       No        No      Yes             User defined signal 2
(gdb) handle SIGPIPE nostop
Signal        Stop      Print   Pass to program Description
SIGPIPE       No        Yes     Yes             Broken pipe
(gdb) set pagi off
(gdb) set print elements 300
(gdb) continue
Continuing.
[New LWP 427599]
[New LWP 427600]
[LWP 427599 exited]
[New LWP 427601]
[LWP 427601 exited]
[New LWP 427602]
[LWP 427602 exited]
[LWP 427600 exited]

當后面不斷有輸出 [New LWP xxxxxx]時,gaussdb就是正常運行中了。
接下來可以開一個客戶端連接,執行上面用于模擬測試的sql,會卡在 alter package pkg_test_4 compile; 這個語句上,同時gdb的窗口不再連續打印 [New LWP xxxxxx],而是命中了斷點

[Switching to LWP 3864766]Thread 16 "TPLworker" hit Breakpoint 1, errstart (elevel=20, filename=0x564908abfcc8 "format_type.cpp", lineno=216, funcname=0x564908abfdd0 <format_type_internal(unsigned int, int, bool, bool, bool)::__func__> "format_type_internal", domain=0x5649087e1004 "plpgsql-9.2") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp:238
238     in /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp
(gdb)

接著輸入bt查看堆棧

(gdb) bt
#0  errstart (elevel=20, filename=0x564908abfcc8 "format_type.cpp", lineno=216, funcname=0x564908abfdd0 <format_type_internal(unsigned int, int, bool, bool, bool)::__func__> "format_type_internal", domain=0x5649087e1004 "plpgsql-9.2") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/error/elog.cpp:238
#1  0x00005649043cc8ae in format_type_internal (type_oid=323, typemod=-1, typemod_given=<optimized out>, allow_invalid=<optimized out>, include_nspname=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/adt/format_type.cpp:212
#2  0x00005649045b82e8 in format_procedure (procedure_oid=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/adt/regproc.cpp:473
#3  0x0000564904a2d53d in do_compile (fcinfo=0x7f6a520460c0, proc_tup=0x7f69b7c375a0, func=0x7f6a39c64050, compile_func_head_info=0x7f6a52046740, for_validator=true, hashkey=0x7f6a52045d50) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_func_main.cpp:921
#4  0x0000564904a35f3b in gsplsql_compile (fcinfo=0x7f6a520460c0, compile_func_head_info=0x7f6a52046740, for_validator=true, isRecompile=false, func_runtime_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_func_main.cpp:3106
#5  0x0000564906c64eeb in plpgsql_validator (fcinfo=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/pl/plpgsql/src/pl_handler.cpp:1481
#6  0x00005649048a56cb in OidFunctionCall4Coll (function_id=10790, collation=0, arg1=97664, arg2=0, arg3=0, arg4=140094619281216, is_null=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/share/fmgr/fmgr.cpp:2512
#7  0x0000564904a55363 in gsplsql_func_in_pkg_compile (pkg=0x7f6a455ec050) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_pkg_main.cpp:1210
#8  0x0000564904a571fc in gsplsql_pkg_init (pkg=0x7f6a455ec050, isCreate=false, isSpec=false, ret_pkg_runtime=0x7f6a52046a18, is_need_compile_func=true, pkg_debug_query_string=<optimized out>, old_pkg_runtime=0x7f6a39a48050) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_pkg_main.cpp:1545
#9  0x0000564904a584a3 in gsplsql_pkg_compile (pkg_oid=97663, for_validator=true, is_spec=false, is_create=false, is_recompile=true, pkg_runtime_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_pkg_main.cpp:956
#10 0x0000564905100dc6 in recompile_single_package (package_oid=97663, is_spec=false) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/commands/packagecmds.cpp:329
#11 0x0000564905101212 in recompile_package_by_oid (pkg_oid=97663, recompile_invalid_pkg=false) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/commands/packagecmds.cpp:416
#12 0x0000564905101262 in recompile_package (stmt=0x7f6a3f4954c0) at /usr1/GaussDBKernel/server/opengauss/src/compatibility/sql_adaptor/commands/packagecmds.cpp:437
#13 0x0000564905431abe in sqlcmd_standard_process_utility (parse_tree=0x7f6a3f4954c0, query_string=0x7f6a3f496050 "alter package pkg_test_4 compile", params=0x0, is_top_level=<optimized out>, dest=0x56490a6e0720 <donothingDR>, sent_to_remote=<optimized out>, completion_tag=0x7f6a5204a430 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/utility.cpp:6813
#14 0x00007f6b9cd9b759 in gsaudit_ProcessUtility_hook (parsetree=0x7f6a3f4954c0, queryString=0x7f6a3f496050 "alter package pkg_test_4 compile", params=0x0, isTopLevel=<optimized out>, dest=0x56490a6e0720 <donothingDR>, sentToRemote=<optimized out>, completionTag=0x7f6a5204a430 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/security/security_plugin/security_policy_plugin.cpp:856
#15 0x00005649059a0f52 in audit_process_utility (parsetree=0x7f6a3f4954c0, query_string=0x7f6a3f496050 "alter package pkg_test_4 compile", params=<optimized out>, is_top_level=<optimized out>, dest=<optimized out>, sent_to_remote=<optimized out>, completion_tag=0x7f6a5204a430 "", is_ctas=false) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/security/audit/security_auditfuncs.cpp:1512
#16 0x000056490543c71d in sqlcmd_process_utility (parse_tree=0x7f6a3f4954c0, query_string=0x7f6a3f496050 "alter package pkg_test_4 compile", params=0x0, is_top_level=<optimized out>, dest=<optimized out>, sent_to_remote=<optimized out>, completion_tag=0x7f6a5204a430 "", isCTAS=false) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/utility.cpp:1974
#17 0x000056490541d83f in PortalRunUtility (portal=0x7f6a48878050, utilityStmt=0x7f6a3f4954c0, isTopLevel=true, dest=0x56490a6e0720 <donothingDR>, completionTag=0x7f6a5204a430 "") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:2140
#18 0x000056490541f0be in PortalRunMulti (portal=0x7f6a48878050, isTopLevel=true, dest=0x56490a6e0720 <donothingDR>, altdest=0x56490a6e0720 <donothingDR>, completionTag=0x7f6a5204a430 "", snapshot=0x0, bii_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:2326
#19 0x00005649054232dc in PortalRun (portal=0x7f6a48878050, count=200, isTopLevel=true, dest=0x7f6a3f4a84d0, altdest=0x7f6a3f4a84d0, completionTag=0x7f6a5204a430 "", snapshot=0x0, bii_state=0x0) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/pquery.cpp:1501
#20 0x00005649054158af in exec_execute_message (max_rows=200, portal_name=0x7f6a3f4a8050 "") at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:7071
#21 gs_process_command (firstchar=<optimized out>, input_message=<optimized out>, send_ready_for_query=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:12314
#22 0x000056490541b9c0 in PostgresMain (argc=<optimized out>, argv=0x7f6a49e45b20, dbname=<optimized out>, username=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/tcop/postgres.cpp:11313
#23 0x000056490539f2df in backend_run (port=0x7f6a5204a890) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:12482
#24 0x00005649053de1b0 in gauss_db_worker_thread_main<(knl_thread_role)2> (arg=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:19086
#25 0x000056490539f39a in internal_thread_func (args=<optimized out>) at /usr1/GaussDBKernel/server/opengauss/src/auxiliary/proc/postmaster/postmaster.cpp:20196
#26 0x00007f6bbc60ff1b in ?? () from /usr/lib64/libpthread.so.0
#27 0x00007f6bbc547320 in clone () from /usr/lib64/libc.so.6

可以看到在報錯的 format_type_internal里,出現了type_oid=323,的確是在報錯中出現的數字,但是323這么小的數字明顯不可能是用戶自定義類型,因為小數字的oid都是被系統保留的。這里肯定是有bug的,但沒有源碼不方便找bug原因,本次調試的主要目的是找到出錯的procedure。

format_type_internalformat_procedure里,有用的參數都顯示成 <optimized out>了,這表示內核把這些變量優化掉了,不給看。于是繼續看下一行 do_compile,打印幾個參數看看

(gdb) f 3
#3  0x0000564904a2d53d in do_compile (fcinfo=0x7f6a520460c0, proc_tup=0x7f69b7c375a0, func=0x7f6a39c64050, compile_func_head_info=0x7f6a52046740, for_validator=true, hashkey=0x7f6a52045d50) at /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_func_main.cpp:921
921     /usr1/GaussDBKernel/server/opengauss/src/gausskernel/pl/plsql/pl_comp/pl_comp_func_main.cpp: No such file or directory.
(gdb) p *fcinfo
$1 = {flinfo = 0x7f6a52046020, context = 0x0, resultinfo = 0x0, fncollation = 0, isnull = false, nargs = 0, arg = 0x7f6a39a4ca50, argnull = 0x0, argTypes = 0x0, prealloc_arg = {0 <repeats 20 times>}, prealloc_argnull = {false <repeats 20 times>}, prealloc_argTypes = {0 <repeats 20 times>}, argVector = 0x0, refcursor_data = {argCursor = 0x0, returnCursor = 0x0, return_number = 0}, out_tmtype = 0 '\000', out_decimals = 0 '\000', udfInfo = {UDFArgsHandlerPtr = 0x0, UDFResultHandlerPtr = 0x0, udfMsgBuf = 0x0, msgReadPtr = 0x0, argBatchRows = 0, allocRows = 0, arg = 0x0, null = 0x0, result = 0x0, resultIsNull = 0x0, valid_UDFArgsHandlerPtr = false}, swinfo = {sw_econtext = 0x0, sw_exprstate = 0x0, sw_is_flt_frame = false}, out_typmode = 0x0, fn_typmode = 0, plfunc_exec_mode = 0, plfunc_exec_state = 0x0, args_done = 0x0, prealloc_args_done = {0 <repeats 20 times>}, arginfo = {{in_tmtype = 0 '\000', in_decimals = 0 '\000', argTypModes = 0, set_enum_typeoid = 0}}}
(gdb) p *proc_tup
Attempt to dereference a generic pointer.
(gdb) p *func
$2 = {type = T_PLpgSQL_FUNCTION, fn_oid = 97664, pkg_oid = 97663, namespaceOid = 2200, fn_owner = 16728, fn_input_collation = 0, fn_signature = 0x0, fn_searchpath = 0x0, namespace_searchpath = 0x0, fn_hashkey = 0x0, fn_cxt = 0x7f6a6a9599d0, fn_rettype = 0, fn_rettyplen = 0, glc_func_life = 1, fn_rettypioparam = 0, fn_retbyval = false, fn_retistuple = false, fn_retset = false, fn_readonly = false, out_param_varno = -1, found_varno = 0, fn_nallargs = 0, argmods = 0x0, argtypes = 0x0, sql_cursor_found_varno = 0, sql_notfound_varno = 0, sql_isopen_varno = 0, sql_rowcount_varno = 0, sql_bulk_exceptions_varno = 0, sqlcode_varno = 0, sqlstate_varno = 0, sqlerrm_varno = 0, new_varno = 0, old_varno = 0, tg_name_varno = 0, tg_when_varno = 0, tg_level_varno = 0, tg_op_varno = 0, tg_relid_varno = 0, tg_relname_varno = 0, tg_table_name_varno = 0, tg_table_schema_varno = 0, tg_nargs_varno = 0, tg_argv_varno = 0, retvarno = 0, guc_stat = 5, use_count = 0, resolve_option = GSPLSQL_RESOLVE_COLUMN, ndatums = 0, datums = 0x0, datum_need_free = 0x0, action = 0x0, goto_labels = 0x0, invalItems = 0x0, saved_unique_id = 4294967295, nPlaceHolders = 0, placeholders = 0x0, cur_estate = 0x0, tg_relation = 0x0, debug = 0x0, ns_top = 0x0, is_private = false, fn_is_trigger = false, pre_parse_trig = false, is_autonomous = false, is_inline_handler = false, is_valid = true, is_plpgsql_func_with_outparam = false, need_skip_process_autonm_pkg = false, remembered_by_resowner = false, typeList = 0x0, namespace_name = 0x0, expr_list = 0x0, fn_retinput = {fn_addr = 0x0, fn_oid = 0, fn_nargs = 0, fn_strict = false, fn_retset = false, fn_extra = 0x0, fn_mcxt = 0x0, fn_expr = 0x0, fn_rettype = 0, fn_rettypemod = 0, fnName = '\000' <repeats 63 times>, fnLibPath = 0x0, vec_fn_addr = 0x0, vec_fn_cache = 0x0, genericRuntime = 0x0, max_length = 0, fn_languageId = 0, fn_stats = 0 '\000', fn_fenced = false, fn_volatile = 0 '\000', decimals = 0 '\000'}, glc_status = {m_type = GLC_FUNCTION_OBJ, m_location = GLC_OBJECT_IN_SESSION_WAIT_REMOVE, m_glc_object_state = GLC_OBJECT_IS_VALID, m_refcount = 1}, expired_cell = {dle_next = 0x0, dle_prev = 0x0, dle_val = 0x7f6a39c64050, dle_list = 0x7f6a48876ea0}, compiled_dlist_elem = {dle_next = 0x0, dle_prev = 0x0, dle_val = 0x7f6a39c64050, dle_list = 0x0}, parent_pro_ndatum = 0, subparam = 0x0, fn_nargs = 0, copiable_size = 0, deep_datums = 0x0, deep_ndatums = 0, cursor_datums = 0x0, cursor_ndatums = 0, placeholder_datums = 0x0, placeholder_ndatums = 0, fn_argvarnos = 0x0, depend_info_list = 0x0, plan_total_mem_size = 0, block_level = 0x0}
(gdb)

可以在 func里看到,fn_oid=97664 ,這意味著是在編譯pg_proc里oid為97664的對象。于是我們輸入q退出gdb,然后回到客戶端查詢

gaussdb=# select proname,g.pkgname from pg_proc p,gs_package g where p.oid=97664 and g.oid=p.propackageid;proname |  pkgname
---------+------------p1      | pkg_test_4
(1 row)

可以看到這個oid對應的就是pkg_test_4這個包里的p1,于是就知道了一定是編譯p1的時候出了問題。
到此,出問題的procedure就直接找出來了,肉眼一看包頭和包體的定義,發現有個參數的in/out方向沒匹配,但GaussDB在創建這個package時竟然沒有報錯…

其他Gauss系數據庫的情況

同樣的這個代碼,在openGauss 7.0.0 RC1 是不會報錯的,這個package還能正常調用,查了下數據字典,出入參方向是按包體生效的,這同樣也是個BUG,沒有做嚴格判斷。

openGauss=# create package pkg_test_4 is
openGauss$# procedure p1(i1 in varchar2,
openGauss$#              i2 out varchar2,
openGauss$#              i3 out varchar2);
openGauss$# end pkg_test_4;
openGauss$# /
;end;
end pkg_test_4;
/CREATE PACKAGE
openGauss=# create package body pkg_test_4 is
openGauss$# procedure p1(i1 in varchar2,
openGauss$#              i2 in varchar2,
openGauss$#              i3 out varchar2) is
openGauss$#              begin
openGauss$#                  null;
openGauss$#              end;
openGauss$# end pkg_test_4;
openGauss$# /
CREATE PACKAGE BODY
openGauss=# alter package pkg_test_4 compile;
ALTER PACKAGE
openGauss=# call pkg_test_4.p1(null,null,null);i3
----(1 row)openGauss=#

在MogDB 5.2.0里則是在創建package body時就報錯了,能正確檢查到包頭里的procedure在包體里沒定義

MogDB=# create package pkg_test_4 is
MogDB$# procedure p1(i1 in varchar2,
MogDB$#              i2 out varchar2,
MogDB$#              i3 out varchar2);
MogDB$# end pkg_test_4;
MogDB$# /
CREATE PACKAGE
MogDB=# create package body pkg_test_4 is
MogDB$# procedure p1(i1 in varchar2,
MogDB$#              i2 in varchar2,
MogDB$#              i3 out varchar2) is
MogDB$#              begin
MogDB$#                  null;
MogDB$#              end;
MogDB$# end pkg_test_4;
MogDB$# /
ERROR:  Function definition not found: p1
MogDB=#

總結

本次觸發 ERROR: Failed to query the 323 type in the cache.這個報錯的直接原因是創建的package和package body中,有個procedure的參數in/out方向不匹配導致。雖然客戶代碼的確有問題,但根本原因還是數據庫有BUG,未將這種異常場景檢查出來。

想要深入排查國產數據庫使用中的問題,學會使用gdb是必不可少的。我曾參與過不少國產數據庫PoC,親眼看到各個數據庫廠家的技術人員在客戶現場都曾用過gdb調試來定位問題。雖然大部分排行靠前的國產數據庫都基本已經穩定應用在各行各業了,但是仍要注意一些不起眼的小角落是否還有蟲子。

  • 本文作者: DarkAthena
  • 本文鏈接: https://www.darkathena.top/archives/Debugging-GaussDB-Locating-Package-Compilation-Errors-with-GDB
  • 版權聲明: 本博客所有文章除特別聲明外,均采用CC BY-NC-SA 3.0 許可協議。轉載請注明出處

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/bicheng/94092.shtml
繁體地址,請注明出處:http://hk.pswp.cn/bicheng/94092.shtml
英文地址,請注明出處:http://en.pswp.cn/bicheng/94092.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

One Commander:強大的Windows文件管理器

在日常使用電腦的過程中&#xff0c;文件管理和瀏覽是必不可少的任務。One Commander作為一款功能強大的Windows文件管理器&#xff0c;提供了豐富的功能和便捷的操作方式&#xff0c;幫助用戶更高效地管理和瀏覽文件。它不僅支持多種文件操作&#xff0c;還提供了豐富的自定義…

SPUpDate Application 程序卸載

我安裝了 EzvizStudioSetups.exe 軟件&#xff0c;卸載后會在電腦遺留 SPUpDate Application 程序&#xff1b;在某一時刻會占用 CPU 資源&#xff1b;應用卸載方法一&#xff1a;在任務管理器搜索 SPUpDate Application&#xff1b;定位到文件位置&#xff1b;我的路徑如下C:\…

算法題(187):程序自動分析

審題&#xff1a; 本題需要我們判斷是否可以同時滿足題目給定的若干等式或不等式&#xff0c;判斷出后根據結果輸出YES或NO 思路&#xff1a; 方法一&#xff1a;離散化并查集 使用并查集&#xff1a;其實題目中只存在兩者相等或不等兩種情況&#xff0c;而等于具有傳遞性&…

strcasecmp函數詳解

strcasecmp 是 C 語言中用于不區分大小寫比較兩個字符串的函數&#xff0c;主要用于忽略字符大小寫差異的場景&#xff08;如用戶輸入驗證、不區分大小寫的字符串匹配等&#xff09;。它屬于 POSIX 標準庫&#xff0c;定義在 <string.h> 頭文件中。 一、函數原型與參數 函…

Voronoi圖

本文將詳細解釋 Voronoi 圖&#xff0c;它在空間分析和插值中非常常用。1. 概念 Voronoi 圖是一種空間劃分方法&#xff0c;它把平面&#xff08;或空間&#xff09;劃分成若干個區域&#xff0c;使得每個區域內的任意一點都比該區域外的任何一點更靠近該區域的“生成點”&…

BioScientist Agent:用于藥物重定位和作用機制解析的知識圖譜增強型 LLM 生物醫學代理技術報告

BioScientist Agent:用于藥物重定位和作用機制解析的知識圖譜增強型 LLM 生物醫學代理技術報告 一、項目概述 藥物研發是一個周期長、成本高的過程,平均需要超過 10 年時間和 20 億美元才能將一種新藥推向市場,且 90% 以上的候選藥物最終失敗(1)。這種低成功率主要歸因于對…

5G視頻終端詳解 無人機圖傳 無線圖傳 便攜式5G單兵圖傳

前言單兵圖傳設備&#xff0c;是一種集視頻采集、編碼壓縮、無線傳輸等多種功能于一體的便攜式通信終端。它以嵌入式系統為基礎&#xff0c;搭載高性能 H.265 編解碼處理器&#xff0c;能夠將現場的音視頻信息進行高效處理后&#xff0c;通過無線網絡快速穩定地傳輸至后端指揮中…

【蘋果軟件】Prism Mac 9.4蘋果系統免費安裝包英文版 Graphpad Prism for Mac 9.4軟件免費下載與詳細圖文教程!!

軟件下載與系統要求 軟件&#xff1a;Prism9.4 語言&#xff1a;英文 大小&#xff1a;103.41M 安裝環境&#xff1a;MacOS12.0&#xff08;或更高&#xff0c;支持IntelM芯片&#xff09; MacOS蘋果系統GraphPad Prism&#xff08;科學數據分析與圖形繪制&#xff09;&am…

Redis 奇葩問題

先貼錯誤碼Unexpected exception while processing command這個奇葩的問題查了很久&#xff0c;后面突然頓悟&#xff0c;應該是Redis記住了第一次的數據類型&#xff0c;后面即使換了數據類型也不會改變之前的數據類型。跟代碼發現是codec變成了默認的了后續public RedissonBa…

C ++代碼學習筆記(一)

1、GetStringUTFChars用于將 Java 字符串&#xff08;jstring&#xff09;轉換為 UTF-8 編碼的 C 風格字符串&#xff08;const char*&#xff09;。必須在使用完后調用 ReleaseStringUTFChars 釋放內存&#xff0c;否則可能導致內存泄漏。std::string data_converter::convert…

【學習嵌入式day-29-網絡】

進程和線程的區別&#xff1a;都是系統執行的任務進程是資源分配的基本單位線程是調度執行的最小單位進程的創建和切換的開銷大&#xff0c;速度慢&#xff0c;效率低空間獨立、----- 安全&#xff0c;穩定進程間通信不方便線程創建和切換的開銷小&#xff0c;速度快&#xff0…

Eino 框架組件協作指南 - 以“智能圖書館建設手冊”方式理解

Eino 框架組件關系 - 形象比喻指南 &#x1f3d7;? 項目概覽&#xff1a;構建一個智能圖書館 想象一下&#xff0c;你要建設一個現代化的智能圖書館&#xff0c;能夠幫助用戶快速找到所需信息并提供智能問答服務。Eino 框架就像是這個智能圖書館的建設工具包&#xff0c;每個組…

網絡打印機自動化部署腳本

下面是一個全面的、交互式的PowerShell腳本&#xff0c;用于自動化網絡打印機部署過程。這個腳本提供了圖形化界面&#xff0c;讓用戶可以輕松地搜索、選擇和安裝網絡打印機。 備注&#xff1a;這個腳本未在生產環境測試過&#xff0c;請大家測試一下&#xff0c;有問題或優化&…

探索工業自動化核心:ZMC 系列 EtherCAT 主站控制器

ZLG致遠電子的ZMC系列EtherCAT主站控制器&#xff0c;憑借多元內核、豐富接口、卓越通信能力及開放開發環境&#xff0c;為工業自動化提供全方位解決方案&#xff0c;助力企業智能化升級。 前言在工業自動化領域不斷演進的今天&#xff0c;可靠且高效的控制解決方案成為企業提…

rt-thread使用sfud掛載qspi flash的trace分析

說明 trace log先貼在這里&#xff0c;待分析完成后&#xff0c;完善文章。 [0m[D/drv.sdram] sdram init success, mapped at 0xC0000000, size is 33554432 bytes, data width is 16[0m\ | / - RT - Thread Operating System/ | \ 5.2.0 build Aug 21 2025 14:44:332…

服務發現與負載均衡:Kubernetes Service核心機制深度解析

目錄 專欄介紹 作者與平臺 您將學到什么&#xff1f; 學習特色 一、 服務發現與負載均衡&#xff1a;云原生應用的核心支柱 1.1 Kubernetes Service的設計哲學 1.2 服務發現的核心組件 二、 Service核心類型深度解析&#xff1a;從ClusterIP到LoadBalancer 2.1 ClusterI…

【基礎排序】CF - 賭場游戲Playing in a Casino

題目描述 在整個太陽系都很有名的賭場 Galaxy Luck 推出了一種新的紙牌游戲。 在這個游戲中&#xff0c;有一副由 nnn 張牌組成的牌堆。每張牌上寫有 mmm 個整數。nnn 位玩家各自從牌堆中獲得一張牌。 然后所有玩家兩兩對局&#xff0c;每一對玩家恰好對局一次。 例如&#…

Jenkins啟動端口修改失敗查找日志

# 查看Jenkins服務啟動時的環境變量sudo systemctl show jenkins | grep -i port從systemd服務信息可以看到&#xff0c;Jenkins的環境變量中 JENKINS_PORT8080&#xff0c;這說明systemd服務配置覆蓋了 /etc/default/jenkins 文件中的設置1. 查找Jenkins的systemd服務文件# 查…

Rancher部署的K8S集群服務節點上執行 kubectl 命令

文章目錄1、Rancher UI 和執行 kubectl 命令之間的關系1.1、Rancher 的架構和 kubectl1.2、Rancher 內置 kubectl 的位置1.3、執行權限和安全2、Rancher UI 的使用操作2.1、UI 界面內置的 Kubectl 命令工具2.2、在服務節點執行 kubectl 命令的方法2.3、創建一個集群上下文文件 …

基于Nodejs作為服務端,React作為前端框架,axios作為通訊框架,實現滑塊驗證

文章目錄基于Nodejs作為服務端&#xff0c;React作為前端框架&#xff0c;axios作為通訊框架&#xff0c;實現滑塊驗證1. 為什么要自己寫滑塊驗證2. 滑塊驗證的整體思路3. 具體實現3.1 服務端3.2 前端4. 總結基于Nodejs作為服務端&#xff0c;React作為前端框架&#xff0c;axi…