前面的博文美化顯示GDB調試的數據結構介紹了如何美化顯示GDB中調試的數據結構,本文將還是以mupdf庫為例介紹如何美化顯示LLDB中調試的數據結構。
先看一下美化后的效果:
一、加載自定義腳本
與GDB類似,需要添加一個~/.lldbinit
文件,可以在其中添加一些LLDB的命令,筆者的內容為:
setting set target.x86-disassembly-flavor intel
command script import C:/Users/admin/lldb/mupdf_printer.py
第一行設置LLDB反匯編格式為intel,第二行則是執行我們自定義的美化顯示腳本mupdf_printer.py
,如果是在VSCode中使用codelldb
插件,則可以不用寫第二行,而在.vscode/launch.json
中設置調試配置的initCommands
來加載腳本:
{"type": "lldb","request": "launch","name": "(lldb) 啟動","program": "${workspaceFolder}/build/t.exe","args": [],"cwd": "${workspaceFolder}","initCommands": ["command script import ${workspaceFolder}/lldbscripts/mupdf_printer.py",]},
二、寫測試代碼
與博文美化顯示GDB調試的數據結構中的測試代碼一樣。
三、寫LLDB的python腳本
1. 向LLDB注冊pdf_obj類型
LLDB必須使用__lldb_init_module(debugger : lldb.SBDebugger, internal_dict : dict)
簽名的函數來處理類型的注冊。
注冊分為概要
(summary
類型)和混合器
(synthetic
)類型。概要
類型用于不需要展開即可顯示的信息;而混合器
類型用于展開時顯示的信息,比如數組、字典等等。
def __lldb_init_module(debugger : lldb.SBDebugger, internal_dict : dict):debugger.HandleCommand(r'type summary add -x "^pdf_obj.*\*" --python-function mupdf_printer.PDFObjAPISummary')debugger.HandleCommand(r'type synthetic add -x "^pdf_obj.*\*" --python-class mupdf_printer.PDFObjAPIPrinter')print("MuPDF pdf_obj summary and synthetic provider (via API) loaded.")
2.寫美化輸出代碼
根據LLDB的Python協議,概要
類型的處理只能是一個函數,而混合器
類型的處理是一個類。
所以pdf_obj的概要
類型處理函數如下:
def PDFObjAPISummary(val : lldb.SBValue, internal_dict : dict):try:addr = val.GetValueAsAddress()if not addr:return "<null>"ref = ""if call_pdf_api("pdf_is_indirect", val):num = call_pdf_api(f"pdf_to_num", val)#gen = call_pdf_api(f"pdf_to_gen", valobj)ref = f"<Ref {num}> => "kind = detect_pdf_obj_kind(val)if kind == "null":return f"{ref}<null>"elif kind == "int":return f"{ref}{call_pdf_api("pdf_to_int", val)}"elif kind == "real":return f"{ref}{call_pdf_api(f"pdf_to_real", val, float)}"elif kind == "bool":v = call_pdf_api(f"pdf_to_bool", val)return f"{ref}{'true' if v else 'false'}"elif kind == "string":return f'{ref}{call_pdf_api("pdf_to_text_string", val, str)}'elif kind == "name":v = call_pdf_api("pdf_to_name", val, str)return f'{ref}/{v.strip('"')}'elif kind == "array":length = call_pdf_api("pdf_array_len", val)return f"{ref}[size]={length}"elif kind == "dict":length = call_pdf_api(f"pdf_dict_len", val)return f"{ref}[pairs]={length}"return f"{ref}{addr}"except Exception as e:return f"<error: {e}>"
混合器
類型的處理函數如下:
class PDFObjAPIPrinter:def __init__(self, val : lldb.SBValue, internal_dict : dict):self.val = valself.kind = detect_pdf_obj_kind(val)self.size = self.num_children()def has_children(self):# 只在array/dict類型時允許展開return self.kind in ["array", "dict"]def num_children(self):if self.kind == "array":length = call_pdf_api(f"pdf_array_len", self.val)return int(length) if length else 0elif self.kind == "dict":length = call_pdf_api(f"pdf_dict_len", self.val)return int(length) if length else 0return 0def get_child_at_index(self, index):try:if index < 0 or index >= self.size:return Noneif self.kind == "array":v = call_pdf_api_1(f"pdf_array_get", self.val, index, object)# 根據索引取到pdf_obj對象了,需要獲取其地址addr = v.GetValueAsAddress()# 再構造一個表達式,將這個地址強制轉為pdf_obj的指針expr = f"(pdf_obj *){addr}"# 最后根據這個表達式創建一個新的值,LLDB會自動重新根據規則顯示這個值return self.val.CreateValueFromExpression(f"[{index}]", expr)elif self.kind == "dict":key = call_pdf_api_1("pdf_dict_get_key", self.val, index, object)val = call_pdf_api_1("pdf_dict_get_val", self.val, index, object)# 將pdf_obj中字典的Key一定是一個name,取name的值key_str = call_pdf_api("pdf_to_name", key, str).strip('"')# 將字典的value取地址,構造一個新的表達式addr = val.GetValueAsAddress()expr = f"(pdf_obj *){addr}"# 最后根據這個表達式創建一個新的值,LLDB會自動重新根據規則顯示這個值return self.val.CreateValueFromExpression(f"[/{key_str}]", expr)except Exception as e:print(f"Error in get_child_at_index: {e}")return None
代碼中同樣需要一些輔助函數,就不再一一列舉了,直接給出完整代碼。
3. 完整代碼
mupdf_printer.py
:
import lldbpdf_ctx : int | None = Nonedef get_mupdf_version_from_symbol(target : lldb.SBTarget):try:version = target.EvaluateExpression('(const char*)mupdf_version')return version.GetSummary()except Exception as e:return f"<symbol not found: {e}>"def call_mupdf_api(func_name : str, val : lldb.SBValue, retType : type, *args):try:target = val.GetTarget()addr = val.GetValueAsAddress() # 使用GetValueAsAddress獲取指針值cast = {int: "(int)",float: "(float)",str: "(const char*)", # for functions returning const char*object: "(pdf_obj *)", # for pdf_obj pointers}.get(retType, "(void)")global pdf_ctxif pdf_ctx is None:ver = get_mupdf_version_from_symbol(target)print(f"[LLDB] MuPDF version: {ver}")ctx : lldb.SBValue = target.EvaluateExpression(f"(fz_context*)fz_new_context_imp(0,0,0,{ver})")pdf_ctx = ctx.GetValueAsAddress() # 保存為整數地址if args:args_str = ', '.join([str(arg) for arg in args])expr = f"{cast}{func_name}((fz_context*){pdf_ctx},{addr}, {args_str})"else:expr = f"{cast}{func_name}((fz_context*){pdf_ctx},(pdf_obj*){addr})"result : lldb.SBValue = target.EvaluateExpression(expr)if retType == int:return int(result.GetValue())elif retType == float:return float(result.GetValue())elif retType == str:return result.GetSummary()else:return resultexcept Exception as e:print(f"<error calling {func_name}: {e}>")return f"<error calling {func_name}: {e}>"def call_pdf_api(func_name : str, val : lldb.SBValue, rettype=int):return call_mupdf_api(func_name, val, rettype)def call_pdf_api_1(func_name : str, val : lldb.SBValue, arg, rettype=int):return call_mupdf_api(func_name, val, rettype, arg)# 檢測除間隔引用外的數據類型
def detect_pdf_obj_kind(val : lldb.SBValue):try:if call_pdf_api("pdf_is_null", val):return "null"elif call_pdf_api("pdf_is_int", val):return "int"elif call_pdf_api("pdf_is_real", val):return "real"elif call_pdf_api("pdf_is_bool", val):return "bool"elif call_pdf_api("pdf_is_string", val):return "string"elif call_pdf_api("pdf_is_name", val):return "name"elif call_pdf_api("pdf_is_array", val):return "array"elif call_pdf_api("pdf_is_dict", val):return "dict"return "unknown"except Exception as e:print(f"<error detecting pdf_obj kind: {e}>")return "<error>"def PDFObjAPISummary(val : lldb.SBValue, internal_dict : dict):try:addr = val.GetValueAsAddress()if not addr:return "<null>"ref = ""if call_pdf_api("pdf_is_indirect", val):num = call_pdf_api(f"pdf_to_num", val)#gen = call_pdf_api(f"pdf_to_gen", valobj)ref = f"<Ref {num}> => "kind = detect_pdf_obj_kind(val)if kind == "null":return f"{ref}<null>"elif kind == "int":return f"{ref}{call_pdf_api("pdf_to_int", val)}"elif kind == "real":return f"{ref}{call_pdf_api(f"pdf_to_real", val, float)}"elif kind == "bool":v = call_pdf_api(f"pdf_to_bool", val)return f"{ref}{'true' if v else 'false'}"elif kind == "string":return f'{ref}{call_pdf_api("pdf_to_text_string", val, str)}'elif kind == "name":v = call_pdf_api("pdf_to_name", val, str)return f'{ref}/{v.strip('"')}'elif kind == "array":length = call_pdf_api("pdf_array_len", val)return f"{ref}[size]={length}"elif kind == "dict":length = call_pdf_api(f"pdf_dict_len", val)return f"{ref}[pairs]={length}"return f"{ref}{addr}"except Exception as e:return f"<error: {e}>"
class PDFObjAPIPrinter:def __init__(self, val : lldb.SBValue, internal_dict : dict):self.val = valself.kind = detect_pdf_obj_kind(val)self.size = self.num_children()def has_children(self):# 只在array/dict類型時允許展開return self.kind in ["array", "dict"]def num_children(self):if self.kind == "array":length = call_pdf_api(f"pdf_array_len", self.val)return int(length) if length else 0elif self.kind == "dict":length = call_pdf_api(f"pdf_dict_len", self.val)return int(length) if length else 0return 0def get_child_at_index(self, index):try:if index < 0 or index >= self.size:return Noneif self.kind == "array":v = call_pdf_api_1(f"pdf_array_get", self.val, index, object)# 根據索引取到pdf_obj對象了,需要獲取其地址addr = v.GetValueAsAddress()# 再構造一個表達式,將這個地址強制轉為pdf_obj的指針expr = f"(pdf_obj *){addr}"# 最后根據這個表達式創建一個新的值,LLDB會自動重新根據規則顯示這個值return self.val.CreateValueFromExpression(f"[{index}]", expr)elif self.kind == "dict":key = call_pdf_api_1("pdf_dict_get_key", self.val, index, object)val = call_pdf_api_1("pdf_dict_get_val", self.val, index, object)# 將pdf_obj中字典的Key一定是一個name,取name的值key_str = call_pdf_api("pdf_to_name", key, str).strip('"')# 將字典的value取地址,構造一個新的表達式addr = val.GetValueAsAddress()expr = f"(pdf_obj *){addr}"# 最后根據這個表達式創建一個新的值,LLDB會自動重新根據規則顯示這個值return self.val.CreateValueFromExpression(f"[/{key_str}]", expr)except Exception as e:print(f"Error in get_child_at_index: {e}")return Nonedef __lldb_init_module(debugger : lldb.SBDebugger, internal_dict : dict):debugger.HandleCommand(r'type summary add -x "^pdf_obj.*\*" --python-function mupdf_printer.PDFObjAPISummary')debugger.HandleCommand(r'type synthetic add -x "^pdf_obj.*\*" --python-class mupdf_printer.PDFObjAPIPrinter')print("MuPDF pdf_obj summary and synthetic provider (via API) loaded.")
四、調試LLDB的python代碼
gdb中的python美化顯示腳本不能直接使用普通的python調試器進行調試,那是因為GDB中不是使用的常規Python。常規Python不能import gdb包:
而常規Python是可以import lldb包的:
所以lldb中的python腳本是可以使用常規的python調試器調試的,下面就介紹一下在VSCode中如何調試它。
首先,需要在.vscode/launch.json
中添加lldb
調試配置和python
調試配置,需要注意的是python的調試配置是附加到進程的類型:
{"version": "0.2.0","configurations": [{"name": "Python 調試程序: 使用進程 ID 附加","type": "debugpy","request": "attach","processId": "${command:pickProcess}"},{"type": "lldb","request": "launch","name": "(lldb) 啟動","program": "${workspaceFolder}/build/t.exe","args": [],"cwd": "${workspaceFolder}","initCommands": ["command script import ${workspaceFolder}/lldbscripts/mupdf_printer.py",]}]
}
先在C/C++代碼中打好斷點:
然后啟動LLDB調試器,此時會在斷點處中斷。
再啟動Python調試,附加到codelldb
進程:
附加成功后,在LLDB的python腳本中打的斷點就生效了:
此時在LLDB調試器中展開還未獲取到值的變量,比如數組、字典需要展開的但從未展開過的變量(展開后有緩存,再次展開不會再觸發Python腳本,如果要想再次觸發,可以如后面介紹的方法,在調試控制臺直接使用p
命令顯示)
將調試器切換到Python調試器:
就可以看到觸發了Python中的斷點了,可以調試Python代碼了:
由于調試了Python代碼,LLDB可能會獲取超時,出現后面的值顯示不了的情況:
可以在VSCode的調試控制臺中直接輸入LLDB命令:p ar
,刷新一下就顯示出來了,此時會再次觸發Python腳本。
五、總結
細心的讀者可能會發現,所有的pdf_obj
變量前面都有一個展開箭頭,不管是基本的數據類型,還是數組與字典,展開后都有一個[raw]
項,這是因為pdf_obj
注冊了混合器。
如果不注冊混合器就不會有展開箭頭,但數組與字典也無法展開查看內容,同時pdf_obj的bool值也會有問題:
有解決辦法的讀者也可以在評論區留言討論!
筆者可能會持續改進與補充,欲知后續版本,請移步:
https://github.com/WittonBell/demo/tree/main/mupdf/lldbscripts
如果本文對你有幫助,歡迎點贊收藏!