最近在看 C++
的虛方法調用實現原理,大概就是說在 class 的首位置存放著一個指向 vtable array
指針數組 的指針,而 vtable array
中的每一個指針元素指向的就是各自的 虛方法
,實現方式很有意思,哈哈,現在我很好奇 C# 中如何實現的。
一:C# 中的多態玩法
1. 一個簡單的 C# 例子
為了方便說明,我就定義一個 Person 類和一個 Chinese 類,詳細代碼如下:
internal?class?Program{static?void?Main(string[]?args){Person?person?=?new?Chinese();person.SayHello();Console.ReadLine();}}public?class?Person{public?virtual?void?SayHello(){Console.WriteLine("sayhello");}}public?class?Chinese:?Person{public?override?void?SayHello(){Console.WriteLine("chinese");}}
}
2. 匯編代碼分析
接下來用 windbg 在 person.SayHello()
處下一個斷點,觀察一下它的反匯編代碼:
D:\net6\ConsoleApplication2\ConsoleApp1\Program.cs?@?9:
05cf21b3?b93c5dce05??????mov?????ecx,5CE5D3Ch?(MT:?ConsoleApp1.Chinese)
05cf21b8?e8030f89fa??????call????005830c0?(JitHelp:?CORINFO_HELP_NEWSFAST)
05cf21bd?8945f4??????????mov?????dword?ptr?[ebp-0Ch],eax
05cf21c0?8b4df4??????????mov?????ecx,dword?ptr?[ebp-0Ch]
05cf21c3?e820fbffff??????call????05cf1ce8?(ConsoleApp1.Chinese..ctor(),?mdToken:?0600000A)
05cf21c8?8b4df4??????????mov?????ecx,dword?ptr?[ebp-0Ch]
05cf21cb?894df8??????????mov?????dword?ptr?[ebp-8],ecxD:\net6\ConsoleApplication2\ConsoleApp1\Program.cs?@?11:
>>>?05cf21ce?8b4df8??????????mov?????ecx,dword?ptr?[ebp-8]
05cf21d1?8b45f8??????????mov?????eax,dword?ptr?[ebp-8]
05cf21d4?8b00????????????mov?????eax,dword?ptr?[eax]
05cf21d6?8b4028??????????mov?????eax,dword?ptr?[eax+28h]
05cf21d9?ff5010??????????call????dword?ptr?[eax+10h]
05cf21dc?90??????????????nop
從匯編代碼看,邏輯非常清晰,大體步驟如下:
eax,dword ptr [ebp-8]
從棧上(ebp-8)處獲取 person 在堆上的首地址,如果不相信的話,可以用 !do 027ea88c
試試看。
0:000>?dp?ebp-8?L1
0057f300??027ea88c0:000>?!do?027ea88c
Name:????????ConsoleApp1.Chinese
MethodTable:?05ce5d3c
EEClass:?????05cd3380
Size:????????12(0xc)?bytes
File:????????D:\net6\ConsoleApplication2\ConsoleApp1\bin\x86\Debug\net6.0\ConsoleApp1.dll
Fields:
None
eax,dword ptr [eax]
如果大家了解 實例
在堆上的內存布局的話,應該知道,這個首地址存放的就是 methodtable
指針,我們可以用 !dumpmt 05ce5d3c
來驗證下。
0:000>?dp?027ea88c?L1
027ea88c??05ce5d3c0:000>?!dumpmt?05ce5d3c
EEClass:?????????05cd3380
Module:??????????05addb14
Name:????????????ConsoleApp1.Chinese
mdToken:?????????02000007
File:????????????D:\net6\ConsoleApplication2\ConsoleApp1\bin\x86\Debug\net6.0\ConsoleApp1.dll
BaseSize:????????0xc
ComponentSize:???0x0
DynamicStatics:??false
ContainsPointers?false
Slots?in?VTable:?6
Number?of?IFaces?in?IFaceMap:?0
eax,dword ptr [eax+28h]
那這句話是什么意思呢?如果你了解 CoreCLR 的話,你應該知道 methedtable 是由一個 class MethodTable
類來承載的,所以它取了 methodtable 偏移 0x28
?位置的一個字段,那這個偏移字段是什么呢?我們先用 dt
把 methodtable 結構給導出來。
0:000>?dt?05ce5d3c?MethodTable
coreclr!MethodTable=7ad96bc8?s_pMethodDataCache?:?0x00639ec8?MethodDataCache=7ad96bc4?s_fUseParentMethodData?:?0n1=7ad96bcc?s_fUseMethodDataCache?:?0n1+0x000?m_dwFlags????????:?0xc+0x004?m_BaseSize???????:?0x74088+0x008?m_wFlags2????????:?5+0x00a?m_wToken?????????:?0+0x00c?m_wNumVirtuals???:?0x5ccc+0x00e?m_wNumInterfaces?:?0x5ce+0x010?m_pParentMethodTable?:?IndirectPointer<MethodTable?*>+0x014?m_pLoaderModule??:?PlainPointer<Module?*>+0x018?m_pWriteableData?:?PlainPointer<MethodTableWriteableData?*>+0x01c?m_pEEClass???????:?PlainPointer<EEClass?*>+0x01c?m_pCanonMT???????:?PlainPointer<unsigned?long>+0x020?m_pPerInstInfo???:?PlainPointer<PlainPointer<Dictionary?*>?*>+0x020?m_ElementTypeHnd?:?0+0x020?m_pMultipurposeSlot1?:?0+0x024?m_pInterfaceMap??:?PlainPointer<InterfaceInfo_t?*>+0x024?m_pMultipurposeSlot2?:?0x5ce5d68=7ad04c78?c_DispatchMapSlotOffsets?:?[0]??"?$?(System.Private.CoreLib.dll"=7ad04c70?c_NonVirtualSlotsOffsets?:?[0]??"?$?($((,?$?(System.Private.CoreLib.dll"=7ad04c60?c_ModuleOverrideOffsets?:?[0]??"?$?($((,$((,(,,0?$?($((,?$?(System.Private.CoreLib.dll"=7ad12838?c_OptionalMembersStartOffsets?:?[0]??"(((((((,(((,(,,0(((,(,,0(,,0,004"
從 methodtable 的布局圖來看, eax+28h
是 m_pMultipurposeSlot2
結構的第二個字段了,因為第一個字段是 虛方法表指針
,如果要驗證的話,也很簡單,用 !dumpmt -md 05ce5d3c
把所有的方法給導出來,然后結合 dp 05ce5d3c
看下 0x5ce5d68 之后是不是許多的方法。
0:000>?!dumpmt?-md?05ce5d3c
EEClass:?????????05cd3380
Module:??????????05addb14
Name:????????????ConsoleApp1.Chinese
mdToken:?????????02000007
File:????????????D:\net6\ConsoleApplication2\ConsoleApp1\bin\x86\Debug\net6.0\ConsoleApp1.dll
BaseSize:????????0xc
ComponentSize:???0x0
DynamicStatics:??false
ContainsPointers?false
Slots?in?VTable:?6
Number?of?IFaces?in?IFaceMap:?0
--------------------------------------
MethodDesc?TableEntry?MethodDe????JIT?Name
02610028?02605568???NONE?System.Object.Finalize()
02610030?02605574???NONE?System.Object.ToString()
02610038?02605580???NONE?System.Object.Equals(System.Object)
02610050?026055ac???NONE?System.Object.GetHashCode()
05CF1CE0?05ce5d24???NONE?ConsoleApp1.Chinese.SayHello()
05CF1CE8?05ce5d30????JIT?ConsoleApp1.Chinese..ctor()
0:000>?dp?05ce5d3c?L10
05ce5d3c??00000200?0000000c?00074088?00000005
05ce5d4c??05ce5ccc?05addb14?05ce5d7c?05cd3380
05ce5d5c??05cf1ce8?00000000?05ce5d68?02610028
05ce5d6c??02610030?02610038?02610050?05cf1ce0
仔細看輸出,上面的 05ce5d68
后面的 02610028
就是 System.Object.Finalize()
方法,02610030
對應著 System.Object.ToString()
方法。
call dword ptr [eax+10h]
有了前面的基礎,這句話就好理解了,它是從 m_pMultipurposeSlot2
結構中找 SayHello
所在的單元指針位置,然后做 call 調用。
0:000>?!U?05cf1ce0
Unmanaged?code
05cf1ce0?e88f9dde74??????call????coreclr!PrecodeFixupThunk?(7aadba74)
05cf1ce5?5e??????????????pop?????esi
05cf1ce6?0001????????????add?????byte?ptr?[ecx],al
05cf1ce8?e913050000??????jmp?????05cf2200
05cf1ced?5f??????????????pop?????edi
05cf1cee?0300????????????add?????eax,dword?ptr?[eax]
05cf1cf0?245d????????????and?????al,5Dh
05cf1cf2?ce??????????????into
05cf1cf3?0500000000??????add?????eax,0
05cf1cf8?0000????????????add?????byte?ptr?[eax],al
從匯編看,它還是一段 樁代碼
,言外之意就是該方法沒有被 JIT 編譯,如果編譯完了,這里的 05CF1CE0 05ce5d24 NONE ConsoleApp1.Chinese.SayHello()
?的 Entry (05CF1CE0) 也會被同步修改,驗證一下很簡單,我們繼續 go 代碼讓其編譯完成,然后再 dumpmt 。
0:008>?!dumpmt?-md?05ce5d3c
EEClass:?????????05cd3380
Module:??????????05addb14
Name:????????????ConsoleApp1.Chinese
mdToken:?????????02000007
File:????????????D:\net6\ConsoleApplication2\ConsoleApp1\bin\x86\Debug\net6.0\ConsoleApp1.dll
BaseSize:????????0xc
ComponentSize:???0x0
DynamicStatics:??false
ContainsPointers?false
Slots?in?VTable:?6
Number?of?IFaces?in?IFaceMap:?0
--------------------------------------
MethodDesc?TableEntry?MethodDe????JIT?Name
02610028?02605568???NONE?System.Object.Finalize()
02610030?02605574???NONE?System.Object.ToString()
02610038?02605580???NONE?System.Object.Equals(System.Object)
02610050?026055ac???NONE?System.Object.GetHashCode()
05CF2270?05ce5d24????JIT?ConsoleApp1.Chinese.SayHello()
05CF1CE8?05ce5d30????JIT?ConsoleApp1.Chinese..ctor()0:008>?dp?05ce5d3c?L10
05ce5d3c??00000200?0000000c?00074088?00000005
05ce5d4c??05ce5ccc?05addb14?05ce5d7c?05cd3380
05ce5d5c??05cf1ce8?00000000?05ce5d68?02610028
05ce5d6c??02610030?02610038?02610050?05cf2270
此時可以看到它由 05cf1ce0
變成了 05cf2270
, 這個就是 JIT 編譯后的方法代碼,我們用 !U 反編譯下。
0:008>?!U?05cf2270
Normal?JIT?generated?code
ConsoleApp1.Chinese.SayHello()
ilAddr?is?05E720D5?pImport?is?008F6E88
Begin?05CF2270,?size?27D:\net6\ConsoleApplication2\ConsoleApp1\Program.cs?@?28:
>>>?05cf2270?55??????????????push????ebp
05cf2271?8bec????????????mov?????ebp,esp
05cf2273?50??????????????push????eax
05cf2274?894dfc??????????mov?????dword?ptr?[ebp-4],ecx
05cf2277?833d74dcad0500??cmp?????dword?ptr?ds:[5ADDC74h],0
05cf227e?7405????????????je??????05cf2285
05cf2280?e8cb2bf174??????call????coreclr!JIT_DbgIsJustMyCode?(7ac04e50)
05cf2285?90??????????????nopD:\net6\ConsoleApplication2\ConsoleApp1\Program.cs?@?29:
05cf2286?8b0d74207e04????mov?????ecx,dword?ptr?ds:[47E2074h]?("chinese")
05cf228c?e8dffbffff??????call????05cf1e70
05cf2291?90??????????????nopD:\net6\ConsoleApplication2\ConsoleApp1\Program.cs?@?30:
05cf2292?90??????????????nop
05cf2293?8be5????????????mov?????esp,ebp
05cf2295?5d??????????????pop?????ebp
05cf2296?c3??????????????ret
終于這就是多態下的 ConsoleApp1.Chinese.SayHello
方法啦。
3. 總結
本質上來說,CoreCLR 也是 C++ 寫的,所以也逃不過用 虛表
來實現多態的玩法, 不過玩法也稍微復雜了一些,希望本篇對大家有幫助。