??摘要:工作過程中處理線上的崩潰時發現了一例cxa_pure_virtual
相關的crash,直接看堆棧基本山很容易確認是有異步調用導致出發了ABI的異常。但是對于為什么會觸發cxa_pure_virtual
雖然有大致的猜測但是沒有直接的證據,因此本文主要描述觸發該類型崩潰的原理。
??關鍵字:cxxabi,llvm,cxa_pure_virtual,vptr
??首先我們看一下崩潰的現象,線上的崩潰堆棧大概類似于下面形式:
0x********* abort()
0x********* std::terminate()
0x********* cxxabi::__cxa_pure_virtual()
0x********* ******::*******
??上面的崩潰我們看實際的代碼基本上能夠判斷出當前類已經被析構的情況下當前類卻嘗試訪問虛函數導致了cxa_pure_virtual
,要修復該問題直接排查哪里導致的異步調用即可。但是為了更加輸入的理解,我這邊查閱了一些資料,如下。??摘要:工作過程中處理線上的崩潰時發現了一例cxa_pure_virtual
相關的crash,直接看堆棧基本山很容易確認是有異步調用導致出發了ABI的異常。但是對于為什么會觸發cxa_pure_virtual
雖然有大致的猜測但是沒有直接的證據,因此本文主要描述觸發該類型崩潰的原理。
??關鍵字:cxxabi,llvm,cxa_pure_virtual,vptr
??首先我們看一下崩潰的現象,線上的崩潰堆棧大概類似于下面形式:
0x********* abort()
0x********* std::terminate()
0x********* cxxabi::__cxa_pure_virtual()
0x********* ******::*******
??上面的崩潰我們看實際的代碼基本上能夠判斷出當前類已經被析構的情況下當前類卻嘗試訪問虛函數導致了cxa_pure_virtual
,要修復該問題直接排查哪里導致的異步調用即可。
??__cxa_pure_virtual
的描述如下:
The __cxa_pure_virtual function is an error handler that is invoked when a pure virtual function is called.
If you are writing a C++ application that has pure virtual functions you must supply your own __cxa_pure_virtual error handler function.
??當調用一個純虛函數時被調用,看llvm中cxxabi的實現可以看到該函數被調用時會直接abort。那就比較奇怪,如果我們調用的是一個純虛函數按理說編譯都無法通過,但是查看代碼發現對應的函數是被重寫的。那我們此時可能懷疑的一個點便是,虛基類的虛函數表構造和銷毀問題。可能是因為子類被銷毀是基類的虛函數表被改回基類的虛函數表,而基類中對應虛函數指針就是編譯器指定的cxa_pure_virtual
。
_LIBCXXABI_FUNC_VIS _LIBCXXABI_NORETURN void __cxa_pure_virtual(void) {abort_message("Pure virtual function called!");
}
??懷疑到這一點,我這邊開始找資料(類似的問題印象中標準中是不管的,那大概率在ABI中定義的,那我們去看ABI的定義)。從ABI的定義中找到如下的描述:
An implementation shall provide a standard entry point that a compiler may reference in virtual tables to indicate a pure virtual function. Its interface is:extern "C" void __cxa_pure_virtual ();
This routine will only be called if the user calls a non-overridden pure virtual function, which has undefined behavior according to the C++ Standard. Therefore, this ABI does not specify its behavior, but it is expected that it will terminate the program, possibly with an error message.if C::f is a pure virtual function, no specific requirement is made for the corresponding virtual table entry. It may point to __cxa_pure_virtual (see 3.2.6 Pure Virtual Function API) or to a wrapper function for __cxa_pure_virtual (e.g., to adapt the calling convention). It may also simply be null in such cases.
??上面這一段描述了cxa_pure_virtual
實際的意義。下面再看一下CXXABI中關于對象以及虛函數表構造的過程的描述:
// Sub-VTT for D (embedded in VTT for its derived class X):static vtable *__VTT__1D [1+n+m] ={ D primary vtable,// The sub-VTT for B-in-D in X may have further structure:B-in-D sub-VTT (n elements),// The secondary virtual pointers for D's bases have elements// corresponding to those in the B-in-D sub-VTT,// and possibly others for virtual bases of D:D secondary virtual pointer for B and bases (m elements) }; D ( D *this, vtable **ctorvtbls ){// (The following will be unwound, not a real loop):for ( each base A of D ) {// A "boring" base is one that does not need a ctorvtbl:if ( ! boring(A) ) {// Call subobject constructors with sub-VTT index// if the base needs it -- only B in our example:A ( (A*)this, ctorvtbls + sub-VTT-index(A) ); } else {// Otherwise, just invoke the complete-object constructor:A ( (A*)this );}}// Initialize virtual pointer with primary ctorvtbls address// (first element):this->vptr = ctorvtbls+0; // primary virtual pointer// (The following will be unwound, not a real loop):for ( each subobject A of D ) {// Initialize virtual pointers of subobjects with ctorvtbls// addresses for the bases if ( ! boring(A) ) {((A*)this)->vptr = ctorvtbls + 1+n + secondary-vptr-index(A);// where n is the number of elements in the sub-VTTs} else {// Otherwise, just use the complete-object vtable:((A *)this)->vptr = &(A-in-D vtable);}}// Code for D constructor....}
??從上面的描述中我們能夠看到:
- 當前類的虛函數表指針的確定是在執行具體的構造函數代碼之前的;
- 構建當前類之前會搜索當前類的繼承圖,找到基類按照繼承圖的先序序列構造基類;
- 基類構造完成后開始調用當前類的構造函數的代碼。
??析構函數的順序相反。對于一個具有直接繼承關系的虛基類A和B(B繼承自A)的構造順序為:
class A{
public:virtual void func() = 0;
};class B: public A{
public:virtual void func(){}
};
- B構造函數B::B被調用;
- 遍歷B的基類構造調用基類的構造函數,這里就是A::A();
- 調用A的時候先將vfptr指向A的虛函數表,此表項中有基類偏移,typeinfo,
__cxa_pure_virtual
(因為func是純虛函數因此該處的虛函數表指針以此填充); - 調用A::A的用戶代碼,這里沒有就不調用;
- A構造函數執行完后開始設置B的虛函數指針為B的虛函數表。
??析構順序:
- 調用B::~B析構函數;
- 設置虛函數表指針為B的虛函數表;
- 執行B析構的用戶代碼;
- 調用基類A::~A(),該過程中先設置虛函數表指針為A的虛函數表再調用A的用戶代碼。
??從上面的過程中大概也能看出cxa_pure_virtual
可能被調用的時機。當類被析構時,基類的析構稍微比較耗時時,第二個線程嘗試訪問當前類的一個被重寫的純虛函數,由于此時的虛函數表中的純虛函數已經被修改為cxa_pure_virtual
就會直接abort。那我們復現下:
class ClassA {
public:ClassA() {printf("Class A \n");}virtual ~ClassA() {std::this_thread::sleep_for(std::chrono::seconds(5));}virtual void func() = 0;
};class ClassB : public ClassA {
public:virtual ~ClassB() {printf("Class B \n");};virtual void func() override {printf("Class B func\n");}
};void func(ClassA *p) {while (1) {p->func();}
}int main(){std::cout << "Hello World!\n";ClassA* p = new ClassB;auto t = std::thread(func, p);std::this_thread::sleep_for(std::chrono::seconds(1));delete p;t.join();
}
??上面的代碼中在析構函數中加了sleep函數來保證對象被析構過程中卡在基類的析構函數,第二個線程嘗試訪問該純虛函數。
??再clang/gcc系列編譯器上觸發的是cxa_purer_virtual
,而msvc觸發的是_purecall
。
extern "C" int __cdecl _purecall()
{_purecall_handler const purecall_handler = _get_purecall_handler();if (purecall_handler){purecall_handler();// The user-registered purecall handler should not return, but if it does,// continue with the default termination behavior.}abort();
}
??__cxa_pure_virtual
的描述如下:
The __cxa_pure_virtual function is an error handler that is invoked when a pure virtual function is called.
If you are writing a C++ application that has pure virtual functions you must supply your own __cxa_pure_virtual error handler function.
??當調用一個純虛函數時被調用,看llvm中cxxabi的實現可以看到該函數被調用時會直接abort。那就比較奇怪,如果我們調用的是一個純虛函數按理說編譯都無法通過,但是查看代碼發現對應的函數是被重寫的。那我們此時可能懷疑的一個點便是,虛基類的虛函數表構造和銷毀問題。可能是因為子類被銷毀是基類的虛函數表被改回基類的虛函數表,而基類中對應虛函數指針就是編譯器指定的cxa_pure_virtual
。
_LIBCXXABI_FUNC_VIS _LIBCXXABI_NORETURN void __cxa_pure_virtual(void) {abort_message("Pure virtual function called!");
}
??懷疑到這一點,我這邊開始找資料(類似的問題印象中標準中是不管的,那大概率在ABI中定義的,那我們去看ABI的定義)。從ABI的定義中找到如下的描述:
An implementation shall provide a standard entry point that a compiler may reference in virtual tables to indicate a pure virtual function. Its interface is:extern "C" void __cxa_pure_virtual ();
This routine will only be called if the user calls a non-overridden pure virtual function, which has undefined behavior according to the C++ Standard. Therefore, this ABI does not specify its behavior, but it is expected that it will terminate the program, possibly with an error message.if C::f is a pure virtual function, no specific requirement is made for the corresponding virtual table entry. It may point to __cxa_pure_virtual (see 3.2.6 Pure Virtual Function API) or to a wrapper function for __cxa_pure_virtual (e.g., to adapt the calling convention). It may also simply be null in such cases.
??上面這一段描述了cxa_pure_virtual
實際的意義。下面再看一下CXXABI中關于對象以及虛函數表構造的過程的描述:
// Sub-VTT for D (embedded in VTT for its derived class X):static vtable *__VTT__1D [1+n+m] ={ D primary vtable,// The sub-VTT for B-in-D in X may have further structure:B-in-D sub-VTT (n elements),// The secondary virtual pointers for D's bases have elements// corresponding to those in the B-in-D sub-VTT,// and possibly others for virtual bases of D:D secondary virtual pointer for B and bases (m elements) }; D ( D *this, vtable **ctorvtbls ){// (The following will be unwound, not a real loop):for ( each base A of D ) {// A "boring" base is one that does not need a ctorvtbl:if ( ! boring(A) ) {// Call subobject constructors with sub-VTT index// if the base needs it -- only B in our example:A ( (A*)this, ctorvtbls + sub-VTT-index(A) ); } else {// Otherwise, just invoke the complete-object constructor:A ( (A*)this );}}// Initialize virtual pointer with primary ctorvtbls address// (first element):this->vptr = ctorvtbls+0; // primary virtual pointer// (The following will be unwound, not a real loop):for ( each subobject A of D ) {// Initialize virtual pointers of subobjects with ctorvtbls// addresses for the bases if ( ! boring(A) ) {((A*)this)->vptr = ctorvtbls + 1+n + secondary-vptr-index(A);// where n is the number of elements in the sub-VTTs} else {// Otherwise, just use the complete-object vtable:((A *)this)->vptr = &(A-in-D vtable);}}// Code for D constructor....}
??從上面的描述中我們能夠看到:
- 當前類的虛函數表指針的確定是在執行具體的構造函數代碼之前的;
- 構建當前類之前會搜索當前類的繼承圖,找到基類按照繼承圖的先序序列構造基類;
- 基類構造完成后開始調用當前類的構造函數的代碼。
??析構函數的順序相反。對于一個具有直接繼承關系的虛基類A和B(B繼承自A)的構造順序為:
class A{
public:virtual void func() = 0;
};class B: public A{
public:virtual void func(){}
};
- B構造函數B::B被調用;
- 遍歷B的基類構造調用基類的構造函數,這里就是A::A();
- 調用A的時候先將vfptr指向A的虛函數表,此表項中有基類偏移,typeinfo,
__cxa_pure_virtual
(因為func是純虛函數因此該處的虛函數表指針以此填充); - 調用A::A的用戶代碼,這里沒有就不調用;
- A構造函數執行完后開始設置B的虛函數指針為B的虛函數表。
??析構順序:
- 調用B::~B析構函數;
- 設置虛函數表指針為B的虛函數表;
- 執行B析構的用戶代碼;
- 調用基類A::~A(),該過程中先設置虛函數表指針為A的虛函數表再調用A的用戶代碼。
??從上面的過程中大概也能看出cxa_pure_virtual
可能被調用的時機。當類被析構時,基類的析構稍微比較耗時時,第二個線程嘗試訪問當前類的一個被重寫的純虛函數,由于此時的虛函數表中的純虛函數已經被修改為cxa_pure_virtual
就會直接abort。那我們復現下:
class ClassA {
public:ClassA() {printf("Class A \n");}virtual ~ClassA() {std::this_thread::sleep_for(std::chrono::seconds(5));}virtual void func() = 0;
};class ClassB : public ClassA {
public:virtual ~ClassB() {printf("Class B \n");};virtual void func() override {printf("Class B func\n");}
};void func(ClassA *p) {while (1) {p->func();}
}int main(){std::cout << "Hello World!\n";ClassA* p = new ClassB;auto t = std::thread(func, p);std::this_thread::sleep_for(std::chrono::seconds(1));delete p;t.join();
}
??上面的代碼中在析構函數中加了sleep函數來保證對象被析構過程中卡在基類的析構函數,第二個線程嘗試訪問該純虛函數。
??再clang/gcc系列編譯器上觸發的是cxa_purer_virtual
,而msvc觸發的是_purecall
。
extern "C" int __cdecl _purecall()
{_purecall_handler const purecall_handler = _get_purecall_handler();if (purecall_handler){purecall_handler();// The user-registered purecall handler should not return, but if it does,// continue with the default termination behavior.}abort();
}