python3虛擬機線程切換過程

python實現了自己的多線程，為了保證線程安全，引入了全局解釋器鎖GIL，只有拿到GIL的線程才能執行，所以在python中同一時刻只能有一個線程在運行，python多線程無法發揮多核處理器的威力，《python源碼剖析》中對GIL存在的歷史原因作了詳細的描述，總之，目前來說GIL的方案可能是python多線程實現的最優解。

python3.13中對去除GIL作了實驗性的嘗試，使用去除GIL的python需要下載特定的編譯版本，GIL相當于在全局范圍內做了資源互斥，去除GIL后就需要以更細的粒度做資源互斥，這可能會導致去除GIL后的python執行效率還不及GIL的版本，不過python也在持續優化這一點。

在多線程執行時，持有GIL的線程在執行一段時間后需要釋放GIL，以使其它線程也有機會執行，那么每個線程應該“享受”GIL多長時間呢？在python2中規定了每個線程持有GIL的時間片，即執行100條字節碼指令就釋放GIL，python3中線程執行的時間片不是固定的，但是依然可以指定一個時間片，通過sys.setswitchinterval可以設定這個時間片，如果當前有線程持有GIL正在運行，那么等待GIL的線程會嘗試等待一定時間，當超過等待時間后，那么在排隊的線程就會向正在執行的線程發出“催促”，通過設置eval_breaker標志位來向執行線程發出釋放GIL的信號，在字節碼指令的設計中，會有許多檢查eval_breaker的機會，執行線程當檢查到eval_breaker中存在釋放GIL的標志后，就會嘗試釋放GIL并重新排隊（排隊不是真的排隊，這里只是比喻，等待GIL的線程是在公平競爭的），以給其它線程獲取GIL的機會。

python的多線程機制涉及到系統調用和各個平臺的兼容，相對比較復雜，這里就主要關注線程切換的過程，嘗試對python3的線程切換過程進行分析。

線程執行

python3的底層線程模塊為_thread，它的實現位于Modules/_threadmodule.c中，創建線程的入口即為thread_PyThread_start_new_thread函數，前期主要經過線程處理對象ThreadHandle的創建和線程狀態對象PyThreadState的創建后，進入平臺相關函數PyThread_start_joinable_thread，在該函數中會構建系統調用的參數并調用系統函數創建線程，傳遞給系統調用的是一個統一的函數入口thread_run，在系統原生線程創建出來后就會執行這個函數，thread_run函數會先獲取自己的線程id并將線程狀態對象綁定到全局運行時對象_PyRuntimeState中，隨后調用PyEval_AcquireThread函數嘗試獲取GIL并執行線程代碼，新線程獲取GIL的核心操作便在這個PyEval_AcquireThread函數中。

獲取GIL

新線程進入PyEval_AcquireThread函數嘗試獲取GIL，它的調用鏈是PyEval_AcquireThread->_PyThreadState_Attach->_PyEval_AcquireLock->take_gil，真正獲取GIL的操作在take_gil中，這個函數位于Python/ceval_gil.c文件中，它的源碼如下：

/* Take the GIL.The function saves errno at entry and restores its value at exit.tstate must be non-NULL.Returns 1 if the GIL was acquired, or 0 if not. */
static void
take_gil(PyThreadState *tstate)
{int err = errno;assert(tstate != NULL);/* We shouldn't be using a thread state that isn't viable any more. */// XXX It may be more correct to check tstate->_status.finalizing.// XXX assert(!tstate->_status.cleared);if (_PyThreadState_MustExit(tstate)) {/* bpo-39877: If Py_Finalize() has been called and tstate is not thethread which called Py_Finalize(), exit immediately the thread.This code path can be reached by a daemon thread after Py_Finalize()completes. In this case, tstate is a dangling pointer: points toPyThreadState freed memory. */PyThread_exit_thread();}assert(_PyThreadState_CheckConsistency(tstate));PyInterpreterState *interp = tstate->interp;struct _gil_runtime_state *gil = interp->ceval.gil;
#ifdef Py_GIL_DISABLEDif (!_Py_atomic_load_int_relaxed(&gil->enabled)) {return;}
#endif/* Check that _PyEval_InitThreads() was called to create the lock */assert(gil_created(gil));MUTEX_LOCK(gil->mutex);int drop_requested = 0;while (_Py_atomic_load_int_relaxed(&gil->locked)) {unsigned long saved_switchnum = gil->switch_number;unsigned long interval = (gil->interval >= 1 ? gil->interval : 1);int timed_out = 0;COND_TIMED_WAIT(gil->cond, gil->mutex, interval, timed_out);/* If we timed out and no switch occurred in the meantime, it is timeto ask the GIL-holding thread to drop it. */if (timed_out &&_Py_atomic_load_int_relaxed(&gil->locked) &&gil->switch_number == saved_switchnum){PyThreadState *holder_tstate =(PyThreadState*)_Py_atomic_load_ptr_relaxed(&gil->last_holder);if (_PyThreadState_MustExit(tstate)) {MUTEX_UNLOCK(gil->mutex);// gh-96387: If the loop requested a drop request in a previous// iteration, reset the request. Otherwise, drop_gil() can// block forever waiting for the thread which exited. Drop// requests made by other threads are also reset: these threads// may have to request again a drop request (iterate one more// time).if (drop_requested) {_Py_unset_eval_breaker_bit(holder_tstate, _PY_GIL_DROP_REQUEST_BIT);}PyThread_exit_thread();}assert(_PyThreadState_CheckConsistency(tstate));_Py_set_eval_breaker_bit(holder_tstate, _PY_GIL_DROP_REQUEST_BIT);drop_requested = 1;}}#ifdef Py_GIL_DISABLEDif (!_Py_atomic_load_int_relaxed(&gil->enabled)) {// Another thread disabled the GIL between our check above and// now. Don't take the GIL, signal any other waiting threads, and// return.COND_SIGNAL(gil->cond);MUTEX_UNLOCK(gil->mutex);return;}
#endif#ifdef FORCE_SWITCHING/* This mutex must be taken before modifying gil->last_holder:see drop_gil(). */MUTEX_LOCK(gil->switch_mutex);
#endif/* We now hold the GIL */_Py_atomic_store_int_relaxed(&gil->locked, 1);_Py_ANNOTATE_RWLOCK_ACQUIRED(&gil->locked, /*is_write=*/1);if (tstate != (PyThreadState*)_Py_atomic_load_ptr_relaxed(&gil->last_holder)) {_Py_atomic_store_ptr_relaxed(&gil->last_holder, tstate);++gil->switch_number;}#ifdef FORCE_SWITCHINGCOND_SIGNAL(gil->switch_cond);MUTEX_UNLOCK(gil->switch_mutex);
#endifif (_PyThreadState_MustExit(tstate)) {/* bpo-36475: If Py_Finalize() has been called and tstate is notthe thread which called Py_Finalize(), exit immediately thethread.This code path can be reached by a daemon thread which was waitingin take_gil() while the main thread calledwait_for_thread_shutdown() from Py_Finalize(). */MUTEX_UNLOCK(gil->mutex);/* tstate could be a dangling pointer, so don't pass it todrop_gil(). */drop_gil(interp, NULL, 1);PyThread_exit_thread();}assert(_PyThreadState_CheckConsistency(tstate));tstate->_status.holds_gil = 1;_Py_unset_eval_breaker_bit(tstate, _PY_GIL_DROP_REQUEST_BIT);update_eval_breaker_for_thread(interp, tstate);MUTEX_UNLOCK(gil->mutex);errno = err;return;
}

_PyThreadState_MustExit函數用于檢查當前線程是否在退出狀態，如果在退出狀態則不再參與搶占GIL，隨后進入while循環，獲取gil->locked，如果locked為1，說明當前GIL被其它線程持有，在while循環中，首先保存當前switch_number，然后調用COND_TIMED_WAIT嘗試等待interval時長，等待結束后進行判斷，如果timeout為1，且gil->locked為1，且gil->switch_number == saved_switchnum，則說明經過interval時長，原來持有GIL的線程還在執行，依然沒有釋放GIL，那么就進入if語句塊中，向執行線程發出釋放GIL的信號，表明有線程在等待GIL。如果gil->switch_number ！= saved_switchnum，則說明在等待期間GIL已經被其它線程搶占了，白等了，重新開始新一輪while循環，設置saved_switchnum，再次等待GIL釋放。

進入if語句塊中，就會獲取執行線程的狀態對象holder_tstate，然后調用_Py_set_eval_breaker_bit函數向它的eval_breaker中設置_PY_GIL_DROP_REQUEST_BIT標志位，表明要求執行線程在下一個檢查點釋放GIL。當執行線程收到信號釋放GIL后，等待的線程就可以進行搶占了。

釋放GIL

那么正在執行的線程應該如何接收到釋放GIL的通知呢？在python3的字節碼中插入了許多檢查當前線程eval_breaker的代碼，是通過CHECK_EVAL_BREAKER宏實現的，比如在字節碼開始的重置指令RESUME中就有CHECK_EVAL_BREAKER，跳轉指令中也有，通過這些檢查點來保證執行線程一定會收到釋放信號，不會使GIL形成死鎖。

CHECK_EVAL_BREAKER宏判斷當前線程狀態對象如果設置了eval_breaker則進入_Py_HandlePending函數處理標志位，_Py_HandlePending函數也位于Python/ceval_gil.c文件中，它的源碼如下：

int
_Py_HandlePending(PyThreadState *tstate)
{uintptr_t breaker = _Py_atomic_load_uintptr_relaxed(&tstate->eval_breaker);/* Stop-the-world */if ((breaker & _PY_EVAL_PLEASE_STOP_BIT) != 0) {_Py_unset_eval_breaker_bit(tstate, _PY_EVAL_PLEASE_STOP_BIT);_PyThreadState_Suspend(tstate);/* The attach blocks until the stop-the-world event is complete. */_PyThreadState_Attach(tstate);}/* Pending signals */if ((breaker & _PY_SIGNALS_PENDING_BIT) != 0) {if (handle_signals(tstate) != 0) {return -1;}}/* Pending calls */if ((breaker & _PY_CALLS_TO_DO_BIT) != 0) {if (make_pending_calls(tstate) != 0) {return -1;}}#ifdef Py_GIL_DISABLED/* Objects with refcounts to merge */if ((breaker & _PY_EVAL_EXPLICIT_MERGE_BIT) != 0) {_Py_unset_eval_breaker_bit(tstate, _PY_EVAL_EXPLICIT_MERGE_BIT);_Py_brc_merge_refcounts(tstate);}
#endif/* GC scheduled to run */if ((breaker & _PY_GC_SCHEDULED_BIT) != 0) {_Py_unset_eval_breaker_bit(tstate, _PY_GC_SCHEDULED_BIT);_Py_RunGC(tstate);}/* GIL drop request */if ((breaker & _PY_GIL_DROP_REQUEST_BIT) != 0) {/* Give another thread a chance */_PyThreadState_Detach(tstate);/* Other threads may run now */_PyThreadState_Attach(tstate);}/* Check for asynchronous exception. */if ((breaker & _PY_ASYNC_EXCEPTION_BIT) != 0) {_Py_unset_eval_breaker_bit(tstate, _PY_ASYNC_EXCEPTION_BIT);PyObject *exc = _Py_atomic_exchange_ptr(&tstate->async_exc, NULL);if (exc != NULL) {_PyErr_SetNone(tstate, exc);Py_DECREF(exc);return -1;}}return 0;
}

_Py_HandlePending根據eval_breaker設置的不同的標志位進入不同分支處理，如果設置了_PY_GIL_DROP_REQUEST_BIT標志位，則調用_PyThreadState_Detach釋放GIL，通過調用鏈_PyThreadState_Detach->detach_thread->_PyEval_ReleaseLock->drop_gil->drop_gil_impl最終釋放了GIL，其實就是把gil->locked設為0而已，gil的原型其實就是一個布爾變量。

在釋放完GIL后又會馬上調用_PyThreadState_Attach重新進入到GIL的競爭中，從釋放到獲取的間隔中可能已經有線程搶到GIL并開始執行了。那么當前線程就和其它等待的線程一起重新競爭GIL，python中的多線程就通過這種通知-釋放的機制進行輪流執行。