ZZ FROM:?http://blog.csdn.net/absurd/article/details/1402433
=====================================================
轉載時請注明出處和作者聯系方式:http://blog.csdn.net/absurd
作者聯系方式:Li XianJing <xianjimli at hotmail dot com>
更新時間:2006-12-19
=====================================================
今天幫同事查一個多線程的BUG,其中一個線程掛在g_cond_wait上不動了。從代碼來看,看出不出任何問題,g_cond_wait和g_cond_signal是嚴格配對的。折騰了兩個小時后,從LOG信息中發現,g_cond_wait和g_cond_signal的順序有點問題,一個線程先調g_cond_signal,另外一個線程才調g_cond_wait。
?
g_cond_signal是glib的封裝,在Linux下,是用pthread_cond_signal模擬的,在Win32下,是用SetEvent模擬的。在Win32下,SetEvent和WaitForSingleObject在兩個線程中的調用順序沒有關系,奇怪,難道在linux下兩者的調用順序有影響嗎?
?
看了pthread的代碼,果然如此:pthread_cond_signal發現沒有其它線程等待,它直接返回了(見用紅色高亮的代碼)。
int?pthread_cond_signal(pthread_cond_t *cond) { ????if?(cond?==?NULL) ????????return?pth_error(EINVAL,?EINVAL); ????if?(*cond?== PTHREAD_COND_INITIALIZER) ????????if?(pthread_cond_init(cond,?NULL) != OK) ????????????return?errno; ????if?(!pth_cond_notify((pth_cond_t *)(*cond),?FALSE)) ????????return?errno; ????return?OK; } int?pth_cond_notify(pth_cond_t *cond,?int?broadcast) {?????? ????/* consistency checks */ ????if?(cond?==?NULL) ????????return?pth_error(FALSE,?EINVAL); ????if?(!(cond->cn_state & PTH_COND_INITIALIZED)) ????????return?pth_error(FALSE,?EDEADLK); ??? ????/* do something only if there is at least one waiters (POSIX semantics) */ ????if (cond->cn_waiters > 0) { ????????/* signal the condition */ ????????cond->cn_state |= PTH_COND_SIGNALED; ????????if?(broadcast) ????????????cond->cn_state |= PTH_COND_BROADCAST; ????????else ????????????cond->cn_state &= ~(PTH_COND_BROADCAST); ????????cond->cn_state &= ~(PTH_COND_HANDLED); ??? ????????/* and give other threads a chance to awake */ ????????pth_yield(NULL); ????} ? ????/* return to caller */ ????return?TRUE; } |
?
晚上回家后,我又看了reactos關于SetEvent的實現。結果也意料之中:沒有線程等待這個Event時,它仍然會設置SignalState(見用紅色高亮的代碼)。
LONG STDCALL KeSetEvent(PKEVENT?Event, ???????????KPRIORITY?Increment, ???????????BOOLEAN?Wait) { ????KIRQL?OldIrql; ????LONG?PreviousState; ????PKWAIT_BLOCK?WaitBlock; ? ????DPRINT("KeSetEvent(Event %x, Wait %x)/n",Event,Wait); ? ????/* Lock the Dispathcer Database */ ????OldIrql?= KeAcquireDispatcherDatabaseLock(); ? ????/* Save the Previous State */ ????PreviousState?=?Event->Header.SignalState; ? ????/* Check if we have stuff in the Wait Queue */ ????if (IsListEmpty(&Event->Header.WaitListHead)) { ? ????????/* Set the Event to Signaled */ ????????DPRINT("Empty Wait Queue, Signal the Event/n"); ????????Event->Header.SignalState = 1; ????} else { ? ????????/* Get the Wait Block */ ????????WaitBlock?=?CONTAINING_RECORD(Event->Header.WaitListHead.Flink, ??????????????????????????????????????KWAIT_BLOCK, ??????????????????????????????????????WaitListEntry); ? ? ????????/* Check the type of event */ ????????if?(Event->Header.Type?== NotificationEvent ||?WaitBlock->WaitType == WaitAll) { ? ????????????if?(PreviousState?== 0) { ? ????????????????/* We must do a full wait satisfaction */ ????????????????DPRINT("Notification Event or WaitAll, Wait on the Event and Signal/n"); ????????????????Event->Header.SignalState = 1; ????????????????KiWaitTest(&Event->Header,?Increment); ????????????} ? ????????}?else?{ ? ????????????/* We can satisfy wait simply by waking the thread, since our signal state is 0 now */ ????????????DPRINT("WaitAny or Sync Event, just unwait the thread/n"); ????????????KiAbortWaitThread(WaitBlock->Thread,?WaitBlock->WaitKey,?Increment); ????????} ????} ? ????/* Check what wait state was requested */ ????if?(Wait?==?FALSE) { ? ????????/* Wait not requested, release Dispatcher Database and return */ ????????KeReleaseDispatcherDatabaseLock(OldIrql); ? ????}?else?{ ? ????????/* Return Locked and with a Wait */ ????????KTHREAD *Thread?= KeGetCurrentThread(); ????????Thread->WaitNext =?TRUE; ????????Thread->WaitIrql =?OldIrql; ????} ? ????/* Return the previous State */ ????DPRINT("Done: %d/n",?PreviousState); ????return?PreviousState; } ? |
?
而在KeWaitForSingleObject中,它發現SignalState大于0,就會Wait成功(見用紅色高亮的代碼)。
NTSTATUS STDCALL KeWaitForSingleObject(PVOID?Object, ??????????????????????KWAIT_REASON?WaitReason, ??????????????????????KPROCESSOR_MODE?WaitMode, ??????????????????????BOOLEAN?Alertable, ??????????????????????PLARGE_INTEGER?Timeout) { ?????????... if (CurrentObject->Header.SignalState > 0) ????????{ ????????????/* Another satisfied object */ ????????????KiSatisfyNonMutantWait(CurrentObject, CurrentThread); ????????????WaitStatus = STATUS_WAIT_0; ????????????goto DontWait; ????????} ... } |
?
由此可見,glib封裝的g_cond_signal/g_cond_wait在Win32下和Linux下行為并不完全一致。即使不使用glib的封裝,自己封裝或者直接使用時,也要小心這個微妙的陷阱。
?
~~end~~