?
0.linux內核異常常用分析方法
- 異常地址是否在0附近,確認是否是空指針解引用問題
- 異常地址是否在iomem映射區,確認是否是設備訪問總線異常問題,如PCI異常導致的地址訪問異常
- 異常地址是否在stack附近,如果相鄰,要考慮是否被踩
- 比較delay reset/nmi watchdog等多種機制打印的棧信息,看看pc是否在動,確定是否是死鎖
- 用SysRq判斷是真死還是假死
- 通過反匯編獲得發生異常的C代碼段和函數,查找開源社區是否已有補丁修復
下面分別通過PowerPC和Mips64的2個異常例子詳細講解分析過程。
1.PowerPC小系統內核異常分析
1.1? 異常打印
?
Unable to handle kernel paging request for data at address 0x36fef31e
Faulting instruction address: 0xc0088b8c
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT SMP NR_CPUS=2
Modules linked in: ossmod tipc ohci_hcd ehci_hcd cmm uart1655x bcm334 bootflash mtdchar bsp_flash_init boardctrl 85xx_debug util
NIP: C0088B8C LR: C0088CF8 CTR: 00000000
REGS: ce283e20 TRAP: 0300 Not tainted (2.6.21.7-EMBSYS-CGEL-3.04.10.P6.F5)
MSR: 00021000 <ME> CR: 22004222 XER: 00000000
DAR: 36FEF31E, DSISR: 00800000
TASK = cffdf180[26] 'events/1' THREAD: ce282000 CPU: 1
GPR00: 00100100 CE283ED0 CFFDF180 CF528000 C09EA500 EFFEAD20 CF5188A0 00000000
GPR08: CF5188BC 00200200 36FEF31E D1FD7F9E 22004222 1010DA44 00000290 00000000
GPR16: 1011C858 100147F4 BF9BC9C4 10100000 00000001 C0460000 C06454CC 00000000
GPR24: C0640000 CE282000 C0640000 00000005 00000000 00000000 EFFE8EC0 CFFED958
NIP [C0088B8C] free_block+0xc4/0x16c
LR [C0088CF8] drain_array+0xc4/0x100
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Call Trace:
[CE283ED0] [C06ABEC0] 0xc06abec0(unreliable)
[CE283EF0] [C0088CF8] drain_array+0xc4/0x100
[CE283F10] [C008A70C] cache_reap+0x94/0x13c
[CE283F30] [C003DA2C] run_workqueue+0xc4/0x198
[CE283F60] [C003E6D4] worker_thread+0x130/0x154
[CE283FB0] [C0042E80] kthread+0xd4/0x110
[CE283FF0] [C0011A70] original_kernel_thread+0x44/0x60Instruction dump:
5400cffe 0f000000 80c4001c 7d1cf214 3c000010 3d200020 80a8001c 60000100
81660000 61290200 81460004 3906001c <916a0000> 914b0004 90060000 91260004
------------[ cut here ]------------
Badness at c0011e4c [verbose debug info unavailable]
Call Trace:
[CE283C50] [C00080BC] show_stack+0x3c/0x1a0 (unreliable)
[CE283C80] [C018EA28] report_bug+0xb0/0xb8
[CE283C90] [C000EC94] program_check_exception+0xcc/0x4f8
[CE283CD0] [C0010BE4] ret_from_except_full+0x0/0x4c
[CE283D90] [C0640000] 0xc0640000
[CE283DD0] [C000E61C] die+0x1f0/0x27c
[CE283E00] [C0014B18] bad_page_fault+0x98/0xe8
[CE283E10] [C0010A88] handle_page_fault+0x7c/0x80
[CE283ED0] [C06ABEC0] 0xc06abec0
[CE283EF0] [C0088CF8] drain_array+0xc4/0x100
[CE283F10] [C008A70C] cache_reap+0x94/0x13c
[CE283F30] [C003DA2C] run_workqueue+0xc4/0x198
[CE283F60] [C003E6D4] worker_thread+0x130/0x154
[CE283FB0] [C0042E80] kthread+0xd4/0x110
[CE283FF0] [C0011A70] original_kernel_thread+0x44/0x60
1.2? Oops分析
?Oops: Kernel access of bad area, sig: 11 [#1]? ?
異常分類
Oops:內核態指令異常;
BUG:內核檢測到邏輯異常(類似于assert),會影響內核的后續運行;
WARNING:類似于BUG,但是不會影響內核的后續運行;
PANIC:類似于BUG,系統不能繼續運行,直接掛起或重啟;
SOFTLOCK:長時間任務得不到調度;
?
異常信號
Signal | Code | Default Action | Description |
SIGABRT | 6 | A | Process abort signal |
SIGALRM | 14 | T | Alarm clock |
SIGBUS | 10 | A | Access to an undefined portion of a memory object |
SIGCHLD | 18 | I - Ignore the Signal | Child process terminated, stopped, |
SIGCONT | 25 | C - Continue the process | Continue executing, if stopped. |
SIGFPE | 8 | A | Erroneous arithmetic operation. |
SIGHUP | 1 | T | Hangup. |
SIGILL | 4 | A | Illegal instruction. |
SIGINT | 2 | T | Terminal interrupt signal. |
SIGKILL | 9 | T | Kill (cannot be caught or ignored). |
SIGPIPE | 13 | T - Abnormal termination of the process | Write on a pipe with no one to read it. |
SIGQUIT | 3 | A - Abnormal termination of the process | Terminal quit signal. |
SIGSEGV | 11 | A | Invalid memory reference. |
SIGSTOP | 23 | S - Stop the process | Stop executing (cannot be caught or ignored). |
SIGTERM | 15 | T | Termination signal. |
SIGTSTP | 23 | S | Terminal stop signal. |
SIGTTIN | 26 | S | Background process attempting read. |
SIGTTOU | 27 | S | Background process attempting write. |
SIGUSR1 | 16 | T | User-defined signal 1. |
SIGUSR2 | 17 | T | User-defined signal 2. |
SIGPOLL | 22 | T | Pollable event. |
SIGPROF | 29 | T | Profiling timer expired. |
SIGSYS | 12 | A | Bad system call. |
SIGTRAP | 5 | A | Trace/breakpoint trap. |
SIGURG | 21 | I | High bandwidth data is available at a socket. |
SIGVTALRM | 28 | T | Virtual timer expired. |
SIGXCPU | 30 | A | CPU time limit exceeded. |
SIGXFSZ | 31 | A | File size limit exceeded |
Default Actions:
T?- Abnormal termination of the process. The process is terminated with all the consequences of _exit() except that the status made available to wait() and waitpid() indicates abnormal termination by the specified signal.
A?- Abnormal termination of the process. Additionally, implementation-defined abnormal termination actions, such as creation of a core file, may occur.
I?- Ignore the signal.
S?- Stop the process.
C?- Continue the process, if it is stopped; otherwise, ignore the signal.
?
具體針對powerpc e500內核,異常與信號的對應關系如下:
?
所以有進程訪問了超出其虛擬地址空間的地址,內核報SIGSEGV(segment fault)信號。
那是什么進程呢?
其他
#1,die_counter,表示Oops發生的次數,一般來說,如果有多條Oops,看第一條Oops信息,因為后面的Oops可能是第一條Oops的錯誤傳播導致的。
?
1.3? 寄存器分析
NIP: C0088B8C LR: C0088CF8 CTR: 00000000?
NIP是next instruction pointer,值就是當前指令的地址。這里列出了3個寄存器的值。
LR是link register其值為上一條指令的地址。
CTR是count register,其值用于循環指令。
REGS: ce283e20 TRAP: 0300?? Not tainted? (2.6.21.7-EMBSYS-CGEL-3.04.10.P6.F5)??
TRAP :異常處理函數入口地址;REGS :系統棧pt_regs的基址。pt_regs這個結構封裝了需要在內核入口中保存的最少的狀態信息。比如說每一次的系統調用、中斷、陷阱、故障。
??? 0x100:??? "(System Reset)" ?????? 0x200:??? "(Machine Check)" ?????? 0x300:??? "(Data Access)" ?????? 0x380:??? "(Data SLB Access)" ?????? 0x400:??? "(Instruction Access)" ?????? 0x480:??? "(Instruction SLB Access)" ?????? 0x500:??? "(Hardware Interrupt)" ?????? 0x600:??? "(Alignment)" ?????? 0x700:??? "(Program Check)" ?????? 0x800:??? "(FPU Unavailable)" ?????? 0x900:??? "(Decrementer)" ?????? 0xc00:???? "(System Call)" ?????? 0xd00:??? "(Single Step)" ?????? 0xf00:???? "(Performance Monitor)" ?????? 0xf20:???? "(Altivec Unavailable)" ?????? 0x1300:?? "(Instruction Breakpoint)" |
詳細解釋見《PowerPC? e500 Core Family Reference Manual》“5.7 Interrupt Definitions”。
?
tainted :內核錯誤信息,由add_taint設置,解釋如下:
*? 'P' - Proprietary module has been loaded. ?*? 'F' - Module has been forcibly loaded. ?*? 'S' - SMP with CPUs not designed for SMP. ?*? 'R' - User forced a module unload. ?*? 'M' - System experienced a machine check exception. ?*? 'B' - System has hit bad_page. ?*? 'U' - Userspace-defined naughtiness. ?*? 'D' - Kernel has oopsed before ?*? 'A' - ACPI table overridden. ?*? 'W' - Taint on warning. ?*? 'C' - modules from drivers/staging are loaded. |
?
MSR: 00021000 <ME>? CR: 22004222? XER: 00000000??
DAR: 36FEF31E, DSISR: 00800000
MSR是machine state register;
CR是condition register;
XER為Integer Exception Register
DAR為data address register,其值為造成了內存訪問異常的地址。E500中為Data Exception Address Register (DEAR)
DSISR為Data Storage Interrupt Status Register,是存儲著發生內存訪問異常原因的寄存器。E500中為Exception Syndrome Register (ESR)。0x00800000表示Store operation中的Alignment, data storage, data TLB error異常。
?
TASK = cffdf180[26] 'events/1' THREAD: ce282000 CPU: 1
cffdf180:進程task_struct結構體的地址;
26:進程號;
events/1:進程名;
THREAD:進程的內核棧起始地址;
CPU:當前CPU;
當前進程也就是'events/1進程,出現SIGSEGV異常了。
?
GPR00: 00100100 CE283ED0 CFFDF180 CF528000 C09EA500 EFFEAD20 CF5188A0 00000000
GPR08: CF5188BC 00200200 36FEF31E D1FD7F9E 22004222 1010DA44 00000290 00000000?????????????????????????????????????????????????????????????????????
GPR16: 1011C858 100147F4 BF9BC9C4 10100000 00000001 C0460000 C06454CC 00000000?????????????????????????????????????????????????????
GPR24: C0640000 CE282000 C0640000 00000005 00000000 00000000 EFFE8EC0 CFFED958
? ? PowerPC的ABI規定的寄存器的使用規則如下:
? (1)GPR0:屬于易失性寄存器,ABI規定普通用戶不能使用此寄存器。GCC編譯器用此寄存器來保存LR寄存器,Linux PowerPC用此寄存器來傳遞系統調用號碼。
? (2)GPR1:屬于專用寄存器,ABI規定用次寄存器來保存堆棧的棧頂指針。
? (3)GPR2:屬于專用寄存器,ABI規定普通用戶不使用才寄存器,Linux PowerPC用此寄存器來保存當前進程的進程描述符地址。
? (4)GPR3-GPR4:屬于易失性寄存器,ABI使用這兩個寄存器來保存函數的返回值,或者用來傳遞參數。
? (5)GPR5-GPR10:也屬于易失性寄存器,加上GPR3和GPR4共8個寄存器用來傳遞函數的參數。當函數的參數超過八個時使用堆棧來傳遞。
? (6)GPR11-GPR12:屬于易失性寄存器,ABI規定普通用戶不使用該寄存器,Linux PowerPC有時用這兩個寄存器來存放臨時變量,但是GCC編譯器沒有使用這兩個寄存器。
? (7)GPR13:屬于專用寄存器,ABI規定該寄存器sdata段的基地址指針。Linux PowerPC在系統初始化時使用該寄存器來存放臨時變量。GCC有時會根據某些規則將一些常用的數據放入sdata或者sbss段中。應用程序對sdata或者sbss段數據的訪問與對data和bss段數據的訪問機制不同,訪問sdata段的數據速度更快。
? (8)GPR14-GPR31:屬于非易失性寄存器。ABI使用這些寄存器來存放一些臨時變量,在應用程序中可以自由使用這些變量。
?
1.4? 調用棧分析
調用鏈
?
NIP [C0088B8C] free_block+0xc4/0x16c
LR [C0088CF8] drain_array+0xc4/0x100
Call Trace:
[CE283ED0] [C06ABEC0] 0xc06abec0(unreliable)
[CE283EF0] [C0088CF8] drain_array+0xc4/0x100
[CE283F10] [C008A70C] cache_reap+0x94/0x13c
[CE283F30] [C003DA2C] run_workqueue+0xc4/0x198
[CE283F60] [C003E6D4] worker_thread+0x130/0x154
[CE283FB0] [C0042E80] kthread+0xd4/0x110
[CE283FF0] [C0011A70] original_kernel_thread+0x44/0x60
Instruction dump:
5400cffe 0f000000 80c4001c 7d1cf214 3c000010 3d200020 80a8001c 60000100
81660000 61290200 81460004 3906001c <916a0000> 914b0004 90060000 91260004
[CE283FB0] [C0042E80] kthread+0xd4/0x110?
CE283FB0:棧地址;
C0042E80:棧上保存的LR值,即函數返回地址。
kthread:函數名;
0xd4/0x110:異常指令偏移/調用函數長度。
?
static void free_block(struct kmem_cache *cachep, void **objpp, int nr_objects, int node)
?
從調用棧上看,內核在drain_array中調用free_block出現異常,查看free_block原型,對比入棧參數(CF528000 C09EA500 EFFEAD20 CF5188A0),可以發現int nr_objects, int node明顯異常,可能推斷調用棧可能已經被踩。
?
指令碼
Instruction dump:
5400cffe 0f000000 80c4001c 7d1cf214 3c000010 3d200020 80a8001c 60000100
81660000 61290200 81460004 3906001c <916a0000> 914b0004 90060000 91260004
??????? Instruction dump打印出NIP附近的指令字節碼。其中<916a0000>為NIP的指令碼。
反匯編定位
objump -dS vmlinux > /tmp/kernel.s
通過查找<916a0000>對應的C代碼,確定具體那句C代碼出現異常。
其中vmlinux為已打開調試信息的,與故障相同版本的內核鏡像。
?
2.MIPS小系統內核異常分析
?
2.1? 異常打印
0:Oops[#1]:
? 0:Cpu 0
? 0:Show thread info from vcpu 0
? 0: VCPU?? Stack bottom????? Task?????????? ???????Ti at
? 0:? 0??? c000000595057fe0??? swapper????????????? c000000595054000
? 0:Thread info( c000000595054000 ):
? 0:??? Process swapper (pid: 1)
? 0:? exec_domain ffffffffc0f299b0
? 0:? flags 100000
? 0:? tp_value 0
? 0:? cpu 0
? 0:? preempt_count 2
? 0:? regs (null)
? 0:STACK_END_MAGIC at va( c000000595054068 ): 57AC6E9D( =? 57AC6E9D)
? 0:
? 0:$ 0?? :? 0: 0000000000000000? 0: 0000000000000000? 0: 0000000000000000? 0: 0000000000000001? 0:
? 0:$ 4?? :? 0: 0000000000000000? 0: 0000000000000000? 0: ffffffffffffffff? 0: 0000000000002976? 0:
? 0:$ 8?? :? 0: 0000000000007fff? 0: 000000000000000a? 0: 5f73746172747570? 0: 000000000000006c? 0:
? 0:$12?? :? 0: 0000000000000068? 0: 000000000000004c? 0: ffffffffc10bc384? 0: c000000593338000? 0:
? 0:$16?? :? 0: 0000000000000000? 0: ffffffffc10e42b8? 0: ffffffffc10e0000? 0: ffffffffc10e0000? 0:
? 0:$20?? :? 0: 0000000000000000? 0: 0000000000000080? 0: 0000000000000080? 0: 0000000000000000? 0:
? 0:$24?? :? 0: 0000000000000006? 0: ffffffffc06501a8? 0:???????????????? ??0:?????????????????? 0:
? 0:$28?? :? 0: c000000595054000? 0: c000000595057c88? 0: 0000000000000000? 0: ffffffffc087bf40? 0:
? 0:Hi??? : 0000000000000000
? 0:Lo??? : 0000000000000000?
0:epc?? : ffffffffc087c4b4 _bcore_cleanup+0x34/0x190
? 0:??? Not tainted
? 0:ra??? : ffffffffc087bf40 _init+0x3e8/0x480
? 0:Status: 5400ffe3????? 0:KX?? 0:SX?? 0:UX?? 0:KERNEL?? 0:EXL?? 0:IE?? 0:
? 0:Cause : 00800008
? 0:BadVA : 0000000000000008
? 0:PrId? : 000c1102 (XLP316?? A2? )
? 0:<d>Modules linked in:? 0:
? 0:Process swapper (pid: 1, threadinfo=c000000595054000, task=c000000595053898, tls=0000000000000000)
? 0:Stack :? 0: ffffffffffffffff? 0: ffffffffc10e0000? 0: c000000595193240? 0: 0000000000000001? 0:
???????? 0: ffffffffc104365c? 0: ffffffffc087bf40? 0: 000001fac104365c? 0: ffffffffc087cb30? 0:
???????? 0: ffffffffc087c3a8? 0: 0000000000000000? 0: ffffffffc0f4a778? 0: c000000595193000? 0:
???????? 0: c000000595193240? 0: 0000000000000001? 0: ffffffffc10e0000? 0: c000000595193240? 0:
???????? 0: 0000000000000001? 0: ffffffffc104365c? 0: 0000000000000000? 0: 0000000000000080? 0:
???????? 0: 0000000000000080? 0: ffffffffc1043c44? 0: 00008a17bc300000? 0: ffffffffc10e0000? 0:
???????? 0: c00000059333dd40? 0: 0000000000000000? 0: 3800000000000000? 0: 0000000000000000? 0:
???????? 0: 000000009333dd40? 0: ffffffffc1043638? 0: 000000005400ffe0? 0: ffffffffbfff00fe? 0:
???????? 0: ffffffffc1070000? 0: ffffffffc1063200? 0: 0000000000000001? 0: ffffffffc104365c? 0:
???????? 0: 0000000000000000? 0: 0000000000000080? 0: 0000000000000080? 0: 0000000000000000? 0:
???????? 0: ...? 0:
? 0:Call Trace: [jiffies: 0xfffff79f]
? 0:[<ffffffffc087c4b4>] _bcore_cleanup+0x34/0x190
? 0:[<ffffffffc087bf40>] _init+0x3e8/0x480
? 0:[<ffffffffc1043c44>] bcmxgs_init_module+0x5e8/0xc00
? 0:[<ffffffffc060eebc>] do_one_initcall+0x3c/0x1a0
? 0:[<ffffffffc102cc04>] kernel_init+0x220/0x2b8
? 0:[<ffffffffc062c730>] kernel_thread_helper+0x10/0x20
? 0:
? 0:
Code:? 0: ffbf0028?? 0: 0000802d?? 0: 663142b8?? 0:<dc420008>? 0: 0040f809?? 0: 00000000?? 0: 0202102a? ?0: 1040001d?? 0: 00000000
?0:
? 0:<4>Disabling lock debugging due to kernel taint
?2.2? 異常信號
異常與信號之間的關系:
2.3? 線程信息分析
0:Cpu 0:這2個0為當前CPU核ID;
??0:Show thread info from vcpu 0
? 0: VCPU?? Stack bottom????? Task????????????????? Ti at
? 0:? 0??? c000000595057fe0?? ?swapper????????????? c000000595054000
VCPU:CPU核;
Stack bottom:棧底指針;
Task:線程名;
Ti at:線程thread_info結構體指針;
?0:Thread info( c000000595054000 ):
? 0:??? Process swapper (pid: 1)
? 0:? exec_domain ffffffffc0f299b0
? 0:? flags 100000
? 0:? tp_value 0
? 0:? cpu 0
? 0:? preempt_count 2
? 0:? regs (null)
? 0:STACK_END_MAGIC at va( c000000595054068 ): 57AC6E9D( =? 57AC6E9D)?
flags :線程標志位,具體標記如下表。此時值為TIF_FIXADE,表示有address errors。Thread info( c000000595054000 ):產生異常的線程信息;下面的字段為thread_info結構體中的字段信息。其中,
preempt_count:為搶占計數。為0時,內核可以安全的執行搶占此線程。不為0,表示當前進程持有鎖不能釋放CPU控制權(不能被搶占)。
STACK_END_MAGIC:棧底部的魔幻數,可以輔助判斷棧是否被踩。
#define TIF_SIGPENDING 1 /* signal pending */
#define TIF_NEED_RESCHED 2 /* rescheduling necessary */
#define TIF_SYSCALL_AUDIT 3 /* syscall auditing active */
#define TIF_SECCOMP 4 /* secure computing */
#define TIF_NOTIFY_RESUME 5 /* callback before returning to user */
#define TIF_RESTORE_SIGMASK 9 /* restore signal mask in do_signal() */
#define TIF_USEDFPU 16 /* FPU was used by this task this quantum (SMP) */
#define TIF_POLLING_NRFLAG 17 /* true if poll_idle() is polling TIF_NEED_RESCHED */
#define TIF_MEMDIE 18
#define TIF_FREEZE 19
#define TIF_FIXADE 20 /* Fix address errors in software */
#define TIF_LOGADE 21 /* Log address errors to syslog */
#define TIF_32BIT_REGS 22 /* also implies 16/32 fprs */
#define TIF_32BIT_ADDR 23 /* 32-bit address space (o32/n32) */
#define TIF_FPUBOUND 24 /* thread bound to FPU-full CPU set */
#define TIF_LOAD_WATCH 25 /* If set, load watch registers */
#define TIF_XKPHYS_MEM_EN 26
#define TIF_XKPHYS_IO_EN 27
#define TIF_SYSCALL_TRACE 31 /* syscall trace active */
?
2.4? 寄存器分析?
??0:$ 0?? :? 0: 0000000000000000? 0: 0000000000000000? 0: 0000000000000000? 0: 0000000000000001? 0:
? 0:$ 4?? :? 0: 0000000000000000? 0: 0000000000000000? 0: ffffffffffffffff? 0: 0000000000002976 ?0:
? 0:$ 8?? :? 0: 0000000000007fff? 0: 000000000000000a? 0: 5f73746172747570? 0: 000000000000006c? 0:
? 0:$12?? :? 0: 0000000000000068? 0: 000000000000004c? 0: ffffffffc10bc384? 0: c000000593338000? 0:
? 0:$16?? :? 0: 0000000000000000? 0: ffffffffc10e42b8? 0: ffffffffc10e0000? 0: ffffffffc10e0000? 0:
? 0:$20?? :? 0: 0000000000000000? 0: 0000000000000080? 0: 0000000000000080? 0: 0000000000000000? 0:
? 0:$24?? :? 0: 0000000000000006? 0: ffffffffc06501a8? 0:?????????????????? 0:?????????????????? 0:
? 0:$28?? :? 0: c000000595054000? 0: c000000595057c88 ?0: 0000000000000000? 0: ffffffffc087bf40? 0:
? 0:Hi??? : 0000000000000000
? 0:Lo??? : 0000000000000000
? 0:epc?? : ffffffffc087c4b4 _bcore_cleanup+0x34/0x190
? 0:??? Not tainted
? 0:ra??? : ffffffffc087bf40 _init+0x3e8/0x480
??0:Status: 5400ffe3????? 0:KX?? 0:SX ??0:UX?? 0:KERNEL?? 0:EXL?? 0:IE?? 0:
? 0:Cause : 00800008
? 0:BadVA : 0000000000000008
? 0:PrId? : 000c1102 (XLP316?? A2? )
?
Mips核心寄存器組有4組,分別是GP, COP0, COP1, COP2。
其中COP0幾個重要的寄存器解釋如下:
Status:c0p0狀態cp0_status。其中EXL標示在異常模式中,具體解釋請參照《參考資料6.7 第193頁》
Cause:00800008,標示 TLB exception(load or instruction fetch)
BadVA:產生異常的虛擬地址,如地址錯誤、無效的TLB,TLB modified等等。
2.5? 調用棧分析
0:Process swapper (pid: 1, threadinfo=c000000595054000, task=c000000595053898, tls=0000000000000000)
? 0:Stack :? 0: ffffffffffffffff? 0: ffffffffc10e0000? 0: c000000595193240? 0: 0000000000000001? 0:
???????? 0: ffffffffc104365c? 0: ffffffffc087bf40? 0: 000001fac104365c? 0: ffffffffc087cb30? 0:
? ???????0: ffffffffc087c3a8? 0: 0000000000000000? 0: ffffffffc0f4a778? 0: c000000595193000? 0:
???????? 0: c000000595193240? 0: 0000000000000001? 0: ffffffffc10e0000? 0: c000000595193240? 0:
???????? 0: 0000000000000001? 0: ffffffffc104365c? 0: 0000000000000000? 0: 0000000000000080? 0:
???????? 0: 0000000000000080? 0: ffffffffc1043c44? 0: 00008a17bc300000? 0: ffffffffc10e0000? 0:
???????? 0: c00000059333dd40? 0: 0000000000000000? 0: 3800000000000000? 0: 0000000000000000? 0:
???????? 0: 000000009333dd40? 0: ffffffffc1043638? 0: 000000005400ffe0? 0: ffffffffbfff00fe? 0:
???????? 0: ffffffffc1070000? 0: ffffffffc1063200? 0: 0000000000000001? 0: ffffffffc104365c? 0:
???????? 0: 0000000000000000? 0: 0000000000000080? 0: 0000000000000080? 0: 0000000000000000? 0:
? ???????0: ...? 0:
? 0:Call Trace: [jiffies: 0xfffff79f]
? 0:[<ffffffffc087c4b4>] _bcore_cleanup+0x34/0x190
? 0:[<ffffffffc087bf40>] _init+0x3e8/0x480
? 0:[<ffffffffc1043c44>] bcmxgs_init_module+0x5e8/0xc00
? 0:[<ffffffffc060eebc>] do_one_initcall+0x3c/0x1a0
? 0:[<ffffffffc102cc04>] kernel_init+0x220/0x2b8
? 0:[<ffffffffc062c730>] kernel_thread_helper+0x10/0x20
? 0:
? 0:
Code:? 0: ffbf0028?? 0: 0000802d?? 0: 663142b8?? 0:<dc420008>? 0: 0040f809?? 0: 00000000?? 0: 0202102a?? 0: 1040001d?? 0: 00000000
?0:
Call Trace:出現異常線程的調用棧信息。Stack:出現異常線程的堆棧信息。
Code:異常附近的指令碼打印。其中0:<dc420008>為epc處的指令碼,對應代碼位置為(epc?? : ffffffffc087c4b4 _bcore_cleanup+0x34/0x190)。具體代碼需要反匯編定位。
反匯編定位方法與Powerpc的相同。
?
分析代碼可知,異常由于訪問了BadVA : 0000000000000008的非法地址,查看_bcore_cleanup代碼,可知此時bde指針沒有初始化,是空指針,所以bde->num_devices的地址剛好是0000000000000008,導致異常。
異常代碼段如下:
_bcore_cleanup(void)
{
??? for (unit = 0; unit < bde->num_devices(BDE_ALL_DEVICES); unit++)
?
6.參考資料
6.1???????? http://en.wikipedia.org/wiki/Unix_signal
6.2???????? http://www.powerlinuxchina.net/club/viewthread.php?tid=981
6.3???????? 《PowerPC? e500 Application Binary Interface User’s Guide》
6.4???????? 《PowerPC? e500 Core Family Reference Manual》
6.5???????? 《MPC8572E PowerQUICC? III Integrated Host Processor Family Reference Manual》
6.6???????? 《SYSTEM V APPLICATION BINARY INTERFACE – MIPS RISC Processor Supplement》
6.7???????? 《XLP 300-/300-Lite-Series-Processor Programmer’s Register Reference Guide》
6.8???????? http://blog.chinaunix.net/uid-16459552-id-3459993.html
6.9???????? http://blog.chinaunix.net/uid-16459552-id-3257539.html
6.10???? http://www.linuxspy.info/2249/tainted-kernel/
?
--EOF--