Purpose: This is a short post in which we will see how to look-up a symbol name (function or a variable) from the address that it outputted in the kernel dmesg logs and vice versa.
Note: Please note that this is not a HOWTO on how to debug Linux Kernel when it does a OOPS or Kernel Panic. May be I will cover those topics at some later date.
Let’s say you upgrade your kernel and you notice the following anomaly in your dmesg logs even though your system appears to boot fine at least to the naked eye:
[ 24.155180] irq 17: nobody cared (try booting with the "irqpoll" option)
[ 24.155284] Pid: 0, comm: swapper Not tainted 2.6.32-bpo.5-686 #1
[ 24.155373] Call Trace:
[ 24.155463] [<c106cf75>] ? __report_bad_irq+0x24/0x69
[ 24.155554] [<c106cf7c>] ? __report_bad_irq+0x2b/0x69
[ 24.155644] [<c106d0a1>] ? note_interrupt+0xe7/0x13e
[ 24.155734] [<c106d5cf>] ? handle_fasteoi_irq+0x7a/0x97
[ 24.155827] [<c1004dd7>] ? handle_irq+0x17/0x1b
[ 24.155915] [<c1004659>] ? do_IRQ+0x38/0x89
[ 24.156002] [<c10037f0>] ? common_interrupt+0x30/0x38
[ 24.156096] [<c10400d8>] ? ftrace_raw_output_workqueue_insertion+0x80/0x8c
[ 24.156192] [<c104f06e>] ? tick_nohz_stop_sched_tick+0x34a/0x36e
[ 24.156285] [<c1002367>] ? cpu_idle+0x67/0xa4
[ 24.156375] [<c13bf7fc>] ? start_kernel+0x318/0x31d
[ 24.156461] handlers:
[ 24.156539] [<f7cc3855>] (usb_hcd_irq+0x0/0x71 [usbcore])
[ 24.156755] Disabling IRQ #17
Now let’s say you want to understand what does the above highly technical output means. Let’s pickup a line from the above out say the one in the red color and we assume that this line might give us useful information regarding the above anomaly, although that might not be the case because in an actual oops, it is generally the first or the last line in the backtrace that provided the most useful debugging information. Let’s understand one-by-one what each of those columns in the following line means:
[24.155827] [<c1004dd7>] ? handle_irq+0x17/0x1b
[24.155827] is the timestamps information printed by the Linux Kernel. This simply means that this line was output approximately 24 seconds later after the Linux Kernel started booting.
[<c1004dd7>] is the address (let’s just say for now) of the function, handle_irq, in the System Map table.
handle_irq is name of the “offending” function in the Linux Kernel code.
0x17 is the offset (and not the Line Number) which basically points out to the piece of the code in the function that was involved in the above action.
0x1b is size of the function, handle_irq. The offset (0x17) should always be less than the size of the function (0x1b).
Now let’s see how this information related to the System Map table file that is generated when you install your Linux Kernel.
Give the following command:
# cat /boot/System.map-2.6.32-bpo.5-686 | grep handle_irq
You will need to substitute the name of your Kernel System Map table file accordingly.
c1004dc0 T handle_irq
c100cffc t intel_pmu_handle_irq
c100d2a4 t p6_pmu_handle_irq
c100d3bd t amd_pmu_handle_irq
We are interested in the first line since that is what we think our offending function is. However, the address c1004dc0 does not matches with the address (c1004dd7) that we found in the output above. So how do we reconcile these numbers?
Let’s try to subtract the hex number c1004dc0 from c1004dd7 i.e.
c1004dd7 - c1004dc0 = 17
Bingo! The number 17 matches with the output, 0x17, that we got in the anomaly. So to summarize:
c1004dc0 is the starting address of the function handle_irq.
c1004dd7 is the address (after adding 17 to the above starting address) of the piece of the code in the function.
Similarly, you can do these exercise for other functions like cpu_idle and start_kernel from the above dmesg output.