Decode segfault errors in dmesg
You are writing a C program. Time has come to run it. You are pretty confident that it will run at once.
$ ./foo Segmentation fault
The machine hardly reminds you that you were over-confident. But
before rushing to re-compile your program with debugging symbols or
adding printf()
calls here and there, have a look at the output of the
Linux kernel:
$ dmesg foo[1234]: segfault at 2a ip 0000000000400511 sp 00007fffe00a3260 error 4 in foo[400000+1000]
These are some hints in dmesg output:
-
foo
is the executable name -
1234
is the process ID -
2a
is the faulty address in hexadecimal - the value after
ip
is the instruction pointer - the value after
sp
is the stack pointer -
error 4
is an error code - the string at the end is the name of the virtual memory area (VMA)
The error code is a combination of several error bits defined in fault.c in the Linux kernel:
/* * Page fault error code bits: * * bit 0 == 0: no page found 1: protection fault * bit 1 == 0: read access 1: write access * bit 2 == 0: kernel-mode access 1: user-mode access * bit 3 == 1: use of reserved bit detected * bit 4 == 1: fault was an instruction fetch * bit 5 == 1: protection keys block access * bit 15 = 1: SGX MMU page-fault */ enum x86_pf_error_code { X86_PF_PROT = 1 << 0, X86_PF_WRITE = 1 << 1, X86_PF_USER = 1 << 2, X86_PF_RSVD = 1 << 3, X86_PF_INSTR = 1 << 4, X86_PF_PK = 1 << 5, X86_PF_SGX = 1 << 15, };
Since you are executing a user-mode program, X86_PF_USER
is set and the
error code is at least 4. If the invalid memory access is a write,
then X86_PF_WRITE
is set. Thus:
- if the error code is 4, then the faulty memory access is a read from userland
- if the error code is 6, then the faulty memory access is a write from userland
Moreover, the faulty memory address in dmesg
can help you identify
the bug. For instance, if the memory address is 0, the root cause is
probably a NULL pointer dereference.
The name of the VMA may give you an indication of the location of the error:
#include <stdlib.h> int main(void) { free((void *) 42); return 0; }
When executed, the program above triggers a segfault and the VMA name is the libc. So we can imagine that a libc function was called with an invalid pointer.
progname[1234]: segfault at 22 ip 00007f6b2531473c sp 00007ffc7b2c5c30 error 4 in libc-2.31.so[7f6b252af000+14b000]
The fault handler is architecture dependent, so you will not observe
the same messages in dmesg
with other architectures than x86. For
instance, on ARM no message is displayed unless the Linux kernel has
been built with CONFIG_DEBUG_USER
.

A 64-bit 64-beam architecture
Fondation Louis Vuitton