0% found this document useful (0 votes)
35 views39 pages

Understanding A Kernel Oop Sand A Kernel Panic v3

This document discusses kernel oops and kernel panics in Linux. A kernel oops occurs when there is an illegal instruction or illegal memory access in kernel space, and will kill the process to keep the system running. A kernel panic means the system stops running immediately. A kernel oops is caused by things like illegal instructions, unrecognized system calls, or undefined CPU instructions, while a kernel panic results from more severe issues like an illegal instruction in interrupt vectors.

Uploaded by

stroganovboris
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views39 pages

Understanding A Kernel Oop Sand A Kernel Panic v3

This document discusses kernel oops and kernel panics in Linux. A kernel oops occurs when there is an illegal instruction or illegal memory access in kernel space, and will kill the process to keep the system running. A kernel panic means the system stops running immediately. A kernel oops is caused by things like illegal instructions, unrecognized system calls, or undefined CPU instructions, while a kernel panic results from more severe issues like an illegal instruction in interrupt vectors.

Uploaded by

stroganovboris
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Understanding a Kernel Oops

and a Kernel panic


Linux Kernel Version 4.9.8

Joseph Lu
[email protected]
Understanding a Kernel Oops and a Kernel panic
• illegal instruction (SIGILL), illegal memory access (SIGSEGV)
– In user space : Segmentation fault

Page
1
2
Understanding a Kernel Oops and a Kernel panic
• illegal instruction (SIGILL), illegal memory access (SIGSEGV)
– In user space : Segmentation fault
– In kernel space : kernel oops.

Oops

Page
2
3
Understanding a Kernel Oops and a Kernel panic
• illegal instruction (SIGILL), illegal memory access (SIGSEGV)
– In user space : Segmentation fault
– In kernel space : kernel oops.
• “kernel oops" will kill the process to keep system running, unless “kernel oops" are bad enough to cause
“kernel panic”.

moderate
do_exit() Oops

Page
3
4
Understanding a Kernel Oops and a Kernel panic
• illegal instruction (SIGILL), illegal memory access (SIGSEGV)
– In user space : Segmentation fault
– In kernel space : kernel oops.
• “kernel oops" will kill the process to keep system running, unless “kernel oops" are bad enough to cause
“kernel panic”.
• “kernel panic” means the system decides to stop running immediately.

moderate
do_exit() Oops

severe

Panic

Page
4
5
Linux kernel oops
• illegal instruction (SIGILL) , illegal memory access (SIGSEGV)
– In kernel : Cause of an oops.

Page
5
6
Linux kernel oops
• illegal instruction (SIGILL) , illegal memory access (SIGSEGV)
– In kernel : Cause of an oops.

illegal instruction

Page
6
7
Linux kernel oops
• illegal instruction (SIGILL) , illegal memory access (SIGSEGV)
– In kernel : Cause of an oops.

illegal instruction illegal instruction cause panic:


It handles the impossible case in the interrupt vectors
• bad_mode()
→ die(“Oops - bad mode”, …);

die()
Oops

Page
7
8
Linux kernel oops
• illegal instruction (SIGILL) , illegal memory access (SIGSEGV)
– In kernel : Cause of an oops.

illegal instruction illegal instruction cause panic:


It handles the impossible case in the interrupt vectors
• bad_mode()
→ die(“Oops - bad mode”, …);
→ panic(“bad mode”);
die()
Oops

Panic()

Page
8
9
Linux kernel oops
• illegal instruction (SIGILL) , illegal memory access (SIGSEGV)
– In kernel : Cause of an oops.

illegal instruction illegal instruction cause panic:


It handles the impossible case in the interrupt vectors
• bad_mode()
→ die(“Oops - bad mode”, …);
→ panic(“bad mode”);
die()
Oops

Panic()

Page
9
10
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.

illegal instruction

Page
10
11
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.

illegal instruction cause oops :


1. Call BUG()
illegal instruction

force_sig_info()
user_mode
Yes
No

arm_notify_die()
Oops

Page
11
12
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.

illegal instruction cause oops :


1. Call BUG()
illegal instruction
2. unrecognised system calls :
• arm_syscall() → arm_notify_die("Oops - bad syscall(2)", …)
force_sig_info()
user_mode
Yes
No

arm_notify_die()
Oops

Page
12
13
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.

illegal instruction cause oops :


1. Call BUG()
illegal instruction
2. unrecognised system calls :
• arm_syscall() → arm_notify_die("Oops - bad syscall(2)", …)
force_sig_info()
3. undefined cpu instruction :
user_mode • do_undefinstr() → arm_notify_die("Oops - undefined
Yes
instruction", …)
No

arm_notify_die()
Oops

Page
13
14
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.

illegal instruction cause oops :


1. Call BUG()
illegal instruction
2. unrecognised system calls :
• arm_syscall() → arm_notify_die("Oops - bad syscall(2)", …)
force_sig_info()
3. undefined cpu instruction :
user_mode • do_undefinstr() → arm_notify_die("Oops - undefined
Yes
instruction", …)
No
4. unknown data abort : data accesses (load or store)
arm_notify_die() • alignment faults, translation faults, access bit faults, domain faults,
Oops permission faults.
• E.g. Unaligned memory access, Memory access to reserved areas,
A write to ROM (flash) space
• E.g. baddataabort() → arm_notify_die(“unknown data abort
code”, …)
• E.g. do_DataAbort() → arm_notify_die("", ...);

Page
14
15
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.

illegal instruction cause oops :


1. Call BUG()
illegal instruction
2. unrecognised system calls :
• arm_syscall() → arm_notify_die("Oops - bad syscall(2)", …)
force_sig_info()
3. undefined cpu instruction :
user_mode • do_undefinstr() → arm_notify_die("Oops - undefined
Yes
instruction", …)
No
4. unknown data abort : data accesses (load or store)
arm_notify_die() • alignment faults, translation faults, access bit faults, domain faults,
Oops permission faults.
• E.g. Unaligned memory access, Memory access to reserved areas,
instruction translation lookaside buffer (ITLB) and
a data translation lookaside buffer (DTLB) aren't
A write to ROM (flash) space
seeing the same picture. • E.g. baddataabort() → arm_notify_die(“unknown data abort
code”, …)
• E.g. do_DataAbort() → arm_notify_die("", ...);

Page
15
16
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.

illegal instruction cause oops :


1. Call BUG()
illegal instruction
2. unrecognised system calls :
• arm_syscall() → arm_notify_die("Oops - bad syscall(2)", …)
force_sig_info()
3. undefined cpu instruction :
user_mode • do_undefinstr() → arm_notify_die("Oops - undefined
Yes
instruction", …)
No
4. unknown data abort : data accesses (load or store)
arm_notify_die() • alignment faults, translation faults, access bit faults, domain faults,
Oops permission faults.
• E.g. Unaligned memory access, Memory access to reserved areas,
A write to ROM (flash) space
• E.g. baddataabort() → arm_notify_die(“unknown data abort
code”, …)
• E.g. do_DataAbort() → arm_notify_die("", ...);
5. prefetch-abort :
• do_PrefetchAbort()
Page 17
→ arm_notify_die("",…);
16
Linux kernel oops
• illegal instruction (SIGILL), illegal memory access (SIGSEGV)
– In kernel : Cause of an oops.

illegal instruction

force_sig_info()
user_mode
Yes
No

arm_notify_die()
Oops

Page
17
18
Linux kernel oops
• illegal instruction (SIGILL), illegal memory access (SIGSEGV)
– In kernel : Cause of an oops.

illegal instruction

force_sig_info()
user_mode
Yes
No

arm_notify_die()
Oops

__do_kernel_fault()

Page
18
19
Linux kernel oops
• __do_kernel_fault : oops
• __do_user_fault : SIGENV

Page
19
20
code flow of data abort(1/2)

Macro start

Macro end

Page
20
21
code flow of data abort (2/2)

Page
21
22
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.

illegal instruction cause oops :


illegal instruction 1. Call BUG()
2. unrecognised system calls
3. undefined cpu instruction
4. unknown data abort
5. prefetch-abort

__do_kernel_fault()
Oops

Page
22
23
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.
illegal memory access
do_translation_fault illegal instruction cause oops :
(E.g Null pointer exception) illegal instruction 1. Call BUG()
2. unrecognised system calls
3. undefined cpu instruction
4. unknown data abort
5. prefetch-abort

__do_kernel_fault()
Oops

Page
23
24
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.
illegal memory access
do_translation_fault illegal instruction cause oops :
(E.g Null pointer exception) illegal instruction 1. Call BUG()
2. unrecognised system calls
3. undefined cpu instruction
fixup the exception
4. unknown data abort
No 5. prefetch-abort

do_bad_area() __do_kernel_fault()
Where does it happen? Oops

Page
24
25
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.
illegal memory access
do_translation_fault illegal instruction cause oops :
(E.g Null pointer exception) illegal instruction 1. Call BUG()
2. unrecognised system calls
3. undefined cpu instruction
fixup the exception
4. unknown data abort
No 5. prefetch-abort

do_bad_area() __do_kernel_fault()
Where does it happen? Oops

In user space

__do_user_fault()
Segmentation fault

Page
25
26
illegal instruction cause panic:
Linux kernel oops It handles the impossible case in the interrupt vectors

• illegal instruction (SIGILL), illegal memory access (SIGSEGV)


– In kernel : Cause of an oops.
illegal memory access
do_translation_fault illegal instruction cause oops :
(E.g Null pointer exception) illegal instruction 1. Call BUG()
2. unrecognised system calls
3. undefined cpu instruction
fixup the exception
4. unknown data abort
in
No 5. prefetch-abort
Kernel
do_bad_area() Space __do_kernel_fault()
Where does it happen? Oops

In user space

__do_user_fault()
Segmentation fault

Page
26
27
Linux kernel oops

Page
27
28
Linux Kernel panic

Page 29
Linux Kernel panic


__do_kernel_fault()
Oops

Page
28
30
Linux Kernel panic


__do_kernel_fault()
Oops

Oop_end()

Page
29
31
Linux Kernel panic


__do_kernel_fault()
Oops

Oop_end()

moderate

do_exit()

Page
30
32
Linux Kernel panic


__do_kernel_fault()
Oops

severe
Oop_end() panic()

moderate

do_exit()

Page
31
33
Linux Kernel panic


__do_kernel_fault()
Oops

severe
Oop_end() panic()

In
process context

do_exit()

Page
32
34
Linux Kernel panic


__do_kernel_fault()
Oops

In Interrupt context
Oop_end() Or panic()
panic_on_oops is set

In
process context

do_exit()

Page
33
35
Linux Kernel panic


__do_kernel_fault()
Oops

In Interrupt context
Oop_end() Or panic()
panic_on_oops is set

In
process context

do_exit()

Page
34
36
Linux Kernel panic

… console_verbose()

__do_kernel_fault() dump_stack()
Oops
panic_smp_self_stop() & smp_send_stop()
In Interrupt context shut down other CPUs
Oop_end() Or panic()
panic_on_oops is set
Yes
panic_timeout ==0 while(1)
In
process context
No
Delay timeout seconds
do_exit()

emergency_restart()
Page
35
37
Introduce Crash Dump Mechanism in Linux
1. Kdump : build a separate custom dump-capture kernel for capturing the
kernel core dump
2. Ramoops : log data into persistent the RAM storage
3. Mtdoops : log data into MTD partition
4. Reserved-memory

Page
36
38
Thank you for listening

Page 39

You might also like