x86: fix NULL pointer deref in __switch_to

Patrick McHardy reported a crash: > > I get this oops once a day, its apparently triggered by something > > run by cron, but the process is a different one each time. > > > > Kernel is -git from yesterday shortly before the -rc6 release > > (last commit is the usb-2.6 merge, the x86 patches are missing), > > .config is attached. > > > > I'll retry with current -git, but the patches that have gone in > > since I last updated don't look related. > > > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at > > 000001ff > > [62060.043009] IP: [<c0102a9b>] __switch_to+0x2f/0x118 > > [62060.043009] *pde = 00000000 > > [62060.043009] Oops: 0002 [#1] PREEMPT Vegard Nossum analyzed it: > This decodes to > > 0: 0f ae 00 fxsave (%eax) > > so it's related to the floating-point context. This is the exact > location of the crash: > > $ addr2line -e arch/x86/kernel/process_32.o -i ab0 > include/asm/i387.h:232 > include/asm/i387.h:262 > arch/x86/kernel/process_32.c:595 > > ...so it looks like prev_task->thread.xstate->fxsave has become NULL. > Or maybe it never had any other value. Somehow (as described below) TS_USEDFPU is set but the fpu is not allocated or freed. Another possible FPU pre-emption issue with the sleazy FPU optimization which was benign before but not so anymore, with the dynamic FPU allocation patch. New task is getting exec'd and it is prempted at the below point. flush_thread() { ... /* * Forget coprocessor state.. */ clear_fpu(tsk); <----- Preemption point clear_used_math(); ... } Now when it context switches in again, as the used_math() is still set and fpu_counter can be > 5, we will do a math_state_restore() which sets the task's TS_USEDFPU. After it continues from the above preemption point it does clear_used_math() and much later free_thread_xstate(). Now, at the next context switch, it is quite possible that xstate is null, used_math() is not set and TS_USEDFPU is still set. This will trigger unlazy_fpu() causing kernel oops. Fix this by clearing tsk's fpu_counter before clearing task's fpu. Reported-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
author: Suresh Siddha <suresh.b.siddha@intel.com> 2008-06-14 00:47:12 +0200
committer: Ingo Molnar <mingo@elte.hu> 2008-06-19 10:08:45 +0200
commit: 75118a82e21cafb4a82b53bb85d1c7689787e046 (patch)
tree: f0de9e9058f6396230bb308fae8c167f33da0701 /arch
parent: x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits. (diff)
download: linux-75118a82e21cafb4a82b53bb85d1c7689787e046.tar.xz
linux-75118a82e21cafb4a82b53bb85d1c7689787e046.zip
2 files changed, 2 insertions, 0 deletions
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 6d5483356e74..e2db9ac5c61c 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -333,6 +333,7 @@ void flush_thread(void)
 	/*
 	 * Forget coprocessor state..
 	 */
+	tsk->fpu_counter = 0;
 	clear_fpu(tsk);
 	clear_used_math();
 }
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index ac54ff56df80..c6eb5c91e5f6 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -294,6 +294,7 @@ void flush_thread(void)
 	/*
 	 * Forget coprocessor state..
 	 */
+	tsk->fpu_counter = 0;
 	clear_fpu(tsk);
 	clear_used_math();
 }
author	Suresh Siddha <suresh.b.siddha@intel.com>	2008-06-14 00:47:12 +0200
committer	Ingo Molnar <mingo@elte.hu>	2008-06-19 10:08:45 +0200
commit	75118a82e21cafb4a82b53bb85d1c7689787e046 (patch)
tree	f0de9e9058f6396230bb308fae8c167f33da0701 /arch
parent	x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits. (diff)
download	linux-75118a82e21cafb4a82b53bb85d1c7689787e046.tar.xz linux-75118a82e21cafb4a82b53bb85d1c7689787e046.zip