summaryrefslogtreecommitdiffstats
path: root/include (follow)
Commit message (Collapse)AuthorAgeFilesLines
* [POWERPC] qe: add function qe_clock_source()Timur Tabi2007-12-141-0/+1
| | | | | | | | | | | | | Add function qe_clock_source() which takes a string containing the name of a QE clock source (as is typically found in device trees) and returns the matching enum qe_clock value. Update booting-without-of.txt to indicate that the UCC properties rx-clock and tx-clock are deprecated and replaced with rx-clock-name and tx-clock-name, which use strings instead of numbers to indicate QE clock sources. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* [POWERPC] Move CPM command handling into the cpm driversJochen Friedrich2007-12-141-0/+1
| | | | | | | | | | | This patch moves the CPM command handling into commproc.c for CPM1 and cpm2_common.c. This is yet another preparation to get rid of drivers accessing the CPM via the global cpmp. Signed-off-by: Jochen Friedrich <jochen@scram.de> Acked-by: Scott Wood <scottwood@freescale.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vitaly Bordug <vitb@kernel.crashing.org>
* [POWERPC] QE: change qe_setbrg() to take an enum qe_clock instead of an integerTimur Tabi2007-12-111-47/+47
| | | | | | | | | qe_setbrg() currently takes an integer to indicate the BRG number. Change that to take an enum qe_clock instead, since this enum is intended to represent clock sources. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* [POWERPC] 86xx: fix guts_set_dmacr() and add guts_set_pmuxcr_dma() to ↵Timur Tabi2007-12-111-2/+23
| | | | | | | | | | | | immap_86xx.h Updated guts_set_dmacr() to enumerate the DMA controllers at 0, instead of 1, so that it now matches other related functions. Added function guts_set_pmuxcr_dma() to set the external DMA control bits in the PMUXCR register of the global utilities structure. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* [POWERPC] ipic: add new interrupts introduced by new chipLi Yang2007-12-111-5/+7
| | | | | | | | These interrupts are introduced by the latest Freescale SoC such as MPC837x. Signed-off-by: Li Yang <leoli@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* [POWERPC] Add SPRN for Embedded registers specified in PowerISA 2.04Kumar Gala2007-12-111-1/+12
| | | | | | | | | | | | | | * Added SPRN for new architectural features added for embedded: - Alternate Time Base (ATB, ATBL, ATBU) - Doorbell Interrupts (IVOR36, IVOR37) - SPRG8/9 - External Proxy (EPR) - External PID load/store (EPLC, EPSC) * Added BUCSR for Freescale Embedded Processors * Moved around MAS7 so its in numeric order Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* [POWERPC] Add of_translate_dma_addressBenjamin Herrenschmidt2007-12-111-0/+4
| | | | | | | | | This adds a variant of of_translate_address that uses the dma-ranges property instead of "ranges", it's to be used by PCI code in parsing the dma-ranges property. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Merge pci_process_bridge_OF_ranges()Benjamin Herrenschmidt2007-12-111-0/+3
| | | | | | | | | | | | | | | | | | | This merges the 32-bit and 64-bit implementations of pci_process_bridge_OF_ranges(). The new function is cleaner than both the old ones, and supports 64 bits ranges on ppc32 which is necessary for the 4xx port. It also adds some better (hopefully) output to the kernel log which should help diagnose problems and makes better use of existing OF parsing helpers (avoiding a few bugs of both implementations along the way). There are still a few unfortunate ifdef's but there is no way around these for now at least not until some other bits of the PCI code are made common. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* [POWERPC] Make isa_mem_base common to 32 and 64 bitsBenjamin Herrenschmidt2007-12-111-2/+3
| | | | | | | | | This defines isa_mem_base on both 32 and 64 bits (it used to be 32 bits only). This avoids a few ifdef's in later patches and potentially can allow support for VGA text mode on 64 bits powerpc. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* Merge branch 'linux-2.6' into for-2.6.25Paul Mackerras2007-12-111-2/+0
|\
| * [IA64] iosapic cleanupSimon Horman2007-12-081-2/+0
| | | | | | | | | | | | | | Make some IOSAPIC functions static and remove one that is unused. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Tony Luck <tony.luck@intel.com>
* | [POWERPC] Update smu command definitionsMichael Hanselmann2007-12-111-4/+128
| | | | | | | | | | | | | | | | | | This updates smu.h with several new commands, and adds parameter descriptions for existing commands. Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Use SLB size from the device treeMichael Neuling2007-12-112-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently we hardwire the number of SLBs to 64, but PAPR says we should use the ibm,slb-size property to obtain the number of SLB entries. This uses this property instead of assuming 64. If no property is found, we assume 64 entries as before. This soft patches the SLB handler, so it shouldn't change performance at all. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] pci_controller->arch_data really is a struct device_node *Stephen Rothwell2007-12-111-2/+3
| | | | | | | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] iSeries: Call iSeries_pcibios_init from setup_archStephen Rothwell2007-12-111-3/+0
| | | | | | | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Consolidate pci_controllerStephen Rothwell2007-12-111-43/+22
| | | | | | | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Clean up pci-bridge.hStephen Rothwell2007-12-111-53/+42
| | | | | | | | | | | | | | No semantic changes. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] iommu_free_table doesn't need the device_nodeStephen Rothwell2007-12-111-2/+1
| | | | | | | | | | | | | | | | | | It only needs the iommu_table address. It also makes use of the node name to print error messages. So just pass it the things it needs. This reduces the places that know about the pci_dn by one. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] celleb: Add support for native CBEIshizaki Kou2007-12-111-1/+1
| | | | | | | | | | | | | | | | | | This adds support for native CBE on Celleb, that is, without the BEAT hypervisor. Many codes in platforms/cell/ are used in native CBE environment. Signed-off-by: Kou Ishizaki <Kou.Ishizaki@toshiba.co.jp> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Add for_each_child_of_node() helper for iterating over child nodesMichael Ellerman2007-12-111-0/+4
| | | | | | | | | | | | | | | | | | Add for_each_child_of_node() to encapsulate the common idiom of iterating over the children of a device_node. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | [POWERPC] Use of_register_driver to implement of_register_platform_driverStephen Rothwell2007-12-111-2/+8
| | | | | | | | | | | | | | Also use of_unregister_driver to implement of_unregister_platform_driver. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | Merge branch 'linux-2.6'Paul Mackerras2007-12-1028-117/+669
|\|
| * bonding: Add new layer2+3 hash for xor/802.3ad modesJay Vosburgh2007-12-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new hash for balance-xor and 802.3ad modes. Originally submitted by "Glenn Griffin" <ggriffin.kernel@gmail.com>; modified by Jay Vosburgh to move setting of hash policy out of line, tweak the documentation update and add version update to 3.2.2. Glenn's original comment follows: Included is a patch for a new xmit_hash_policy for the bonding driver that selects slaves based on MAC and IP information. This is a middle ground between what currently exists in the layer2 only policy and the layer3+4 policy. This policy strives to be fully 802.3ad compliant by transmitting every packet of any particular flow over the same link. As documented the layer3+4 policy is not fully compliant for extreme cases such as ip fragmentation, so this policy is a nice compromise for environments that require full compliance but desire more than the layer2 only policy. Signed-off-by: "Glenn Griffin" <ggriffin.kernel@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * Merge branch 'for-linus' of ↵Linus Torvalds2007-12-077-88/+563
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6: [AVR32] Fix wrong pt_regs in critical exception handler [AVR32] Fix copy_to_user_page() breakage [AVR32] Follow the rules when dealing with the OCD system [AVR32] Clean up OCD register usage [AVR32] Implement irqflags trace and lockdep support [AVR32] Implement stacktrace support [AVR32] Kconfig: Use def_bool instead of bool + default [AVR32] Fix invalid status register bit definitions in asm/ptrace.h [AVR32] Add TIF_RESTORE_SIGMASK to the work masks
| | * [AVR32] Fix copy_to_user_page() breakageHaavard Skinnemoen2007-12-071-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation of copy_to_user_page() gives "vaddr" to the cache instruction when trying to sync the icache with the dcache. If vaddr does not exist in the TLB, the CPU will silently abort the operation, which may result in the caches staying out of sync. To fix this, pass the "dst" parameter to flush_icache_range() instead -- we know this is valid because we just wrote to it. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
| | * [AVR32] Follow the rules when dealing with the OCD systemHaavard Skinnemoen2007-12-072-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current debug trap handling code does a number of things that are illegal according to the AVR32 Architecture manual. Most importantly, it may try to schedule from Debug Mode, thus clearing the D bit, which can lead to "undefined behaviour". It seems like this works in most cases, but several people have observed somewhat unstable behaviour when debugging programs, including soft lockups. So there's definitely something which is not right with the existing code. The new code will never schedule from Debug mode, it will always exit Debug mode with a "retd" instruction, and if something not running in Debug mode needs to do something debug-related (like doing a single step), it will enter debug mode through a "breakpoint" instruction. The monitor code will then return directly to user space, bypassing its own saved registers if necessary (since we don't actually care about the trapped context, only the one that came before.) This adds three instructions to the common exception handling code, including one branch. It does not touch super-hot paths like the TLB miss handler. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
| | * [AVR32] Clean up OCD register usageHaavard Skinnemoen2007-12-072-68/+528
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generate a new set of OCD register definitions in asm/ocd.h and rename __mfdr() and __mtdr() to ocd_read() and ocd_write() respectively. The bitfield definitions are a lot more complete now, and they are entirely based on bit numbers, not masks. This is because OCD registers are frequently accessed from assembly code, where bit numbers are a lot more useful (can be fed directly to sbr, bfins, etc.) Bitfields that consist of more than one bit have two definitions: _START, which indicates the number of the first bit, and _SIZE, which indicates the number of bits. These directly correspond to the parameters taken by the bfextu, bfexts and bfins instructions. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
| | * [AVR32] Implement irqflags trace and lockdep supportHaavard Skinnemoen2007-12-071-0/+2
| | | | | | | | | | | | Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
| | * [AVR32] Fix invalid status register bit definitions in asm/ptrace.hHaavard Skinnemoen2007-12-071-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The 'H' bit is bit 29, while the 'R' bit doesn't exist. Luckily, we don't actually use any of the bits in question. Also update show_regs() to show the Debug Mask and Debug state bits. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
| | * [AVR32] Add TIF_RESTORE_SIGMASK to the work masksHaavard Skinnemoen2007-12-071-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We really need to check TIF_RESTORE_SIGMASK before returning to userspace. The existing code does not necessarily do this. Define the work masks as a bitwise OR of the respective flags instead of a hardcoded hex value to make it easier to spot errors like this in the future. Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
| * | Merge branch 'master' of ↵Linus Torvalds2007-12-072-1/+3
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [AF_RXRPC]: Add a missing goto [VLAN]: Lost rtnl_unlock() in vlan_ioctl() [SCTP]: Fix the bind_addr info during migration. [SCTP]: Add bind hash locking to the migrate code [IPV4]: Remove prototype of ip_rt_advice [IPv4]: Reply net unreachable ICMP message [IPv6] SNMP: Increment OutNoRoutes when connecting to unreachable network [BRIDGE]: Section fix. [NIU]: Fix link LED handling.
| | * | [SCTP]: Fix the bind_addr info during migration.Vlad Yasevich2007-12-071-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During accept/migrate the code attempts to copy the addresses from the parent endpoint to the new endpoint. However, if the parent was bound to a wildcard address, then we end up pointlessly copying all of the current addresses on the system. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | [IPV4]: Remove prototype of ip_rt_adviceDenis V. Lunev2007-12-071-1/+0
| | |/ | | | | | | | | | | | | | | | | | | ip_rt_advice has been gone, so no need to keep prototype and debug message. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'for-linus' of git://git.o-hand.com/linux-rpurdie-ledsLinus Torvalds2007-12-071-1/+2
| |\ \ | | | | | | | | | | | | | | | | * 'for-linus' of git://git.o-hand.com/linux-rpurdie-leds: leds: Fix led trigger locking bugs
| | * | leds: Fix led trigger locking bugsRichard Purdie2007-12-071-1/+2
| | |/ | | | | | | | | | | | | | | | | | | | | | Convert part of the led trigger core from rw spinlocks to rw semaphores. We're calling functions which can sleep from invalid contexts otherwise. Fixes bug #9264. Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
| * | Merge branch 'merge' of ↵Linus Torvalds2007-12-071-6/+2
| |\ \ | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc * 'merge' of master.kernel.org:/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] virtex bug fix: Use canonical value for AC97 interrupt xparams [POWERPC] Update defconfigs [POWERPC] PS3: Update ps3_defconfig [POWERPC] Update iseries_defconfig [POWERPC] Fix hardware IRQ time accounting problem.
| | * [POWERPC] Fix hardware IRQ time accounting problem.Tony Breeds2007-12-061-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit fa13a5a1f25f671d084d8884be96fc48d9b68275 (sched: restore deterministic CPU accounting on powerpc), unconditionally calls update_process_tick() in system context. In the deterministic accounting case this is the correct thing to do. However, in the non-deterministic accounting case we need to not do this, since doing this results in the time accounted as hardware irq time being artificially elevated. Also this collapses 2 consecutive '#ifdef CONFIG_VIRT_CPU_ACCOUNTING' checks in time.h into one for neatness. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| * | Merge branch 'for-2.6.24' of ↵Linus Torvalds2007-12-061-0/+5
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc * 'for-2.6.24' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc: [POWERPC] Fix swapper_pg_dir size when CONFIG_PTE_64BIT=y on FSL_BOOKE
| | * | [POWERPC] Fix swapper_pg_dir size when CONFIG_PTE_64BIT=y on FSL_BOOKEKumar Gala2007-12-061-0/+5
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | The size of swapper_pg_dir is 8k instead of 4k when using 64-bit PTEs (CONFIG_PTE_64BIT). This was reported by Cedric Hombourger <chombourger@gmail.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6Linus Torvalds2007-12-061-2/+1
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6: [PARISC] lba_pci: pci_claim_resources disabled expansion roms [PARISC] print more than one character at a time for pdc console [PARISC] Update parisc-linux MAINTAINERS entries [PARISC] timer interrupt should not be IRQ_DISABLED Revert "[PARISC] import necessary bits of libgcc.a"
| | * | [PARISC] print more than one character at a time for pdc consoleKyle McMartin2007-12-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's really no reason not to print more than one character at a time to the PDC console... Booting is measurably speedier, and now I don't have to watch individual characters get drawn. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * | | [MIPS] Alchemy: fix IRQ basesSergei Shtylyov2007-12-061-10/+11
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | Do what the commits commits f3e8d1da389fe2e514e31f6e93c690c8e1243849 and 9d360ab4a7568a8d177280f651a8a772ae52b9b9 failed to achieve -- actually convert the Alchemy code to irq_cpu. Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-schedLinus Torvalds2007-12-051-2/+15
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched: futex: correctly return -EFAULT not -EINVAL lockdep: in_range() fix lockdep: fix debug_show_all_locks() sched: style cleanups futex: fix for futex_wait signal stack corruption
| | * | futex: fix for futex_wait signal stack corruptionSteven Rostedt2007-12-051-2/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Holmes found a bug in the -rt tree with respect to pthread_cond_timedwait. After trying his test program on the latest git from mainline, I found the bug was there too. The bug he was seeing that his test program showed, was that if one were to do a "Ctrl-Z" on a process that was in the pthread_cond_timedwait, and then did a "bg" on that process, it would return with a "-ETIMEDOUT" but early. That is, the timer would go off early. Looking into this, I found the source of the problem. And it is a rather nasty bug at that. Here's the relevant code from kernel/futex.c: (not in order in the file) [...] smlinkage long sys_futex(u32 __user *uaddr, int op, u32 val, struct timespec __user *utime, u32 __user *uaddr2, u32 val3) { struct timespec ts; ktime_t t, *tp = NULL; u32 val2 = 0; int cmd = op & FUTEX_CMD_MASK; if (utime && (cmd == FUTEX_WAIT || cmd == FUTEX_LOCK_PI)) { if (copy_from_user(&ts, utime, sizeof(ts)) != 0) return -EFAULT; if (!timespec_valid(&ts)) return -EINVAL; t = timespec_to_ktime(ts); if (cmd == FUTEX_WAIT) t = ktime_add(ktime_get(), t); tp = &t; } [...] return do_futex(uaddr, op, val, tp, uaddr2, val2, val3); } [...] long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3) { int ret; int cmd = op & FUTEX_CMD_MASK; struct rw_semaphore *fshared = NULL; if (!(op & FUTEX_PRIVATE_FLAG)) fshared = &current->mm->mmap_sem; switch (cmd) { case FUTEX_WAIT: ret = futex_wait(uaddr, fshared, val, timeout); [...] static int futex_wait(u32 __user *uaddr, struct rw_semaphore *fshared, u32 val, ktime_t *abs_time) { [...] struct restart_block *restart; restart = &current_thread_info()->restart_block; restart->fn = futex_wait_restart; restart->arg0 = (unsigned long)uaddr; restart->arg1 = (unsigned long)val; restart->arg2 = (unsigned long)abs_time; restart->arg3 = 0; if (fshared) restart->arg3 |= ARG3_SHARED; return -ERESTART_RESTARTBLOCK; [...] static long futex_wait_restart(struct restart_block *restart) { u32 __user *uaddr = (u32 __user *)restart->arg0; u32 val = (u32)restart->arg1; ktime_t *abs_time = (ktime_t *)restart->arg2; struct rw_semaphore *fshared = NULL; restart->fn = do_no_restart_syscall; if (restart->arg3 & ARG3_SHARED) fshared = &current->mm->mmap_sem; return (long)futex_wait(uaddr, fshared, val, abs_time); } So when the futex_wait is interrupt by a signal we break out of the hrtimer code and set up or return from signal. This code does not return back to userspace, so we set up a RESTARTBLOCK. The bug here is that we save the "abs_time" which is a pointer to the stack variable "ktime_t t" from sys_futex. This returns and unwinds the stack before we get to call our signal. On return from the signal we go to futex_wait_restart, where we update all the parameters for futex_wait and call it. But here we have a problem where abs_time is no longer valid. I verified this with print statements, and sure enough, what abs_time was set to ends up being garbage when we get to futex_wait_restart. The solution I did to solve this (with input from Linus Torvalds) was to add unions to the restart_block to allow system calls to use the restart with specific parameters. This way the futex code now saves the time in a 64bit value in the restart block instead of storing it on the stack. Note: I'm a bit nervious to add "linux/types.h" and use u32 and u64 in thread_info.h, when there's a #ifdef __KERNEL__ just below that. Not sure what that is there for. If this turns out to be a problem, I've tested this with using "unsigned int" for u32 and "unsigned long long" for u64 and it worked just the same. I'm using u32 and u64 just to be consistent with what the futex code uses. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | Merge branch 'for-linus' of ↵Linus Torvalds2007-12-051-0/+16
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/selinux-2.6: VM/Security: add security hook to do_brk Security: round mmap hint address above mmap_min_addr security: protect from stack expantion into low vm addresses Security: allow capable check to permit mmap or low vm space SELinux: detect dead booleans SELinux: do not clear f_op when removing entries
| | * | | Security: round mmap hint address above mmap_min_addrEric Paris2007-12-051-0/+16
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If mmap_min_addr is set and a process attempts to mmap (not fixed) with a non-null hint address less than mmap_min_addr the mapping will fail the security checks. Since this is just a hint address this patch will round such a hint address above mmap_min_addr. gcj was found to try to be very frugal with vm usage and give hint addresses in the 8k-32k range. Without this patch all such programs failed and with the patch they happily get a higher address. This patch is wrappad in CONFIG_SECURITY since mmap_min_addr doesn't exist without it and there would be no security check possible no matter what. So we should not bother compiling in this rounding if it is just a waste of time. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2007-12-051-0/+3
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [LRO]: fix lro_gen_skb() alignment [TCP]: NAGLE_PUSH seems to be a wrong way around [TCP]: Move prior_in_flight collect to more robust place [TCP] FRTO: Use of existing funcs make code more obvious & robust [IRDA]: Move ircomm_tty_line_info() under #ifdef CONFIG_PROC_FS [ROSE]: Trivial compilation CONFIG_INET=n case [IPVS]: Fix sched registration race when checking for name collision. [IPVS]: Don't leak sysctl tables if the scheduler registration fails.
| | * | | [LRO]: fix lro_gen_skb() alignmentAndrew Gallatin2007-12-051-0/+3
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a field to the lro_mgr struct so that drivers can specify how much padding is required to align layer 3 headers when a packet is copied into a freshly allocated skb by inet_lro.c:lro_gen_skb(). Without padding, skbs generated by LRO will cause alignment warnings on architectures which require strict alignment (seen on sparc64). Myri10GE is updated to use this field. Signed-off-by: Andrew Gallatin <gallatin@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | proc: fix proc_dir_entry refcountingAlexey Dobriyan2007-12-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Creating PDEs with refcount 0 and "deleted" flag has problems (see below). Switch to usual scheme: * PDE is created with refcount 1 * every de_get does +1 * every de_put() and remove_proc_entry() do -1 * once refcount reaches 0, PDE is freed. This elegantly fixes at least two following races (both observed) without introducing new locks, without abusing old locks, without spreading lock_kernel(): 1) PDE leak remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_read(&de->count) == 0) if (atomic_dec_and_test(&de->count)) if (de->deleted) /* also not taken! */ free_proc_entry(de); else de->deleted = 1; [refcount=0, deleted=1] 2) use after free remove_proc_entry de_put ----------------- ------ [refcnt = 1] if (atomic_dec_and_test(&de->count)) if (atomic_read(&de->count) == 0) free_proc_entry(de); /* boom! */ if (de->deleted) free_proc_entry(de); BUG: unable to handle kernel paging request at virtual address 6b6b6b6b printing eip: c10acdda *pdpt = 00000000338f8001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: af_packet ipv6 cpufreq_ondemand loop serio_raw psmouse k8temp hwmon sr_mod cdrom Pid: 23161, comm: cat Not tainted (2.6.24-rc2-8c0863403f109a43d7000b4646da4818220d501f #4) EIP: 0060:[<c10acdda>] EFLAGS: 00210097 CPU: 1 EIP is at strnlen+0x6/0x18 EAX: 6b6b6b6b EBX: 6b6b6b6b ECX: 6b6b6b6b EDX: fffffffe ESI: c128fa3b EDI: f380bf34 EBP: ffffffff ESP: f380be44 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process cat (pid: 23161, ti=f380b000 task=f38f2570 task.ti=f380b000) Stack: c10ac4f0 00000278 c12ce000 f43cd2a8 00000163 00000000 7da86067 00000400 c128fa20 00896b18 f38325a8 c128fe20 ffffffff 00000000 c11f291e 00000400 f75be300 c128fa20 f769c9a0 c10ac779 f380bf34 f7bfee70 c1018e6b f380bf34 Call Trace: [<c10ac4f0>] vsnprintf+0x2ad/0x49b [<c10ac779>] vscnprintf+0x14/0x1f [<c1018e6b>] vprintk+0xc5/0x2f9 [<c10379f1>] handle_fasteoi_irq+0x0/0xab [<c1004f44>] do_IRQ+0x9f/0xb7 [<c117db3b>] preempt_schedule_irq+0x3f/0x5b [<c100264e>] need_resched+0x1f/0x21 [<c10190ba>] printk+0x1b/0x1f [<c107c8ad>] de_put+0x3d/0x50 [<c107c8f8>] proc_delete_inode+0x38/0x41 [<c107c8c0>] proc_delete_inode+0x0/0x41 [<c1066298>] generic_delete_inode+0x5e/0xc6 [<c1065aa9>] iput+0x60/0x62 [<c1063c8e>] d_kill+0x2d/0x46 [<c1063fa9>] dput+0xdc/0xe4 [<c10571a1>] __fput+0xb0/0xcd [<c1054e49>] filp_close+0x48/0x4f [<c1055ee9>] sys_close+0x67/0xa5 [<c10026b6>] sysenter_past_esp+0x5f/0x85 ======================= Code: c9 74 0c f2 ae 74 05 bf 01 00 00 00 4f 89 fa 5f 89 d0 c3 85 c9 57 89 c7 89 d0 74 05 f2 ae 75 01 4f 89 f8 5f c3 89 c1 89 c8 eb 06 <80> 38 00 74 07 40 4a 83 fa ff 75 f4 29 c8 c3 90 90 90 57 83 c9 EIP: [<c10acdda>] strnlen+0x6/0x18 SS:ESP 0068:f380be44 Also, remove broken usage of ->deleted from reiserfs: if sget() succeeds, module is already pinned and remove_proc_entry() can't happen => nobody can mark PDE deleted. Dummy proc root in netns code is not marked with refcount 1. AFAICS, we never get it, it's just for proper /proc/net removal. I double checked CLONE_NETNS continues to work. Patch survives many hours of modprobe/rmmod/cat loops without new bugs which can be attributed to refcounting. Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | jbd: Fix assertion failure in fs/jbd/checkpoint.cJan Kara2007-12-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before we start committing a transaction, we call __journal_clean_checkpoint_list() to cleanup transaction's written-back buffers. If this call happens to remove all of them (and there were already some buffers), __journal_remove_checkpoint() will decide to free the transaction because it isn't (yet) a committing transaction and soon we fail some assertion - the transaction really isn't ready to be freed :). We change the check in __journal_remove_checkpoint() to free only a transaction in T_FINISHED state. The locking there is subtle though (as everywhere in JBD ;(). We use j_list_lock to protect the check and a subsequent call to __journal_drop_transaction() and do the same in the end of journal_commit_transaction() which is the only place where a transaction can get to T_FINISHED state. Probably I'm too paranoid here and such locking is not really necessary - checkpoint lists are processed only from log_do_checkpoint() where a transaction must be already committed to be processed or from __journal_clean_checkpoint_list() where kjournald itself calls it and thus transaction cannot change state either. Better be safe if something changes in future... Signed-off-by: Jan Kara <jack@suse.cz> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>