summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [S390] ETR support.Martin Schwidefsky2007-02-0513-109/+1488
| | | | | | | | | | | | | This patch adds support for clock synchronization to an external time reference (ETR). The external time reference sends an oscillator signal and a synchronization signal every 2^20 microseconds to keep the TOD clocks of all connected servers in sync. For availability two ETR units can be connected to a machine. If the clock deviates for more than the sync-check tolerance all cpus get a machine check that indicates that the clock is out of sync. For the lovely details how to get the clock back in sync see the code below. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] noexec protectionGerald Schaefer2007-02-0528-115/+913
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides a noexec protection on s390 hardware. Our hardware does not have any bits left in the pte for a hw noexec bit, so this is a different approach using shadow page tables and a special addressing mode that allows separate address spaces for code and data. As a special feature of our "secondary-space" addressing mode, separate page tables can be specified for the translation of data addresses (storage operands) and instruction addresses. The shadow page table is used for the instruction addresses and the standard page table for the data addresses. The shadow page table is linked to the standard page table by a pointer in page->lru.next of the struct page corresponding to the page that contains the standard page table (since page->private is not really private with the pte_lock and the page table pages are not in the LRU list). Depending on the software bits of a pte, it is either inserted into both page tables or just into the standard (data) page table. Pages of a vma that does not have the VM_EXEC bit set get mapped only in the data address space. Any try to execute code on such a page will cause a page translation exception. The standard reaction to this is a SIGSEGV with two exceptions: the two system call opcodes 0x0a77 (sys_sigreturn) and 0x0aad (sys_rt_sigreturn) are allowed. They are stored by the kernel to the signal stack frame. Unfortunately, the signal return mechanism cannot be modified to use an SA_RESTORER because the exception unwinding code depends on the system call opcode stored behind the signal stack frame. This feature requires that user space is executed in secondary-space mode and the kernel in home-space mode, which means that the addressing modes need to be switched and that the noexec protection only works for user space. After switching the addressing modes, we cannot use the mvcp/mvcs instructions anymore to copy between kernel and user space. A new mvcos instruction has been added to the z9 EC/BC hardware which allows to copy between arbitrary address spaces, but on older hardware the page tables need to be walked manually. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] move crypto options and some cleanup.Jan Glauber2007-02-0511-410/+272
| | | | | | | | | | | | This patch moves the config options for the s390 crypto instructions to the standard "Hardware crypto devices" menu. In addition some cleanup has been done: use a flag for supported keylengths, add a warning about machien limitation, return ENOTSUPP in case the hardware has no support, remove superfluous printks and update email addresses. Signed-off-by: Jan Glauber <jan.glauber@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] cio: Don't spam debug feature.Cornelia Huck2007-02-051-1/+1
| | | | | | | | Lower priority of "Blacklisted device detected" messages so we don't overwrite more useful messages. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Cleanup of CHSC event handling.Peter Oberparleiter2007-02-051-120/+112
| | | | | | | Change CHSC event handling to be more easily extensible. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] cio: declare hardware structures packed.Peter Oberparleiter2007-02-052-12/+12
| | | | | Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Add set_fs(USER_DS) to start_thread().Heiko Carstens2007-02-051-0/+3
| | | | | | | | | Currently works anyway since search_binary_handler has a set_fs(USER_DS). But start_thread() is the place where this should be done. Following all other architectures... Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] cio: Catch operand exceptions on stsch.Cornelia Huck2007-02-052-2/+2
| | | | | | | | | | If we have a subchannel id which has been generated via for_each_subchannel(), it might contain an invalid subchannel set id. We need to catch the ensuing operand exception by using stsch_err() instead of stsch() in all possible cases. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Fix register usage description.Heiko Carstens2007-02-051-1/+1
| | | | | | | | | Fix description of register usage as pointed out by Andreas Krebbel. Since this document is completely outdated and would need a lot of fixing, it might be worth considering to get rid of it... Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] kretprobe_trampoline_holder() in wrong section.Heiko Carstens2007-02-051-1/+1
| | | | | | | | | kretprobe_trampoline_holder() is in kprobes section but used to register a kprobe in arch_init_kprobes(). Hence register_kprobe() and therefore arch_init_kprobes() will fail. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Fix kprobes breakpoint handling.Heiko Carstens2007-02-051-2/+9
| | | | | | | | In case of an illegal op the die notifier gets called with DIE_TRAP instead of DIE_BPT first. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Update maintainers file.Martin Schwidefsky2007-02-051-3/+3
| | | | | | | Use the new linux-s390@vger.kernel.org mailing list instead of linux-390@vm.marist.edu. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] dasd: fix unconditional reserve handling.Horst Hummel2007-02-052-6/+6
| | | | | | | | | | The reserve/release IOCTLs sometimes do not work. If second system does a 'steal lock' the pending unit check (Format 3 Msg F) is delivered. Since ERP is disabled for reserve/release, the IOCTL call fails. We have to allow basic ERP (retries) for reserve/release IOCTLs. Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Remove dasd_ccw_log function.Horst Hummel2007-02-053-67/+0
| | | | | | | | Logging of relevant information is already done by disciplines dump_sense function. Signed-off-by: Horst Hummel <horst.hummel@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Small barrier() and cpu_relax() cleanup.Heiko Carstens2007-02-052-4/+2
| | | | | | | | | cpu_relax() has barrier() semantics hence there is no need to use both of them in conjunction in sclp_sync_wait(). Also change cpu_relax() so it's more obvious that it has barrier semantics. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] cio: Use device_{create,remove}_bin_file.Cornelia Huck2007-02-051-7/+5
| | | | | | | | Create/remove the channel measurement binary files with device_{create,remove}_bin_file instead of sysfs_{create,remove}_bin_file. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] sclp: don't call local_bh_disable/_local_bh_enable if in_interrupt()Heiko Carstens2007-02-051-2/+6
| | | | | | | | | local_bh_disable/_local_bh_enable must not be called if in_irq() is true. Besides that if in_interrupt() is true bottom halves are disabled anyway. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Show loaded DCSS segments under /proc/iomem.Gerald Schaefer2007-02-052-12/+56
| | | | | | | | Currently loaded DCSS segments are now listed in /proc/iomem with their name followed by a trailing "(DCSS)". Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] cio: Restart path verification after unsolicited interrupt.Cornelia Huck2007-02-051-0/+2
| | | | | | | | | | If we try to start path verification when an unsolicited interrupt is already pending, stctl shows status pending and we delay path verification again. We need to check for the doverify bit when the unsolicited interrupt comes in and then do path verification. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Fix FCP dump feature detection.Heiko Carstens2007-02-051-1/+5
| | | | | | | | | | FCP dump feature detection works only if the sclp command in head.S was succesful. Since the sclp command is skipped if diag260 works, we don't have any dump feature detection anymore. Bug was introduced with d57de5a36791cb1b7285649c62f183b0d3505f7d. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] dasd: fix bug in dasd initialization cleanupStefan Weinhuber2007-02-051-7/+17
| | | | | | | | | | | | | | | | The initialization of the dasd_eer code is one of the last steps of the dasd driver initialization. When initialization fails in one of the earlier steps, the dasd_exit function is called to clean up what has been done so far. So the dasd_eer_exit function may be called, although the dasd_eer_init function wasn't called before and dasd_eer_exit tries to unregister a misc device that wasn't registered, which results in a BUG. Make sure that dasd_eer_exit can be called without initialization. Use a dynamically allocated struct miscdevice instead of a static one, so we only try to unregister the device if it exists and was actually registered. Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] sclp: invalid handling of temporary 'not operational' statusPeter Oberparleiter2007-02-051-21/+49
| | | | | | | | | | | Requests are aborted when the sclp interface reports 'not operational' even though they may still be active at the sclp, leading to concurrent writes to request memory by both the kernel and the sclp interface. Do not abort requests for which the sclp interface reports not operational status during request retry. Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>5A Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Simplify virt_to_phys.Heiko Carstens2007-02-051-4/+0
| | | | | | | | No need to use lrag in 64 bit addressing mode since lra will do the same. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] cio: Remove check for ssd in chpids_show().Cornelia Huck2007-02-051-5/+2
| | | | | | | | Since ssd_info is now available before the subchannel is registered, we don't need to check whether it is available. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] cpcmd with vmalloc addresses.Christian Borntraeger2007-02-051-7/+7
| | | | | | | | | | | | | | | | Change the bounce buffer logic of cpcmd. diag8 needs _real_ memory below 2GB. Therefore vmalloced data does not work. As the data might cross a page boundary, we cannot use virt_to_page either. The solution is to use virt_to_page only in the check for a bounce buffer. There was a redundant check for response==NULL. response < 2GB contains this check as well. I also removed the rlen==0 check, since rlen=0 and response!=NULL would be a caller bug and response==NULL is already checked. Signed-off-by: Christian Borntraeger <cborntra@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Remove pointless/unreliable kernel messages.Heiko Carstens2007-02-052-5/+0
| | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Check the return value of kthread_run().Akinobu Mita2007-02-051-1/+5
| | | | | Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Get rid of a lot of sparse warnings.Heiko Carstens2007-02-0569-189/+210
| | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* [S390] Move init_irq_proc to the other irq related functions.Heiko Carstens2007-02-053-24/+13
| | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
* Linux 2.6.20v2.6.20Linus Torvalds2007-02-041-1/+1
|
* [PATCH] EFI x86: pass firmware call parameters on the stackFrédéric Riss2007-02-041-16/+73
| | | | | | | | | | When calling into the EFI firmware, the parameters need to be passed on the stack. The recent change to use -mregparm=3 breaks x86 EFI support. This patch is needed to allow the new Intel-based Macs to suspend to ram (efi.get_time is called during the suspend phase). Signed-off-by: Frederic Riss <frederic.riss@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] fix rtl8150Al Viro2007-02-041-1/+2
| | | | | | | That code doesn't do what its author apparently thought it would do... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds2007-02-0310-70/+100
|\ | | | | | | | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crash [SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file number to be incorrect [SCSI] qla4xxx: bug fixes [SCSI] Fix scsi_add_device() for async scanning
| * [SCSI] sd: udev accessing an uninitialized scsi_disk field results in a crashNagendra Singh Tomar2007-02-031-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sd_probe() calls class_device_add() even before initializing the sdkp->device variable. class_device_add() eventually results in the user mode udev program to be called. udev program can read the the allow_restart attribute of the newly created scsi device. This is resulting in a crash as the show function for allow_restart (i.e sd_show_allow_restart) returns the attribute value by reading the sdkp->device->allow_restart variable. As the sdkp->device is not initialized before calling the user mode hotplug helper, this results in a crash. The patch below solves it by calling class_device_add() only after the necessary fields in the scsi_disk structure are initialized properly. Signed-off-by: Nagendra Singh Tomar <nagendra_tomar@adaptec.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
| * [SCSI] st: A MTIOCTOP/MTWEOF within the early warning will cause the file ↵Kai Makisara2007-01-271-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | number to be incorrect On Wed, 24 Jan 2007, Andrew Morton wrote: > On Mon, 22 Jan 2007 13:07:20 -0800 > bugme-daemon@bugzilla.kernel.org wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=7864 > > > > Summary: A MTIOCTOP/MTWEOF within the early warning will cause > > the file number to be incorrect > > Kernel Version: 2.6.19.2 > > Status: NEW > > Severity: low > > Owner: io_scsi@kernel-bugs.osdl.org > > Submitter: ce_reisinger@yahoo.com > > > > > > Write records to a SCSI tape until a write fails with a ENOSPC (you have reached > > early warning. > > Now perform a: > > struct mtget before, after; > > ioctl(fd, MTIOCGET, &before); > > struct mtop mtop = { MTWEOF, 1 }; > > ioctl(fd, MTIOCTOP, &mtop); > > ioctl(fd, MTIOCGET, &after); > > > > Check the value of mt_fileno in the before and after structures. Notice the > > after is 2 greater then the before. > > > > The problem appears to be in the block of code starting at line 2817 in st.c. > > This block is entered because the drive did return a CHECK CONDITION with NO > > SENSE and the SENSE_EOM bit set. At lines 2824/5 the fileno is incremented. But > > it has already been increased by the number of filemarks requested by the > > MTIOCTOP. I believe that the residue count in the sense data should be > > subtracted from fileno, not a increment as is done. > > > > Thanks. Could you please send us a tested patch to fix these things, as > per http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt ? > The analysis is basically correct and explains the bug. According to the SCSI standards, the sense code is NO SENSE or RECOVERED ERROR in case writing filemark(s) succeeds. If it fails (partly or completely) the sense code is VOLUME OVERFLOW. The patch below is tested to fix the case when one filemark is successfully written after the EOM early warning. It should also fix the case at real EOM but this has not been tested. Carl, thanks for reporting the bug and providing the analysis for the fix. Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
| * [SCSI] qla4xxx: bug fixesDavid C Somayajulu2007-01-277-52/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The included patch fixes the following issues: 1. qla3xxx/qla4xxx co-existence issue which can result in a lockup when qla3xxx driver is unloaded, or when ifdown; ifup is performed on one of the interfaces correponding to qla3xxx. This is because qla4xxx HBA supports one ethernet and iscsi interfaces per port. Both iscsi and ethernet interfaces share the same state machine. The problem has to do with synchronizing access to the state machine in the event of a reset 2. mutex_lock() is sometimes not followed by mutex_unlock() prior to invoking a msleep() in qla4xxx_mailbox_command() Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
| * [SCSI] Fix scsi_add_device() for async scanningMatthew Wilcox2007-01-271-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | I had thought that all drivers which didn't call scsi_scan_host() called scsi_scan_target(). Some, such as sbp2, mptsas and libata-scsi, call scsi_add_device() or __scsi_add_device(). We just need to wait for the currently executing async scans to complete first. This is the same code that's in scsi_scan_target(), except that we have to return an error instead of void when we're declining to scan at all. Signed-off-by: Matthew Wilcox <matthew@wil.cx> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* | [PATCH] x86-64: define dma noncoherent API functionsJeff Garzik2007-02-031-0/+3
| | | | | | | | | | | | | | | | x86-64 is missing these: Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] Altix: more ACPI PRT supportJohn Keller2007-02-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SN Altix platform does not conform to the IOSAPIC IRQ routing model. Add code in acpi_unregister_gsi() to check if (acpi_irq_model == ACPI_IRQ_MODEL_PLATFORM) and return. Due to an oversight, this code was not added previously when similar code was added to acpi_register_gsi(). http://marc.theaimsgroup.com/?l=linux-acpi&m=116680983430121&w=2 Signed-off-by: John Keller <jpk@sgi.com> Acked-by: Len Brown <lenb@kernel.org> Cc: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] revert blockdev direct io back to 2.6.19 versionAndrew Morton2007-02-031-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew Vasquez is reporting as-iosched oopses and a 65% throughput slowdown due to the recent special-casing of direct-io against blockdevs. We don't know why either of these things are occurring. The patch minimally reverts us back to the 2.6.19 code for a 2.6.20 release. Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Cc: Ken Chen <kenchen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] alpha: fix epoll syscall enumerationsMike Frysinger2007-02-031-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | We went and named them __NR_sys_foo instead of __NR_foo. It may be too late to change this, but we can at least add the proper names now. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] net/smc911x: match up spin lock/unlockPeter Korsgaard2007-02-031-2/+3
| | | | | | | | | | | | | | | | | | | | | | smc911x_phy_configure's error handling unconditionally unlocks the spinlock even if it wasn't locked. Patch fixes it. Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk> Cc: Jeff Garzik <jeff@garzik.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] kexec: Avoid migration of already disabled irqs (ia64)Magnus Damm2007-02-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes up ia64 kexec support for HP rx2620 hardware. It does this by skipping migration of already disabled irqs. This is most likely a problem on other ia64 platforms as well, but I've only been able to reproduce it on one machine so far. The full story is that handle_bad_irq() gets invoked before starting the new kernel without this patch. This seems to happen when fixup_irqs() calls generic_handle_irq() on already migrated (and disabled) irqs. So by avoiding migration of disabled irqs we stay away of handle_bad_irq(). The code has been tested on three different ia64 machines, all with good results. It is possible to trigger the same bug by offlining a processor using echo 0 > /sys/devices/system/cpu/cpuX/online. More detailed information is available in the following mail thread: http://lists.osdl.org/pipermail/fastboot/2007-January/thread.html#5774 Signed-off-by: Magnus Damm <magnus@valinux.co.jp> Acked-by: Simon Horman <horms@verge.net.au> Acked-by: Zou, Nanhai <nanhai.zou@intel.com> Acked-by: Jay Lan <jlan@sgi.com> Acked-by: "Luck, Tony" <tony.luck@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] aio: fix buggy put_ioctx call in aio_complete - v2Ken Chen2007-02-031-11/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An AIO bug was reported that sleeping function is being called in softirq context: BUG: warning at kernel/mutex.c:132/__mutex_lock_common() Call Trace: [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0 [<a000000100577ba0>] mutex_lock+0x20/0x40 [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0 [<a00000010018c0c0>] __put_ioctx+0xc0/0x240 [<a00000010018d470>] aio_complete+0x2f0/0x420 [<a00000010019cc80>] finished_one_bio+0x200/0x2a0 [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200 [<a00000010019d260>] dio_bio_end_aio+0x60/0x80 [<a00000010014acd0>] bio_endio+0x110/0x1c0 [<a0000001002770e0>] __end_that_request_first+0x180/0xba0 [<a000000100277b90>] end_that_request_chunk+0x30/0x60 [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod] [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod] [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod] [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod] [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod] [<a000000100277d20>] blk_done_softirq+0x160/0x1c0 [<a000000100083e00>] __do_softirq+0x200/0x240 [<a000000100083eb0>] do_softirq+0x70/0xc0 See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2 flush_workqueue() is not allowed to be called in the softirq context. However, aio_complete() called from I/O interrupt can potentially call put_ioctx with last ref count on ioctx and triggers bug. It is simply incorrect to perform ioctx freeing from aio_complete. The bug is trigger-able from a race between io_destroy() and aio_complete(). A possible scenario: cpu0 cpu1 io_destroy aio_complete wait_for_all_aios { __aio_put_req ... ctx->reqs_active--; if (!ctx->reqs_active) return; } ... put_ioctx(ioctx) put_ioctx(ctx); __put_ioctx bam! Bug trigger! The real problem is that the condition check of ctx->reqs_active in wait_for_all_aios() is incorrect that access to reqs_active is not being properly protected by spin lock. This patch adds that protective spin lock, and at the same time removes all duplicate ref counting for each kiocb as reqs_active is already used as a ref count for each active ioctx. This also ensures that buggy call to flush_workqueue() in softirq context is eliminated. Signed-off-by: "Ken Chen" <kenchen@google.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Suparna Bhattacharya <suparna@in.ibm.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: <stable@kernel.org> Acked-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [NETFILTER]: nf_conntrack_h323: fix compile error with CONFIG_IPV6=m, ↵Adrian Bunk2007-02-031-1/+1
| | | | | | | | | | | | | | | | | | CONFIG_NF_CONNTRACK_H323=y Fix this by letting NF_CONNTRACK_H323 depend on (IPV6 || IPV6=n). Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [NETFILTER]: ctnetlink: fix compile failure with NF_CONNTRACK_MARK=nPatrick McHardy2007-02-032-0/+4
| | | | | | | | | | | | | | | | | | | | CC net/netfilter/nf_conntrack_netlink.o net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_conntrack_event': net/netfilter/nf_conntrack_netlink.c:392: error: 'struct nf_conn' has no member named 'mark' make[3]: *** [net/netfilter/nf_conntrack_netlink.o] Error 1 Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'upstream-linus' of ↵Linus Torvalds2007-02-026-18/+21
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: Initialize nbytes for internal sg commands libata: Fix ata_busy_wait() kernel docs pata_via: Correct missing comments pata_atiixp: propogate cable detection hack from drivers/ide to the new driver ahci/pata_jmicron: fix JMicron quirk
| * | libata: Initialize nbytes for internal sg commandsBrian King2007-02-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some LLDDs, like ipr, use nbytes and pad_len to determine the total data transfer length of a command. Make sure nbytes gets initialized for internally generated commands. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * | libata: Fix ata_busy_wait() kernel docsAlan2007-02-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | > Looks like you should use ata_busy_wait() here, rather than reproducing > the same code again. It waits in 10uS chunks while 1uS chunks were used in the workaround. Could indeed do that once I know the fix is right. While I'm at it the ata_busy_wait kerneldoc is borked so here's a fix Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
| * | pata_via: Correct missing commentsAlan2007-02-021-1/+2
| | | | | | | | | | | | | | | | | | | | | The 8237S was added to the chipsets but not to the comments. Fix this Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>