summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* sparc: kernel: pmc: make of_device_ids const.Arvind Yadav2017-07-031-1/+1
| | | | | | | | | of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sparc64: fix typo in propertyPavel Tatashin2017-06-262-3/+3
| | | | | | | | | | | | | There is a typo in a comment that propagated into code: upa-portis instead of upa-portid This problem was detected by code inspection. Fixes: eea9833453bd ("sparc64: broken %tick frequency on spitfire cpus" Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reported-by: Steven Sistare <steven.sistare@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'sparc64-add-MDESC-and-VIO-support-for-VCC'David S. Miller2017-06-256-167/+479
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jag Raman says: ==================== sparc64: Add MDESC & VIO support for VCC This series of patches is part of an effort to add VCC (Virtual Console Concentrator) support to Linux. VCC enables the virtualization of serial console on SPARC processors. VCC provides access to the guest domain's serial console. VCC depends on some core functionalities in the linux kernel for SPARC. The functionalities include LDC (Logical Domain Channels), MDESC (Machine Descriptor) and VIO (Virtual IO protocol). In order for VCC to be enabled, it requires that these core functionalities support them. This series of patches adds MDESC & VIO support to enable VCC on Linux. It is the second batch of changes to enable VCC. This version of the series addresses the following changes suggested by Dave Miller Patch 4/5: - "name" field in vdev_port md_node_info is declared as "const char *" - Code has been modified to dynamically allocate & free "name" in vdev_port md_node_info - Parameters to mdesc_get_node(), mdesc_get_node_info() & mdesc_get_node_ops() have been updated to use "const char *" Patch 6/5: - "node_name" parameter in vio_create_one() has been changed to "const char *" type from "char *" - Typecasts in vio_create_one() invocations to convert "const char *" to "char *" have been removed Patch 11/5: - Invocations of mdesc_node_get() & mdesc_get_node_info() have been updated to use the prototypes defined in patch 4/5 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: add port_id to VIO device metadataJag Raman2017-06-252-0/+3
| | | | | | | | | | | | | | | | | | | | Add port_id field to VIO device metadata to identify the port of VIO device. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: Enhance search for VIO device in MDESCJag Raman2017-06-253-63/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enhances search for VIO device in MDESC by leveraging already existing MDESC APIs. Enhances changes in earlier patch, "sparc: Machine description indices can vary", by using existing MD search functions. It also specifies a match function, thereby enabling device_find_child() to use it for the purpose of matching device nodes in MDESC. An API to find VDEV node in MDESC based on its md_node_info is also added. It is planned to be used by VIO device clients in the future. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: enhance VIO device probingJag Raman2017-06-252-16/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | - Allocate IRQs for VIO devices during probing. - Allow clients to specify if IRQs would be allocated for a given VIO device. - Cache the device handle of the root node of channel-devices sub-tree in Machine Description (MDESC). Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: check if a client is allowed to register for MDESC notificationsJag Raman2017-06-251-0/+17
| | | | | | | | | | | | | | | | | | | | Check if a client is supported, by comparing against a whitelist, to register for notifications from Machine Description (MDESC) Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: remove restriction on VIO device name sizeJag Raman2017-06-251-19/+4
| | | | | | | | | | | | | | | | | | | | | | | | Removes restriction on VIO device's size limit. Since KOBJ_NAME_LEN has been dropped from kobject, there doesn't seem to be a restriction on the device name anymore. This limit therefore doesn't make sense. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: refactor code to obtain cfg_handle property from MDESCJag Raman2017-06-251-11/+19
| | | | | | | | | | | | | | | | | | | | Refactors code to get the cfg_handle property of a node from Machine Description (MDESC) Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: add MDESC node name property to VIO device metadataJag Raman2017-06-254-51/+68
| | | | | | | | | | | | | | | | | | | | | | Add the MDESC node name of MDESC client to VIO device metadata. It is later used to uniquely identify a node in the MDESC. VIO & MDESC APIs are updated to handle this node name. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: mdesc: use __GFP_REPEAT action modifier for VM allocationJag Raman2017-06-251-5/+3
| | | | | | | | | | | | | | | | | | | | | | During MDESC handle allocation, use the __GFP_REPEAT flag instead of __GFP_NOFAIL. If memory is not available, the caller expects a NULL pointer instead of waiting until memory is allocated. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: expand MDESC interfaceJag Raman2017-06-252-0/+233
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the following two APIs to Machine Description (MDESC) - mdesc_get_node: Searches for a node in the Machine Description tree based on given information about that node. - mdesc_get_node_info: Retrieves information about a given node. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: skip handshake for LDC channels in RAW modeJag Raman2017-06-252-1/+15
| | | | | | | | | | | | | | | | | | | | | | LDC channels in RAW mode does not provide any session management. No handshake protocol is defined for LDC channels in RAW mode. It's therefore skipped. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: specify the device class in VIO version info. packetJag Raman2017-06-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Specify the class of VIO device in the version info. packet. The device's class identifies the type of VIO device, whether it's DISK, CONSOLE, NETWORK, etc... This packet is used in the handshake between the client and server for this device. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: ensure VIO operations are defined while being usedJag Raman2017-06-251-4/+13
|/ | | | | | | | | | | It's possible that VIO operations are not defined for some VIO clients. In that case, VIO ops pointer should be checked for NULL before being used Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sparc: kernel: apc: make of_device_ids constArvind Yadav2017-06-251-1/+1
| | | | | | | | | of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sparc/time: make of_device_ids constArvind Yadav2017-06-151-1/+1
| | | | | | | | | of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'sparc64-early-boot-timestamp-fixes'David S. Miller2017-06-151-6/+28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | Pavel Tatashin says: ==================== sparc64: Early boot timestamp fixes Guenter Roeck reported a problem that was introduced by early boot timestamp changes. Where: tick_get_frequency() returns 0. ==================== Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: broken %tick frequency on spitfire cpusPavel Tatashin2017-06-151-1/+26
| | | | | | | | | | | | | | | | | | | | | | After early boot time stamps project the %tick frequency is detected incorrectly on spittfire cpus. We must use cpuid of boot cpu to find corresponding cpu node in OpenBoot, and extract clock-frequency property from there. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: use prom interface to get %stick frequencyPavel Tatashin2017-06-151-5/+2
|/ | | | | | | | | | | We initialize time early, we must use prom interface instead of open firmware driver, which is not yet initialized. Also, use prom_getintdefault() instead of prom_getint() to be compatible with the code before early boot timestamps project. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'sparc64-early-boot-timestamp'David S. Miller2017-06-136-58/+181
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pavel Tatashin says: ==================== sparc64: Early boot timestamp Changelog: v2 - v3: - __aligned(64) -> __cacheline_aligned - Replaced in sched_clock() wmb() with barrier() v1 - v2: - Early boot timestamps are now available on all 64-bit sparc processors - New hot-patched get_tick() function. This patch set: - enables early boot timestamps on SPARC, - adds offset so we count time from zero, the same as it is done on other platforms - improves the performance by inling, hot patching, and combining loads into the same cacheline. (and few other optimizations). So, the final performance of sched_clock() is faster than now: the fewer number of loads, and all of them are coming from the same cacheline. Loads can run while we are reading tick value, and we do not do function call. Current sched_clock(): sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x250 ], %g1 ldx [ %g1 ], %g1 call %g1 nop sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x300 ], %g1 mulx %o0, %g1, %g1 rett %i7 + 8 srlx %g1, 0xa, %o0 Final sched_clock(): sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x2c8 ], %g2 ldx [ %g1 + 0x2c0 ], %g1 b 42f638 <sched_clock+0x44> rd %asr24, %i0 ... sllx %i0, 1, %i0 srlx %i0, 1, %i0 mulx %i0, %g1, %i0 srlx %i0, 0xa, %i0 rett %i7 + 8 sub %o0, %g2, %o0 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: optimize functions that access tickPavel Tatashin2017-06-131-9/+13
| | | | | | | | | | | | | | | | | | Replace read tick function pointers with the new hot-patched get_tick(). This optimizes the performance of functions such as: sched_clock() Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: add hot-patched and inlined get_tick()Pavel Tatashin2017-06-133-6/+87
| | | | | | | | | | | | | | | | | | Add the new get_tick() function that is hot-patched during boot based on processor we are booting on. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: initialize time earlyPavel Tatashin2017-06-133-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Linux it is possible to configure printk() to output timestamp next to every line. This is very useful to determine the slow parts of the boot process, and also to avoid regressions, as boot time is visiable to everyone. Also, there are scripts that change these time stamps to intervals. However, on larger machines these timestamps start appearing many seconds, and even minutes into the boot process. This patch gets stick-frequency property early from OpenBoot, and uses its value to initialize time stamps before the first printk() messages are printed. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: improve modularity tick optionsPavel Tatashin2017-06-132-28/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch prepares the code for early boot time stamps by making it more modular. - init_tick_ops() to initialize struct sparc64_tick_ops - new sparc64_tick_ops operation get_frequency() which returns a frequency Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: optimize loads in clock_sched()Pavel Tatashin2017-06-132-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In clock sched we now have three loads: - Function pointer - quotient for multiplication - offset However, it is possible to improve performance substantially, by guaranteeing that all three loads are from the same cacheline. By moving these three values first in sparc64_tick_ops, and by having tick_operations 64-byte aligned we guarantee this. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: show time stamps from zeroPavel Tatashin2017-06-131-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | On most platforms, time is shown from the beginning of boot. This patch is adding offset to sched_clock() for SPARC, to also show time from 0. This means we will have one more load, but we saved one in an ealier patch. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: access tick function from variablePavel Tatashin2017-06-131-14/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In timer_64.c tick functions are access via pointer (tick_ops), every time clock is read, there is one extra load to get to the function. This patch optimizes it, by accessing functions pointer from value. Current ched_clock(): sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x250 ], %g1 ! <tick_ops> ldx [ %g1 ], %g1 call %g1 nop sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x300 ], %g1 ! <timer_ticks_per_nsec_quotient> mulx %o0, %g1, %g1 rett %i7 + 8 srlx %g1, 0xa, %o0 New sched_clock(): sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x340 ], %g1 call %g1 nop sethi %hi(0xb9b400), %g1 ldx [ %g1 + 0x378 ], %g1 mulx %o0, %g1, %g1 rett %i7 + 8 srlx %g1, 0xa, %o0 Before three loads, now two loads. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: remove trailing white spacesPavel Tatashin2017-06-132-4/+4
|/ | | | | | | | | A few changes that were reported by checkpatch, removed all trailing white spaces in these two files. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'sparc64-LDC-changes-for-porting-VCC-driver-into-upstream-kernel'David S. Miller2017-06-102-43/+106
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jag Raman says: ==================== sparc64: LDC changes for porting VCC driver into upstream kernel This series of patches is part of an effort to add VCC (Virtual Console Concentrator) support to Linux. VCC enables the virtualization of serial console on SPARC processors. VCC provides access to the guest domain's serial console. VCC depends on some core functionalities in the linux kernel for SPARC. The functionalities include LDC (Logical Domain Channels), MDESC (Machine Descriptor) and VIO (Virtual IO protocol). In order for VCC to be enabled, it requires that these core functionalities support them. This series of patches adds LDC support to enable VCC on Linux. It is the first batch of changes to enable VCC. This version 4 of the series addresses the following changes suggested by Dave Miller Patch 1/5: Modifies ldc_print/__ldc_print to print caller name. Fixes indentation of wrapped lines. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: print debug messages when reading from LDC channelJag Raman2017-06-101-0/+5
| | | | | | | | | | | | | | | | | | | | | | Print debug messages when reading from given LDC channel. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: ldc abort during vds iso bootJag Raman2017-06-101-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Orabug: 20902628 When an ldc control-only packet is received during data exchange in read_nonraw(), a new rx head is calculated but the rx queue head is not actually advanced (rx_set_head() is not called) and a branch is taken to 'no_data' at which point two things can happen depending on the value of the newly calculated rx head and the current rx tail: - If the rx queue is determined to be not empty, then the wrong packet is picked up. - If the rx queue is determined to be empty, then a read error (EAGAIN) is eventually returned since it is falsely assumed that more data was expected. The fix is to update the rx head and return in case of a control only packet during data exchange. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: ensure LDC channel is ready before communicationJag Raman2017-06-101-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | Ensure that LDC channel is up before writing to it, in RAW mode. Generate event to bring the LDC channel up, if it's not up already. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: enhance ldc_abort to print messageJag Raman2017-06-101-25/+28
| | | | | | | | | | | | | | | | | | | | | | | | Enhance ldc_abort to accept a message to be printed when it is called. Add a macro, LDC_ABORT, to print info. about the function that calls ldc_abort. Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sparc64: expand LDC interfaceJag Raman2017-06-102-12/+42
|/ | | | | | | | | | | | | | | | Add the following LDC APIs which are planned to be used by LDC clients in the future: - ldc_set_state: Sets given LDC channel to given state - ldc_mode: Returns the mode of given LDC channel - ldc_print: Prints info about given LDC channel - ldc_rx_reset: Reset the RX queue of given LDC channel Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcDavid S. Miller2017-06-10697-3701/+6647
|\
| * Merge branch 'for-linus' of ↵Linus Torvalds2017-06-105-20/+63
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull UFS fixes from Al Viro: "This is just the obvious backport fodder; I'm pretty sure that there will be more - definitely so wrt performance and quite possibly correctness as well" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ufs: we need to sync inode before freeing it excessive checks in ufs_write_failed() and ufs_evict_inode() ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments() ufs: set correct ->s_maxsize ufs: restore maintaining ->i_blocks fix ufs_isblockset() ufs: restore proper tail allocation
| | * ufs: we need to sync inode before freeing itAl Viro2017-06-101-0/+1
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * excessive checks in ufs_write_failed() and ufs_evict_inode()Al Viro2017-06-091-13/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As it is, short copy in write() to append-only file will fail to truncate the excessive allocated blocks. As the matter of fact, all checks in ufs_truncate_blocks() are either redundant or wrong for that caller. As for the only other caller (ufs_evict_inode()), we only need the file type checks there. Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * ufs_getfrag_block(): we only grab ->truncate_mutex on block creation pathAl Viro2017-06-091-1/+3
| | | | | | | | | | | | | | | Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments()Al Viro2017-06-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | ... and it really needs splitting into "new" and "extend" cases, but that's for later Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * ufs: set correct ->s_maxsizeAl Viro2017-06-091-0/+18
| | | | | | | | | | | | | | | Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * ufs: restore maintaining ->i_blocksAl Viro2017-06-092-1/+26
| | | | | | | | | | | | | | | Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * fix ufs_isblockset()Al Viro2017-06-091-3/+7
| | | | | | | | | | | | | | | Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| | * ufs: restore proper tail allocationAl Viro2017-06-091-1/+1
| | | | | | | | | | | | | | | Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | Merge branch 'for-linus-4.12' of ↵Linus Torvalds2017-06-106-16/+139
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Some fixes that Dave Sterba collected. We've been hitting an early enospc problem on production machines that Omar tracked down to an old int->u64 mistake. I waited a bit on this pull to make sure it was really the problem from production, but it's on ~2100 hosts now and I think we're good. Omar also noticed a commit in the queue would make new early ENOSPC problems. I pulled that out for now, which is why the top three commits are younger than the rest. Otherwise these are all fixes, some explaining very old bugs that we've been poking at for a while" * 'for-linus-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: fix delalloc accounting leak caused by u32 overflow Btrfs: clear EXTENT_DEFRAG bits in finish_ordered_io btrfs: tree-log.c: Wrong printk information about namelen btrfs: fix race with relocation recovery and fs_root setup btrfs: fix memory leak in update_space_info failure path btrfs: use correct types for page indices in btrfs_page_exists_in_range btrfs: fix incorrect error return ret being passed to mapping_set_error btrfs: Make flush bios explicitely sync btrfs: fiemap: Cache and merge fiemap extent before submit it to user
| | * | Btrfs: fix delalloc accounting leak caused by u32 overflowOmar Sandoval2017-06-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_calc_trans_metadata_size() does an unsigned 32-bit multiplication, which can overflow if num_items >= 4 GB / (nodesize * BTRFS_MAX_LEVEL * 2). For a nodesize of 16kB, this overflow happens at 16k items. Usually, num_items is a small constant passed to btrfs_start_transaction(), but we also use btrfs_calc_trans_metadata_size() for metadata reservations for extent items in btrfs_delalloc_{reserve,release}_metadata(). In drop_outstanding_extents(), num_items is calculated as inode->reserved_extents - inode->outstanding_extents. The difference between these two counters is usually small, but if many delalloc extents are reserved and then the outstanding extents are merged in btrfs_merge_extent_hook(), the difference can become large enough to overflow in btrfs_calc_trans_metadata_size(). The overflow manifests itself as a leak of a multiple of 4 GB in delalloc_block_rsv and the metadata bytes_may_use counter. This in turn can cause early ENOSPC errors. Additionally, these WARN_ONs in extent-tree.c will be hit when unmounting: WARN_ON(fs_info->delalloc_block_rsv.size > 0); WARN_ON(fs_info->delalloc_block_rsv.reserved > 0); WARN_ON(space_info->bytes_pinned > 0 || space_info->bytes_reserved > 0 || space_info->bytes_may_use > 0); Fix it by casting nodesize to a u64 so that btrfs_calc_trans_metadata_size() does a full 64-bit multiplication. While we're here, do the same in btrfs_calc_trunc_metadata_size(); this can't overflow with any existing uses, but it's better to be safe here than have another hard-to-debug problem later on. Cc: stable@vger.kernel.org Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| | * | Btrfs: clear EXTENT_DEFRAG bits in finish_ordered_ioLiu Bo2017-06-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this, we use 'filled' mode here, ie. if all range has been filled with EXTENT_DEFRAG bits, get to clear it, but if the defrag range joins the adjacent delalloc range, then we'll have EXTENT_DEFRAG bits in extent_state until releasing this inode's pages, and that prevents extent_data from being freed. This clears the bit if any was found within the ordered extent. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| | * | btrfs: tree-log.c: Wrong printk information about namelenSu Yue2017-06-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In verify_dir_item, it wants to printk name_len of dir_item but printk data_len acutally. Fix it by calling btrfs_dir_name_len instead of btrfs_dir_data_len. Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
| | * | btrfs: fix race with relocation recovery and fs_root setupJeff Mahoney2017-06-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we have to recover relocation during mount, we'll ultimately have to evict the orphan inode. That goes through the reservation dance, where priority_reclaim_metadata_space and flush_space expect fs_info->fs_root to be valid. That's the next thing to be set up during mount, so we crash, almost always in flush_space trying to join the transaction but priority_reclaim_metadata_space is possible as well. This call path has been problematic in the past WRT whether ->fs_root is valid yet. Commit 957780eb278 (Btrfs: introduce ticketed enospc infrastructure) added new users that are called in the direct path instead of the async path that had already been worked around. The thing is that we don't actually need the fs_root, specifically, for anything. We either use it to determine whether the root is the chunk_root for use in choosing an allocation profile or as a root to pass btrfs_join_transaction before immediately committing it. Anything that isn't the chunk root works in the former case and any root works in the latter. A simple fix is to use a root we know will always be there: the extent_root. Cc: <stable@vger.kernel.org> # v4.8+ Fixes: 957780eb278 (Btrfs: introduce ticketed enospc infrastructure) Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>