From a4ac73a701288f633fbf2369ffaae4841aa295b3 Mon Sep 17 00:00:00 2001 From: Carlos Chinea Date: Thu, 29 Apr 2010 13:19:06 +0300 Subject: HSI: Add HSI API documentation Add an entry for HSI in the device-drivers section of the kernel documentation. Signed-off-by: Carlos Chinea --- Documentation/DocBook/device-drivers.tmpl | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'Documentation') diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index b638e50cf8f6..5f70f734e8b4 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -437,4 +437,21 @@ X!Idrivers/video/console/fonts.c !Edrivers/i2c/i2c-core.c + + High Speed Synchronous Serial Interface (HSI) + + + High Speed Synchronous Serial Interface (HSI) is a + serial interface mainly used for connecting application + engines (APE) with cellular modem engines (CMT) in cellular + handsets. + + HSI provides multiplexing for up to 16 logical channels, + low-latency and full duplex communication. + + +!Iinclude/linux/hsi/hsi.h +!Edrivers/hsi/hsi.c + + -- cgit v1.2.3 From 43139a61fc68f4b0af7327a0e63f340a7c81c69a Mon Sep 17 00:00:00 2001 From: Andras Domokos Date: Thu, 6 May 2010 15:10:47 +0300 Subject: HSI: hsi_char: Update ioctl-number.txt Added ioctl range for HSI char devices to the documentation Signed-off-by: Andras Domokos Signed-off-by: Carlos Chinea --- Documentation/ioctl/ioctl-number.txt | 1 + 1 file changed, 1 insertion(+) (limited to 'Documentation') diff --git a/Documentation/ioctl/ioctl-number.txt b/Documentation/ioctl/ioctl-number.txt index 54078ed96b37..af76fdef6046 100644 --- a/Documentation/ioctl/ioctl-number.txt +++ b/Documentation/ioctl/ioctl-number.txt @@ -223,6 +223,7 @@ Code Seq#(hex) Include File Comments 'j' 00-3F linux/joystick.h 'k' 00-0F linux/spi/spidev.h conflict! 'k' 00-05 video/kyro.h conflict! +'k' 10-17 linux/hsi/hsi_char.h HSI character device 'l' 00-3F linux/tcfs_fs.h transparent cryptographic file system 'l' 40-7F linux/udf_fs_i.h in development: -- cgit v1.2.3 From b43ab901d671e3e3cad425ea5e9a3c74e266dcdd Mon Sep 17 00:00:00 2001 From: Sebastian Andrzej Siewior Date: Mon, 27 Jun 2011 09:26:23 +0200 Subject: gpio: Add a driver for Sodaville GPIO controller Sodaville has GPIO controller behind the PCI bus. To my suprissed it is not the same as on PXA. The interrupt & gpio chip can be referenced from the device tree like from any other driver. Unfortunately the driver which uses the gpio interrupt has to use irq_of_parse_and_map() instead of platform_get_irq(). The problem is that the platform device (which is created from the device tree) is most likely created before the interrupt chip is registered and therefore irq_of_parse_and_map() fails. In theory the driver works as module. In reality most of the irq functions are not exported to modules and it is possible that _this_ module is unloaded while the provided irqs are still in use. Signed-off-by: Hans J. Koch [torbenh@linutronix.de: make it work after the irq namespace cleanup, add some device tree entries.] Signed-off-by: Torben Hohn [bigeasy@linutronix.de: convert to generic irq & gpio chip] Signed-off-by: Sebastian Andrzej Siewior [grant.likely@secretlab.ca: depend on x86 to avoid irq_domain breakage] Signed-off-by: Grant Likely --- .../devicetree/bindings/gpio/sodaville.txt | 48 ++++ arch/x86/platform/ce4100/falconfalls.dts | 7 +- drivers/gpio/Kconfig | 8 + drivers/gpio/Makefile | 1 + drivers/gpio/gpio-sodaville.c | 302 +++++++++++++++++++++ 5 files changed, 364 insertions(+), 2 deletions(-) create mode 100644 Documentation/devicetree/bindings/gpio/sodaville.txt create mode 100644 drivers/gpio/gpio-sodaville.c (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/gpio/sodaville.txt b/Documentation/devicetree/bindings/gpio/sodaville.txt new file mode 100644 index 000000000000..563eff22b975 --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/sodaville.txt @@ -0,0 +1,48 @@ +GPIO controller on CE4100 / Sodaville SoCs +========================================== + +The bindings for CE4100's GPIO controller match the generic description +which is covered by the gpio.txt file in this folder. + +The only additional property is the intel,muxctl property which holds the +value which is written into the MUXCNTL register. + +There is no compatible property for now because the driver is probed via +PCI id (vendor 0x8086 device 0x2e67). + +The interrupt specifier consists of two cells encoded as follows: + - <1st cell>: The interrupt-number that identifies the interrupt source. + - <2nd cell>: The level-sense information, encoded as follows: + 4 - active high level-sensitive + 8 - active low level-sensitive + +Example of the GPIO device and one user: + + pcigpio: gpio@b,1 { + /* two cells for GPIO and interrupt */ + #gpio-cells = <2>; + #interrupt-cells = <2>; + compatible = "pci8086,2e67.2", + "pci8086,2e67", + "pciclassff0000", + "pciclassff00"; + + reg = <0x15900 0x0 0x0 0x0 0x0>; + /* Interrupt line of the gpio device */ + interrupts = <15 1>; + /* It is an interrupt and GPIO controller itself */ + interrupt-controller; + gpio-controller; + intel,muxctl = <0>; + }; + + testuser@20 { + compatible = "example,testuser"; + /* User the 11th GPIO line as an active high triggered + * level interrupt + */ + interrupts = <11 8>; + interrupt-parent = <&pcigpio>; + /* Use this GPIO also with the gpio functions */ + gpios = <&pcigpio 11 0>; + }; diff --git a/arch/x86/platform/ce4100/falconfalls.dts b/arch/x86/platform/ce4100/falconfalls.dts index e70be38ce039..ce874f872cc6 100644 --- a/arch/x86/platform/ce4100/falconfalls.dts +++ b/arch/x86/platform/ce4100/falconfalls.dts @@ -208,16 +208,19 @@ interrupts = <14 1>; }; - gpio@b,1 { + pcigpio: gpio@b,1 { + #gpio-cells = <2>; + #interrupt-cells = <2>; compatible = "pci8086,2e67.2", "pci8086,2e67", "pciclassff0000", "pciclassff00"; - #gpio-cells = <2>; reg = <0x15900 0x0 0x0 0x0 0x0>; interrupts = <15 1>; + interrupt-controller; gpio-controller; + intel,muxctl = <0>; }; i2c-controller@b,2 { diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index eaa7d3828e70..dbb1909ca0a2 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -417,6 +417,14 @@ config GPIO_ML_IOH Hub) which is for IVI(In-Vehicle Infotainment) use. This driver can access the IOH's GPIO device. +config GPIO_SODAVILLE + bool "Intel Sodaville GPIO support" + depends on X86 && PCI && OF + select GPIO_GENERIC + select GENERIC_IRQ_CHIP + help + Say Y here to support Intel Sodaville GPIO. + config GPIO_TIMBERDALE bool "Support for timberdale GPIO IP" depends on MFD_TIMBERDALE && HAS_IOMEM diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile index 8863a7f2dece..593bdcd1976e 100644 --- a/drivers/gpio/Makefile +++ b/drivers/gpio/Makefile @@ -46,6 +46,7 @@ obj-$(CONFIG_GPIO_RDC321X) += gpio-rdc321x.o obj-$(CONFIG_PLAT_SAMSUNG) += gpio-samsung.o obj-$(CONFIG_ARCH_SA1100) += gpio-sa1100.o obj-$(CONFIG_GPIO_SCH) += gpio-sch.o +obj-$(CONFIG_GPIO_SODAVILLE) += gpio-sodaville.o obj-$(CONFIG_GPIO_STMPE) += gpio-stmpe.o obj-$(CONFIG_GPIO_SX150X) += gpio-sx150x.o obj-$(CONFIG_GPIO_TC3589X) += gpio-tc3589x.o diff --git a/drivers/gpio/gpio-sodaville.c b/drivers/gpio/gpio-sodaville.c new file mode 100644 index 000000000000..9ba15d31d242 --- /dev/null +++ b/drivers/gpio/gpio-sodaville.c @@ -0,0 +1,302 @@ +/* + * GPIO interface for Intel Sodaville SoCs. + * + * Copyright (c) 2010, 2011 Intel Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License 2 as published + * by the Free Software Foundation. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define DRV_NAME "sdv_gpio" +#define SDV_NUM_PUB_GPIOS 12 +#define PCI_DEVICE_ID_SDV_GPIO 0x2e67 +#define GPIO_BAR 0 + +#define GPOUTR 0x00 +#define GPOER 0x04 +#define GPINR 0x08 + +#define GPSTR 0x0c +#define GPIT1R0 0x10 +#define GPIO_INT 0x14 +#define GPIT1R1 0x18 + +#define GPMUXCTL 0x1c + +struct sdv_gpio_chip_data { + int irq_base; + void __iomem *gpio_pub_base; + struct irq_domain id; + struct irq_chip_generic *gc; + struct bgpio_chip bgpio; +}; + +static int sdv_gpio_pub_set_type(struct irq_data *d, unsigned int type) +{ + struct irq_chip_generic *gc = irq_data_get_irq_chip_data(d); + struct sdv_gpio_chip_data *sd = gc->private; + void __iomem *type_reg; + u32 irq_offs = d->irq - sd->irq_base; + u32 reg; + + if (irq_offs < 8) + type_reg = sd->gpio_pub_base + GPIT1R0; + else + type_reg = sd->gpio_pub_base + GPIT1R1; + + reg = readl(type_reg); + + switch (type) { + case IRQ_TYPE_LEVEL_HIGH: + reg &= ~BIT(4 * (irq_offs % 8)); + break; + + case IRQ_TYPE_LEVEL_LOW: + reg |= BIT(4 * (irq_offs % 8)); + break; + + default: + return -EINVAL; + } + + writel(reg, type_reg); + return 0; +} + +static irqreturn_t sdv_gpio_pub_irq_handler(int irq, void *data) +{ + struct sdv_gpio_chip_data *sd = data; + u32 irq_stat = readl(sd->gpio_pub_base + GPSTR); + + irq_stat &= readl(sd->gpio_pub_base + GPIO_INT); + if (!irq_stat) + return IRQ_NONE; + + while (irq_stat) { + u32 irq_bit = __fls(irq_stat); + + irq_stat &= ~BIT(irq_bit); + generic_handle_irq(sd->irq_base + irq_bit); + } + + return IRQ_HANDLED; +} + +static int sdv_xlate(struct irq_domain *h, struct device_node *node, + const u32 *intspec, u32 intsize, irq_hw_number_t *out_hwirq, + u32 *out_type) +{ + u32 line, type; + + if (node != h->of_node) + return -EINVAL; + + if (intsize < 2) + return -EINVAL; + + line = *intspec; + *out_hwirq = line; + + intspec++; + type = *intspec; + + switch (type) { + case IRQ_TYPE_LEVEL_LOW: + case IRQ_TYPE_LEVEL_HIGH: + *out_type = type; + break; + default: + return -EINVAL; + } + return 0; +} + +static struct irq_domain_ops irq_domain_sdv_ops = { + .dt_translate = sdv_xlate, +}; + +static __devinit int sdv_register_irqsupport(struct sdv_gpio_chip_data *sd, + struct pci_dev *pdev) +{ + struct irq_chip_type *ct; + int ret; + + sd->irq_base = irq_alloc_descs(-1, 0, SDV_NUM_PUB_GPIOS, -1); + if (sd->irq_base < 0) + return sd->irq_base; + + /* mask + ACK all interrupt sources */ + writel(0, sd->gpio_pub_base + GPIO_INT); + writel((1 << 11) - 1, sd->gpio_pub_base + GPSTR); + + ret = request_irq(pdev->irq, sdv_gpio_pub_irq_handler, IRQF_SHARED, + "sdv_gpio", sd); + if (ret) + goto out_free_desc; + + sd->id.irq_base = sd->irq_base; + sd->id.of_node = of_node_get(pdev->dev.of_node); + sd->id.ops = &irq_domain_sdv_ops; + + /* + * This gpio irq controller latches level irqs. Testing shows that if + * we unmask & ACK the IRQ before the source of the interrupt is gone + * then the interrupt is active again. + */ + sd->gc = irq_alloc_generic_chip("sdv-gpio", 1, sd->irq_base, + sd->gpio_pub_base, handle_fasteoi_irq); + if (!sd->gc) { + ret = -ENOMEM; + goto out_free_irq; + } + + sd->gc->private = sd; + ct = sd->gc->chip_types; + ct->type = IRQ_TYPE_LEVEL_HIGH | IRQ_TYPE_LEVEL_LOW; + ct->regs.eoi = GPSTR; + ct->regs.mask = GPIO_INT; + ct->chip.irq_mask = irq_gc_mask_clr_bit; + ct->chip.irq_unmask = irq_gc_mask_set_bit; + ct->chip.irq_eoi = irq_gc_eoi; + ct->chip.irq_set_type = sdv_gpio_pub_set_type; + + irq_setup_generic_chip(sd->gc, IRQ_MSK(SDV_NUM_PUB_GPIOS), + IRQ_GC_INIT_MASK_CACHE, IRQ_NOREQUEST, + IRQ_LEVEL | IRQ_NOPROBE); + + irq_domain_add(&sd->id); + return 0; +out_free_irq: + free_irq(pdev->irq, sd); +out_free_desc: + irq_free_descs(sd->irq_base, SDV_NUM_PUB_GPIOS); + return ret; +} + +static int __devinit sdv_gpio_probe(struct pci_dev *pdev, + const struct pci_device_id *pci_id) +{ + struct sdv_gpio_chip_data *sd; + unsigned long addr; + const void *prop; + int len; + int ret; + u32 mux_val; + + sd = kzalloc(sizeof(struct sdv_gpio_chip_data), GFP_KERNEL); + if (!sd) + return -ENOMEM; + ret = pci_enable_device(pdev); + if (ret) { + dev_err(&pdev->dev, "can't enable device.\n"); + goto done; + } + + ret = pci_request_region(pdev, GPIO_BAR, DRV_NAME); + if (ret) { + dev_err(&pdev->dev, "can't alloc PCI BAR #%d\n", GPIO_BAR); + goto disable_pci; + } + + addr = pci_resource_start(pdev, GPIO_BAR); + if (!addr) + goto release_reg; + sd->gpio_pub_base = ioremap(addr, pci_resource_len(pdev, GPIO_BAR)); + + prop = of_get_property(pdev->dev.of_node, "intel,muxctl", &len); + if (prop && len == 4) { + mux_val = of_read_number(prop, 1); + writel(mux_val, sd->gpio_pub_base + GPMUXCTL); + } + + ret = bgpio_init(&sd->bgpio, &pdev->dev, 4, + sd->gpio_pub_base + GPINR, sd->gpio_pub_base + GPOUTR, + NULL, sd->gpio_pub_base + GPOER, NULL, false); + if (ret) + goto unmap; + sd->bgpio.gc.ngpio = SDV_NUM_PUB_GPIOS; + + ret = gpiochip_add(&sd->bgpio.gc); + if (ret < 0) { + dev_err(&pdev->dev, "gpiochip_add() failed.\n"); + goto unmap; + } + + ret = sdv_register_irqsupport(sd, pdev); + if (ret) + goto unmap; + + pci_set_drvdata(pdev, sd); + dev_info(&pdev->dev, "Sodaville GPIO driver registered.\n"); + return 0; + +unmap: + iounmap(sd->gpio_pub_base); +release_reg: + pci_release_region(pdev, GPIO_BAR); +disable_pci: + pci_disable_device(pdev); +done: + kfree(sd); + return ret; +} + +static void sdv_gpio_remove(struct pci_dev *pdev) +{ + struct sdv_gpio_chip_data *sd = pci_get_drvdata(pdev); + + irq_domain_del(&sd->id); + free_irq(pdev->irq, sd); + irq_free_descs(sd->irq_base, SDV_NUM_PUB_GPIOS); + + if (gpiochip_remove(&sd->bgpio.gc)) + dev_err(&pdev->dev, "gpiochip_remove() failed.\n"); + + pci_release_region(pdev, GPIO_BAR); + iounmap(sd->gpio_pub_base); + pci_disable_device(pdev); + kfree(sd); +} + +static struct pci_device_id sdv_gpio_pci_ids[] __devinitdata = { + { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_SDV_GPIO) }, + { 0, }, +}; + +static struct pci_driver sdv_gpio_driver = { + .name = DRV_NAME, + .id_table = sdv_gpio_pci_ids, + .probe = sdv_gpio_probe, + .remove = sdv_gpio_remove, +}; + +static int __init sdv_gpio_init(void) +{ + return pci_register_driver(&sdv_gpio_driver); +} +module_init(sdv_gpio_init); + +static void __exit sdv_gpio_exit(void) +{ + pci_unregister_driver(&sdv_gpio_driver); +} +module_exit(sdv_gpio_exit); + +MODULE_AUTHOR("Hans J. Koch "); +MODULE_DESCRIPTION("GPIO interface for Intel Sodaville SoCs"); +MODULE_LICENSE("GPL v2"); -- cgit v1.2.3 From 1dce27c5aa6770e9d195f2bb7db1db3d4dde5591 Mon Sep 17 00:00:00 2001 From: David Howells Date: Thu, 16 Feb 2012 17:49:42 +0000 Subject: Wrap accesses to the fd_sets in struct fdtable Wrap accesses to the fd_sets in struct fdtable (for recording open files and close-on-exec flags) so that we can move away from using fd_sets since we abuse the fd_set structs by not allocating the full-sized structure under normal circumstances and by non-core code looking at the internals of the fd_sets. The first abuse means that use of FD_ZERO() on these fd_sets is not permitted, since that cannot be told about their abnormal lengths. This introduces six wrapper functions for setting, clearing and testing close-on-exec flags and fd-is-open flags: void __set_close_on_exec(int fd, struct fdtable *fdt); void __clear_close_on_exec(int fd, struct fdtable *fdt); bool close_on_exec(int fd, const struct fdtable *fdt); void __set_open_fd(int fd, struct fdtable *fdt); void __clear_open_fd(int fd, struct fdtable *fdt); bool fd_is_open(int fd, const struct fdtable *fdt); Note that I've prepended '__' to the names of the set/clear functions because they require the caller to hold a lock to use them. Note also that I haven't added wrappers for looking behind the scenes at the the array. Possibly that should exist too. Signed-off-by: David Howells Link: http://lkml.kernel.org/r/20120216174942.23314.1364.stgit@warthog.procyon.org.uk Signed-off-by: H. Peter Anvin Cc: Al Viro --- Documentation/filesystems/files.txt | 4 ++-- arch/powerpc/platforms/cell/spufs/coredump.c | 2 +- drivers/staging/android/binder.c | 10 +++++----- fs/autofs4/dev-ioctl.c | 2 +- fs/exec.c | 4 ++-- fs/fcntl.c | 18 ++++++++--------- fs/file.c | 8 ++++---- fs/open.c | 4 ++-- fs/proc/base.c | 2 +- include/linux/fdtable.h | 30 ++++++++++++++++++++++++++++ 10 files changed, 57 insertions(+), 27 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/files.txt b/Documentation/filesystems/files.txt index ac2facc50d2a..46dfc6b038c3 100644 --- a/Documentation/filesystems/files.txt +++ b/Documentation/filesystems/files.txt @@ -113,8 +113,8 @@ the fdtable structure - if (fd >= 0) { /* locate_fd() may have expanded fdtable, load the ptr */ fdt = files_fdtable(files); - FD_SET(fd, fdt->open_fds); - FD_CLR(fd, fdt->close_on_exec); + __set_open_fd(fd, fdt); + __clear_close_on_exec(fd, fdt); spin_unlock(&files->file_lock); ..... diff --git a/arch/powerpc/platforms/cell/spufs/coredump.c b/arch/powerpc/platforms/cell/spufs/coredump.c index 03c5fce2a5b3..c2c5b078ba80 100644 --- a/arch/powerpc/platforms/cell/spufs/coredump.c +++ b/arch/powerpc/platforms/cell/spufs/coredump.c @@ -122,7 +122,7 @@ static struct spu_context *coredump_next_context(int *fd) struct spu_context *ctx = NULL; for (; *fd < fdt->max_fds; (*fd)++) { - if (!FD_ISSET(*fd, fdt->open_fds)) + if (!fd_is_open(*fd, fdt)) continue; file = fcheck(*fd); diff --git a/drivers/staging/android/binder.c b/drivers/staging/android/binder.c index 7491801a661c..35dd9c370e55 100644 --- a/drivers/staging/android/binder.c +++ b/drivers/staging/android/binder.c @@ -408,11 +408,11 @@ repeat: goto repeat; } - FD_SET(fd, fdt->open_fds); + __set_open_fd(fd, fdt); if (flags & O_CLOEXEC) - FD_SET(fd, fdt->close_on_exec); + __set_close_on_exec(fd, fdt); else - FD_CLR(fd, fdt->close_on_exec); + __clear_close_on_exec(fd, fdt); files->next_fd = fd + 1; #if 1 /* Sanity check */ @@ -453,7 +453,7 @@ static void task_fd_install( static void __put_unused_fd(struct files_struct *files, unsigned int fd) { struct fdtable *fdt = files_fdtable(files); - __FD_CLR(fd, fdt->open_fds); + __clear_open_fd(fd, fdt); if (fd < files->next_fd) files->next_fd = fd; } @@ -479,7 +479,7 @@ static long task_close_fd(struct binder_proc *proc, unsigned int fd) if (!filp) goto out_unlock; rcu_assign_pointer(fdt->fd[fd], NULL); - FD_CLR(fd, fdt->close_on_exec); + __clear_close_on_exec(fd, fdt); __put_unused_fd(files, fd); spin_unlock(&files->file_lock); retval = filp_close(filp, files); diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index 76741d8d7786..3dfd615afb6b 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -230,7 +230,7 @@ static void autofs_dev_ioctl_fd_install(unsigned int fd, struct file *file) fdt = files_fdtable(files); BUG_ON(fdt->fd[fd] != NULL); rcu_assign_pointer(fdt->fd[fd], file); - FD_SET(fd, fdt->close_on_exec); + __set_close_on_exec(fd, fdt); spin_unlock(&files->file_lock); } diff --git a/fs/exec.c b/fs/exec.c index 92ce83a11e90..22cc38d9e79f 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -2078,8 +2078,8 @@ static int umh_pipe_setup(struct subprocess_info *info, struct cred *new) fd_install(0, rp); spin_lock(&cf->file_lock); fdt = files_fdtable(cf); - FD_SET(0, fdt->open_fds); - FD_CLR(0, fdt->close_on_exec); + __set_open_fd(0, fdt); + __clear_close_on_exec(0, fdt); spin_unlock(&cf->file_lock); /* and disallow core files too */ diff --git a/fs/fcntl.c b/fs/fcntl.c index 22764c7c8382..75e7c1f3a080 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -32,20 +32,20 @@ void set_close_on_exec(unsigned int fd, int flag) spin_lock(&files->file_lock); fdt = files_fdtable(files); if (flag) - FD_SET(fd, fdt->close_on_exec); + __set_close_on_exec(fd, fdt); else - FD_CLR(fd, fdt->close_on_exec); + __clear_close_on_exec(fd, fdt); spin_unlock(&files->file_lock); } -static int get_close_on_exec(unsigned int fd) +static bool get_close_on_exec(unsigned int fd) { struct files_struct *files = current->files; struct fdtable *fdt; - int res; + bool res; rcu_read_lock(); fdt = files_fdtable(files); - res = FD_ISSET(fd, fdt->close_on_exec); + res = close_on_exec(fd, fdt); rcu_read_unlock(); return res; } @@ -90,15 +90,15 @@ SYSCALL_DEFINE3(dup3, unsigned int, oldfd, unsigned int, newfd, int, flags) err = -EBUSY; fdt = files_fdtable(files); tofree = fdt->fd[newfd]; - if (!tofree && FD_ISSET(newfd, fdt->open_fds)) + if (!tofree && fd_is_open(newfd, fdt)) goto out_unlock; get_file(file); rcu_assign_pointer(fdt->fd[newfd], file); - FD_SET(newfd, fdt->open_fds); + __set_open_fd(newfd, fdt); if (flags & O_CLOEXEC) - FD_SET(newfd, fdt->close_on_exec); + __set_close_on_exec(newfd, fdt); else - FD_CLR(newfd, fdt->close_on_exec); + __clear_close_on_exec(newfd, fdt); spin_unlock(&files->file_lock); if (tofree) diff --git a/fs/file.c b/fs/file.c index 4c6992d8f3ba..114fea0a2cec 100644 --- a/fs/file.c +++ b/fs/file.c @@ -366,7 +366,7 @@ struct files_struct *dup_fd(struct files_struct *oldf, int *errorp) * is partway through open(). So make sure that this * fd is available to the new process. */ - FD_CLR(open_files - i, new_fdt->open_fds); + __clear_open_fd(open_files - i, new_fdt); } rcu_assign_pointer(*new_fds++, f); } @@ -460,11 +460,11 @@ repeat: if (start <= files->next_fd) files->next_fd = fd + 1; - FD_SET(fd, fdt->open_fds); + __set_open_fd(fd, fdt); if (flags & O_CLOEXEC) - FD_SET(fd, fdt->close_on_exec); + __set_close_on_exec(fd, fdt); else - FD_CLR(fd, fdt->close_on_exec); + __clear_close_on_exec(fd, fdt); error = fd; #if 1 /* Sanity check */ diff --git a/fs/open.c b/fs/open.c index 77becc041149..5720854156db 100644 --- a/fs/open.c +++ b/fs/open.c @@ -836,7 +836,7 @@ EXPORT_SYMBOL(dentry_open); static void __put_unused_fd(struct files_struct *files, unsigned int fd) { struct fdtable *fdt = files_fdtable(files); - __FD_CLR(fd, fdt->open_fds); + __clear_open_fd(fd, fdt); if (fd < files->next_fd) files->next_fd = fd; } @@ -1080,7 +1080,7 @@ SYSCALL_DEFINE1(close, unsigned int, fd) if (!filp) goto out_unlock; rcu_assign_pointer(fdt->fd[fd], NULL); - FD_CLR(fd, fdt->close_on_exec); + __clear_close_on_exec(fd, fdt); __put_unused_fd(files, fd); spin_unlock(&files->file_lock); retval = filp_close(filp, files); diff --git a/fs/proc/base.c b/fs/proc/base.c index d4548dd49b02..db6ab4b36a0b 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -1754,7 +1754,7 @@ static int proc_fd_info(struct inode *inode, struct path *path, char *info) fdt = files_fdtable(files); f_flags = file->f_flags & ~O_CLOEXEC; - if (FD_ISSET(fd, fdt->close_on_exec)) + if (close_on_exec(fd, fdt)) f_flags |= O_CLOEXEC; if (path) { diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h index 82163c4b32c9..7675da2c18f7 100644 --- a/include/linux/fdtable.h +++ b/include/linux/fdtable.h @@ -38,6 +38,36 @@ struct fdtable { struct fdtable *next; }; +static inline void __set_close_on_exec(int fd, struct fdtable *fdt) +{ + FD_SET(fd, fdt->close_on_exec); +} + +static inline void __clear_close_on_exec(int fd, struct fdtable *fdt) +{ + FD_CLR(fd, fdt->close_on_exec); +} + +static inline bool close_on_exec(int fd, const struct fdtable *fdt) +{ + return FD_ISSET(fd, fdt->close_on_exec); +} + +static inline void __set_open_fd(int fd, struct fdtable *fdt) +{ + FD_SET(fd, fdt->open_fds); +} + +static inline void __clear_open_fd(int fd, struct fdtable *fdt) +{ + FD_CLR(fd, fdt->open_fds); +} + +static inline bool fd_is_open(int fd, const struct fdtable *fdt) +{ + return FD_ISSET(fd, fdt->open_fds); +} + /* * Open file table structure */ -- cgit v1.2.3 From 43e625d84fa7daca0ad46f1dbc965b04fd204afe Mon Sep 17 00:00:00 2001 From: Eric Sandeen Date: Mon, 20 Feb 2012 17:53:04 -0500 Subject: ext4: remove the journal=update mount option The V2 journal format was introduced around ten years ago, for ext3. It seems highly unlikely that anyone will need this migration option for ext4. Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" --- Documentation/filesystems/ext4.txt | 3 -- fs/ext4/super.c | 26 +---------------- fs/jbd2/journal.c | 57 -------------------------------------- 3 files changed, 1 insertion(+), 85 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 10ec4639f152..09d5f32ea156 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt @@ -144,9 +144,6 @@ journal_async_commit Commit block can be written to disk without waiting mount the device. This will enable 'journal_checksum' internally. -journal=update Update the ext4 file system's journal to the current - format. - journal_dev=devnum When the external journal device's major/minor numbers have changed, this option allows the user to specify the new journal location. The journal device is diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 3e697ec7feca..9420c50c5c6a 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1336,8 +1336,7 @@ enum { Opt_user_xattr, Opt_nouser_xattr, Opt_acl, Opt_noacl, Opt_auto_da_alloc, Opt_noauto_da_alloc, Opt_noload, Opt_nobh, Opt_bh, Opt_commit, Opt_min_batch_time, Opt_max_batch_time, - Opt_journal_update, Opt_journal_dev, - Opt_journal_checksum, Opt_journal_async_commit, + Opt_journal_dev, Opt_journal_checksum, Opt_journal_async_commit, Opt_abort, Opt_data_journal, Opt_data_ordered, Opt_data_writeback, Opt_data_err_abort, Opt_data_err_ignore, Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota, @@ -1379,7 +1378,6 @@ static const match_table_t tokens = { {Opt_commit, "commit=%u"}, {Opt_min_batch_time, "min_batch_time=%u"}, {Opt_max_batch_time, "max_batch_time=%u"}, - {Opt_journal_update, "journal=update"}, {Opt_journal_dev, "journal_dev=%u"}, {Opt_journal_checksum, "journal_checksum"}, {Opt_journal_async_commit, "journal_async_commit"}, @@ -1629,19 +1627,6 @@ static int parse_options(char *options, struct super_block *sb, ext4_msg(sb, KERN_ERR, "(no)acl options not supported"); break; #endif - case Opt_journal_update: - /* @@@ FIXME */ - /* Eventually we will want to be able to create - a journal file here. For now, only allow the - user to specify an existing inode to be the - journal file. */ - if (is_remount) { - ext4_msg(sb, KERN_ERR, - "Cannot specify journal on remount"); - return 0; - } - set_opt(sb, UPDATE_JOURNAL); - break; case Opt_journal_dev: if (is_remount) { ext4_msg(sb, KERN_ERR, @@ -4109,15 +4094,6 @@ static int ext4_load_journal(struct super_block *sb, if (!(journal->j_flags & JBD2_BARRIER)) ext4_msg(sb, KERN_INFO, "barriers disabled"); - if (!really_read_only && test_opt(sb, UPDATE_JOURNAL)) { - err = jbd2_journal_update_format(journal); - if (err) { - ext4_msg(sb, KERN_ERR, "error updating journal"); - jbd2_journal_destroy(journal); - return err; - } - } - if (!EXT4_HAS_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_RECOVER)) err = jbd2_journal_wipe(journal, !really_read_only); if (!err) { diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 47e341100c2c..cfb36d99f7a4 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -71,7 +71,6 @@ EXPORT_SYMBOL(jbd2_journal_revoke); EXPORT_SYMBOL(jbd2_journal_init_dev); EXPORT_SYMBOL(jbd2_journal_init_inode); -EXPORT_SYMBOL(jbd2_journal_update_format); EXPORT_SYMBOL(jbd2_journal_check_used_features); EXPORT_SYMBOL(jbd2_journal_check_available_features); EXPORT_SYMBOL(jbd2_journal_set_features); @@ -96,7 +95,6 @@ EXPORT_SYMBOL(jbd2_journal_release_jbd_inode); EXPORT_SYMBOL(jbd2_journal_begin_ordered_truncate); EXPORT_SYMBOL(jbd2_inode_cache); -static int journal_convert_superblock_v1(journal_t *, journal_superblock_t *); static void __journal_abort_soft (journal_t *journal, int errno); static int jbd2_journal_create_slab(size_t slab_size); @@ -1551,61 +1549,6 @@ void jbd2_journal_clear_features(journal_t *journal, unsigned long compat, } EXPORT_SYMBOL(jbd2_journal_clear_features); -/** - * int jbd2_journal_update_format () - Update on-disk journal structure. - * @journal: Journal to act on. - * - * Given an initialised but unloaded journal struct, poke about in the - * on-disk structure to update it to the most recent supported version. - */ -int jbd2_journal_update_format (journal_t *journal) -{ - journal_superblock_t *sb; - int err; - - err = journal_get_superblock(journal); - if (err) - return err; - - sb = journal->j_superblock; - - switch (be32_to_cpu(sb->s_header.h_blocktype)) { - case JBD2_SUPERBLOCK_V2: - return 0; - case JBD2_SUPERBLOCK_V1: - return journal_convert_superblock_v1(journal, sb); - default: - break; - } - return -EINVAL; -} - -static int journal_convert_superblock_v1(journal_t *journal, - journal_superblock_t *sb) -{ - int offset, blocksize; - struct buffer_head *bh; - - printk(KERN_WARNING - "JBD2: Converting superblock from version 1 to 2.\n"); - - /* Pre-initialise new fields to zero */ - offset = ((char *) &(sb->s_feature_compat)) - ((char *) sb); - blocksize = be32_to_cpu(sb->s_blocksize); - memset(&sb->s_feature_compat, 0, blocksize-offset); - - sb->s_nr_users = cpu_to_be32(1); - sb->s_header.h_blocktype = cpu_to_be32(JBD2_SUPERBLOCK_V2); - journal->j_format_version = 2; - - bh = journal->j_sb_buffer; - BUFFER_TRACE(bh, "marking dirty"); - mark_buffer_dirty(bh); - sync_dirty_buffer(bh); - return 0; -} - - /** * int jbd2_journal_flush () - Flush journal * @journal: Journal to act on. -- cgit v1.2.3 From 661aa520577046192e50467b28c9c5726a8a9fb1 Mon Sep 17 00:00:00 2001 From: Eric Sandeen Date: Mon, 20 Feb 2012 17:53:04 -0500 Subject: ext4: remove the resize mount option The resize mount option seems to be of limited value, especially in the age of online resize2fs. Nuke it. Signed-off-by: Eric Sandeen Signed-off-by: "Theodore Ts'o" --- Documentation/filesystems/ext4.txt | 5 ----- fs/ext4/super.c | 29 ++++++----------------------- 2 files changed, 6 insertions(+), 28 deletions(-) (limited to 'Documentation') diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 09d5f32ea156..990219c58b1a 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt @@ -353,11 +353,6 @@ nouid32 Disables 32-bit UIDs and GIDs. This is for interoperability with older kernels which only store and expect 16-bit values. -resize Allows to resize filesystem to the end of the last - existing block group, further resize has to be done - with resize2fs either online, or offline. It can be - used only with conjunction with remount. - block_validity This options allows to enables/disables the in-kernel noblock_validity facility for tracking filesystem metadata blocks within internal data structures. This allows multi- diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 9420c50c5c6a..deb6eb35ff3d 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1342,7 +1342,7 @@ enum { Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota, Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota, Opt_noquota, Opt_ignore, Opt_barrier, Opt_nobarrier, Opt_err, - Opt_resize, Opt_usrquota, Opt_grpquota, Opt_i_version, + Opt_usrquota, Opt_grpquota, Opt_i_version, Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_mblk_io_submit, Opt_nomblk_io_submit, Opt_block_validity, Opt_noblock_validity, Opt_inode_readahead_blks, Opt_journal_ioprio, @@ -1403,7 +1403,6 @@ static const match_table_t tokens = { {Opt_nobarrier, "nobarrier"}, {Opt_i_version, "i_version"}, {Opt_stripe, "stripe=%u"}, - {Opt_resize, "resize"}, {Opt_delalloc, "delalloc"}, {Opt_nodelalloc, "nodelalloc"}, {Opt_mblk_io_submit, "mblk_io_submit"}, @@ -1513,7 +1512,7 @@ static int clear_qf_name(struct super_block *sb, int qtype) static int parse_options(char *options, struct super_block *sb, unsigned long *journal_devnum, unsigned int *journal_ioprio, - ext4_fsblk_t *n_blocks_count, int is_remount) + int is_remount) { struct ext4_sb_info *sbi = EXT4_SB(sb); char *p; @@ -1794,17 +1793,6 @@ set_qf_format: break; case Opt_ignore: break; - case Opt_resize: - if (!is_remount) { - ext4_msg(sb, KERN_ERR, - "resize option only available " - "for remount"); - return 0; - } - if (match_int(&args[0], &option) != 0) - return 0; - *n_blocks_count = option; - break; case Opt_nobh: ext4_msg(sb, KERN_WARNING, "Ignoring deprecated nobh option"); @@ -3241,13 +3229,13 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) sbi->s_li_wait_mult = EXT4_DEF_LI_WAIT_MULT; if (!parse_options((char *) sbi->s_es->s_mount_opts, sb, - &journal_devnum, &journal_ioprio, NULL, 0)) { + &journal_devnum, &journal_ioprio, 0)) { ext4_msg(sb, KERN_WARNING, "failed to parse options in superblock: %s", sbi->s_es->s_mount_opts); } if (!parse_options((char *) data, sb, &journal_devnum, - &journal_ioprio, NULL, 0)) + &journal_ioprio, 0)) goto failed_mount; if (test_opt(sb, DATA_FLAGS) == EXT4_MOUNT_JOURNAL_DATA) { @@ -4380,7 +4368,6 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) { struct ext4_super_block *es; struct ext4_sb_info *sbi = EXT4_SB(sb); - ext4_fsblk_t n_blocks_count = 0; unsigned long old_sb_flags; struct ext4_mount_options old_opts; int enable_quota = 0; @@ -4413,8 +4400,7 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) /* * Allow the "check" option to be passed as a remount option. */ - if (!parse_options(data, sb, NULL, &journal_ioprio, - &n_blocks_count, 1)) { + if (!parse_options(data, sb, NULL, &journal_ioprio, 1)) { err = -EINVAL; goto restore_opts; } @@ -4432,8 +4418,7 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) set_task_ioprio(sbi->s_journal->j_task, journal_ioprio); } - if ((*flags & MS_RDONLY) != (sb->s_flags & MS_RDONLY) || - n_blocks_count > ext4_blocks_count(es)) { + if ((*flags & MS_RDONLY) != (sb->s_flags & MS_RDONLY)) { if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED) { err = -EROFS; goto restore_opts; @@ -4508,8 +4493,6 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) if (sbi->s_journal) ext4_clear_journal_err(sb, es); sbi->s_mount_state = le16_to_cpu(es->s_state); - if ((err = ext4_group_extend(sb, es, n_blocks_count))) - goto restore_opts; if (!ext4_setup_super(sb, es, 0)) sb->s_flags &= ~MS_RDONLY; if (EXT4_HAS_INCOMPAT_FEATURE(sb, -- cgit v1.2.3 From e08b96371625aaa84cb03f51acc4c8e0be27403a Mon Sep 17 00:00:00 2001 From: Carsten Otte Date: Wed, 4 Jan 2012 10:25:20 +0100 Subject: KVM: s390: add parameter for KVM_CREATE_VM This patch introduces a new config option for user controlled kernel virtual machines. It introduces a parameter to KVM_CREATE_VM that allows to set bits that alter the capabilities of the newly created virtual machine. The parameter is passed to kvm_arch_init_vm for all architectures. The only valid modifier bit for now is KVM_VM_S390_UCONTROL. This requires CAP_SYS_ADMIN privileges and creates a user controlled virtual machine on s390 architectures. Signed-off-by: Carsten Otte Signed-off-by: Marcelo Tosatti Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 7 ++++++- arch/ia64/kvm/kvm-ia64.c | 5 ++++- arch/powerpc/kvm/powerpc.c | 5 ++++- arch/s390/kvm/Kconfig | 9 +++++++++ arch/s390/kvm/kvm-s390.c | 24 +++++++++++++++++++----- arch/s390/kvm/kvm-s390.h | 10 ++++++++++ arch/x86/kvm/x86.c | 5 ++++- include/linux/kvm.h | 3 +++ include/linux/kvm_host.h | 2 +- virt/kvm/kvm_main.c | 13 +++++-------- 10 files changed, 65 insertions(+), 18 deletions(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index e1d94bf4056e..579d40b26a5a 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -95,7 +95,7 @@ described as 'basic' will be available. Capability: basic Architectures: all Type: system ioctl -Parameters: none +Parameters: machine type identifier (KVM_VM_*) Returns: a VM fd that can be used to control the new virtual machine. The new VM has no virtual cpus and no memory. An mmap() of a VM fd @@ -103,6 +103,11 @@ will access the virtual machine's physical address space; offset zero corresponds to guest physical address zero. Use of mmap() on a VM fd is discouraged if userspace memory allocation (KVM_CAP_USER_MEMORY) is available. +You most certainly want to use 0 as machine type. + +In order to create user controlled virtual machines on S390, check +KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as +privileged user (CAP_SYS_ADMIN). 4.3 KVM_GET_MSR_INDEX_LIST diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index 405052002493..df6b14194051 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -809,10 +809,13 @@ static void kvm_build_io_pmt(struct kvm *kvm) #define GUEST_PHYSICAL_RR4 0x2739 #define VMM_INIT_RR 0x1660 -int kvm_arch_init_vm(struct kvm *kvm) +int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { BUG_ON(!kvm); + if (type) + return -EINVAL; + kvm->arch.is_sn2 = ia64_platform_is("sn2"); kvm->arch.metaphysical_rr0 = GUEST_PHYSICAL_RR0; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 607fbdf24b84..83f244569874 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -171,8 +171,11 @@ void kvm_arch_check_processor_compat(void *rtn) *(int *)rtn = kvmppc_core_check_processor_compat(); } -int kvm_arch_init_vm(struct kvm *kvm) +int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { + if (type) + return -EINVAL; + return kvmppc_core_init_vm(kvm); } diff --git a/arch/s390/kvm/Kconfig b/arch/s390/kvm/Kconfig index a21634173a66..78eb9847008f 100644 --- a/arch/s390/kvm/Kconfig +++ b/arch/s390/kvm/Kconfig @@ -34,6 +34,15 @@ config KVM If unsure, say N. +config KVM_S390_UCONTROL + bool "Userspace controlled virtual machines" + depends on KVM + ---help--- + Allow CAP_SYS_ADMIN users to create KVM virtual machines that are + controlled by userspace. + + If unsure, say N. + # OK, it's a little counter-intuitive to do this, but it puts it neatly under # the virtualization menu. source drivers/vhost/Kconfig diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index d1c445732451..f0937552175b 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -171,11 +171,22 @@ long kvm_arch_vm_ioctl(struct file *filp, return r; } -int kvm_arch_init_vm(struct kvm *kvm) +int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { int rc; char debug_name[16]; + rc = -EINVAL; +#ifdef CONFIG_KVM_S390_UCONTROL + if (type & ~KVM_VM_S390_UCONTROL) + goto out_err; + if ((type & KVM_VM_S390_UCONTROL) && (!capable(CAP_SYS_ADMIN))) + goto out_err; +#else + if (type) + goto out_err; +#endif + rc = s390_enable_sie(); if (rc) goto out_err; @@ -198,10 +209,13 @@ int kvm_arch_init_vm(struct kvm *kvm) debug_register_view(kvm->arch.dbf, &debug_sprintf_view); VM_EVENT(kvm, 3, "%s", "vm created"); - kvm->arch.gmap = gmap_alloc(current->mm); - if (!kvm->arch.gmap) - goto out_nogmap; - + if (type & KVM_VM_S390_UCONTROL) { + kvm->arch.gmap = NULL; + } else { + kvm->arch.gmap = gmap_alloc(current->mm); + if (!kvm->arch.gmap) + goto out_nogmap; + } return 0; out_nogmap: debug_unregister(kvm->arch.dbf); diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index 99b0b7597115..45b236a7c730 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -47,6 +47,16 @@ static inline int __cpu_is_stopped(struct kvm_vcpu *vcpu) return atomic_read(&vcpu->arch.sie_block->cpuflags) & CPUSTAT_STOP_INT; } +static inline int kvm_is_ucontrol(struct kvm *kvm) +{ +#ifdef CONFIG_KVM_S390_UCONTROL + if (kvm->arch.gmap) + return 0; + return 1; +#else + return 0; +#endif +} int kvm_s390_handle_wait(struct kvm_vcpu *vcpu); enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer); void kvm_s390_tasklet(unsigned long parm); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9cbfc0698118..06925b4bcc27 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6031,8 +6031,11 @@ void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) free_page((unsigned long)vcpu->arch.pio_data); } -int kvm_arch_init_vm(struct kvm *kvm) +int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) { + if (type) + return -EINVAL; + INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.assigned_dev_head); diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 68e67e50d028..bba393a6760f 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -431,6 +431,9 @@ struct kvm_ppc_pvinfo { #define KVMIO 0xAE +/* machine type bits, to be used as argument to KVM_CREATE_VM */ +#define KVM_VM_S390_UCONTROL 1 + /* * ioctls for /dev/kvm fds: */ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 900c76337e8f..82375e145e64 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -520,7 +520,7 @@ static inline void kvm_arch_free_vm(struct kvm *kvm) } #endif -int kvm_arch_init_vm(struct kvm *kvm); +int kvm_arch_init_vm(struct kvm *kvm, unsigned long type); void kvm_arch_destroy_vm(struct kvm *kvm); void kvm_free_all_assigned_devices(struct kvm *kvm); void kvm_arch_sync_events(struct kvm *kvm); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index a91f980077d8..32e3b048a6cf 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -449,7 +449,7 @@ static void kvm_init_memslots_id(struct kvm *kvm) slots->id_to_index[i] = slots->memslots[i].id = i; } -static struct kvm *kvm_create_vm(void) +static struct kvm *kvm_create_vm(unsigned long type) { int r, i; struct kvm *kvm = kvm_arch_alloc_vm(); @@ -457,7 +457,7 @@ static struct kvm *kvm_create_vm(void) if (!kvm) return ERR_PTR(-ENOMEM); - r = kvm_arch_init_vm(kvm); + r = kvm_arch_init_vm(kvm, type); if (r) goto out_err_nodisable; @@ -2198,12 +2198,12 @@ static struct file_operations kvm_vm_fops = { .llseek = noop_llseek, }; -static int kvm_dev_ioctl_create_vm(void) +static int kvm_dev_ioctl_create_vm(unsigned long type) { int r; struct kvm *kvm; - kvm = kvm_create_vm(); + kvm = kvm_create_vm(type); if (IS_ERR(kvm)) return PTR_ERR(kvm); #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET @@ -2254,10 +2254,7 @@ static long kvm_dev_ioctl(struct file *filp, r = KVM_API_VERSION; break; case KVM_CREATE_VM: - r = -EINVAL; - if (arg) - goto out; - r = kvm_dev_ioctl_create_vm(); + r = kvm_dev_ioctl_create_vm(arg); break; case KVM_CHECK_EXTENSION: r = kvm_dev_ioctl_check_extension_generic(arg); -- cgit v1.2.3 From 27e0393f15fc8bc855c6a888387ff5ffd2181089 Mon Sep 17 00:00:00 2001 From: Carsten Otte Date: Wed, 4 Jan 2012 10:25:21 +0100 Subject: KVM: s390: ucontrol: per vcpu address spaces This patch introduces two ioctls for virtual cpus, that are only valid for kernel virtual machines that are controlled by userspace. Each virtual cpu has its individual address space in this mode of operation, and each address space is backed by the gmap implementation just like the address space for regular KVM guests. KVM_S390_UCAS_MAP allows to map a part of the user's virtual address space to the vcpu. Starting offset and length in both the user and the vcpu address space need to be aligned to 1M. KVM_S390_UCAS_UNMAP can be used to unmap a range of memory from a virtual cpu in a similar way. Signed-off-by: Carsten Otte Signed-off-by: Marcelo Tosatti Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 38 +++++++++++++++++++++++++++++ arch/s390/kvm/kvm-s390.c | 50 ++++++++++++++++++++++++++++++++++++++- include/linux/kvm.h | 10 ++++++++ 3 files changed, 97 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 579d40b26a5a..ee394b263261 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1496,6 +1496,44 @@ following algorithm: Some guests configure the LINT1 NMI input to cause a panic, aiding in debugging. +4.64 KVM_S390_UCAS_MAP + +Capability: KVM_CAP_S390_UCONTROL +Architectures: s390 +Type: vcpu ioctl +Parameters: struct kvm_s390_ucas_mapping (in) +Returns: 0 in case of success + +The parameter is defined like this: + struct kvm_s390_ucas_mapping { + __u64 user_addr; + __u64 vcpu_addr; + __u64 length; + }; + +This ioctl maps the memory at "user_addr" with the length "length" to +the vcpu's address space starting at "vcpu_addr". All parameters need to +be alligned by 1 megabyte. + +4.65 KVM_S390_UCAS_UNMAP + +Capability: KVM_CAP_S390_UCONTROL +Architectures: s390 +Type: vcpu ioctl +Parameters: struct kvm_s390_ucas_mapping (in) +Returns: 0 in case of success + +The parameter is defined like this: + struct kvm_s390_ucas_mapping { + __u64 user_addr; + __u64 vcpu_addr; + __u64 length; + }; + +This ioctl unmaps the memory in the vcpu's address space starting at +"vcpu_addr" with the length "length". The field "user_addr" is ignored. +All parameters need to be alligned by 1 megabyte. + 5. The kvm_run structure Application code obtains a pointer to the kvm_run structure by diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index f0937552175b..2d3248895def 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -233,6 +233,10 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu) (__u64) vcpu->arch.sie_block) vcpu->kvm->arch.sca->cpu[vcpu->vcpu_id].sda = 0; smp_mb(); + + if (kvm_is_ucontrol(vcpu->kvm)) + gmap_free(vcpu->arch.gmap); + free_page((unsigned long)(vcpu->arch.sie_block)); kvm_vcpu_uninit(vcpu); kfree(vcpu); @@ -263,12 +267,20 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_free_vcpus(kvm); free_page((unsigned long)(kvm->arch.sca)); debug_unregister(kvm->arch.dbf); - gmap_free(kvm->arch.gmap); + if (!kvm_is_ucontrol(kvm)) + gmap_free(kvm->arch.gmap); } /* Section: vcpu related */ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) { + if (kvm_is_ucontrol(vcpu->kvm)) { + vcpu->arch.gmap = gmap_alloc(current->mm); + if (!vcpu->arch.gmap) + return -ENOMEM; + return 0; + } + vcpu->arch.gmap = vcpu->kvm->arch.gmap; return 0; } @@ -687,6 +699,42 @@ long kvm_arch_vcpu_ioctl(struct file *filp, case KVM_S390_INITIAL_RESET: r = kvm_arch_vcpu_ioctl_initial_reset(vcpu); break; +#ifdef CONFIG_KVM_S390_UCONTROL + case KVM_S390_UCAS_MAP: { + struct kvm_s390_ucas_mapping ucasmap; + + if (copy_from_user(&ucasmap, argp, sizeof(ucasmap))) { + r = -EFAULT; + break; + } + + if (!kvm_is_ucontrol(vcpu->kvm)) { + r = -EINVAL; + break; + } + + r = gmap_map_segment(vcpu->arch.gmap, ucasmap.user_addr, + ucasmap.vcpu_addr, ucasmap.length); + break; + } + case KVM_S390_UCAS_UNMAP: { + struct kvm_s390_ucas_mapping ucasmap; + + if (copy_from_user(&ucasmap, argp, sizeof(ucasmap))) { + r = -EFAULT; + break; + } + + if (!kvm_is_ucontrol(vcpu->kvm)) { + r = -EINVAL; + break; + } + + r = gmap_unmap_segment(vcpu->arch.gmap, ucasmap.vcpu_addr, + ucasmap.length); + break; + } +#endif default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index bba393a6760f..0a66c1072691 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -658,6 +658,16 @@ struct kvm_clock_data { struct kvm_userspace_memory_region) #define KVM_SET_TSS_ADDR _IO(KVMIO, 0x47) #define KVM_SET_IDENTITY_MAP_ADDR _IOW(KVMIO, 0x48, __u64) + +/* enable ucontrol for s390 */ +struct kvm_s390_ucas_mapping { + __u64 user_addr; + __u64 vcpu_addr; + __u64 length; +}; +#define KVM_S390_UCAS_MAP _IOW(KVMIO, 0x50, struct kvm_s390_ucas_mapping) +#define KVM_S390_UCAS_UNMAP _IOW(KVMIO, 0x51, struct kvm_s390_ucas_mapping) + /* Device model IOC */ #define KVM_CREATE_IRQCHIP _IO(KVMIO, 0x60) #define KVM_IRQ_LINE _IOW(KVMIO, 0x61, struct kvm_irq_level) -- cgit v1.2.3 From e168bf8de33e16a909df2401af1f7d419c5780de Mon Sep 17 00:00:00 2001 From: Carsten Otte Date: Wed, 4 Jan 2012 10:25:22 +0100 Subject: KVM: s390: ucontrol: export page faults to user This patch introduces a new exit reason in the kvm_run structure named KVM_EXIT_S390_UCONTROL. This exit indicates, that a virtual cpu has regognized a fault on the host page table. The idea is that userspace can handle this fault by mapping memory at the fault location into the cpu's address space and then continue to run the virtual cpu. Signed-off-by: Carsten Otte Signed-off-by: Marcelo Tosatti Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 14 ++++++++++++++ arch/s390/kvm/kvm-s390.c | 32 +++++++++++++++++++++++++++----- arch/s390/kvm/kvm-s390.h | 1 + include/linux/kvm.h | 6 ++++++ 4 files changed, 48 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index ee394b263261..6e53ff51422f 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1694,6 +1694,20 @@ s390 specific. s390 specific. + /* KVM_EXIT_S390_UCONTROL */ + struct { + __u64 trans_exc_code; + __u32 pgm_code; + } s390_ucontrol; + +s390 specific. A page fault has occurred for a user controlled virtual +machine (KVM_VM_S390_UNCONTROL) on it's host page table that cannot be +resolved by the kernel. +The program code and the translation exception code that were placed +in the cpu's lowcore are presented here as defined by the z Architecture +Principles of Operation Book in the Chapter for Dynamic Address Translation +(DAT) + /* KVM_EXIT_DCR */ struct { __u32 dcrn; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 2d3248895def..af05328aca25 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -493,8 +493,10 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, return -EINVAL; /* not implemented yet */ } -static void __vcpu_run(struct kvm_vcpu *vcpu) +static int __vcpu_run(struct kvm_vcpu *vcpu) { + int rc; + memcpy(&vcpu->arch.sie_block->gg14, &vcpu->arch.guest_gprs[14], 16); if (need_resched()) @@ -511,9 +513,15 @@ static void __vcpu_run(struct kvm_vcpu *vcpu) local_irq_enable(); VCPU_EVENT(vcpu, 6, "entering sie flags %x", atomic_read(&vcpu->arch.sie_block->cpuflags)); - if (sie64a(vcpu->arch.sie_block, vcpu->arch.guest_gprs)) { - VCPU_EVENT(vcpu, 3, "%s", "fault in sie instruction"); - kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + rc = sie64a(vcpu->arch.sie_block, vcpu->arch.guest_gprs); + if (rc) { + if (kvm_is_ucontrol(vcpu->kvm)) { + rc = SIE_INTERCEPT_UCONTROL; + } else { + VCPU_EVENT(vcpu, 3, "%s", "fault in sie instruction"); + kvm_s390_inject_program_int(vcpu, PGM_ADDRESSING); + rc = 0; + } } VCPU_EVENT(vcpu, 6, "exit sie icptcode %d", vcpu->arch.sie_block->icptcode); @@ -522,6 +530,7 @@ static void __vcpu_run(struct kvm_vcpu *vcpu) local_irq_enable(); memcpy(&vcpu->arch.guest_gprs[14], &vcpu->arch.sie_block->gg14, 16); + return rc; } int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) @@ -542,6 +551,7 @@ rerun_vcpu: case KVM_EXIT_UNKNOWN: case KVM_EXIT_INTR: case KVM_EXIT_S390_RESET: + case KVM_EXIT_S390_UCONTROL: break; default: BUG(); @@ -553,7 +563,9 @@ rerun_vcpu: might_fault(); do { - __vcpu_run(vcpu); + rc = __vcpu_run(vcpu); + if (rc) + break; rc = kvm_handle_sie_intercept(vcpu); } while (!signal_pending(current) && !rc); @@ -565,6 +577,16 @@ rerun_vcpu: rc = -EINTR; } +#ifdef CONFIG_KVM_S390_UCONTROL + if (rc == SIE_INTERCEPT_UCONTROL) { + kvm_run->exit_reason = KVM_EXIT_S390_UCONTROL; + kvm_run->s390_ucontrol.trans_exc_code = + current->thread.gmap_addr; + kvm_run->s390_ucontrol.pgm_code = 0x10; + rc = 0; + } +#endif + if (rc == -EOPNOTSUPP) { /* intercept cannot be handled in-kernel, prepare kvm-run */ kvm_run->exit_reason = KVM_EXIT_S390_SIEIC; diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index 45b236a7c730..62aa5f19bb98 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -26,6 +26,7 @@ typedef int (*intercept_handler_t)(struct kvm_vcpu *vcpu); /* negativ values are error codes, positive values for internal conditions */ #define SIE_INTERCEPT_RERUNVCPU (1<<0) +#define SIE_INTERCEPT_UCONTROL (1<<1) int kvm_handle_sie_intercept(struct kvm_vcpu *vcpu); #define VM_EVENT(d_kvm, d_loglevel, d_string, d_args...)\ diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 0a66c1072691..7f686f6708b0 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -162,6 +162,7 @@ struct kvm_pit_config { #define KVM_EXIT_INTERNAL_ERROR 17 #define KVM_EXIT_OSI 18 #define KVM_EXIT_PAPR_HCALL 19 +#define KVM_EXIT_S390_UCONTROL 20 /* For KVM_EXIT_INTERNAL_ERROR */ #define KVM_INTERNAL_ERROR_EMULATION 1 @@ -249,6 +250,11 @@ struct kvm_run { #define KVM_S390_RESET_CPU_INIT 8 #define KVM_S390_RESET_IPL 16 __u64 s390_reset_flags; + /* KVM_EXIT_S390_UCONTROL */ + struct { + __u64 trans_exc_code; + __u32 pgm_code; + } s390_ucontrol; /* KVM_EXIT_DCR */ struct { __u32 dcrn; -- cgit v1.2.3 From 5b1c1493afe8d69909f9df3221bb2fffdf479f4a Mon Sep 17 00:00:00 2001 From: Carsten Otte Date: Wed, 4 Jan 2012 10:25:23 +0100 Subject: KVM: s390: ucontrol: export SIE control block to user This patch exports the s390 SIE hardware control block to userspace via the mapping of the vcpu file descriptor. In order to do so, a new arch callback named kvm_arch_vcpu_fault is introduced for all architectures. It allows to map architecture specific pages. Signed-off-by: Carsten Otte Signed-off-by: Marcelo Tosatti Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 5 +++++ arch/ia64/kvm/kvm-ia64.c | 5 +++++ arch/powerpc/kvm/powerpc.c | 5 +++++ arch/s390/kvm/kvm-s390.c | 13 +++++++++++++ arch/x86/kvm/x86.c | 5 +++++ include/linux/kvm.h | 2 ++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 2 +- 8 files changed, 37 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 6e53ff51422f..5ebf47d99e56 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -218,6 +218,11 @@ allocation of vcpu ids. For example, if userspace wants single-threaded guest vcpus, it should make all vcpu ids be a multiple of the number of vcpus per vcore. +For virtual cpus that have been created with S390 user controlled virtual +machines, the resulting vcpu fd can be memory mapped at page offset +KVM_S390_SIE_PAGE_OFFSET in order to obtain a memory map of the virtual +cpu's hardware control block. + 4.8 KVM_GET_DIRTY_LOG (vm ioctl) Capability: basic diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c index df6b14194051..8ca7261e7b3d 100644 --- a/arch/ia64/kvm/kvm-ia64.c +++ b/arch/ia64/kvm/kvm-ia64.c @@ -1566,6 +1566,11 @@ out: return r; } +int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) +{ + return VM_FAULT_SIGBUS; +} + int kvm_arch_prepare_memory_region(struct kvm *kvm, struct kvm_memory_slot *memslot, struct kvm_memory_slot old, diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 83f244569874..a5671616af86 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -659,6 +659,11 @@ out: return r; } +int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) +{ + return VM_FAULT_SIGBUS; +} + static int kvm_vm_ioctl_get_pvinfo(struct kvm_ppc_pvinfo *pvinfo) { u32 inst_lis = 0x3c000000; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index af05328aca25..d6bc65aeb950 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -763,6 +763,19 @@ long kvm_arch_vcpu_ioctl(struct file *filp, return r; } +int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) +{ +#ifdef CONFIG_KVM_S390_UCONTROL + if ((vmf->pgoff == KVM_S390_SIE_PAGE_OFFSET) + && (kvm_is_ucontrol(vcpu->kvm))) { + vmf->page = virt_to_page(vcpu->arch.sie_block); + get_page(vmf->page); + return 0; + } +#endif + return VM_FAULT_SIGBUS; +} + /* Section: memory related */ int kvm_arch_prepare_memory_region(struct kvm *kvm, struct kvm_memory_slot *memslot, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 06925b4bcc27..a3ce196d21fe 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2814,6 +2814,11 @@ out: return r; } +int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf) +{ + return VM_FAULT_SIGBUS; +} + static int kvm_vm_ioctl_set_tss_addr(struct kvm *kvm, unsigned long addr) { int ret; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 7f686f6708b0..8f888df206a2 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -440,6 +440,8 @@ struct kvm_ppc_pvinfo { /* machine type bits, to be used as argument to KVM_CREATE_VM */ #define KVM_VM_S390_UCONTROL 1 +#define KVM_S390_SIE_PAGE_OFFSET 1 + /* * ioctls for /dev/kvm fds: */ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 82375e145e64..d4d4d7092110 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -450,6 +450,7 @@ long kvm_arch_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg); +int kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf); int kvm_dev_ioctl_check_extension(long ext); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 32e3b048a6cf..64be836f3348 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1657,7 +1657,7 @@ static int kvm_vcpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf) page = virt_to_page(vcpu->kvm->coalesced_mmio_ring); #endif else - return VM_FAULT_SIGBUS; + return kvm_arch_vcpu_fault(vcpu, vmf); get_page(page); vmf->page = page; return 0; -- cgit v1.2.3 From ccc7910fe564d99415def7c041fa261e62a43011 Mon Sep 17 00:00:00 2001 From: Carsten Otte Date: Wed, 4 Jan 2012 10:25:26 +0100 Subject: KVM: s390: ucontrol: interface to inject faults on a vcpu page table This patch allows the user to fault in pages on a virtual cpus address space for user controlled virtual machines. Typically this is superfluous because userspace can just create a mapping and let the kernel's page fault logic take are of it. There is one exception: SIE won't start if the lowcore is not present. Normally the kernel takes care of this [handle_validity() in arch/s390/kvm/intercept.c] but since the kernel does not handle intercepts for user controlled virtual machines, userspace needs to be able to handle this condition. Signed-off-by: Carsten Otte Signed-off-by: Marcelo Tosatti Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 16 ++++++++++++++++ arch/s390/kvm/kvm-s390.c | 6 ++++++ include/linux/kvm.h | 1 + 3 files changed, 23 insertions(+) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 5ebf47d99e56..a67fb35993fa 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1539,6 +1539,22 @@ This ioctl unmaps the memory in the vcpu's address space starting at "vcpu_addr" with the length "length". The field "user_addr" is ignored. All parameters need to be alligned by 1 megabyte. +4.66 KVM_S390_VCPU_FAULT + +Capability: KVM_CAP_S390_UCONTROL +Architectures: s390 +Type: vcpu ioctl +Parameters: vcpu absolute address (in) +Returns: 0 in case of success + +This call creates a page table entry on the virtual cpu's address space +(for user controlled virtual machines) or the virtual machine's address +space (for regular virtual machines). This only works for minor faults, +thus it's recommended to access subject memory page via the user page +table upfront. This is useful to handle validity intercepts for user +controlled virtual machines to fault in the virtual cpu's lowcore pages +prior to calling the KVM_RUN ioctl. + 5. The kvm_run structure Application code obtains a pointer to the kvm_run structure by diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 5b5c28e471df..8489edf80c89 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -761,6 +761,12 @@ long kvm_arch_vcpu_ioctl(struct file *filp, break; } #endif + case KVM_S390_VCPU_FAULT: { + r = gmap_fault(arg, vcpu->arch.gmap); + if (!IS_ERR_VALUE(r)) + r = 0; + break; + } default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 8f888df206a2..778e748927b4 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -675,6 +675,7 @@ struct kvm_s390_ucas_mapping { }; #define KVM_S390_UCAS_MAP _IOW(KVMIO, 0x50, struct kvm_s390_ucas_mapping) #define KVM_S390_UCAS_UNMAP _IOW(KVMIO, 0x51, struct kvm_s390_ucas_mapping) +#define KVM_S390_VCPU_FAULT _IOW(KVMIO, 0x52, unsigned long) /* Device model IOC */ #define KVM_CREATE_IRQCHIP _IO(KVMIO, 0x60) -- cgit v1.2.3 From b9e5dc8d4511e6a00862a795319569e7fe7f60f4 Mon Sep 17 00:00:00 2001 From: Christian Borntraeger Date: Wed, 11 Jan 2012 11:20:30 +0100 Subject: KVM: provide synchronous registers in kvm_run On some cpus the overhead for virtualization instructions is in the same range as a system call. Having to call multiple ioctls to get set registers will make certain userspace handled exits more expensive than necessary. Lets provide a section in kvm_run that works as a shared save area for guest registers. We also provide two 64bit flags fields (architecture specific), that will specify 1. which parts of these fields are valid. 2. which registers were modified by userspace Each bit for these flag fields will define a group of registers (like general purpose) or a single register. Signed-off-by: Christian Borntraeger Signed-off-by: Marcelo Tosatti Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 23 +++++++++++++++++++++++ arch/ia64/include/asm/kvm.h | 4 ++++ arch/powerpc/include/asm/kvm.h | 4 ++++ arch/s390/include/asm/kvm.h | 3 +++ arch/x86/include/asm/kvm.h | 4 ++++ include/linux/kvm.h | 15 +++++++++++++++ 6 files changed, 53 insertions(+) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index a67fb35993fa..7ca696227d3a 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1771,6 +1771,29 @@ developer registration required to access it). /* Fix the size of the union. */ char padding[256]; }; + + /* + * shared registers between kvm and userspace. + * kvm_valid_regs specifies the register classes set by the host + * kvm_dirty_regs specified the register classes dirtied by userspace + * struct kvm_sync_regs is architecture specific, as well as the + * bits for kvm_valid_regs and kvm_dirty_regs + */ + __u64 kvm_valid_regs; + __u64 kvm_dirty_regs; + union { + struct kvm_sync_regs regs; + char padding[1024]; + } s; + +If KVM_CAP_SYNC_REGS is defined, these fields allow userspace to access +certain guest registers without having to call SET/GET_*REGS. Thus we can +avoid some system call overhead if userspace has to handle the exit. +Userspace can query the validity of the structure by checking +kvm_valid_regs for specific bits. These bits are architecture specific +and usually define the validity of a groups of registers. (e.g. one bit + for general purpose registers) + }; 6. Capabilities that can be enabled diff --git a/arch/ia64/include/asm/kvm.h b/arch/ia64/include/asm/kvm.h index bc90c75adf67..b9f82c84f093 100644 --- a/arch/ia64/include/asm/kvm.h +++ b/arch/ia64/include/asm/kvm.h @@ -261,4 +261,8 @@ struct kvm_debug_exit_arch { struct kvm_guest_debug_arch { }; +/* definition of registers in kvm_run */ +struct kvm_sync_regs { +}; + #endif diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index f7727d91ac6b..7d9d4de057ef 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -265,6 +265,10 @@ struct kvm_debug_exit_arch { struct kvm_guest_debug_arch { }; +/* definition of registers in kvm_run */ +struct kvm_sync_regs { +}; + #define KVM_REG_MASK 0x001f #define KVM_REG_EXT_MASK 0xffe0 #define KVM_REG_GPR 0x0000 diff --git a/arch/s390/include/asm/kvm.h b/arch/s390/include/asm/kvm.h index 82b32a100c7d..325560afb77e 100644 --- a/arch/s390/include/asm/kvm.h +++ b/arch/s390/include/asm/kvm.h @@ -41,4 +41,7 @@ struct kvm_debug_exit_arch { struct kvm_guest_debug_arch { }; +/* definition of registers in kvm_run */ +struct kvm_sync_regs { +}; #endif diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h index 4d8dcbdfc120..e7d1c194d272 100644 --- a/arch/x86/include/asm/kvm.h +++ b/arch/x86/include/asm/kvm.h @@ -321,4 +321,8 @@ struct kvm_xcrs { __u64 padding[16]; }; +/* definition of registers in kvm_run */ +struct kvm_sync_regs { +}; + #endif /* _ASM_X86_KVM_H */ diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 6cf048d9604b..245bcb3a0fcd 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -279,6 +279,20 @@ struct kvm_run { /* Fix the size of the union. */ char padding[256]; }; + + /* + * shared registers between kvm and userspace. + * kvm_valid_regs specifies the register classes set by the host + * kvm_dirty_regs specified the register classes dirtied by userspace + * struct kvm_sync_regs is architecture specific, as well as the + * bits for kvm_valid_regs and kvm_dirty_regs + */ + __u64 kvm_valid_regs; + __u64 kvm_dirty_regs; + union { + struct kvm_sync_regs regs; + char padding[1024]; + } s; }; /* for KVM_REGISTER_COALESCED_MMIO / KVM_UNREGISTER_COALESCED_MMIO */ @@ -570,6 +584,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_S390_GMAP 71 #define KVM_CAP_TSC_DEADLINE_TIMER 72 #define KVM_CAP_S390_UCONTROL 73 +#define KVM_CAP_SYNC_REGS 74 #ifdef KVM_CAP_IRQ_ROUTING -- cgit v1.2.3 From dc83b8bc0256ee682506ed83853a98eaba529c6f Mon Sep 17 00:00:00 2001 From: Scott Wood Date: Thu, 18 Aug 2011 15:25:21 -0500 Subject: KVM: PPC: e500: MMU API This implements a shared-memory API for giving host userspace access to the guest's TLB. Signed-off-by: Scott Wood Signed-off-by: Alexander Graf Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 74 +++++++ arch/powerpc/include/asm/kvm.h | 35 ++++ arch/powerpc/include/asm/kvm_e500.h | 24 +-- arch/powerpc/include/asm/kvm_ppc.h | 5 + arch/powerpc/kvm/e500.c | 5 +- arch/powerpc/kvm/e500_emulate.c | 12 +- arch/powerpc/kvm/e500_tlb.c | 393 ++++++++++++++++++++++++------------ arch/powerpc/kvm/e500_tlb.h | 38 ++-- arch/powerpc/kvm/powerpc.c | 28 +++ include/linux/kvm.h | 18 ++ 10 files changed, 469 insertions(+), 163 deletions(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 7ca696227d3a..bcd45d5afcab 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1409,6 +1409,38 @@ The following flags are defined: If datamatch flag is set, the event will be signaled only if the written value to the registered address is equal to datamatch in struct kvm_ioeventfd. +4.59 KVM_DIRTY_TLB + +Capability: KVM_CAP_SW_TLB +Architectures: ppc +Type: vcpu ioctl +Parameters: struct kvm_dirty_tlb (in) +Returns: 0 on success, -1 on error + +struct kvm_dirty_tlb { + __u64 bitmap; + __u32 num_dirty; +}; + +This must be called whenever userspace has changed an entry in the shared +TLB, prior to calling KVM_RUN on the associated vcpu. + +The "bitmap" field is the userspace address of an array. This array +consists of a number of bits, equal to the total number of TLB entries as +determined by the last successful call to KVM_CONFIG_TLB, rounded up to the +nearest multiple of 64. + +Each bit corresponds to one TLB entry, ordered the same as in the shared TLB +array. + +The array is little-endian: the bit 0 is the least significant bit of the +first byte, bit 8 is the least significant bit of the second byte, etc. +This avoids any complications with differing word sizes. + +The "num_dirty" field is a performance hint for KVM to determine whether it +should skip processing the bitmap and just invalidate everything. It must +be set to the number of set bits in the bitmap. + 4.62 KVM_CREATE_SPAPR_TCE Capability: KVM_CAP_SPAPR_TCE @@ -1842,3 +1874,45 @@ HTAB address part of SDR1 contains an HVA instead of a GPA, as PAPR keeps the HTAB invisible to the guest. When this capability is enabled, KVM_EXIT_PAPR_HCALL can occur. + +6.3 KVM_CAP_SW_TLB + +Architectures: ppc +Parameters: args[0] is the address of a struct kvm_config_tlb +Returns: 0 on success; -1 on error + +struct kvm_config_tlb { + __u64 params; + __u64 array; + __u32 mmu_type; + __u32 array_len; +}; + +Configures the virtual CPU's TLB array, establishing a shared memory area +between userspace and KVM. The "params" and "array" fields are userspace +addresses of mmu-type-specific data structures. The "array_len" field is an +safety mechanism, and should be set to the size in bytes of the memory that +userspace has reserved for the array. It must be at least the size dictated +by "mmu_type" and "params". + +While KVM_RUN is active, the shared region is under control of KVM. Its +contents are undefined, and any modification by userspace results in +boundedly undefined behavior. + +On return from KVM_RUN, the shared region will reflect the current state of +the guest's TLB. If userspace makes any changes, it must call KVM_DIRTY_TLB +to tell KVM which entries have been changed, prior to calling KVM_RUN again +on this vcpu. + +For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV: + - The "params" field is of type "struct kvm_book3e_206_tlb_params". + - The "array" field points to an array of type "struct + kvm_book3e_206_tlb_entry". + - The array consists of all entries in the first TLB, followed by all + entries in the second TLB. + - Within a TLB, entries are ordered first by increasing set number. Within a + set, entries are ordered by way (increasing ESEL). + - The hash for determining set number in TLB0 is: (MAS2 >> 12) & (num_sets - 1) + where "num_sets" is the tlb_sizes[] value divided by the tlb_ways[] value. + - The tsize field of mas1 shall be set to 4K on TLB0, even though the + hardware ignores this value for TLB0. diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 7d9d4de057ef..663c57f87165 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -296,4 +296,39 @@ struct kvm_allocate_rma { __u64 rma_size; }; +struct kvm_book3e_206_tlb_entry { + __u32 mas8; + __u32 mas1; + __u64 mas2; + __u64 mas7_3; +}; + +struct kvm_book3e_206_tlb_params { + /* + * For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV: + * + * - The number of ways of TLB0 must be a power of two between 2 and + * 16. + * - TLB1 must be fully associative. + * - The size of TLB0 must be a multiple of the number of ways, and + * the number of sets must be a power of two. + * - The size of TLB1 may not exceed 64 entries. + * - TLB0 supports 4 KiB pages. + * - The page sizes supported by TLB1 are as indicated by + * TLB1CFG (if MMUCFG[MAVN] = 0) or TLB1PS (if MMUCFG[MAVN] = 1) + * as returned by KVM_GET_SREGS. + * - TLB2 and TLB3 are reserved, and their entries in tlb_sizes[] + * and tlb_ways[] must be zero. + * + * tlb_ways[n] = tlb_sizes[n] means the array is fully associative. + * + * KVM will adjust TLBnCFG based on the sizes configured here, + * though arrays greater than 2048 entries will have TLBnCFG[NENTRY] + * set to zero. + */ + __u32 tlb_sizes[4]; + __u32 tlb_ways[4]; + __u32 reserved[8]; +}; + #endif /* __LINUX_KVM_POWERPC_H */ diff --git a/arch/powerpc/include/asm/kvm_e500.h b/arch/powerpc/include/asm/kvm_e500.h index a5197d816ec7..bc17441535f2 100644 --- a/arch/powerpc/include/asm/kvm_e500.h +++ b/arch/powerpc/include/asm/kvm_e500.h @@ -22,13 +22,6 @@ #define E500_PID_NUM 3 #define E500_TLB_NUM 2 -struct tlbe{ - u32 mas1; - u32 mas2; - u32 mas3; - u32 mas7; -}; - #define E500_TLB_VALID 1 #define E500_TLB_DIRTY 2 @@ -48,13 +41,17 @@ struct kvmppc_e500_tlb_params { }; struct kvmppc_vcpu_e500 { - /* Unmodified copy of the guest's TLB. */ - struct tlbe *gtlb_arch[E500_TLB_NUM]; + /* Unmodified copy of the guest's TLB -- shared with host userspace. */ + struct kvm_book3e_206_tlb_entry *gtlb_arch; + + /* Starting entry number in gtlb_arch[] */ + int gtlb_offset[E500_TLB_NUM]; /* KVM internal information associated with each guest TLB entry */ struct tlbe_priv *gtlb_priv[E500_TLB_NUM]; - unsigned int gtlb_size[E500_TLB_NUM]; + struct kvmppc_e500_tlb_params gtlb_params[E500_TLB_NUM]; + unsigned int gtlb_nv[E500_TLB_NUM]; /* @@ -68,7 +65,6 @@ struct kvmppc_vcpu_e500 { * and back, and our host TLB entries got evicted). */ struct tlbe_ref *tlb_refs[E500_TLB_NUM]; - unsigned int host_tlb1_nv; u32 host_pid[E500_PID_NUM]; @@ -78,11 +74,10 @@ struct kvmppc_vcpu_e500 { u32 mas0; u32 mas1; u32 mas2; - u32 mas3; + u64 mas7_3; u32 mas4; u32 mas5; u32 mas6; - u32 mas7; /* vcpu id table */ struct vcpu_id_table *idt; @@ -95,6 +90,9 @@ struct kvmppc_vcpu_e500 { u32 tlb1cfg; u64 mcar; + struct page **shared_tlb_pages; + int num_shared_tlb_pages; + struct kvm_vcpu vcpu; }; diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 46efd1a265c9..a284f209e2df 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -193,4 +193,9 @@ static inline void kvm_rma_init(void) {} #endif +int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu, + struct kvm_config_tlb *cfg); +int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu, + struct kvm_dirty_tlb *cfg); + #endif /* __POWERPC_KVM_PPC_H__ */ diff --git a/arch/powerpc/kvm/e500.c b/arch/powerpc/kvm/e500.c index 8c0d45a6faf7..f17d7e732a1e 100644 --- a/arch/powerpc/kvm/e500.c +++ b/arch/powerpc/kvm/e500.c @@ -121,7 +121,7 @@ void kvmppc_core_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) sregs->u.e.mas0 = vcpu_e500->mas0; sregs->u.e.mas1 = vcpu_e500->mas1; sregs->u.e.mas2 = vcpu_e500->mas2; - sregs->u.e.mas7_3 = ((u64)vcpu_e500->mas7 << 32) | vcpu_e500->mas3; + sregs->u.e.mas7_3 = vcpu_e500->mas7_3; sregs->u.e.mas4 = vcpu_e500->mas4; sregs->u.e.mas6 = vcpu_e500->mas6; @@ -154,8 +154,7 @@ int kvmppc_core_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) vcpu_e500->mas0 = sregs->u.e.mas0; vcpu_e500->mas1 = sregs->u.e.mas1; vcpu_e500->mas2 = sregs->u.e.mas2; - vcpu_e500->mas7 = sregs->u.e.mas7_3 >> 32; - vcpu_e500->mas3 = (u32)sregs->u.e.mas7_3; + vcpu_e500->mas7_3 = sregs->u.e.mas7_3; vcpu_e500->mas4 = sregs->u.e.mas4; vcpu_e500->mas6 = sregs->u.e.mas6; } diff --git a/arch/powerpc/kvm/e500_emulate.c b/arch/powerpc/kvm/e500_emulate.c index d48ae396f41e..e0d36099c756 100644 --- a/arch/powerpc/kvm/e500_emulate.c +++ b/arch/powerpc/kvm/e500_emulate.c @@ -95,13 +95,17 @@ int kvmppc_core_emulate_mtspr(struct kvm_vcpu *vcpu, int sprn, int rs) case SPRN_MAS2: vcpu_e500->mas2 = spr_val; break; case SPRN_MAS3: - vcpu_e500->mas3 = spr_val; break; + vcpu_e500->mas7_3 &= ~(u64)0xffffffff; + vcpu_e500->mas7_3 |= spr_val; + break; case SPRN_MAS4: vcpu_e500->mas4 = spr_val; break; case SPRN_MAS6: vcpu_e500->mas6 = spr_val; break; case SPRN_MAS7: - vcpu_e500->mas7 = spr_val; break; + vcpu_e500->mas7_3 &= (u64)0xffffffff; + vcpu_e500->mas7_3 |= (u64)spr_val << 32; + break; case SPRN_L1CSR0: vcpu_e500->l1csr0 = spr_val; vcpu_e500->l1csr0 &= ~(L1CSR0_DCFI | L1CSR0_CLFC); @@ -158,13 +162,13 @@ int kvmppc_core_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt) case SPRN_MAS2: kvmppc_set_gpr(vcpu, rt, vcpu_e500->mas2); break; case SPRN_MAS3: - kvmppc_set_gpr(vcpu, rt, vcpu_e500->mas3); break; + kvmppc_set_gpr(vcpu, rt, (u32)vcpu_e500->mas7_3); break; case SPRN_MAS4: kvmppc_set_gpr(vcpu, rt, vcpu_e500->mas4); break; case SPRN_MAS6: kvmppc_set_gpr(vcpu, rt, vcpu_e500->mas6); break; case SPRN_MAS7: - kvmppc_set_gpr(vcpu, rt, vcpu_e500->mas7); break; + kvmppc_set_gpr(vcpu, rt, vcpu_e500->mas7_3 >> 32); break; case SPRN_TLB0CFG: kvmppc_set_gpr(vcpu, rt, vcpu_e500->tlb0cfg); break; diff --git a/arch/powerpc/kvm/e500_tlb.c b/arch/powerpc/kvm/e500_tlb.c index 59221bb1e00e..f19ae2f61521 100644 --- a/arch/powerpc/kvm/e500_tlb.c +++ b/arch/powerpc/kvm/e500_tlb.c @@ -19,6 +19,11 @@ #include #include #include +#include +#include +#include +#include +#include #include #include @@ -66,6 +71,13 @@ static DEFINE_PER_CPU(unsigned long, pcpu_last_used_sid); static struct kvmppc_e500_tlb_params host_tlb_params[E500_TLB_NUM]; +static struct kvm_book3e_206_tlb_entry *get_entry( + struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, int entry) +{ + int offset = vcpu_e500->gtlb_offset[tlbsel]; + return &vcpu_e500->gtlb_arch[offset + entry]; +} + /* * Allocate a free shadow id and setup a valid sid mapping in given entry. * A mapping is only valid when vcpu_id_table and pcpu_id_table are match. @@ -217,34 +229,13 @@ void kvmppc_e500_recalc_shadow_pid(struct kvmppc_vcpu_e500 *vcpu_e500) preempt_enable(); } -void kvmppc_dump_tlbs(struct kvm_vcpu *vcpu) -{ - struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); - struct tlbe *tlbe; - int i, tlbsel; - - printk("| %8s | %8s | %8s | %8s | %8s |\n", - "nr", "mas1", "mas2", "mas3", "mas7"); - - for (tlbsel = 0; tlbsel < 2; tlbsel++) { - printk("Guest TLB%d:\n", tlbsel); - for (i = 0; i < vcpu_e500->gtlb_size[tlbsel]; i++) { - tlbe = &vcpu_e500->gtlb_arch[tlbsel][i]; - if (tlbe->mas1 & MAS1_VALID) - printk(" G[%d][%3d] | %08X | %08X | %08X | %08X |\n", - tlbsel, i, tlbe->mas1, tlbe->mas2, - tlbe->mas3, tlbe->mas7); - } - } -} - static inline unsigned int gtlb0_get_next_victim( struct kvmppc_vcpu_e500 *vcpu_e500) { unsigned int victim; victim = vcpu_e500->gtlb_nv[0]++; - if (unlikely(vcpu_e500->gtlb_nv[0] >= KVM_E500_TLB0_WAY_NUM)) + if (unlikely(vcpu_e500->gtlb_nv[0] >= vcpu_e500->gtlb_params[0].ways)) vcpu_e500->gtlb_nv[0] = 0; return victim; @@ -256,9 +247,9 @@ static inline unsigned int tlb1_max_shadow_size(void) return host_tlb_params[1].entries - tlbcam_index - 1; } -static inline int tlbe_is_writable(struct tlbe *tlbe) +static inline int tlbe_is_writable(struct kvm_book3e_206_tlb_entry *tlbe) { - return tlbe->mas3 & (MAS3_SW|MAS3_UW); + return tlbe->mas7_3 & (MAS3_SW|MAS3_UW); } static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) @@ -289,39 +280,41 @@ static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) /* * writing shadow tlb entry to host TLB */ -static inline void __write_host_tlbe(struct tlbe *stlbe, uint32_t mas0) +static inline void __write_host_tlbe(struct kvm_book3e_206_tlb_entry *stlbe, + uint32_t mas0) { unsigned long flags; local_irq_save(flags); mtspr(SPRN_MAS0, mas0); mtspr(SPRN_MAS1, stlbe->mas1); - mtspr(SPRN_MAS2, stlbe->mas2); - mtspr(SPRN_MAS3, stlbe->mas3); - mtspr(SPRN_MAS7, stlbe->mas7); + mtspr(SPRN_MAS2, (unsigned long)stlbe->mas2); + mtspr(SPRN_MAS3, (u32)stlbe->mas7_3); + mtspr(SPRN_MAS7, (u32)(stlbe->mas7_3 >> 32)); asm volatile("isync; tlbwe" : : : "memory"); local_irq_restore(flags); } /* esel is index into set, not whole array */ static inline void write_host_tlbe(struct kvmppc_vcpu_e500 *vcpu_e500, - int tlbsel, int esel, struct tlbe *stlbe) + int tlbsel, int esel, struct kvm_book3e_206_tlb_entry *stlbe) { if (tlbsel == 0) { - __write_host_tlbe(stlbe, MAS0_TLBSEL(0) | MAS0_ESEL(esel)); + int way = esel & (vcpu_e500->gtlb_params[0].ways - 1); + __write_host_tlbe(stlbe, MAS0_TLBSEL(0) | MAS0_ESEL(way)); } else { __write_host_tlbe(stlbe, MAS0_TLBSEL(1) | MAS0_ESEL(to_htlb1_esel(esel))); } trace_kvm_stlb_write(index_of(tlbsel, esel), stlbe->mas1, stlbe->mas2, - stlbe->mas3, stlbe->mas7); + (u32)stlbe->mas7_3, (u32)(stlbe->mas7_3 >> 32)); } void kvmppc_map_magic(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); - struct tlbe magic; + struct kvm_book3e_206_tlb_entry magic; ulong shared_page = ((ulong)vcpu->arch.shared) & PAGE_MASK; unsigned int stid; pfn_t pfn; @@ -335,9 +328,8 @@ void kvmppc_map_magic(struct kvm_vcpu *vcpu) magic.mas1 = MAS1_VALID | MAS1_TS | MAS1_TID(stid) | MAS1_TSIZE(BOOK3E_PAGESZ_4K); magic.mas2 = vcpu->arch.magic_page_ea | MAS2_M; - magic.mas3 = (pfn << PAGE_SHIFT) | - MAS3_SW | MAS3_SR | MAS3_UW | MAS3_UR; - magic.mas7 = pfn >> (32 - PAGE_SHIFT); + magic.mas7_3 = ((u64)pfn << PAGE_SHIFT) | + MAS3_SW | MAS3_SR | MAS3_UW | MAS3_UR; __write_host_tlbe(&magic, MAS0_TLBSEL(1) | MAS0_ESEL(tlbcam_index)); preempt_enable(); @@ -358,7 +350,8 @@ void kvmppc_e500_tlb_put(struct kvm_vcpu *vcpu) static void inval_gtlbe_on_host(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, int esel) { - struct tlbe *gtlbe = &vcpu_e500->gtlb_arch[tlbsel][esel]; + struct kvm_book3e_206_tlb_entry *gtlbe = + get_entry(vcpu_e500, tlbsel, esel); struct vcpu_id_table *idt = vcpu_e500->idt; unsigned int pr, tid, ts, pid; u32 val, eaddr; @@ -424,9 +417,8 @@ static int tlb0_set_base(gva_t addr, int sets, int ways) static int gtlb0_set_base(struct kvmppc_vcpu_e500 *vcpu_e500, gva_t addr) { - int sets = KVM_E500_TLB0_SIZE / KVM_E500_TLB0_WAY_NUM; - - return tlb0_set_base(addr, sets, KVM_E500_TLB0_WAY_NUM); + return tlb0_set_base(addr, vcpu_e500->gtlb_params[0].sets, + vcpu_e500->gtlb_params[0].ways); } static int htlb0_set_base(gva_t addr) @@ -440,10 +432,10 @@ static unsigned int get_tlb_esel(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel) unsigned int esel = get_tlb_esel_bit(vcpu_e500); if (tlbsel == 0) { - esel &= KVM_E500_TLB0_WAY_NUM_MASK; + esel &= vcpu_e500->gtlb_params[0].ways - 1; esel += gtlb0_set_base(vcpu_e500, vcpu_e500->mas2); } else { - esel &= vcpu_e500->gtlb_size[tlbsel] - 1; + esel &= vcpu_e500->gtlb_params[tlbsel].entries - 1; } return esel; @@ -453,19 +445,22 @@ static unsigned int get_tlb_esel(struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel) static int kvmppc_e500_tlb_index(struct kvmppc_vcpu_e500 *vcpu_e500, gva_t eaddr, int tlbsel, unsigned int pid, int as) { - int size = vcpu_e500->gtlb_size[tlbsel]; - unsigned int set_base; + int size = vcpu_e500->gtlb_params[tlbsel].entries; + unsigned int set_base, offset; int i; if (tlbsel == 0) { set_base = gtlb0_set_base(vcpu_e500, eaddr); - size = KVM_E500_TLB0_WAY_NUM; + size = vcpu_e500->gtlb_params[0].ways; } else { set_base = 0; } + offset = vcpu_e500->gtlb_offset[tlbsel]; + for (i = 0; i < size; i++) { - struct tlbe *tlbe = &vcpu_e500->gtlb_arch[tlbsel][set_base + i]; + struct kvm_book3e_206_tlb_entry *tlbe = + &vcpu_e500->gtlb_arch[offset + set_base + i]; unsigned int tid; if (eaddr < get_tlb_eaddr(tlbe)) @@ -491,7 +486,7 @@ static int kvmppc_e500_tlb_index(struct kvmppc_vcpu_e500 *vcpu_e500, } static inline void kvmppc_e500_ref_setup(struct tlbe_ref *ref, - struct tlbe *gtlbe, + struct kvm_book3e_206_tlb_entry *gtlbe, pfn_t pfn) { ref->pfn = pfn; @@ -518,7 +513,7 @@ static void clear_tlb_privs(struct kvmppc_vcpu_e500 *vcpu_e500) int tlbsel = 0; int i; - for (i = 0; i < vcpu_e500->gtlb_size[tlbsel]; i++) { + for (i = 0; i < vcpu_e500->gtlb_params[tlbsel].entries; i++) { struct tlbe_ref *ref = &vcpu_e500->gtlb_priv[tlbsel][i].ref; kvmppc_e500_ref_release(ref); @@ -530,6 +525,8 @@ static void clear_tlb_refs(struct kvmppc_vcpu_e500 *vcpu_e500) int stlbsel = 1; int i; + kvmppc_e500_id_table_reset_all(vcpu_e500); + for (i = 0; i < host_tlb_params[stlbsel].entries; i++) { struct tlbe_ref *ref = &vcpu_e500->tlb_refs[stlbsel][i]; @@ -559,18 +556,18 @@ static inline void kvmppc_e500_deliver_tlb_miss(struct kvm_vcpu *vcpu, | MAS1_TSIZE(tsized); vcpu_e500->mas2 = (eaddr & MAS2_EPN) | (vcpu_e500->mas4 & MAS2_ATTRIB_MASK); - vcpu_e500->mas3 &= MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3; + vcpu_e500->mas7_3 &= MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3; vcpu_e500->mas6 = (vcpu_e500->mas6 & MAS6_SPID1) | (get_cur_pid(vcpu) << 16) | (as ? MAS6_SAS : 0); - vcpu_e500->mas7 = 0; } /* TID must be supplied by the caller */ -static inline void kvmppc_e500_setup_stlbe(struct kvmppc_vcpu_e500 *vcpu_e500, - struct tlbe *gtlbe, int tsize, - struct tlbe_ref *ref, - u64 gvaddr, struct tlbe *stlbe) +static inline void kvmppc_e500_setup_stlbe( + struct kvmppc_vcpu_e500 *vcpu_e500, + struct kvm_book3e_206_tlb_entry *gtlbe, + int tsize, struct tlbe_ref *ref, u64 gvaddr, + struct kvm_book3e_206_tlb_entry *stlbe) { pfn_t pfn = ref->pfn; @@ -581,16 +578,16 @@ static inline void kvmppc_e500_setup_stlbe(struct kvmppc_vcpu_e500 *vcpu_e500, stlbe->mas2 = (gvaddr & MAS2_EPN) | e500_shadow_mas2_attrib(gtlbe->mas2, vcpu_e500->vcpu.arch.shared->msr & MSR_PR); - stlbe->mas3 = ((pfn << PAGE_SHIFT) & MAS3_RPN) - | e500_shadow_mas3_attrib(gtlbe->mas3, + stlbe->mas7_3 = ((u64)pfn << PAGE_SHIFT) + | e500_shadow_mas3_attrib(gtlbe->mas7_3, vcpu_e500->vcpu.arch.shared->msr & MSR_PR); - stlbe->mas7 = (pfn >> (32 - PAGE_SHIFT)) & MAS7_RPN; } /* sesel is an index into the entire array, not just the set */ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, - u64 gvaddr, gfn_t gfn, struct tlbe *gtlbe, int tlbsel, int sesel, - struct tlbe *stlbe, struct tlbe_ref *ref) + u64 gvaddr, gfn_t gfn, struct kvm_book3e_206_tlb_entry *gtlbe, + int tlbsel, int sesel, struct kvm_book3e_206_tlb_entry *stlbe, + struct tlbe_ref *ref) { struct kvm_memory_slot *slot; unsigned long pfn, hva; @@ -700,15 +697,16 @@ static inline void kvmppc_e500_shadow_map(struct kvmppc_vcpu_e500 *vcpu_e500, /* XXX only map the one-one case, for now use TLB0 */ static int kvmppc_e500_tlb0_map(struct kvmppc_vcpu_e500 *vcpu_e500, - int esel, struct tlbe *stlbe) + int esel, + struct kvm_book3e_206_tlb_entry *stlbe) { - struct tlbe *gtlbe; + struct kvm_book3e_206_tlb_entry *gtlbe; struct tlbe_ref *ref; int sesel = esel & (host_tlb_params[0].ways - 1); int sesel_base; gva_t ea; - gtlbe = &vcpu_e500->gtlb_arch[0][esel]; + gtlbe = get_entry(vcpu_e500, 0, esel); ref = &vcpu_e500->gtlb_priv[0][esel].ref; ea = get_tlb_eaddr(gtlbe); @@ -725,7 +723,8 @@ static int kvmppc_e500_tlb0_map(struct kvmppc_vcpu_e500 *vcpu_e500, * the shadow TLB. */ /* XXX for both one-one and one-to-many , for now use TLB1 */ static int kvmppc_e500_tlb1_map(struct kvmppc_vcpu_e500 *vcpu_e500, - u64 gvaddr, gfn_t gfn, struct tlbe *gtlbe, struct tlbe *stlbe) + u64 gvaddr, gfn_t gfn, struct kvm_book3e_206_tlb_entry *gtlbe, + struct kvm_book3e_206_tlb_entry *stlbe) { struct tlbe_ref *ref; unsigned int victim; @@ -754,7 +753,8 @@ static inline int kvmppc_e500_gtlbe_invalidate( struct kvmppc_vcpu_e500 *vcpu_e500, int tlbsel, int esel) { - struct tlbe *gtlbe = &vcpu_e500->gtlb_arch[tlbsel][esel]; + struct kvm_book3e_206_tlb_entry *gtlbe = + get_entry(vcpu_e500, tlbsel, esel); if (unlikely(get_tlb_iprot(gtlbe))) return -1; @@ -769,10 +769,10 @@ int kvmppc_e500_emul_mt_mmucsr0(struct kvmppc_vcpu_e500 *vcpu_e500, ulong value) int esel; if (value & MMUCSR0_TLB0FI) - for (esel = 0; esel < vcpu_e500->gtlb_size[0]; esel++) + for (esel = 0; esel < vcpu_e500->gtlb_params[0].entries; esel++) kvmppc_e500_gtlbe_invalidate(vcpu_e500, 0, esel); if (value & MMUCSR0_TLB1FI) - for (esel = 0; esel < vcpu_e500->gtlb_size[1]; esel++) + for (esel = 0; esel < vcpu_e500->gtlb_params[1].entries; esel++) kvmppc_e500_gtlbe_invalidate(vcpu_e500, 1, esel); /* Invalidate all vcpu id mappings */ @@ -797,7 +797,8 @@ int kvmppc_e500_emul_tlbivax(struct kvm_vcpu *vcpu, int ra, int rb) if (ia) { /* invalidate all entries */ - for (esel = 0; esel < vcpu_e500->gtlb_size[tlbsel]; esel++) + for (esel = 0; esel < vcpu_e500->gtlb_params[tlbsel].entries; + esel++) kvmppc_e500_gtlbe_invalidate(vcpu_e500, tlbsel, esel); } else { ea &= 0xfffff000; @@ -817,18 +818,17 @@ int kvmppc_e500_emul_tlbre(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); int tlbsel, esel; - struct tlbe *gtlbe; + struct kvm_book3e_206_tlb_entry *gtlbe; tlbsel = get_tlb_tlbsel(vcpu_e500); esel = get_tlb_esel(vcpu_e500, tlbsel); - gtlbe = &vcpu_e500->gtlb_arch[tlbsel][esel]; + gtlbe = get_entry(vcpu_e500, tlbsel, esel); vcpu_e500->mas0 &= ~MAS0_NV(~0); vcpu_e500->mas0 |= MAS0_NV(vcpu_e500->gtlb_nv[tlbsel]); vcpu_e500->mas1 = gtlbe->mas1; vcpu_e500->mas2 = gtlbe->mas2; - vcpu_e500->mas3 = gtlbe->mas3; - vcpu_e500->mas7 = gtlbe->mas7; + vcpu_e500->mas7_3 = gtlbe->mas7_3; return EMULATE_DONE; } @@ -839,7 +839,7 @@ int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb) int as = !!get_cur_sas(vcpu_e500); unsigned int pid = get_cur_spid(vcpu_e500); int esel, tlbsel; - struct tlbe *gtlbe = NULL; + struct kvm_book3e_206_tlb_entry *gtlbe = NULL; gva_t ea; ea = kvmppc_get_gpr(vcpu, rb); @@ -847,7 +847,7 @@ int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb) for (tlbsel = 0; tlbsel < 2; tlbsel++) { esel = kvmppc_e500_tlb_index(vcpu_e500, ea, tlbsel, pid, as); if (esel >= 0) { - gtlbe = &vcpu_e500->gtlb_arch[tlbsel][esel]; + gtlbe = get_entry(vcpu_e500, tlbsel, esel); break; } } @@ -857,8 +857,7 @@ int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb) | MAS0_NV(vcpu_e500->gtlb_nv[tlbsel]); vcpu_e500->mas1 = gtlbe->mas1; vcpu_e500->mas2 = gtlbe->mas2; - vcpu_e500->mas3 = gtlbe->mas3; - vcpu_e500->mas7 = gtlbe->mas7; + vcpu_e500->mas7_3 = gtlbe->mas7_3; } else { int victim; @@ -873,8 +872,7 @@ int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb) | (vcpu_e500->mas4 & MAS4_TSIZED(~0)); vcpu_e500->mas2 &= MAS2_EPN; vcpu_e500->mas2 |= vcpu_e500->mas4 & MAS2_ATTRIB_MASK; - vcpu_e500->mas3 &= MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3; - vcpu_e500->mas7 = 0; + vcpu_e500->mas7_3 &= MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3; } kvmppc_set_exit_type(vcpu, EMULATED_TLBSX_EXITS); @@ -883,8 +881,8 @@ int kvmppc_e500_emul_tlbsx(struct kvm_vcpu *vcpu, int rb) /* sesel is index into the set, not the whole array */ static void write_stlbe(struct kvmppc_vcpu_e500 *vcpu_e500, - struct tlbe *gtlbe, - struct tlbe *stlbe, + struct kvm_book3e_206_tlb_entry *gtlbe, + struct kvm_book3e_206_tlb_entry *stlbe, int stlbsel, int sesel) { int stid; @@ -902,28 +900,27 @@ static void write_stlbe(struct kvmppc_vcpu_e500 *vcpu_e500, int kvmppc_e500_emul_tlbwe(struct kvm_vcpu *vcpu) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); - struct tlbe *gtlbe; + struct kvm_book3e_206_tlb_entry *gtlbe; int tlbsel, esel; tlbsel = get_tlb_tlbsel(vcpu_e500); esel = get_tlb_esel(vcpu_e500, tlbsel); - gtlbe = &vcpu_e500->gtlb_arch[tlbsel][esel]; + gtlbe = get_entry(vcpu_e500, tlbsel, esel); if (get_tlb_v(gtlbe)) inval_gtlbe_on_host(vcpu_e500, tlbsel, esel); gtlbe->mas1 = vcpu_e500->mas1; gtlbe->mas2 = vcpu_e500->mas2; - gtlbe->mas3 = vcpu_e500->mas3; - gtlbe->mas7 = vcpu_e500->mas7; + gtlbe->mas7_3 = vcpu_e500->mas7_3; trace_kvm_gtlb_write(vcpu_e500->mas0, gtlbe->mas1, gtlbe->mas2, - gtlbe->mas3, gtlbe->mas7); + (u32)gtlbe->mas7_3, (u32)(gtlbe->mas7_3 >> 32)); /* Invalidate shadow mappings for the about-to-be-clobbered TLBE. */ if (tlbe_is_host_safe(vcpu, gtlbe)) { - struct tlbe stlbe; + struct kvm_book3e_206_tlb_entry stlbe; int stlbsel, sesel; u64 eaddr; u64 raddr; @@ -996,9 +993,11 @@ gpa_t kvmppc_mmu_xlate(struct kvm_vcpu *vcpu, unsigned int index, gva_t eaddr) { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); - struct tlbe *gtlbe = - &vcpu_e500->gtlb_arch[tlbsel_of(index)][esel_of(index)]; - u64 pgmask = get_tlb_bytes(gtlbe) - 1; + struct kvm_book3e_206_tlb_entry *gtlbe; + u64 pgmask; + + gtlbe = get_entry(vcpu_e500, tlbsel_of(index), esel_of(index)); + pgmask = get_tlb_bytes(gtlbe) - 1; return get_tlb_raddr(gtlbe) | (eaddr & pgmask); } @@ -1012,12 +1011,12 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr, { struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); struct tlbe_priv *priv; - struct tlbe *gtlbe, stlbe; + struct kvm_book3e_206_tlb_entry *gtlbe, stlbe; int tlbsel = tlbsel_of(index); int esel = esel_of(index); int stlbsel, sesel; - gtlbe = &vcpu_e500->gtlb_arch[tlbsel][esel]; + gtlbe = get_entry(vcpu_e500, tlbsel, esel); switch (tlbsel) { case 0: @@ -1073,25 +1072,174 @@ void kvmppc_set_pid(struct kvm_vcpu *vcpu, u32 pid) void kvmppc_e500_tlb_setup(struct kvmppc_vcpu_e500 *vcpu_e500) { - struct tlbe *tlbe; + struct kvm_book3e_206_tlb_entry *tlbe; /* Insert large initial mapping for guest. */ - tlbe = &vcpu_e500->gtlb_arch[1][0]; + tlbe = get_entry(vcpu_e500, 1, 0); tlbe->mas1 = MAS1_VALID | MAS1_TSIZE(BOOK3E_PAGESZ_256M); tlbe->mas2 = 0; - tlbe->mas3 = E500_TLB_SUPER_PERM_MASK; - tlbe->mas7 = 0; + tlbe->mas7_3 = E500_TLB_SUPER_PERM_MASK; /* 4K map for serial output. Used by kernel wrapper. */ - tlbe = &vcpu_e500->gtlb_arch[1][1]; + tlbe = get_entry(vcpu_e500, 1, 1); tlbe->mas1 = MAS1_VALID | MAS1_TSIZE(BOOK3E_PAGESZ_4K); tlbe->mas2 = (0xe0004500 & 0xFFFFF000) | MAS2_I | MAS2_G; - tlbe->mas3 = (0xe0004500 & 0xFFFFF000) | E500_TLB_SUPER_PERM_MASK; - tlbe->mas7 = 0; + tlbe->mas7_3 = (0xe0004500 & 0xFFFFF000) | E500_TLB_SUPER_PERM_MASK; +} + +static void free_gtlb(struct kvmppc_vcpu_e500 *vcpu_e500) +{ + int i; + + clear_tlb_refs(vcpu_e500); + kfree(vcpu_e500->gtlb_priv[0]); + kfree(vcpu_e500->gtlb_priv[1]); + + if (vcpu_e500->shared_tlb_pages) { + vfree((void *)(round_down((uintptr_t)vcpu_e500->gtlb_arch, + PAGE_SIZE))); + + for (i = 0; i < vcpu_e500->num_shared_tlb_pages; i++) { + set_page_dirty_lock(vcpu_e500->shared_tlb_pages[i]); + put_page(vcpu_e500->shared_tlb_pages[i]); + } + + vcpu_e500->num_shared_tlb_pages = 0; + vcpu_e500->shared_tlb_pages = NULL; + } else { + kfree(vcpu_e500->gtlb_arch); + } + + vcpu_e500->gtlb_arch = NULL; +} + +int kvm_vcpu_ioctl_config_tlb(struct kvm_vcpu *vcpu, + struct kvm_config_tlb *cfg) +{ + struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); + struct kvm_book3e_206_tlb_params params; + char *virt; + struct page **pages; + struct tlbe_priv *privs[2] = {}; + size_t array_len; + u32 sets; + int num_pages, ret, i; + + if (cfg->mmu_type != KVM_MMU_FSL_BOOKE_NOHV) + return -EINVAL; + + if (copy_from_user(¶ms, (void __user *)(uintptr_t)cfg->params, + sizeof(params))) + return -EFAULT; + + if (params.tlb_sizes[1] > 64) + return -EINVAL; + if (params.tlb_ways[1] != params.tlb_sizes[1]) + return -EINVAL; + if (params.tlb_sizes[2] != 0 || params.tlb_sizes[3] != 0) + return -EINVAL; + if (params.tlb_ways[2] != 0 || params.tlb_ways[3] != 0) + return -EINVAL; + + if (!is_power_of_2(params.tlb_ways[0])) + return -EINVAL; + + sets = params.tlb_sizes[0] >> ilog2(params.tlb_ways[0]); + if (!is_power_of_2(sets)) + return -EINVAL; + + array_len = params.tlb_sizes[0] + params.tlb_sizes[1]; + array_len *= sizeof(struct kvm_book3e_206_tlb_entry); + + if (cfg->array_len < array_len) + return -EINVAL; + + num_pages = DIV_ROUND_UP(cfg->array + array_len - 1, PAGE_SIZE) - + cfg->array / PAGE_SIZE; + pages = kmalloc(sizeof(struct page *) * num_pages, GFP_KERNEL); + if (!pages) + return -ENOMEM; + + ret = get_user_pages_fast(cfg->array, num_pages, 1, pages); + if (ret < 0) + goto err_pages; + + if (ret != num_pages) { + num_pages = ret; + ret = -EFAULT; + goto err_put_page; + } + + virt = vmap(pages, num_pages, VM_MAP, PAGE_KERNEL); + if (!virt) + goto err_put_page; + + privs[0] = kzalloc(sizeof(struct tlbe_priv) * params.tlb_sizes[0], + GFP_KERNEL); + privs[1] = kzalloc(sizeof(struct tlbe_priv) * params.tlb_sizes[1], + GFP_KERNEL); + + if (!privs[0] || !privs[1]) + goto err_put_page; + + free_gtlb(vcpu_e500); + + vcpu_e500->gtlb_priv[0] = privs[0]; + vcpu_e500->gtlb_priv[1] = privs[1]; + + vcpu_e500->gtlb_arch = (struct kvm_book3e_206_tlb_entry *) + (virt + (cfg->array & (PAGE_SIZE - 1))); + + vcpu_e500->gtlb_params[0].entries = params.tlb_sizes[0]; + vcpu_e500->gtlb_params[1].entries = params.tlb_sizes[1]; + + vcpu_e500->gtlb_offset[0] = 0; + vcpu_e500->gtlb_offset[1] = params.tlb_sizes[0]; + + vcpu_e500->tlb0cfg = mfspr(SPRN_TLB0CFG) & ~0xfffUL; + if (params.tlb_sizes[0] <= 2048) + vcpu_e500->tlb0cfg |= params.tlb_sizes[0]; + + vcpu_e500->tlb1cfg = mfspr(SPRN_TLB1CFG) & ~0xfffUL; + vcpu_e500->tlb1cfg |= params.tlb_sizes[1]; + + vcpu_e500->shared_tlb_pages = pages; + vcpu_e500->num_shared_tlb_pages = num_pages; + + vcpu_e500->gtlb_params[0].ways = params.tlb_ways[0]; + vcpu_e500->gtlb_params[0].sets = sets; + + vcpu_e500->gtlb_params[1].ways = params.tlb_sizes[1]; + vcpu_e500->gtlb_params[1].sets = 1; + + return 0; + +err_put_page: + kfree(privs[0]); + kfree(privs[1]); + + for (i = 0; i < num_pages; i++) + put_page(pages[i]); + +err_pages: + kfree(pages); + return ret; +} + +int kvm_vcpu_ioctl_dirty_tlb(struct kvm_vcpu *vcpu, + struct kvm_dirty_tlb *dirty) +{ + struct kvmppc_vcpu_e500 *vcpu_e500 = to_e500(vcpu); + + clear_tlb_refs(vcpu_e500); + return 0; } int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500) { + int entry_size = sizeof(struct kvm_book3e_206_tlb_entry); + int entries = KVM_E500_TLB0_SIZE + KVM_E500_TLB1_SIZE; + host_tlb_params[0].entries = mfspr(SPRN_TLB0CFG) & TLBnCFG_N_ENTRY; host_tlb_params[1].entries = mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY; @@ -1124,17 +1272,22 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500) host_tlb_params[0].entries / host_tlb_params[0].ways; host_tlb_params[1].sets = 1; - vcpu_e500->gtlb_size[0] = KVM_E500_TLB0_SIZE; - vcpu_e500->gtlb_arch[0] = - kzalloc(sizeof(struct tlbe) * KVM_E500_TLB0_SIZE, GFP_KERNEL); - if (vcpu_e500->gtlb_arch[0] == NULL) - goto err; + vcpu_e500->gtlb_params[0].entries = KVM_E500_TLB0_SIZE; + vcpu_e500->gtlb_params[1].entries = KVM_E500_TLB1_SIZE; - vcpu_e500->gtlb_size[1] = KVM_E500_TLB1_SIZE; - vcpu_e500->gtlb_arch[1] = - kzalloc(sizeof(struct tlbe) * KVM_E500_TLB1_SIZE, GFP_KERNEL); - if (vcpu_e500->gtlb_arch[1] == NULL) - goto err; + vcpu_e500->gtlb_params[0].ways = KVM_E500_TLB0_WAY_NUM; + vcpu_e500->gtlb_params[0].sets = + KVM_E500_TLB0_SIZE / KVM_E500_TLB0_WAY_NUM; + + vcpu_e500->gtlb_params[1].ways = KVM_E500_TLB1_SIZE; + vcpu_e500->gtlb_params[1].sets = 1; + + vcpu_e500->gtlb_arch = kmalloc(entries * entry_size, GFP_KERNEL); + if (!vcpu_e500->gtlb_arch) + return -ENOMEM; + + vcpu_e500->gtlb_offset[0] = 0; + vcpu_e500->gtlb_offset[1] = KVM_E500_TLB0_SIZE; vcpu_e500->tlb_refs[0] = kzalloc(sizeof(struct tlbe_ref) * host_tlb_params[0].entries, @@ -1148,15 +1301,15 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500) if (!vcpu_e500->tlb_refs[1]) goto err; - vcpu_e500->gtlb_priv[0] = - kzalloc(sizeof(struct tlbe_ref) * vcpu_e500->gtlb_size[0], - GFP_KERNEL); + vcpu_e500->gtlb_priv[0] = kzalloc(sizeof(struct tlbe_ref) * + vcpu_e500->gtlb_params[0].entries, + GFP_KERNEL); if (!vcpu_e500->gtlb_priv[0]) goto err; - vcpu_e500->gtlb_priv[1] = - kzalloc(sizeof(struct tlbe_ref) * vcpu_e500->gtlb_size[1], - GFP_KERNEL); + vcpu_e500->gtlb_priv[1] = kzalloc(sizeof(struct tlbe_ref) * + vcpu_e500->gtlb_params[1].entries, + GFP_KERNEL); if (!vcpu_e500->gtlb_priv[1]) goto err; @@ -1165,32 +1318,24 @@ int kvmppc_e500_tlb_init(struct kvmppc_vcpu_e500 *vcpu_e500) /* Init TLB configuration register */ vcpu_e500->tlb0cfg = mfspr(SPRN_TLB0CFG) & ~0xfffUL; - vcpu_e500->tlb0cfg |= vcpu_e500->gtlb_size[0]; + vcpu_e500->tlb0cfg |= vcpu_e500->gtlb_params[0].entries; vcpu_e500->tlb1cfg = mfspr(SPRN_TLB1CFG) & ~0xfffUL; - vcpu_e500->tlb1cfg |= vcpu_e500->gtlb_size[1]; + vcpu_e500->tlb1cfg |= vcpu_e500->gtlb_params[1].entries; return 0; err: + free_gtlb(vcpu_e500); kfree(vcpu_e500->tlb_refs[0]); kfree(vcpu_e500->tlb_refs[1]); - kfree(vcpu_e500->gtlb_priv[0]); - kfree(vcpu_e500->gtlb_priv[1]); - kfree(vcpu_e500->gtlb_arch[0]); - kfree(vcpu_e500->gtlb_arch[1]); return -1; } void kvmppc_e500_tlb_uninit(struct kvmppc_vcpu_e500 *vcpu_e500) { - clear_tlb_refs(vcpu_e500); - + free_gtlb(vcpu_e500); kvmppc_e500_id_table_free(vcpu_e500); kfree(vcpu_e500->tlb_refs[0]); kfree(vcpu_e500->tlb_refs[1]); - kfree(vcpu_e500->gtlb_priv[0]); - kfree(vcpu_e500->gtlb_priv[1]); - kfree(vcpu_e500->gtlb_arch[1]); - kfree(vcpu_e500->gtlb_arch[0]); } diff --git a/arch/powerpc/kvm/e500_tlb.h b/arch/powerpc/kvm/e500_tlb.h index b587f691459a..2c296407e759 100644 --- a/arch/powerpc/kvm/e500_tlb.h +++ b/arch/powerpc/kvm/e500_tlb.h @@ -20,13 +20,9 @@ #include #include -#define KVM_E500_TLB0_WAY_SIZE_BIT 7 /* Fixed */ -#define KVM_E500_TLB0_WAY_SIZE (1UL << KVM_E500_TLB0_WAY_SIZE_BIT) -#define KVM_E500_TLB0_WAY_SIZE_MASK (KVM_E500_TLB0_WAY_SIZE - 1) - -#define KVM_E500_TLB0_WAY_NUM_BIT 1 /* No greater than 7 */ -#define KVM_E500_TLB0_WAY_NUM (1UL << KVM_E500_TLB0_WAY_NUM_BIT) -#define KVM_E500_TLB0_WAY_NUM_MASK (KVM_E500_TLB0_WAY_NUM - 1) +/* This geometry is the legacy default -- can be overridden by userspace */ +#define KVM_E500_TLB0_WAY_SIZE 128 +#define KVM_E500_TLB0_WAY_NUM 2 #define KVM_E500_TLB0_SIZE (KVM_E500_TLB0_WAY_SIZE * KVM_E500_TLB0_WAY_NUM) #define KVM_E500_TLB1_SIZE 16 @@ -58,50 +54,54 @@ extern void kvmppc_e500_tlb_setup(struct kvmppc_vcpu_e500 *); extern void kvmppc_e500_recalc_shadow_pid(struct kvmppc_vcpu_e500 *); /* TLB helper functions */ -static inline unsigned int get_tlb_size(const struct tlbe *tlbe) +static inline unsigned int +get_tlb_size(const struct kvm_book3e_206_tlb_entry *tlbe) { return (tlbe->mas1 >> 7) & 0x1f; } -static inline gva_t get_tlb_eaddr(const struct tlbe *tlbe) +static inline gva_t get_tlb_eaddr(const struct kvm_book3e_206_tlb_entry *tlbe) { return tlbe->mas2 & 0xfffff000; } -static inline u64 get_tlb_bytes(const struct tlbe *tlbe) +static inline u64 get_tlb_bytes(const struct kvm_book3e_206_tlb_entry *tlbe) { unsigned int pgsize = get_tlb_size(tlbe); return 1ULL << 10 << pgsize; } -static inline gva_t get_tlb_end(const struct tlbe *tlbe) +static inline gva_t get_tlb_end(const struct kvm_book3e_206_tlb_entry *tlbe) { u64 bytes = get_tlb_bytes(tlbe); return get_tlb_eaddr(tlbe) + bytes - 1; } -static inline u64 get_tlb_raddr(const struct tlbe *tlbe) +static inline u64 get_tlb_raddr(const struct kvm_book3e_206_tlb_entry *tlbe) { - u64 rpn = tlbe->mas7; - return (rpn << 32) | (tlbe->mas3 & 0xfffff000); + return tlbe->mas7_3 & ~0xfffULL; } -static inline unsigned int get_tlb_tid(const struct tlbe *tlbe) +static inline unsigned int +get_tlb_tid(const struct kvm_book3e_206_tlb_entry *tlbe) { return (tlbe->mas1 >> 16) & 0xff; } -static inline unsigned int get_tlb_ts(const struct tlbe *tlbe) +static inline unsigned int +get_tlb_ts(const struct kvm_book3e_206_tlb_entry *tlbe) { return (tlbe->mas1 >> 12) & 0x1; } -static inline unsigned int get_tlb_v(const struct tlbe *tlbe) +static inline unsigned int +get_tlb_v(const struct kvm_book3e_206_tlb_entry *tlbe) { return (tlbe->mas1 >> 31) & 0x1; } -static inline unsigned int get_tlb_iprot(const struct tlbe *tlbe) +static inline unsigned int +get_tlb_iprot(const struct kvm_book3e_206_tlb_entry *tlbe) { return (tlbe->mas1 >> 30) & 0x1; } @@ -156,7 +156,7 @@ static inline unsigned int get_tlb_esel_bit( } static inline int tlbe_is_host_safe(const struct kvm_vcpu *vcpu, - const struct tlbe *tlbe) + const struct kvm_book3e_206_tlb_entry *tlbe) { gpa_t gpa; diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index a5671616af86..3cf6fba513ac 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -222,6 +222,9 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_PPC_PAIRED_SINGLES: case KVM_CAP_PPC_OSI: case KVM_CAP_PPC_GET_PVINFO: +#ifdef CONFIG_KVM_E500 + case KVM_CAP_SW_TLB: +#endif r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -602,6 +605,19 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, r = 0; vcpu->arch.papr_enabled = true; break; +#ifdef CONFIG_KVM_E500 + case KVM_CAP_SW_TLB: { + struct kvm_config_tlb cfg; + void __user *user_ptr = (void __user *)(uintptr_t)cap->args[0]; + + r = -EFAULT; + if (copy_from_user(&cfg, user_ptr, sizeof(cfg))) + break; + + r = kvm_vcpu_ioctl_config_tlb(vcpu, &cfg); + break; + } +#endif default: r = -EINVAL; break; @@ -651,6 +667,18 @@ long kvm_arch_vcpu_ioctl(struct file *filp, r = kvm_vcpu_ioctl_enable_cap(vcpu, &cap); break; } + +#ifdef CONFIG_KVM_E500 + case KVM_DIRTY_TLB: { + struct kvm_dirty_tlb dirty; + r = -EFAULT; + if (copy_from_user(&dirty, argp, sizeof(dirty))) + goto out; + r = kvm_vcpu_ioctl_dirty_tlb(vcpu, &dirty); + break; + } +#endif + default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 245bcb3a0fcd..fa029ced4bd5 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -581,6 +581,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_RMA 65 #define KVM_CAP_MAX_VCPUS 66 /* returns max vcpus per vm */ #define KVM_CAP_PPC_PAPR 68 +#define KVM_CAP_SW_TLB 69 #define KVM_CAP_S390_GMAP 71 #define KVM_CAP_TSC_DEADLINE_TIMER 72 #define KVM_CAP_S390_UCONTROL 73 @@ -664,6 +665,21 @@ struct kvm_clock_data { __u32 pad[9]; }; +#define KVM_MMU_FSL_BOOKE_NOHV 0 +#define KVM_MMU_FSL_BOOKE_HV 1 + +struct kvm_config_tlb { + __u64 params; + __u64 array; + __u32 mmu_type; + __u32 array_len; +}; + +struct kvm_dirty_tlb { + __u64 bitmap; + __u32 num_dirty; +}; + /* * ioctls for VM fds */ @@ -801,6 +817,8 @@ struct kvm_s390_ucas_mapping { #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce) /* Available with KVM_CAP_RMA */ #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma) +/* Available with KVM_CAP_SW_TLB */ +#define KVM_DIRTY_TLB _IOW(KVMIO, 0xaa, struct kvm_dirty_tlb) #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) -- cgit v1.2.3 From e24ed81fedd551e80378be62fa0b0532480ea7d4 Mon Sep 17 00:00:00 2001 From: Alexander Graf Date: Wed, 14 Sep 2011 10:02:41 +0200 Subject: KVM: PPC: Add generic single register ioctls Right now we transfer a static struct every time we want to get or set registers. Unfortunately, over time we realize that there are more of these than we thought of before and the extensibility and flexibility of transferring a full struct every time is limited. So this is a new approach to the problem. With these new ioctls, we can get and set a single register that is identified by an ID. This allows for very precise and limited transmittal of data. When we later realize that it's a better idea to shove over multiple registers at once, we can reuse most of the infrastructure and simply implement a GET_MANY_REGS / SET_MANY_REGS interface. Signed-off-by: Alexander Graf Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 46 ++++++++++++++++++++++++++++++++++++--- arch/powerpc/kvm/powerpc.c | 41 ++++++++++++++++++++++++++++++++++ include/linux/kvm.h | 35 +++++++++++++++++++++++++++++ 3 files changed, 119 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index bcd45d5afcab..e0c75dcb2ca3 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1533,7 +1533,7 @@ following algorithm: Some guests configure the LINT1 NMI input to cause a panic, aiding in debugging. -4.64 KVM_S390_UCAS_MAP +4.65 KVM_S390_UCAS_MAP Capability: KVM_CAP_S390_UCONTROL Architectures: s390 @@ -1552,7 +1552,7 @@ This ioctl maps the memory at "user_addr" with the length "length" to the vcpu's address space starting at "vcpu_addr". All parameters need to be alligned by 1 megabyte. -4.65 KVM_S390_UCAS_UNMAP +4.66 KVM_S390_UCAS_UNMAP Capability: KVM_CAP_S390_UCONTROL Architectures: s390 @@ -1571,7 +1571,7 @@ This ioctl unmaps the memory in the vcpu's address space starting at "vcpu_addr" with the length "length". The field "user_addr" is ignored. All parameters need to be alligned by 1 megabyte. -4.66 KVM_S390_VCPU_FAULT +4.67 KVM_S390_VCPU_FAULT Capability: KVM_CAP_S390_UCONTROL Architectures: s390 @@ -1587,6 +1587,46 @@ table upfront. This is useful to handle validity intercepts for user controlled virtual machines to fault in the virtual cpu's lowcore pages prior to calling the KVM_RUN ioctl. +4.68 KVM_SET_ONE_REG + +Capability: KVM_CAP_ONE_REG +Architectures: all +Type: vcpu ioctl +Parameters: struct kvm_one_reg (in) +Returns: 0 on success, negative value on failure + +struct kvm_one_reg { + __u64 id; + __u64 addr; +}; + +Using this ioctl, a single vcpu register can be set to a specific value +defined by user space with the passed in struct kvm_one_reg, where id +refers to the register identifier as described below and addr is a pointer +to a variable with the respective size. There can be architecture agnostic +and architecture specific registers. Each have their own range of operation +and their own constants and width. To keep track of the implemented +registers, find a list below: + + Arch | Register | Width (bits) + | | + +4.69 KVM_GET_ONE_REG + +Capability: KVM_CAP_ONE_REG +Architectures: all +Type: vcpu ioctl +Parameters: struct kvm_one_reg (in and out) +Returns: 0 on success, negative value on failure + +This ioctl allows to receive the value of a single register implemented +in a vcpu. The register to read is indicated by the "id" field of the +kvm_one_reg struct passed in. On success, the register value can be found +at the memory location pointed to by "addr". + +The list of registers accessible using this interface is identical to the +list in 4.64. + 5. The kvm_run structure Application code obtains a pointer to the kvm_run structure by diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index f4380cb264e0..089c61bf0e16 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -217,6 +217,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_PPC_UNSET_IRQ: case KVM_CAP_PPC_IRQ_LEVEL: case KVM_CAP_ENABLE_CAP: + case KVM_CAP_ONE_REG: r = 1; break; #ifndef CONFIG_KVM_BOOK3S_64_HV @@ -645,6 +646,32 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu, return r; } +static int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, + struct kvm_one_reg *reg) +{ + int r = -EINVAL; + + switch (reg->id) { + default: + break; + } + + return r; +} + +static int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, + struct kvm_one_reg *reg) +{ + int r = -EINVAL; + + switch (reg->id) { + default: + break; + } + + return r; +} + int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, struct kvm_mp_state *mp_state) { @@ -684,6 +711,20 @@ long kvm_arch_vcpu_ioctl(struct file *filp, break; } + case KVM_SET_ONE_REG: + case KVM_GET_ONE_REG: + { + struct kvm_one_reg reg; + r = -EFAULT; + if (copy_from_user(®, argp, sizeof(reg))) + goto out; + if (ioctl == KVM_SET_ONE_REG) + r = kvm_vcpu_ioctl_set_one_reg(vcpu, ®); + else + r = kvm_vcpu_ioctl_get_one_reg(vcpu, ®); + break; + } + #ifdef CONFIG_KVM_E500 case KVM_DIRTY_TLB: { struct kvm_dirty_tlb dirty; diff --git a/include/linux/kvm.h b/include/linux/kvm.h index fa029ced4bd5..4f7a9fb8ab06 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -582,6 +582,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_MAX_VCPUS 66 /* returns max vcpus per vm */ #define KVM_CAP_PPC_PAPR 68 #define KVM_CAP_SW_TLB 69 +#define KVM_CAP_ONE_REG 70 #define KVM_CAP_S390_GMAP 71 #define KVM_CAP_TSC_DEADLINE_TIMER 72 #define KVM_CAP_S390_UCONTROL 73 @@ -680,6 +681,37 @@ struct kvm_dirty_tlb { __u32 num_dirty; }; +/* Available with KVM_CAP_ONE_REG */ + +#define KVM_REG_ARCH_MASK 0xff00000000000000ULL +#define KVM_REG_GENERIC 0x0000000000000000ULL + +/* + * Architecture specific registers are to be defined in arch headers and + * ORed with the arch identifier. + */ +#define KVM_REG_PPC 0x1000000000000000ULL +#define KVM_REG_X86 0x2000000000000000ULL +#define KVM_REG_IA64 0x3000000000000000ULL +#define KVM_REG_ARM 0x4000000000000000ULL +#define KVM_REG_S390 0x5000000000000000ULL + +#define KVM_REG_SIZE_SHIFT 52 +#define KVM_REG_SIZE_MASK 0x00f0000000000000ULL +#define KVM_REG_SIZE_U8 0x0000000000000000ULL +#define KVM_REG_SIZE_U16 0x0010000000000000ULL +#define KVM_REG_SIZE_U32 0x0020000000000000ULL +#define KVM_REG_SIZE_U64 0x0030000000000000ULL +#define KVM_REG_SIZE_U128 0x0040000000000000ULL +#define KVM_REG_SIZE_U256 0x0050000000000000ULL +#define KVM_REG_SIZE_U512 0x0060000000000000ULL +#define KVM_REG_SIZE_U1024 0x0070000000000000ULL + +struct kvm_one_reg { + __u64 id; + __u64 addr; +}; + /* * ioctls for VM fds */ @@ -819,6 +851,9 @@ struct kvm_s390_ucas_mapping { #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma) /* Available with KVM_CAP_SW_TLB */ #define KVM_DIRTY_TLB _IOW(KVMIO, 0xaa, struct kvm_dirty_tlb) +/* Available with KVM_CAP_ONE_REG */ +#define KVM_GET_ONE_REG _IOW(KVMIO, 0xab, struct kvm_one_reg) +#define KVM_SET_ONE_REG _IOW(KVMIO, 0xac, struct kvm_one_reg) #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) -- cgit v1.2.3 From 1022fc3d3bfaca09d5d6bfcc93a168de16840814 Mon Sep 17 00:00:00 2001 From: Alexander Graf Date: Wed, 14 Sep 2011 21:45:23 +0200 Subject: KVM: PPC: Add support for explicit HIOR setting Until now, we always set HIOR based on the PVR, but this is just wrong. Instead, we should be setting HIOR explicitly, so user space can decide what the initial HIOR value is - just like on real hardware. We keep the old PVR based way around for backwards compatibility, but once user space uses the SET_ONE_REG based method, we drop the PVR logic. Signed-off-by: Alexander Graf Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 1 + arch/powerpc/include/asm/kvm.h | 2 ++ arch/powerpc/include/asm/kvm_book3s.h | 2 ++ arch/powerpc/kvm/book3s_pr.c | 6 ++++-- arch/powerpc/kvm/powerpc.c | 13 +++++++++++++ include/linux/kvm.h | 1 + 6 files changed, 23 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index e0c75dcb2ca3..59a38264a0ed 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1610,6 +1610,7 @@ registers, find a list below: Arch | Register | Width (bits) | | + PPC | KVM_REG_PPC_HIOR | 64 4.69 KVM_GET_ONE_REG diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h index 663c57f87165..f41adcda1468 100644 --- a/arch/powerpc/include/asm/kvm.h +++ b/arch/powerpc/include/asm/kvm.h @@ -331,4 +331,6 @@ struct kvm_book3e_206_tlb_params { __u32 reserved[8]; }; +#define KVM_REG_PPC_HIOR (KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x1) + #endif /* __LINUX_KVM_POWERPC_H */ diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index 3c3edee672aa..aa795ccef294 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -90,6 +90,8 @@ struct kvmppc_vcpu_book3s { #endif int context_id[SID_CONTEXTS]; + bool hior_explicit; /* HIOR is set by ioctl, not PVR */ + struct hlist_head hpte_hash_pte[HPTEG_HASH_NUM_PTE]; struct hlist_head hpte_hash_pte_long[HPTEG_HASH_NUM_PTE_LONG]; struct hlist_head hpte_hash_vpte[HPTEG_HASH_NUM_VPTE]; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index c193625d5289..00efda6dc0e2 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -157,14 +157,16 @@ void kvmppc_set_pvr(struct kvm_vcpu *vcpu, u32 pvr) #ifdef CONFIG_PPC_BOOK3S_64 if ((pvr >= 0x330000) && (pvr < 0x70330000)) { kvmppc_mmu_book3s_64_init(vcpu); - to_book3s(vcpu)->hior = 0xfff00000; + if (!to_book3s(vcpu)->hior_explicit) + to_book3s(vcpu)->hior = 0xfff00000; to_book3s(vcpu)->msr_mask = 0xffffffffffffffffULL; vcpu->arch.cpu_type = KVM_CPU_3S_64; } else #endif { kvmppc_mmu_book3s_32_init(vcpu); - to_book3s(vcpu)->hior = 0; + if (!to_book3s(vcpu)->hior_explicit) + to_book3s(vcpu)->hior = 0; to_book3s(vcpu)->msr_mask = 0xffffffffULL; vcpu->arch.cpu_type = KVM_CPU_3S_32; } diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 089c61bf0e16..59852091b389 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -212,6 +212,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_PPC_BOOKE_SREGS: #else case KVM_CAP_PPC_SEGSTATE: + case KVM_CAP_PPC_HIOR: case KVM_CAP_PPC_PAPR: #endif case KVM_CAP_PPC_UNSET_IRQ: @@ -652,6 +653,11 @@ static int kvm_vcpu_ioctl_get_one_reg(struct kvm_vcpu *vcpu, int r = -EINVAL; switch (reg->id) { +#ifdef CONFIG_PPC_BOOK3S + case KVM_REG_PPC_HIOR: + r = put_user(to_book3s(vcpu)->hior, (u64 __user *)reg->addr); + break; +#endif default: break; } @@ -665,6 +671,13 @@ static int kvm_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu, int r = -EINVAL; switch (reg->id) { +#ifdef CONFIG_PPC_BOOK3S + case KVM_ONE_REG_PPC_HIOR: + r = get_user(to_book3s(vcpu)->hior, (u64 __user *)reg->addr); + if (!r) + to_book3s(vcpu)->hior_explicit = true; + break; +#endif default: break; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 4f7a9fb8ab06..acbe42939089 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -580,6 +580,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_SMT 64 #define KVM_CAP_PPC_RMA 65 #define KVM_CAP_MAX_VCPUS 66 /* returns max vcpus per vm */ +#define KVM_CAP_PPC_HIOR 67 #define KVM_CAP_PPC_PAPR 68 #define KVM_CAP_SW_TLB 69 #define KVM_CAP_ONE_REG 70 -- cgit v1.2.3 From 54f65795c8f6e192540756085d738e66ab0917a0 Mon Sep 17 00:00:00 2001 From: Scott Wood Date: Wed, 11 Jan 2012 13:37:35 +0000 Subject: KVM: PPC: refer to paravirt docs in header file Instead of keeping separate copies of struct kvm_vcpu_arch_shared (one in the code, one in the docs) that inevitably fail to be kept in sync (already sr[] is missing from the doc version), just point to the header file as the source of documentation on the contents of the magic page. Signed-off-by: Scott Wood Acked-by: Avi Kivity Signed-off-by: Alexander Graf Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/ppc-pv.txt | 24 ++---------------------- arch/powerpc/include/asm/kvm_para.h | 10 ++++++++++ 2 files changed, 12 insertions(+), 22 deletions(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/ppc-pv.txt b/Documentation/virtual/kvm/ppc-pv.txt index 2b7ce190cde4..6e7c37050930 100644 --- a/Documentation/virtual/kvm/ppc-pv.txt +++ b/Documentation/virtual/kvm/ppc-pv.txt @@ -81,28 +81,8 @@ additional registers to the magic page. If you add fields to the magic page, also define a new hypercall feature to indicate that the host can give you more registers. Only if the host supports the additional features, make use of them. -The magic page has the following layout as described in -arch/powerpc/include/asm/kvm_para.h: - -struct kvm_vcpu_arch_shared { - __u64 scratch1; - __u64 scratch2; - __u64 scratch3; - __u64 critical; /* Guest may not get interrupts if == r1 */ - __u64 sprg0; - __u64 sprg1; - __u64 sprg2; - __u64 sprg3; - __u64 srr0; - __u64 srr1; - __u64 dar; - __u64 msr; - __u32 dsisr; - __u32 int_pending; /* Tells the guest if we have an interrupt */ -}; - -Additions to the page must only occur at the end. Struct fields are always 32 -or 64 bit aligned, depending on them being 32 or 64 bit wide respectively. +The magic page layout is described by struct kvm_vcpu_arch_shared +in arch/powerpc/include/asm/kvm_para.h. Magic page features =================== diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h index ece70fb36513..7b754e743003 100644 --- a/arch/powerpc/include/asm/kvm_para.h +++ b/arch/powerpc/include/asm/kvm_para.h @@ -22,6 +22,16 @@ #include +/* + * Additions to this struct must only occur at the end, and should be + * accompanied by a KVM_MAGIC_FEAT flag to advertise that they are present + * (albeit not necessarily relevant to the current target hardware platform). + * + * Struct fields are always 32 or 64 bit aligned, depending on them being 32 + * or 64 bit wide respectively. + * + * See Documentation/virtual/kvm/ppc-pv.txt + */ struct kvm_vcpu_arch_shared { __u64 scratch1; __u64 scratch2; -- cgit v1.2.3 From 3d2ddfdcf05f5f816f829f81858c54827d7be5b4 Mon Sep 17 00:00:00 2001 From: Laxman Dewangan Date: Fri, 17 Feb 2012 20:26:20 +0530 Subject: Documentation: gpio: Add details of open-drain/source configuration Adding details of open drain(open collector) and open source (open emitter) configuration of the gpio so that client can set the pin as open drain at the time of gpio request. Signed-off-by: Laxman Dewangan Reviwed-by: Mark Brown Acked-by: Linus Walleij Signed-off-by: Grant Likely --- Documentation/gpio.txt | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/gpio.txt b/Documentation/gpio.txt index 792faa3c06cf..f783e7fed764 100644 --- a/Documentation/gpio.txt +++ b/Documentation/gpio.txt @@ -302,6 +302,8 @@ where 'flags' is currently defined to specify the following properties: * GPIOF_INIT_LOW - as output, set initial level to LOW * GPIOF_INIT_HIGH - as output, set initial level to HIGH + * GPIOF_OPEN_DRAIN - gpio pin is open drain type. + * GPIOF_OPEN_SOURCE - gpio pin is open source type. since GPIOF_INIT_* are only valid when configured as output, so group valid combinations as: @@ -310,8 +312,19 @@ combinations as: * GPIOF_OUT_INIT_LOW - configured as output, initial level LOW * GPIOF_OUT_INIT_HIGH - configured as output, initial level HIGH -In the future, these flags can be extended to support more properties such -as open-drain status. +When setting the flag as GPIOF_OPEN_DRAIN then it will assume that pins is +open drain type. Such pins will not be driven to 1 in output mode. It is +require to connect pull-up on such pins. By enabling this flag, gpio lib will +make the direction to input when it is asked to set value of 1 in output mode +to make the pin HIGH. The pin is make to LOW by driving value 0 in output mode. + +When setting the flag as GPIOF_OPEN_SOURCE then it will assume that pins is +open source type. Such pins will not be driven to 0 in output mode. It is +require to connect pull-down on such pin. By enabling this flag, gpio lib will +make the direction to input when it is asked to set value of 0 in output mode +to make the pin LOW. The pin is make to HIGH by driving value 1 in output mode. + +In the future, these flags can be extended to support more properties. Further more, to ease the claim/release of multiple GPIOs, 'struct gpio' is introduced to encapsulate all three fields as: -- cgit v1.2.3 From 384ebe1c2849160d040df3e68634ec506f13d9ff Mon Sep 17 00:00:00 2001 From: Benoit Cousson Date: Tue, 16 Aug 2011 11:53:02 +0200 Subject: gpio/omap: Add DT support to GPIO driver Adapt the GPIO driver to retrieve information from a DT file. Allocate the irq_base dynamically and rename bank->virtual_irq_start to bank->irq_base. Change irq_base type to int instead of u16 to match irq_alloc_descs output. Add documentation for GPIO properties specific to OMAP. Signed-off-by: Benoit Cousson Cc: Tarun Kanti DebBarma Acked-by: Rob Herring --- .../devicetree/bindings/gpio/gpio-omap.txt | 36 ++++++ drivers/gpio/gpio-omap.c | 121 +++++++++++++++++++-- 2 files changed, 148 insertions(+), 9 deletions(-) create mode 100644 Documentation/devicetree/bindings/gpio/gpio-omap.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/gpio/gpio-omap.txt b/Documentation/devicetree/bindings/gpio/gpio-omap.txt new file mode 100644 index 000000000000..bff51a2fee1e --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/gpio-omap.txt @@ -0,0 +1,36 @@ +OMAP GPIO controller bindings + +Required properties: +- compatible: + - "ti,omap2-gpio" for OMAP2 controllers + - "ti,omap3-gpio" for OMAP3 controllers + - "ti,omap4-gpio" for OMAP4 controllers +- #gpio-cells : Should be two. + - first cell is the pin number + - second cell is used to specify optional parameters (unused) +- gpio-controller : Marks the device node as a GPIO controller. +- #interrupt-cells : Should be 2. +- interrupt-controller: Mark the device node as an interrupt controller + The first cell is the GPIO number. + The second cell is used to specify flags: + bits[3:0] trigger type and level flags: + 1 = low-to-high edge triggered. + 2 = high-to-low edge triggered. + 4 = active high level-sensitive. + 8 = active low level-sensitive. + +OMAP specific properties: +- ti,hwmods: Name of the hwmod associated to the GPIO: + "gpio", being the 1-based instance number from the HW spec + + +Example: + +gpio4: gpio4 { + compatible = "ti,omap4-gpio"; + ti,hwmods = "gpio4"; + #gpio-cells = <2>; + gpio-controller; + #interrupt-cells = <2>; + interrupt-controller; +}; diff --git a/drivers/gpio/gpio-omap.c b/drivers/gpio/gpio-omap.c index c3a9dc8fe732..bc2bd698ff2a 100644 --- a/drivers/gpio/gpio-omap.c +++ b/drivers/gpio/gpio-omap.c @@ -22,6 +22,9 @@ #include #include #include +#include +#include +#include #include #include @@ -52,7 +55,8 @@ struct gpio_bank { struct list_head node; void __iomem *base; u16 irq; - u16 virtual_irq_start; + int irq_base; + struct irq_domain *domain; u32 suspend_wakeup; u32 saved_wakeup; u32 non_wakeup_gpios; @@ -669,7 +673,7 @@ static void gpio_irq_handler(unsigned int irq, struct irq_desc *desc) if (!isr) break; - gpio_irq = bank->virtual_irq_start; + gpio_irq = bank->irq_base; for (; isr != 0; isr >>= 1, gpio_irq++) { gpio_index = GPIO_INDEX(bank, irq_to_gpio(gpio_irq)); @@ -915,7 +919,7 @@ static int gpio_2irq(struct gpio_chip *chip, unsigned offset) struct gpio_bank *bank; bank = container_of(chip, struct gpio_bank, chip); - return bank->virtual_irq_start + offset; + return bank->irq_base + offset; } /*---------------------------------------------------------------------*/ @@ -1028,8 +1032,7 @@ static void __devinit omap_gpio_chip_init(struct gpio_bank *bank) gpiochip_add(&bank->chip); - for (j = bank->virtual_irq_start; - j < bank->virtual_irq_start + bank->width; j++) { + for (j = bank->irq_base; j < bank->irq_base + bank->width; j++) { irq_set_lockdep_class(j, &gpio_lock_class); irq_set_chip_data(j, bank); if (bank->is_mpuio) { @@ -1044,15 +1047,22 @@ static void __devinit omap_gpio_chip_init(struct gpio_bank *bank) irq_set_handler_data(bank->irq, bank); } +static const struct of_device_id omap_gpio_match[]; + static int __devinit omap_gpio_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; + struct device_node *node = dev->of_node; + const struct of_device_id *match; struct omap_gpio_platform_data *pdata; struct resource *res; struct gpio_bank *bank; int ret = 0; - if (!dev->platform_data) + match = of_match_device(of_match_ptr(omap_gpio_match), dev); + + pdata = match ? match->data : dev->platform_data; + if (!pdata) return -EINVAL; bank = devm_kzalloc(&pdev->dev, sizeof(struct gpio_bank), GFP_KERNEL); @@ -1068,9 +1078,6 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev) } bank->irq = res->start; - - pdata = pdev->dev.platform_data; - bank->virtual_irq_start = pdata->virtual_irq_start; bank->dev = dev; bank->dbck_flag = pdata->dbck_flag; bank->stride = pdata->bank_stride; @@ -1080,6 +1087,18 @@ static int __devinit omap_gpio_probe(struct platform_device *pdev) bank->loses_context = pdata->loses_context; bank->get_context_loss_count = pdata->get_context_loss_count; bank->regs = pdata->regs; +#ifdef CONFIG_OF_GPIO + bank->chip.of_node = of_node_get(node); +#endif + + bank->irq_base = irq_alloc_descs(-1, 0, bank->width, 0); + if (bank->irq_base < 0) { + dev_err(dev, "Couldn't allocate IRQ numbers\n"); + return -ENODEV; + } + + bank->domain = irq_domain_add_legacy(node, bank->width, bank->irq_base, + 0, &irq_domain_simple_ops, NULL); if (bank->regs->set_dataout && bank->regs->clr_dataout) bank->set_dataout = _set_gpio_dataout_reg; @@ -1387,11 +1406,95 @@ static const struct dev_pm_ops gpio_pm_ops = { NULL) }; +#if defined(CONFIG_OF) +static struct omap_gpio_reg_offs omap2_gpio_regs = { + .revision = OMAP24XX_GPIO_REVISION, + .direction = OMAP24XX_GPIO_OE, + .datain = OMAP24XX_GPIO_DATAIN, + .dataout = OMAP24XX_GPIO_DATAOUT, + .set_dataout = OMAP24XX_GPIO_SETDATAOUT, + .clr_dataout = OMAP24XX_GPIO_CLEARDATAOUT, + .irqstatus = OMAP24XX_GPIO_IRQSTATUS1, + .irqstatus2 = OMAP24XX_GPIO_IRQSTATUS2, + .irqenable = OMAP24XX_GPIO_IRQENABLE1, + .irqenable2 = OMAP24XX_GPIO_IRQENABLE2, + .set_irqenable = OMAP24XX_GPIO_SETIRQENABLE1, + .clr_irqenable = OMAP24XX_GPIO_CLEARIRQENABLE1, + .debounce = OMAP24XX_GPIO_DEBOUNCE_VAL, + .debounce_en = OMAP24XX_GPIO_DEBOUNCE_EN, + .ctrl = OMAP24XX_GPIO_CTRL, + .wkup_en = OMAP24XX_GPIO_WAKE_EN, + .leveldetect0 = OMAP24XX_GPIO_LEVELDETECT0, + .leveldetect1 = OMAP24XX_GPIO_LEVELDETECT1, + .risingdetect = OMAP24XX_GPIO_RISINGDETECT, + .fallingdetect = OMAP24XX_GPIO_FALLINGDETECT, +}; + +static struct omap_gpio_reg_offs omap4_gpio_regs = { + .revision = OMAP4_GPIO_REVISION, + .direction = OMAP4_GPIO_OE, + .datain = OMAP4_GPIO_DATAIN, + .dataout = OMAP4_GPIO_DATAOUT, + .set_dataout = OMAP4_GPIO_SETDATAOUT, + .clr_dataout = OMAP4_GPIO_CLEARDATAOUT, + .irqstatus = OMAP4_GPIO_IRQSTATUS0, + .irqstatus2 = OMAP4_GPIO_IRQSTATUS1, + .irqenable = OMAP4_GPIO_IRQSTATUSSET0, + .irqenable2 = OMAP4_GPIO_IRQSTATUSSET1, + .set_irqenable = OMAP4_GPIO_IRQSTATUSSET0, + .clr_irqenable = OMAP4_GPIO_IRQSTATUSCLR0, + .debounce = OMAP4_GPIO_DEBOUNCINGTIME, + .debounce_en = OMAP4_GPIO_DEBOUNCENABLE, + .ctrl = OMAP4_GPIO_CTRL, + .wkup_en = OMAP4_GPIO_IRQWAKEN0, + .leveldetect0 = OMAP4_GPIO_LEVELDETECT0, + .leveldetect1 = OMAP4_GPIO_LEVELDETECT1, + .risingdetect = OMAP4_GPIO_RISINGDETECT, + .fallingdetect = OMAP4_GPIO_FALLINGDETECT, +}; + +static struct omap_gpio_platform_data omap2_pdata = { + .regs = &omap2_gpio_regs, + .bank_width = 32, + .dbck_flag = false, +}; + +static struct omap_gpio_platform_data omap3_pdata = { + .regs = &omap2_gpio_regs, + .bank_width = 32, + .dbck_flag = true, +}; + +static struct omap_gpio_platform_data omap4_pdata = { + .regs = &omap4_gpio_regs, + .bank_width = 32, + .dbck_flag = true, +}; + +static const struct of_device_id omap_gpio_match[] = { + { + .compatible = "ti,omap4-gpio", + .data = &omap4_pdata, + }, + { + .compatible = "ti,omap3-gpio", + .data = &omap3_pdata, + }, + { + .compatible = "ti,omap2-gpio", + .data = &omap2_pdata, + }, + { }, +}; +MODULE_DEVICE_TABLE(of, omap_gpio_match); +#endif + static struct platform_driver omap_gpio_driver = { .probe = omap_gpio_probe, .driver = { .name = "omap_gpio", .pm = &gpio_pm_ops, + .of_match_table = of_match_ptr(omap_gpio_match), }, }; -- cgit v1.2.3 From d5dfcc91b179f40652318b9f3d76c31d2f908000 Mon Sep 17 00:00:00 2001 From: Simon Glass Date: Tue, 6 Mar 2012 21:04:32 -0800 Subject: arm: tegra: dts: Support host/device selection and legacy mode Some USB ports can support host and device operation. We add the dr_mode property (as found in Freescale) for this. One USB port has a 'legacy mode', left over from the days of pre-Tegra chips. I don't believe this is actually used, except that we must know to turn this off in the driver. Signed-off-by: Simon Glass Acked-by: Stephen Warren Signed-off-by: Olof Johansson --- Documentation/devicetree/bindings/usb/tegra-usb.txt | 13 +++++++++++++ 1 file changed, 13 insertions(+) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/usb/tegra-usb.txt b/Documentation/devicetree/bindings/usb/tegra-usb.txt index 035d63d5646d..007005ddbe12 100644 --- a/Documentation/devicetree/bindings/usb/tegra-usb.txt +++ b/Documentation/devicetree/bindings/usb/tegra-usb.txt @@ -11,3 +11,16 @@ Required properties : - phy_type : Should be one of "ulpi" or "utmi". - nvidia,vbus-gpio : If present, specifies a gpio that needs to be activated for the bus to be powered. + +Optional properties: + - dr_mode : dual role mode. Indicates the working mode for + nvidia,tegra20-ehci compatible controllers. Can be "host", "peripheral", + or "otg". Default to "host" if not defined for backward compatibility. + host means this is a host controller + peripheral means it is device controller + otg means it can operate as either ("on the go") + - nvidia,has-legacy-mode : boolean indicates whether this controller can + operate in legacy mode (as APX 2500 / 2600). In legacy mode some + registers are accessed through the APB_MISC base address instead of + the USB controller. Since this is a legacy issue it probably does not + warrant a compatible string of its own. -- cgit v1.2.3 From 07700a94b00a4fcbbfb07d1b72dc112a0e036735 Mon Sep 17 00:00:00 2001 From: Jan Kiszka Date: Tue, 28 Feb 2012 14:19:54 +0100 Subject: KVM: Allow host IRQ sharing for assigned PCI 2.3 devices PCI 2.3 allows to generically disable IRQ sources at device level. This enables us to share legacy IRQs of such devices with other host devices when passing them to a guest. The new IRQ sharing feature introduced here is optional, user space has to request it explicitly. Moreover, user space can inform us about its view of PCI_COMMAND_INTX_DISABLE so that we can avoid unmasking the interrupt and signaling it if the guest masked it via the virtualized PCI config space. Signed-off-by: Jan Kiszka Acked-by: Alex Williamson Acked-by: Michael S. Tsirkin Signed-off-by: Avi Kivity --- Documentation/virtual/kvm/api.txt | 41 ++++++++ arch/x86/kvm/x86.c | 1 + include/linux/kvm.h | 6 ++ include/linux/kvm_host.h | 2 + virt/kvm/assigned-dev.c | 209 ++++++++++++++++++++++++++++++++------ 5 files changed, 230 insertions(+), 29 deletions(-) (limited to 'Documentation') diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 59a38264a0ed..6386f8c0482e 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1169,6 +1169,14 @@ following flags are specified: /* Depends on KVM_CAP_IOMMU */ #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) +/* The following two depend on KVM_CAP_PCI_2_3 */ +#define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) +#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2) + +If KVM_DEV_ASSIGN_PCI_2_3 is set, the kernel will manage legacy INTx interrupts +via the PCI-2.3-compliant device-level mask, thus enable IRQ sharing with other +assigned devices or host devices. KVM_DEV_ASSIGN_MASK_INTX specifies the +guest's view on the INTx mask, see KVM_ASSIGN_SET_INTX_MASK for details. The KVM_DEV_ASSIGN_ENABLE_IOMMU flag is a mandatory option to ensure isolation of the device. Usages not specifying this flag are deprecated. @@ -1441,6 +1449,39 @@ The "num_dirty" field is a performance hint for KVM to determine whether it should skip processing the bitmap and just invalidate everything. It must be set to the number of set bits in the bitmap. +4.60 KVM_ASSIGN_SET_INTX_MASK + +Capability: KVM_CAP_PCI_2_3 +Architectures: x86 +Type: vm ioctl +Parameters: struct kvm_assigned_pci_dev (in) +Returns: 0 on success, -1 on error + +Allows userspace to mask PCI INTx interrupts from the assigned device. The +kernel will not deliver INTx interrupts to the guest between setting and +clearing of KVM_ASSIGN_SET_INTX_MASK via this interface. This enables use of +and emulation of PCI 2.3 INTx disable command register behavior. + +This may be used for both PCI 2.3 devices supporting INTx disable natively and +older devices lacking this support. Userspace is responsible for emulating the +read value of the INTx disable bit in the guest visible PCI command register. +When modifying the INTx disable state, userspace should precede updating the +physical device command register by calling this ioctl to inform the kernel of +the new intended INTx mask state. + +Note that the kernel uses the device INTx disable bit to internally manage the +device interrupt state for PCI 2.3 devices. Reads of this register may +therefore not match the expected value. Writes should always use the guest +intended INTx disable value rather than attempting to read-copy-update the +current physical device state. Races between user and kernel updates to the +INTx disable bit are handled lazily in the kernel. It's possible the device +may generate unintended interrupts, but they will not be injected into the +guest. + +See KVM_ASSIGN_DEV_IRQ for the data structure. The target device is specified +by assigned_dev_id. In the flags field, only KVM_DEV_ASSIGN_MASK_INTX is +evaluated. + 4.62 KVM_CREATE_SPAPR_TCE Capability: KVM_CAP_SPAPR_TCE diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9477dc6cccae..6866083a48c1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2143,6 +2143,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: + case KVM_CAP_PCI_2_3: r = 1; break; case KVM_CAP_COALESCED_MMIO: diff --git a/include/linux/kvm.h b/include/linux/kvm.h index acbe42939089..6c322a90b92f 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -588,6 +588,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_TSC_DEADLINE_TIMER 72 #define KVM_CAP_S390_UCONTROL 73 #define KVM_CAP_SYNC_REGS 74 +#define KVM_CAP_PCI_2_3 75 #ifdef KVM_CAP_IRQ_ROUTING @@ -784,6 +785,9 @@ struct kvm_s390_ucas_mapping { /* Available with KVM_CAP_TSC_CONTROL */ #define KVM_SET_TSC_KHZ _IO(KVMIO, 0xa2) #define KVM_GET_TSC_KHZ _IO(KVMIO, 0xa3) +/* Available with KVM_CAP_PCI_2_3 */ +#define KVM_ASSIGN_SET_INTX_MASK _IOW(KVMIO, 0xa4, \ + struct kvm_assigned_pci_dev) /* * ioctls for vcpu fds @@ -857,6 +861,8 @@ struct kvm_s390_ucas_mapping { #define KVM_SET_ONE_REG _IOW(KVMIO, 0xac, struct kvm_one_reg) #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) +#define KVM_DEV_ASSIGN_PCI_2_3 (1 << 1) +#define KVM_DEV_ASSIGN_MASK_INTX (1 << 2) struct kvm_assigned_pci_dev { __u32 assigned_dev_id; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index e42d85ae8541..ec171c1d0878 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -546,6 +546,7 @@ struct kvm_assigned_dev_kernel { unsigned int entries_nr; int host_irq; bool host_irq_disabled; + bool pci_2_3; struct msix_entry *host_msix_entries; int guest_irq; struct msix_entry *guest_msix_entries; @@ -555,6 +556,7 @@ struct kvm_assigned_dev_kernel { struct pci_dev *dev; struct kvm *kvm; spinlock_t intx_lock; + struct mutex intx_mask_lock; char irq_name[32]; struct pci_saved_state *pci_saved_state; }; diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c index ece80612b594..08e05715df72 100644 --- a/virt/kvm/assigned-dev.c +++ b/virt/kvm/assigned-dev.c @@ -55,22 +55,66 @@ static int find_index_from_host_irq(struct kvm_assigned_dev_kernel return index; } -static irqreturn_t kvm_assigned_dev_thread(int irq, void *dev_id) +static irqreturn_t kvm_assigned_dev_intx(int irq, void *dev_id) { struct kvm_assigned_dev_kernel *assigned_dev = dev_id; + int ret; - if (assigned_dev->irq_requested_type & KVM_DEV_IRQ_HOST_INTX) { - spin_lock(&assigned_dev->intx_lock); + spin_lock(&assigned_dev->intx_lock); + if (pci_check_and_mask_intx(assigned_dev->dev)) { + assigned_dev->host_irq_disabled = true; + ret = IRQ_WAKE_THREAD; + } else + ret = IRQ_NONE; + spin_unlock(&assigned_dev->intx_lock); + + return ret; +} + +static void +kvm_assigned_dev_raise_guest_irq(struct kvm_assigned_dev_kernel *assigned_dev, + int vector) +{ + if (unlikely(assigned_dev->irq_requested_type & + KVM_DEV_IRQ_GUEST_INTX)) { + mutex_lock(&assigned_dev->intx_mask_lock); + if (!(assigned_dev->flags & KVM_DEV_ASSIGN_MASK_INTX)) + kvm_set_irq(assigned_dev->kvm, + assigned_dev->irq_source_id, vector, 1); + mutex_unlock(&assigned_dev->intx_mask_lock); + } else + kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id, + vector, 1); +} + +static irqreturn_t kvm_assigned_dev_thread_intx(int irq, void *dev_id) +{ + struct kvm_assigned_dev_kernel *assigned_dev = dev_id; + + if (!(assigned_dev->flags & KVM_DEV_ASSIGN_PCI_2_3)) { + spin_lock_irq(&assigned_dev->intx_lock); disable_irq_nosync(irq); assigned_dev->host_irq_disabled = true; - spin_unlock(&assigned_dev->intx_lock); + spin_unlock_irq(&assigned_dev->intx_lock); } - kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id, - assigned_dev->guest_irq, 1); + kvm_assigned_dev_raise_guest_irq(assigned_dev, + assigned_dev->guest_irq); + + return IRQ_HANDLED; +} + +#ifdef __KVM_HAVE_MSI +static irqreturn_t kvm_assigned_dev_thread_msi(int irq, void *dev_id) +{ + struct kvm_assigned_dev_kernel *assigned_dev = dev_id; + + kvm_assigned_dev_raise_guest_irq(assigned_dev, + assigned_dev->guest_irq); return IRQ_HANDLED; } +#endif #ifdef __KVM_HAVE_MSIX static irqreturn_t kvm_assigned_dev_thread_msix(int irq, void *dev_id) @@ -81,8 +125,7 @@ static irqreturn_t kvm_assigned_dev_thread_msix(int irq, void *dev_id) if (index >= 0) { vector = assigned_dev->guest_msix_entries[index].vector; - kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id, - vector, 1); + kvm_assigned_dev_raise_guest_irq(assigned_dev, vector); } return IRQ_HANDLED; @@ -98,15 +141,31 @@ static void kvm_assigned_dev_ack_irq(struct kvm_irq_ack_notifier *kian) kvm_set_irq(dev->kvm, dev->irq_source_id, dev->guest_irq, 0); - /* The guest irq may be shared so this ack may be - * from another device. - */ - spin_lock(&dev->intx_lock); - if (dev->host_irq_disabled) { - enable_irq(dev->host_irq); - dev->host_irq_disabled = false; + mutex_lock(&dev->intx_mask_lock); + + if (!(dev->flags & KVM_DEV_ASSIGN_MASK_INTX)) { + bool reassert = false; + + spin_lock_irq(&dev->intx_lock); + /* + * The guest IRQ may be shared so this ack can come from an + * IRQ for another guest device. + */ + if (dev->host_irq_disabled) { + if (!(dev->flags & KVM_DEV_ASSIGN_PCI_2_3)) + enable_irq(dev->host_irq); + else if (!pci_check_and_unmask_intx(dev->dev)) + reassert = true; + dev->host_irq_disabled = reassert; + } + spin_unlock_irq(&dev->intx_lock); + + if (reassert) + kvm_set_irq(dev->kvm, dev->irq_source_id, + dev->guest_irq, 1); } - spin_unlock(&dev->intx_lock); + + mutex_unlock(&dev->intx_mask_lock); } static void deassign_guest_irq(struct kvm *kvm, @@ -154,7 +213,15 @@ static void deassign_host_irq(struct kvm *kvm, pci_disable_msix(assigned_dev->dev); } else { /* Deal with MSI and INTx */ - disable_irq(assigned_dev->host_irq); + if ((assigned_dev->irq_requested_type & + KVM_DEV_IRQ_HOST_INTX) && + (assigned_dev->flags & KVM_DEV_ASSIGN_PCI_2_3)) { + spin_lock_irq(&assigned_dev->intx_lock); + pci_intx(assigned_dev->dev, false); + spin_unlock_irq(&assigned_dev->intx_lock); + synchronize_irq(assigned_dev->host_irq); + } else + disable_irq(assigned_dev->host_irq); free_irq(assigned_dev->host_irq, assigned_dev); @@ -235,15 +302,34 @@ void kvm_free_all_assigned_devices(struct kvm *kvm) static int assigned_device_enable_host_intx(struct kvm *kvm, struct kvm_assigned_dev_kernel *dev) { + irq_handler_t irq_handler; + unsigned long flags; + dev->host_irq = dev->dev->irq; - /* Even though this is PCI, we don't want to use shared - * interrupts. Sharing host devices with guest-assigned devices - * on the same interrupt line is not a happy situation: there - * are going to be long delays in accepting, acking, etc. + + /* + * We can only share the IRQ line with other host devices if we are + * able to disable the IRQ source at device-level - independently of + * the guest driver. Otherwise host devices may suffer from unbounded + * IRQ latencies when the guest keeps the line asserted. */ - if (request_threaded_irq(dev->host_irq, NULL, kvm_assigned_dev_thread, - IRQF_ONESHOT, dev->irq_name, dev)) + if (dev->flags & KVM_DEV_ASSIGN_PCI_2_3) { + irq_handler = kvm_assigned_dev_intx; + flags = IRQF_SHARED; + } else { + irq_handler = NULL; + flags = IRQF_ONESHOT; + } + if (request_threaded_irq(dev->host_irq, irq_handler, + kvm_assigned_dev_thread_intx, flags, + dev->irq_name, dev)) return -EIO; + + if (dev->flags & KVM_DEV_ASSIGN_PCI_2_3) { + spin_lock_irq(&dev->intx_lock); + pci_intx(dev->dev, true); + spin_unlock_irq(&dev->intx_lock); + } return 0; } @@ -260,8 +346,9 @@ static int assigned_device_enable_host_msi(struct kvm *kvm, } dev->host_irq = dev->dev->irq; - if (request_threaded_irq(dev->host_irq, NULL, kvm_assigned_dev_thread, - 0, dev->irq_name, dev)) { + if (request_threaded_irq(dev->host_irq, NULL, + kvm_assigned_dev_thread_msi, 0, + dev->irq_name, dev)) { pci_disable_msi(dev->dev); return -EIO; } @@ -319,7 +406,6 @@ static int assigned_device_enable_guest_msi(struct kvm *kvm, { dev->guest_irq = irq->guest_irq; dev->ack_notifier.gsi = -1; - dev->host_irq_disabled = false; return 0; } #endif @@ -331,7 +417,6 @@ static int assigned_device_enable_guest_msix(struct kvm *kvm, { dev->guest_irq = irq->guest_irq; dev->ack_notifier.gsi = -1; - dev->host_irq_disabled = false; return 0; } #endif @@ -365,6 +450,7 @@ static int assign_host_irq(struct kvm *kvm, default: r = -EINVAL; } + dev->host_irq_disabled = false; if (!r) dev->irq_requested_type |= host_irq_type; @@ -466,6 +552,7 @@ static int kvm_vm_ioctl_deassign_dev_irq(struct kvm *kvm, { int r = -ENODEV; struct kvm_assigned_dev_kernel *match; + unsigned long irq_type; mutex_lock(&kvm->lock); @@ -474,7 +561,9 @@ static int kvm_vm_ioctl_deassign_dev_irq(struct kvm *kvm, if (!match) goto out; - r = kvm_deassign_irq(kvm, match, assigned_irq->flags); + irq_type = assigned_irq->flags & (KVM_DEV_IRQ_HOST_MASK | + KVM_DEV_IRQ_GUEST_MASK); + r = kvm_deassign_irq(kvm, match, irq_type); out: mutex_unlock(&kvm->lock); return r; @@ -607,6 +696,10 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm, if (!match->pci_saved_state) printk(KERN_DEBUG "%s: Couldn't store %s saved state\n", __func__, dev_name(&dev->dev)); + + if (!pci_intx_mask_supported(dev)) + assigned_dev->flags &= ~KVM_DEV_ASSIGN_PCI_2_3; + match->assigned_dev_id = assigned_dev->assigned_dev_id; match->host_segnr = assigned_dev->segnr; match->host_busnr = assigned_dev->busnr; @@ -614,6 +707,7 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm, match->flags = assigned_dev->flags; match->dev = dev; spin_lock_init(&match->intx_lock); + mutex_init(&match->intx_mask_lock); match->irq_source_id = -1; match->kvm = kvm; match->ack_notifier.irq_acked = kvm_assigned_dev_ack_irq; @@ -759,6 +853,55 @@ msix_entry_out: } #endif +static int kvm_vm_ioctl_set_pci_irq_mask(struct kvm *kvm, + struct kvm_assigned_pci_dev *assigned_dev) +{ + int r = 0; + struct kvm_assigned_dev_kernel *match; + + mutex_lock(&kvm->lock); + + match = kvm_find_assigned_dev(&kvm->arch.assigned_dev_head, + assigned_dev->assigned_dev_id); + if (!match) { + r = -ENODEV; + goto out; + } + + mutex_lock(&match->intx_mask_lock); + + match->flags &= ~KVM_DEV_ASSIGN_MASK_INTX; + match->flags |= assigned_dev->flags & KVM_DEV_ASSIGN_MASK_INTX; + + if (match->irq_requested_type & KVM_DEV_IRQ_GUEST_INTX) { + if (assigned_dev->flags & KVM_DEV_ASSIGN_MASK_INTX) { + kvm_set_irq(match->kvm, match->irq_source_id, + match->guest_irq, 0); + /* + * Masking at hardware-level is performed on demand, + * i.e. when an IRQ actually arrives at the host. + */ + } else if (!(assigned_dev->flags & KVM_DEV_ASSIGN_PCI_2_3)) { + /* + * Unmask the IRQ line if required. Unmasking at + * device level will be performed by user space. + */ + spin_lock_irq(&match->intx_lock); + if (match->host_irq_disabled) { + enable_irq(match->host_irq); + match->host_irq_disabled = false; + } + spin_unlock_irq(&match->intx_lock); + } + } + + mutex_unlock(&match->intx_mask_lock); + +out: + mutex_unlock(&kvm->lock); + return r; +} + long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl, unsigned long arg) { @@ -866,6 +1009,15 @@ long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl, break; } #endif + case KVM_ASSIGN_SET_INTX_MASK: { + struct kvm_assigned_pci_dev assigned_dev; + + r = -EFAULT; + if (copy_from_user(&assigned_dev, argp, sizeof assigned_dev)) + goto out; + r = kvm_vm_ioctl_set_pci_irq_mask(kvm, &assigned_dev); + break; + } default: r = -ENOTTY; break; @@ -873,4 +1025,3 @@ long kvm_vm_ioctl_assigned_device(struct kvm *kvm, unsigned ioctl, out: return r; } - -- cgit v1.2.3 From 0dc665d426691fd75fe9b6b16295ad0c02677d21 Mon Sep 17 00:00:00 2001 From: Stephen Warren Date: Mon, 5 Mar 2012 17:22:14 -0700 Subject: Documentation/gpio.txt: Explain expected pinctrl interaction Update gpio.txt based on recent discussions regarding interaction with the pinctrl subsystem. Previously, gpio_request() was described as explicitly not performing any required mux setup operations etc. Now, gpio_request() is explicitly as explicitly performing any required mux setup operations where possible. In the case it isn't, platform code is required to have set up any required muxing or other configuration prior to gpio_request() being called, in order to maintain the same semantics. This is achieved by gpiolib drivers calling e.g. pinctrl_request_gpio() in their .request() operation. Cc: Randy Dunlap Cc: linux-doc@vger.kernel.org Signed-off-by: Stephen Warren Reviewed-by: Linus Walleij Signed-off-by: Grant Likely --- Documentation/gpio.txt | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/gpio.txt b/Documentation/gpio.txt index f783e7fed764..620a07844e8c 100644 --- a/Documentation/gpio.txt +++ b/Documentation/gpio.txt @@ -271,9 +271,26 @@ Some platforms may also use knowledge about what GPIOs are active for power management, such as by powering down unused chip sectors and, more easily, gating off unused clocks. -Note that requesting a GPIO does NOT cause it to be configured in any -way; it just marks that GPIO as in use. Separate code must handle any -pin setup (e.g. controlling which pin the GPIO uses, pullup/pulldown). +For GPIOs that use pins known to the pinctrl subsystem, that subsystem should +be informed of their use; a gpiolib driver's .request() operation may call +pinctrl_request_gpio(), and a gpiolib driver's .free() operation may call +pinctrl_free_gpio(). The pinctrl subsystem allows a pinctrl_request_gpio() +to succeed concurrently with a pin or pingroup being "owned" by a device for +pin multiplexing. + +Any programming of pin multiplexing hardware that is needed to route the +GPIO signal to the appropriate pin should occur within a GPIO driver's +.direction_input() or .direction_output() operations, and occur after any +setup of an output GPIO's value. This allows a glitch-free migration from a +pin's special function to GPIO. This is sometimes required when using a GPIO +to implement a workaround on signals typically driven by a non-GPIO HW block. + +Some platforms allow some or all GPIO signals to be routed to different pins. +Similarly, other aspects of the GPIO or pin may need to be configured, such as +pullup/pulldown. Platform software should arrange that any such details are +configured prior to gpio_request() being called for those GPIOs, e.g. using +the pinctrl subsystem's mapping table, so that GPIO users need not be aware +of these details. Also note that it's your responsibility to have stopped using a GPIO before you free it. -- cgit v1.2.3 From 770d7c39af940da24dd4c2c048576d778ac0abd4 Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Sat, 28 Jan 2012 12:12:36 +0800 Subject: of/mtd/nand: add generic bindings and helpers - nand-ecc-mode : String, operation mode of the NAND ecc mode. Supported values are: "none", "soft", "hw", "hw_syndrome", "hw_oob_first", "soft_bch". - nand-bus-width : 8 or 16 bus width if not present 8 - nand-on-flash-bbt: boolean to enable on flash bbt option if not present false Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Acked-by: Grant Likely Acked-by: Stefan Roese --- Documentation/devicetree/bindings/mtd/nand.txt | 7 +++ drivers/of/Kconfig | 4 ++ drivers/of/Makefile | 1 + drivers/of/of_mtd.c | 85 ++++++++++++++++++++++++++ include/linux/of_mtd.h | 19 ++++++ 5 files changed, 116 insertions(+) create mode 100644 Documentation/devicetree/bindings/mtd/nand.txt create mode 100644 drivers/of/of_mtd.c create mode 100644 include/linux/of_mtd.h (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/mtd/nand.txt b/Documentation/devicetree/bindings/mtd/nand.txt new file mode 100644 index 000000000000..03855c8c492a --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/nand.txt @@ -0,0 +1,7 @@ +* MTD generic binding + +- nand-ecc-mode : String, operation mode of the NAND ecc mode. + Supported values are: "none", "soft", "hw", "hw_syndrome", "hw_oob_first", + "soft_bch". +- nand-bus-width : 8 or 16 bus width if not present 8 +- nand-on-flash-bbt: boolean to enable on flash bbt option if not present false diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig index 268163dd71c7..fa666a93540c 100644 --- a/drivers/of/Kconfig +++ b/drivers/of/Kconfig @@ -90,4 +90,8 @@ config OF_PCI_IRQ help OpenFirmware PCI IRQ routing helpers +config OF_MTD + depends on MTD + def_bool y + endmenu # OF diff --git a/drivers/of/Makefile b/drivers/of/Makefile index a73f5a51ff4c..aa90e602c8a7 100644 --- a/drivers/of/Makefile +++ b/drivers/of/Makefile @@ -12,3 +12,4 @@ obj-$(CONFIG_OF_SELFTEST) += selftest.o obj-$(CONFIG_OF_MDIO) += of_mdio.o obj-$(CONFIG_OF_PCI) += of_pci.o obj-$(CONFIG_OF_PCI_IRQ) += of_pci_irq.o +obj-$(CONFIG_OF_MTD) += of_mtd.o diff --git a/drivers/of/of_mtd.c b/drivers/of/of_mtd.c new file mode 100644 index 000000000000..e7cad627a5d1 --- /dev/null +++ b/drivers/of/of_mtd.c @@ -0,0 +1,85 @@ +/* + * Copyright 2012 Jean-Christophe PLAGNIOL-VILLARD + * + * OF helpers for mtd. + * + * This file is released under the GPLv2 + * + */ +#include +#include +#include +#include + +/** + * It maps 'enum nand_ecc_modes_t' found in include/linux/mtd/nand.h + * into the device tree binding of 'nand-ecc', so that MTD + * device driver can get nand ecc from device tree. + */ +static const char *nand_ecc_modes[] = { + [NAND_ECC_NONE] = "none", + [NAND_ECC_SOFT] = "soft", + [NAND_ECC_HW] = "hw", + [NAND_ECC_HW_SYNDROME] = "hw_syndrome", + [NAND_ECC_HW_OOB_FIRST] = "hw_oob_first", + [NAND_ECC_SOFT_BCH] = "soft_bch", +}; + +/** + * of_get_nand_ecc_mode - Get nand ecc mode for given device_node + * @np: Pointer to the given device_node + * + * The function gets ecc mode string from property 'nand-ecc-mode', + * and return its index in nand_ecc_modes table, or errno in error case. + */ +const int of_get_nand_ecc_mode(struct device_node *np) +{ + const char *pm; + int err, i; + + err = of_property_read_string(np, "nand-ecc-mode", &pm); + if (err < 0) + return err; + + for (i = 0; i < ARRAY_SIZE(nand_ecc_modes); i++) + if (!strcasecmp(pm, nand_ecc_modes[i])) + return i; + + return -ENODEV; +} +EXPORT_SYMBOL_GPL(of_get_nand_ecc_mode); + +/** + * of_get_nand_bus_width - Get nand bus witdh for given device_node + * @np: Pointer to the given device_node + * + * return bus width option, or errno in error case. + */ +int of_get_nand_bus_width(struct device_node *np) +{ + u32 val; + + if (of_property_read_u32(np, "nand-bus-width", &val)) + return 8; + + switch(val) { + case 8: + case 16: + return val; + default: + return -EIO; + } +} +EXPORT_SYMBOL_GPL(of_get_nand_bus_width); + +/** + * of_get_nand_on_flash_bbt - Get nand on flash bbt for given device_node + * @np: Pointer to the given device_node + * + * return true if present false other wise + */ +bool of_get_nand_on_flash_bbt(struct device_node *np) +{ + return of_property_read_bool(np, "nand-on-flash-bbt"); +} +EXPORT_SYMBOL_GPL(of_get_nand_on_flash_bbt); diff --git a/include/linux/of_mtd.h b/include/linux/of_mtd.h new file mode 100644 index 000000000000..bae1b6094c63 --- /dev/null +++ b/include/linux/of_mtd.h @@ -0,0 +1,19 @@ +/* + * Copyright 2012 Jean-Christophe PLAGNIOL-VILLARD + * + * OF helpers for mtd. + * + * This file is released under the GPLv2 + */ + +#ifndef __LINUX_OF_MTD_H +#define __LINUX_OF_NET_H + +#ifdef CONFIG_OF_MTD +#include +extern const int of_get_nand_ecc_mode(struct device_node *np); +int of_get_nand_bus_width(struct device_node *np); +bool of_get_nand_on_flash_bbt(struct device_node *np); +#endif + +#endif /* __LINUX_OF_MTD_H */ -- cgit v1.2.3 From d6a016616ba834b7da7653effb98d413acde7aa2 Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Thu, 26 Jan 2012 02:11:06 +0800 Subject: atmel/nand: add DT support Use a local copy of board informatin and fill with DT data. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Acked-by: Grant Likely Cc: Nicolas Ferre --- .../devicetree/bindings/mtd/atmel-nand.txt | 41 ++++++++ arch/arm/boot/dts/at91sam9g20.dtsi | 16 +++ arch/arm/boot/dts/at91sam9g45.dtsi | 16 +++ arch/arm/boot/dts/at91sam9m10g45ek.dts | 25 ++++- arch/arm/boot/dts/usb_a9g20.dts | 44 +++++++- arch/arm/mach-at91/at91sam9x5.c | 5 - arch/arm/mach-at91/board-dt.c | 51 ---------- drivers/mtd/nand/atmel_nand.c | 113 +++++++++++++++++---- 8 files changed, 233 insertions(+), 78 deletions(-) create mode 100644 Documentation/devicetree/bindings/mtd/atmel-nand.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/mtd/atmel-nand.txt b/Documentation/devicetree/bindings/mtd/atmel-nand.txt new file mode 100644 index 000000000000..5903ecf6e895 --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/atmel-nand.txt @@ -0,0 +1,41 @@ +Atmel NAND flash + +Required properties: +- compatible : "atmel,at91rm9200-nand". +- reg : should specify localbus address and size used for the chip, + and if availlable the ECC. +- atmel,nand-addr-offset : offset for the address latch. +- atmel,nand-cmd-offset : offset for the command latch. +- #address-cells, #size-cells : Must be present if the device has sub-nodes + representing partitions. + +- gpios : specifies the gpio pins to control the NAND device. detect is an + optional gpio and may be set to 0 if not present. + +Optional properties: +- nand-ecc-mode : String, operation mode of the NAND ecc mode, soft by default. + Supported values are: "none", "soft", "hw", "hw_syndrome", "hw_oob_first", + "soft_bch". +- nand-bus-width : 8 or 16 bus width if not present 8 +- nand-on-flash-bbt: boolean to enable on flash bbt option if not present false + +Examples: +nand0: nand@40000000,0 { + compatible = "atmel,at91rm9200-nand"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0x40000000 0x10000000 + 0xffffe800 0x200 + >; + atmel,nand-addr-offset = <21>; + atmel,nand-cmd-offset = <22>; + nand-on-flash-bbt; + nand-ecc-mode = "soft"; + gpios = <&pioC 13 0 + &pioC 14 0 + 0 + >; + partition@0 { + ... + }; +}; diff --git a/arch/arm/boot/dts/at91sam9g20.dtsi b/arch/arm/boot/dts/at91sam9g20.dtsi index a100db03ec90..4b0dc99b9319 100644 --- a/arch/arm/boot/dts/at91sam9g20.dtsi +++ b/arch/arm/boot/dts/at91sam9g20.dtsi @@ -172,5 +172,21 @@ status = "disabled"; }; }; + + nand0: nand@40000000 { + compatible = "atmel,at91rm9200-nand"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0x40000000 0x10000000 + 0xffffe800 0x200 + >; + atmel,nand-addr-offset = <21>; + atmel,nand-cmd-offset = <22>; + gpios = <&pioC 13 0 + &pioC 14 0 + 0 + >; + status = "disabled"; + }; }; }; diff --git a/arch/arm/boot/dts/at91sam9g45.dtsi b/arch/arm/boot/dts/at91sam9g45.dtsi index f779667159b1..d79021b831c4 100644 --- a/arch/arm/boot/dts/at91sam9g45.dtsi +++ b/arch/arm/boot/dts/at91sam9g45.dtsi @@ -180,5 +180,21 @@ status = "disabled"; }; }; + + nand0: nand@40000000 { + compatible = "atmel,at91rm9200-nand"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0x40000000 0x10000000 + 0xffffe200 0x200 + >; + atmel,nand-addr-offset = <21>; + atmel,nand-cmd-offset = <22>; + gpios = <&pioC 8 0 + &pioC 14 0 + 0 + >; + status = "disabled"; + }; }; }; diff --git a/arch/arm/boot/dts/at91sam9m10g45ek.dts b/arch/arm/boot/dts/at91sam9m10g45ek.dts index 15e25f903cad..fd4531135431 100644 --- a/arch/arm/boot/dts/at91sam9m10g45ek.dts +++ b/arch/arm/boot/dts/at91sam9m10g45ek.dts @@ -14,7 +14,7 @@ compatible = "atmel,at91sam9m10g45ek", "atmel,at91sam9g45", "atmel,at91sam9"; chosen { - bootargs = "mem=64M console=ttyS0,115200 mtdparts=atmel_nand:4M(bootstrap/uboot/kernel)ro,60M(rootfs),-(data) root=/dev/mtdblock1 rw rootfstype=jffs2"; + bootargs = "mem=64M console=ttyS0,115200 root=/dev/mtdblock1 rw rootfstype=jffs2"; }; memory@70000000 { @@ -36,6 +36,29 @@ status = "okay"; }; }; + + nand0: nand@40000000 { + nand-bus-width = <8>; + nand-ecc-mode = "soft"; + nand-on-flash-bbt; + status = "okay"; + + boot@0 { + label = "bootstrap/uboot/kernel"; + reg = <0x0 0x400000>; + }; + + rootfs@400000 { + label = "rootfs"; + reg = <0x400000 0x3C00000>; + }; + + data@4000000 { + label = "data"; + reg = <0x4000000 0xC000000>; + }; + + }; }; leds { diff --git a/arch/arm/boot/dts/usb_a9g20.dts b/arch/arm/boot/dts/usb_a9g20.dts index d74545a2a77c..71d83ef316df 100644 --- a/arch/arm/boot/dts/usb_a9g20.dts +++ b/arch/arm/boot/dts/usb_a9g20.dts @@ -13,7 +13,7 @@ compatible = "calao,usb-a9g20", "atmel,at91sam9g20", "atmel,at91sam9"; chosen { - bootargs = "mem=64M console=ttyS0,115200 mtdparts=atmel_nand:128k(at91bootstrap),256k(barebox)ro,128k(bareboxenv),128k(bareboxenv2),4M(kernel),120M(rootfs),-(data) root=/dev/mtdblock5 rw rootfstype=ubifs"; + bootargs = "mem=64M console=ttyS0,115200 root=/dev/mtdblock5 rw rootfstype=ubifs"; }; memory@20000000 { @@ -31,6 +31,48 @@ status = "okay"; }; }; + + nand0: nand@40000000 { + nand-bus-width = <8>; + nand-ecc-mode = "soft"; + nand-on-flash-bbt; + status = "okay"; + + at91bootstrap@0 { + label = "at91bootstrap"; + reg = <0x0 0x20000>; + }; + + barebox@20000 { + label = "barebox"; + reg = <0x20000 0x40000>; + }; + + bareboxenv@60000 { + label = "bareboxenv"; + reg = <0x60000 0x20000>; + }; + + bareboxenv2@80000 { + label = "bareboxenv2"; + reg = <0x80000 0x20000>; + }; + + kernel@a0000 { + label = "kernel"; + reg = <0xa0000 0x400000>; + }; + + rootfs@4a0000 { + label = "rootfs"; + reg = <0x4a0000 0x7800000>; + }; + + data@7ca0000 { + label = "data"; + reg = <0x7ca0000 0x8360000>; + }; + }; }; leds { diff --git a/arch/arm/mach-at91/at91sam9x5.c b/arch/arm/mach-at91/at91sam9x5.c index a34d96afa746..7bec5a40d01a 100644 --- a/arch/arm/mach-at91/at91sam9x5.c +++ b/arch/arm/mach-at91/at91sam9x5.c @@ -313,11 +313,6 @@ void __init at91sam9x5_initialize(void) at91_gpio_init(NULL, 0); } -/* -------------------------------------------------------------------- - * AT91SAM9x5 devices (temporary before modification of code) - * -------------------------------------------------------------------- */ -void __init at91_add_device_nand(struct atmel_nand_data *data) {} - /* -------------------------------------------------------------------- * Interrupt initialization * -------------------------------------------------------------------- */ diff --git a/arch/arm/mach-at91/board-dt.c b/arch/arm/mach-at91/board-dt.c index 2eb294b2cc0e..9f729d6c8942 100644 --- a/arch/arm/mach-at91/board-dt.c +++ b/arch/arm/mach-at91/board-dt.c @@ -19,10 +19,7 @@ #include #include -#include #include -#include -#include #include #include @@ -30,7 +27,6 @@ #include #include -#include "sam9_smc.h" #include "generic.h" @@ -40,50 +36,6 @@ static void __init ek_init_early(void) at91_initialize(12000000); } -/* det_pin is not connected */ -static struct atmel_nand_data __initdata ek_nand_data = { - .ale = 21, - .cle = 22, - .det_pin = -EINVAL, - .rdy_pin = AT91_PIN_PC8, - .enable_pin = AT91_PIN_PC14, - .ecc_mode = NAND_ECC_SOFT, - .on_flash_bbt = 1, -}; - -static struct sam9_smc_config __initdata ek_nand_smc_config = { - .ncs_read_setup = 0, - .nrd_setup = 2, - .ncs_write_setup = 0, - .nwe_setup = 2, - - .ncs_read_pulse = 4, - .nrd_pulse = 4, - .ncs_write_pulse = 4, - .nwe_pulse = 4, - - .read_cycle = 7, - .write_cycle = 7, - - .mode = AT91_SMC_READMODE | AT91_SMC_WRITEMODE | AT91_SMC_EXNWMODE_DISABLE, - .tdf_cycles = 3, -}; - -static void __init ek_add_device_nand(void) -{ - ek_nand_data.bus_width_16 = board_have_nand_16bit(); - /* setup bus-width (8 or 16) */ - if (ek_nand_data.bus_width_16) - ek_nand_smc_config.mode |= AT91_SMC_DBW_16; - else - ek_nand_smc_config.mode |= AT91_SMC_DBW_8; - - /* configure chip-select 3 (NAND) */ - sam9_smc_configure(0, 3, &ek_nand_smc_config); - - at91_add_device_nand(&ek_nand_data); -} - static const struct of_device_id irq_of_match[] __initconst = { { .compatible = "atmel,at91rm9200-aic", .data = at91_aic_of_init }, @@ -100,9 +52,6 @@ static void __init at91_dt_init_irq(void) static void __init at91_dt_device_init(void) { of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL); - - /* NAND */ - ek_add_device_nand(); } static const char *at91_dt_board_compat[] __initdata = { diff --git a/drivers/mtd/nand/atmel_nand.c b/drivers/mtd/nand/atmel_nand.c index 045d174b8277..ae7e37d9ac17 100644 --- a/drivers/mtd/nand/atmel_nand.c +++ b/drivers/mtd/nand/atmel_nand.c @@ -27,6 +27,10 @@ #include #include #include +#include +#include +#include +#include #include #include #include @@ -83,7 +87,7 @@ struct atmel_nand_host { struct mtd_info mtd; void __iomem *io_base; dma_addr_t io_phys; - struct atmel_nand_data *board; + struct atmel_nand_data board; struct device *dev; void __iomem *ecc; @@ -101,8 +105,8 @@ static int cpu_has_dma(void) */ static void atmel_nand_enable(struct atmel_nand_host *host) { - if (gpio_is_valid(host->board->enable_pin)) - gpio_set_value(host->board->enable_pin, 0); + if (gpio_is_valid(host->board.enable_pin)) + gpio_set_value(host->board.enable_pin, 0); } /* @@ -110,8 +114,8 @@ static void atmel_nand_enable(struct atmel_nand_host *host) */ static void atmel_nand_disable(struct atmel_nand_host *host) { - if (gpio_is_valid(host->board->enable_pin)) - gpio_set_value(host->board->enable_pin, 1); + if (gpio_is_valid(host->board.enable_pin)) + gpio_set_value(host->board.enable_pin, 1); } /* @@ -132,9 +136,9 @@ static void atmel_nand_cmd_ctrl(struct mtd_info *mtd, int cmd, unsigned int ctrl return; if (ctrl & NAND_CLE) - writeb(cmd, host->io_base + (1 << host->board->cle)); + writeb(cmd, host->io_base + (1 << host->board.cle)); else - writeb(cmd, host->io_base + (1 << host->board->ale)); + writeb(cmd, host->io_base + (1 << host->board.ale)); } /* @@ -145,8 +149,8 @@ static int atmel_nand_device_ready(struct mtd_info *mtd) struct nand_chip *nand_chip = mtd->priv; struct atmel_nand_host *host = nand_chip->priv; - return gpio_get_value(host->board->rdy_pin) ^ - !!host->board->rdy_pin_active_low; + return gpio_get_value(host->board.rdy_pin) ^ + !!host->board.rdy_pin_active_low; } /* @@ -261,7 +265,7 @@ static void atmel_read_buf(struct mtd_info *mtd, u8 *buf, int len) if (atmel_nand_dma_op(mtd, buf, len, 1) == 0) return; - if (host->board->bus_width_16) + if (host->board.bus_width_16) atmel_read_buf16(mtd, buf, len); else atmel_read_buf8(mtd, buf, len); @@ -277,7 +281,7 @@ static void atmel_write_buf(struct mtd_info *mtd, const u8 *buf, int len) if (atmel_nand_dma_op(mtd, (void *)buf, len, 0) == 0) return; - if (host->board->bus_width_16) + if (host->board.bus_width_16) atmel_write_buf16(mtd, buf, len); else atmel_write_buf8(mtd, buf, len); @@ -469,6 +473,56 @@ static void atmel_nand_hwctl(struct mtd_info *mtd, int mode) } } +#if defined(CONFIG_OF) +static int __devinit atmel_of_init_port(struct atmel_nand_host *host, + struct device_node *np) +{ + u32 val; + int ecc_mode; + struct atmel_nand_data *board = &host->board; + enum of_gpio_flags flags; + + if (of_property_read_u32(np, "atmel,nand-addr-offset", &val) == 0) { + if (val >= 32) { + dev_err(host->dev, "invalid addr-offset %u\n", val); + return -EINVAL; + } + board->ale = val; + } + + if (of_property_read_u32(np, "atmel,nand-cmd-offset", &val) == 0) { + if (val >= 32) { + dev_err(host->dev, "invalid cmd-offset %u\n", val); + return -EINVAL; + } + board->cle = val; + } + + ecc_mode = of_get_nand_ecc_mode(np); + + board->ecc_mode = ecc_mode < 0 ? NAND_ECC_SOFT : ecc_mode; + + board->on_flash_bbt = of_get_nand_on_flash_bbt(np); + + if (of_get_nand_bus_width(np) == 16) + board->bus_width_16 = 1; + + board->rdy_pin = of_get_gpio_flags(np, 0, &flags); + board->rdy_pin_active_low = (flags == OF_GPIO_ACTIVE_LOW); + + board->enable_pin = of_get_gpio(np, 1); + board->det_pin = of_get_gpio(np, 2); + + return 0; +} +#else +static int __devinit atmel_of_init_port(struct atmel_nand_host *host, + struct device_node *np) +{ + return -EINVAL; +} +#endif + /* * Probe for the NAND device. */ @@ -479,6 +533,7 @@ static int __init atmel_nand_probe(struct platform_device *pdev) struct nand_chip *nand_chip; struct resource *regs; struct resource *mem; + struct mtd_part_parser_data ppdata = {}; int res; mem = platform_get_resource(pdev, IORESOURCE_MEM, 0); @@ -505,8 +560,15 @@ static int __init atmel_nand_probe(struct platform_device *pdev) mtd = &host->mtd; nand_chip = &host->nand_chip; - host->board = pdev->dev.platform_data; host->dev = &pdev->dev; + if (pdev->dev.of_node) { + res = atmel_of_init_port(host, pdev->dev.of_node); + if (res) + goto err_nand_ioremap; + } else { + memcpy(&host->board, pdev->dev.platform_data, + sizeof(struct atmel_nand_data)); + } nand_chip->priv = host; /* link the private data structures */ mtd->priv = nand_chip; @@ -517,10 +579,10 @@ static int __init atmel_nand_probe(struct platform_device *pdev) nand_chip->IO_ADDR_W = host->io_base; nand_chip->cmd_ctrl = atmel_nand_cmd_ctrl; - if (gpio_is_valid(host->board->rdy_pin)) + if (gpio_is_valid(host->board.rdy_pin)) nand_chip->dev_ready = atmel_nand_device_ready; - nand_chip->ecc.mode = host->board->ecc_mode; + nand_chip->ecc.mode = host->board.ecc_mode; regs = platform_get_resource(pdev, IORESOURCE_MEM, 1); if (!regs && nand_chip->ecc.mode == NAND_ECC_HW) { @@ -545,7 +607,7 @@ static int __init atmel_nand_probe(struct platform_device *pdev) nand_chip->chip_delay = 20; /* 20us command delay time */ - if (host->board->bus_width_16) /* 16-bit bus width */ + if (host->board.bus_width_16) /* 16-bit bus width */ nand_chip->options |= NAND_BUSWIDTH_16; nand_chip->read_buf = atmel_read_buf; @@ -554,15 +616,15 @@ static int __init atmel_nand_probe(struct platform_device *pdev) platform_set_drvdata(pdev, host); atmel_nand_enable(host); - if (gpio_is_valid(host->board->det_pin)) { - if (gpio_get_value(host->board->det_pin)) { + if (gpio_is_valid(host->board.det_pin)) { + if (gpio_get_value(host->board.det_pin)) { printk(KERN_INFO "No SmartMedia card inserted.\n"); res = -ENXIO; goto err_no_card; } } - if (host->board->on_flash_bbt || on_flash_bbt) { + if (host->board.on_flash_bbt || on_flash_bbt) { printk(KERN_INFO "atmel_nand: Use On Flash BBT\n"); nand_chip->bbt_options |= NAND_BBT_USE_FLASH; } @@ -637,8 +699,9 @@ static int __init atmel_nand_probe(struct platform_device *pdev) } mtd->name = "atmel_nand"; - res = mtd_device_parse_register(mtd, NULL, 0, - host->board->parts, host->board->num_parts); + ppdata.of_node = pdev->dev.of_node; + res = mtd_device_parse_register(mtd, NULL, &ppdata, + host->board.parts, host->board.num_parts); if (!res) return res; @@ -682,11 +745,21 @@ static int __exit atmel_nand_remove(struct platform_device *pdev) return 0; } +#if defined(CONFIG_OF) +static const struct of_device_id atmel_nand_dt_ids[] = { + { .compatible = "atmel,at91rm9200-nand" }, + { /* sentinel */ } +}; + +MODULE_DEVICE_TABLE(of, atmel_nand_dt_ids); +#endif + static struct platform_driver atmel_nand_driver = { .remove = __exit_p(atmel_nand_remove), .driver = { .name = "atmel_nand", .owner = THIS_MODULE, + .of_match_table = of_match_ptr(atmel_nand_dt_ids), }, }; -- cgit v1.2.3 From 8ffaa0f40db22564efc44588a9d861d78a1fae02 Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Sun, 5 Feb 2012 18:22:34 +0800 Subject: i2c/gpio: add DT support To achieve DT support, we need to populate a custom platform_data in a private struct from DT information. To simplify code, the adapter and algorithm are also put into the private struct. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Acked-by: Grant Likely Acked-by: Rob Herring Acked-by: Wolfram Sang Cc: Nicolas Ferre --- .../devicetree/bindings/gpio/gpio_i2c.txt | 32 +++++++ drivers/i2c/busses/i2c-gpio.c | 98 +++++++++++++++++----- 2 files changed, 108 insertions(+), 22 deletions(-) create mode 100644 Documentation/devicetree/bindings/gpio/gpio_i2c.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/gpio/gpio_i2c.txt b/Documentation/devicetree/bindings/gpio/gpio_i2c.txt new file mode 100644 index 000000000000..4f8ec947c6bd --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/gpio_i2c.txt @@ -0,0 +1,32 @@ +Device-Tree bindings for i2c gpio driver + +Required properties: + - compatible = "i2c-gpio"; + - gpios: sda and scl gpio + + +Optional properties: + - i2c-gpio,sda-open-drain: sda as open drain + - i2c-gpio,scl-open-drain: scl as open drain + - i2c-gpio,scl-output-only: scl as output only + - i2c-gpio,delay-us: delay between GPIO operations (may depend on each platform) + - i2c-gpio,timeout-ms: timeout to get data + +Example nodes: + +i2c@0 { + compatible = "i2c-gpio"; + gpios = <&pioA 23 0 /* sda */ + &pioA 24 0 /* scl */ + >; + i2c-gpio,sda-open-drain; + i2c-gpio,scl-open-drain; + i2c-gpio,delay-us = <2>; /* ~100 kHz */ + #address-cells = <1>; + #size-cells = <0>; + + rv3029c2@56 { + compatible = "rv3029c2"; + reg = <0x56>; + }; +}; diff --git a/drivers/i2c/busses/i2c-gpio.c b/drivers/i2c/busses/i2c-gpio.c index a651779d9ff7..c0330a41db03 100644 --- a/drivers/i2c/busses/i2c-gpio.c +++ b/drivers/i2c/busses/i2c-gpio.c @@ -14,8 +14,15 @@ #include #include #include - -#include +#include +#include +#include + +struct i2c_gpio_private_data { + struct i2c_adapter adap; + struct i2c_algo_bit_data bit_data; + struct i2c_gpio_platform_data pdata; +}; /* Toggle SDA by changing the direction of the pin */ static void i2c_gpio_setsda_dir(void *data, int state) @@ -78,24 +85,62 @@ static int i2c_gpio_getscl(void *data) return gpio_get_value(pdata->scl_pin); } +static int __devinit of_i2c_gpio_probe(struct device_node *np, + struct i2c_gpio_platform_data *pdata) +{ + u32 reg; + + if (of_gpio_count(np) < 2) + return -ENODEV; + + pdata->sda_pin = of_get_gpio(np, 0); + pdata->scl_pin = of_get_gpio(np, 1); + + if (!gpio_is_valid(pdata->sda_pin) || !gpio_is_valid(pdata->scl_pin)) { + pr_err("%s: invalid GPIO pins, sda=%d/scl=%d\n", + np->full_name, pdata->sda_pin, pdata->scl_pin); + return -ENODEV; + } + + of_property_read_u32(np, "i2c-gpio,delay-us", &pdata->udelay); + + if (!of_property_read_u32(np, "i2c-gpio,timeout-ms", ®)) + pdata->timeout = msecs_to_jiffies(reg); + + pdata->sda_is_open_drain = + of_property_read_bool(np, "i2c-gpio,sda-open-drain"); + pdata->scl_is_open_drain = + of_property_read_bool(np, "i2c-gpio,scl-open-drain"); + pdata->scl_is_output_only = + of_property_read_bool(np, "i2c-gpio,scl-output-only"); + + return 0; +} + static int __devinit i2c_gpio_probe(struct platform_device *pdev) { + struct i2c_gpio_private_data *priv; struct i2c_gpio_platform_data *pdata; struct i2c_algo_bit_data *bit_data; struct i2c_adapter *adap; int ret; - pdata = pdev->dev.platform_data; - if (!pdata) - return -ENXIO; - - ret = -ENOMEM; - adap = kzalloc(sizeof(struct i2c_adapter), GFP_KERNEL); - if (!adap) - goto err_alloc_adap; - bit_data = kzalloc(sizeof(struct i2c_algo_bit_data), GFP_KERNEL); - if (!bit_data) - goto err_alloc_bit_data; + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + adap = &priv->adap; + bit_data = &priv->bit_data; + pdata = &priv->pdata; + + if (pdev->dev.of_node) { + ret = of_i2c_gpio_probe(pdev->dev.of_node, pdata); + if (ret) + return ret; + } else { + if (!pdev->dev.platform_data) + return -ENXIO; + memcpy(pdata, pdev->dev.platform_data, sizeof(*pdata)); + } ret = gpio_request(pdata->sda_pin, "sda"); if (ret) @@ -143,6 +188,7 @@ static int __devinit i2c_gpio_probe(struct platform_device *pdev) adap->algo_data = bit_data; adap->class = I2C_CLASS_HWMON | I2C_CLASS_SPD; adap->dev.parent = &pdev->dev; + adap->dev.of_node = pdev->dev.of_node; /* * If "dev->id" is negative we consider it as zero. @@ -154,7 +200,9 @@ static int __devinit i2c_gpio_probe(struct platform_device *pdev) if (ret) goto err_add_bus; - platform_set_drvdata(pdev, adap); + of_i2c_register_devices(adap); + + platform_set_drvdata(pdev, priv); dev_info(&pdev->dev, "using pins %u (SDA) and %u (SCL%s)\n", pdata->sda_pin, pdata->scl_pin, @@ -168,34 +216,40 @@ err_add_bus: err_request_scl: gpio_free(pdata->sda_pin); err_request_sda: - kfree(bit_data); -err_alloc_bit_data: - kfree(adap); -err_alloc_adap: return ret; } static int __devexit i2c_gpio_remove(struct platform_device *pdev) { + struct i2c_gpio_private_data *priv; struct i2c_gpio_platform_data *pdata; struct i2c_adapter *adap; - adap = platform_get_drvdata(pdev); - pdata = pdev->dev.platform_data; + priv = platform_get_drvdata(pdev); + adap = &priv->adap; + pdata = &priv->pdata; i2c_del_adapter(adap); gpio_free(pdata->scl_pin); gpio_free(pdata->sda_pin); - kfree(adap->algo_data); - kfree(adap); return 0; } +#if defined(CONFIG_OF) +static const struct of_device_id i2c_gpio_dt_ids[] = { + { .compatible = "i2c-gpio", }, + { /* sentinel */ } +}; + +MODULE_DEVICE_TABLE(of, i2c_gpio_dt_ids); +#endif + static struct platform_driver i2c_gpio_driver = { .driver = { .name = "i2c-gpio", .owner = THIS_MODULE, + .of_match_table = of_match_ptr(i2c_gpio_dt_ids), }, .probe = i2c_gpio_probe, .remove = __devexit_p(i2c_gpio_remove), -- cgit v1.2.3 From eb5e76ffd4e626655944e99bb85b07e17172620d Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Fri, 2 Mar 2012 20:44:23 +0800 Subject: ARM: at91: add pmc DT support Specified the main Oscillator via clock binding. This will allow to do not hardcode it anymore in the DT board at 12MHz. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Acked-by: Rob Herring Acked-by: Nicolas Ferre --- .../devicetree/bindings/arm/atmel-pmc.txt | 11 +++++ arch/arm/boot/dts/at91sam9g20.dtsi | 5 ++ arch/arm/boot/dts/at91sam9g45.dtsi | 5 ++ arch/arm/boot/dts/at91sam9m10g45ek.dts | 11 +++++ arch/arm/boot/dts/at91sam9x5.dtsi | 5 ++ arch/arm/boot/dts/at91sam9x5cm.dtsi | 11 +++++ arch/arm/boot/dts/usb_a9g20.dts | 11 +++++ arch/arm/mach-at91/clock.c | 56 ++++++++++++++++++++-- arch/arm/mach-at91/generic.h | 1 + arch/arm/mach-at91/setup.c | 3 +- 10 files changed, 112 insertions(+), 7 deletions(-) create mode 100644 Documentation/devicetree/bindings/arm/atmel-pmc.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/arm/atmel-pmc.txt b/Documentation/devicetree/bindings/arm/atmel-pmc.txt new file mode 100644 index 000000000000..389bed5056e8 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/atmel-pmc.txt @@ -0,0 +1,11 @@ +* Power Management Controller (PMC) + +Required properties: +- compatible: Should be "atmel,at91rm9200-pmc" +- reg: Should contain PMC registers location and length + +Examples: + pmc: pmc@fffffc00 { + compatible = "atmel,at91rm9200-pmc"; + reg = <0xfffffc00 0x100>; + }; diff --git a/arch/arm/boot/dts/at91sam9g20.dtsi b/arch/arm/boot/dts/at91sam9g20.dtsi index a885a30d1c82..dd5d114a0e1d 100644 --- a/arch/arm/boot/dts/at91sam9g20.dtsi +++ b/arch/arm/boot/dts/at91sam9g20.dtsi @@ -59,6 +59,11 @@ reg = <0xfffff000 0x200>; }; + pmc: pmc@fffffc00 { + compatible = "atmel,at91rm9200-pmc"; + reg = <0xfffffc00 0x100>; + }; + pit: timer@fffffd30 { compatible = "atmel,at91sam9260-pit"; reg = <0xfffffd30 0xf>; diff --git a/arch/arm/boot/dts/at91sam9g45.dtsi b/arch/arm/boot/dts/at91sam9g45.dtsi index 92fe5a5c0eee..621a329307d6 100644 --- a/arch/arm/boot/dts/at91sam9g45.dtsi +++ b/arch/arm/boot/dts/at91sam9g45.dtsi @@ -60,6 +60,11 @@ reg = <0xfffff000 0x200>; }; + pmc: pmc@fffffc00 { + compatible = "atmel,at91rm9200-pmc"; + reg = <0xfffffc00 0x100>; + }; + pit: timer@fffffd30 { compatible = "atmel,at91sam9260-pit"; reg = <0xfffffd30 0xf>; diff --git a/arch/arm/boot/dts/at91sam9m10g45ek.dts b/arch/arm/boot/dts/at91sam9m10g45ek.dts index fd4531135431..a8958241f1d7 100644 --- a/arch/arm/boot/dts/at91sam9m10g45ek.dts +++ b/arch/arm/boot/dts/at91sam9m10g45ek.dts @@ -21,6 +21,17 @@ reg = <0x70000000 0x4000000>; }; + clocks { + #address-cells = <1>; + #size-cells = <1>; + ranges; + + main_clock: clock@0 { + compatible = "atmel,osc", "fixed-clock"; + clock-frequency = <12000000>; + }; + }; + ahb { apb { dbgu: serial@ffffee00 { diff --git a/arch/arm/boot/dts/at91sam9x5.dtsi b/arch/arm/boot/dts/at91sam9x5.dtsi index f0104f4e6ab4..3855843fc038 100644 --- a/arch/arm/boot/dts/at91sam9x5.dtsi +++ b/arch/arm/boot/dts/at91sam9x5.dtsi @@ -58,6 +58,11 @@ reg = <0xfffff000 0x200>; }; + pmc: pmc@fffffc00 { + compatible = "atmel,at91rm9200-pmc"; + reg = <0xfffffc00 0x100>; + }; + pit: timer@fffffe30 { compatible = "atmel,at91sam9260-pit"; reg = <0xfffffe30 0xf>; diff --git a/arch/arm/boot/dts/at91sam9x5cm.dtsi b/arch/arm/boot/dts/at91sam9x5cm.dtsi index 5b37033bed56..67936f83c694 100644 --- a/arch/arm/boot/dts/at91sam9x5cm.dtsi +++ b/arch/arm/boot/dts/at91sam9x5cm.dtsi @@ -12,6 +12,17 @@ reg = <0x20000000 0x8000000>; }; + clocks { + #address-cells = <1>; + #size-cells = <1>; + ranges; + + main_clock: clock@0 { + compatible = "atmel,osc", "fixed-clock"; + clock-frequency = <12000000>; + }; + }; + ahb { nand0: nand@40000000 { nand-bus-width = <8>; diff --git a/arch/arm/boot/dts/usb_a9g20.dts b/arch/arm/boot/dts/usb_a9g20.dts index 0ea90b5be512..73f1dc48f307 100644 --- a/arch/arm/boot/dts/usb_a9g20.dts +++ b/arch/arm/boot/dts/usb_a9g20.dts @@ -20,6 +20,17 @@ reg = <0x20000000 0x4000000>; }; + clocks { + #address-cells = <1>; + #size-cells = <1>; + ranges; + + main_clock: clock@0 { + compatible = "atmel,osc", "fixed-clock"; + clock-frequency = <12000000>; + }; + }; + ahb { apb { dbgu: serial@fffff200 { diff --git a/arch/arm/mach-at91/clock.c b/arch/arm/mach-at91/clock.c index be51ca7f694d..a0f4d7424cdc 100644 --- a/arch/arm/mach-at91/clock.c +++ b/arch/arm/mach-at91/clock.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include @@ -671,16 +672,12 @@ static void __init at91_upll_usbfs_clock_init(unsigned long main_clock) uhpck.rate_hz /= 1 + ((at91_pmc_read(AT91_PMC_USB) & AT91_PMC_OHCIUSBDIV) >> 8); } -int __init at91_clock_init(unsigned long main_clock) +static int __init at91_pmc_init(unsigned long main_clock) { unsigned tmp, freq, mckr; int i; int pll_overclock = false; - at91_pmc_base = ioremap(AT91_PMC, 256); - if (!at91_pmc_base) - panic("Impossible to ioremap AT91_PMC 0x%x\n", AT91_PMC); - /* * When the bootloader initialized the main oscillator correctly, * there's no problem using the cycle counter. But if it didn't, @@ -802,6 +799,55 @@ int __init at91_clock_init(unsigned long main_clock) return 0; } +#if defined(CONFIG_OF) +static struct of_device_id pmc_ids[] = { + { .compatible = "atmel,at91rm9200-pmc" }, + { /*sentinel*/ } +}; + +static struct of_device_id osc_ids[] = { + { .compatible = "atmel,osc" }, + { /*sentinel*/ } +}; + +int __init at91_dt_clock_init(void) +{ + struct device_node *np; + u32 main_clock = 0; + + np = of_find_matching_node(NULL, pmc_ids); + if (!np) + panic("unable to find compatible pmc node in dtb\n"); + + at91_pmc_base = of_iomap(np, 0); + if (!at91_pmc_base) + panic("unable to map pmc cpu registers\n"); + + of_node_put(np); + + /* retrieve the freqency of fixed clocks from device tree */ + np = of_find_matching_node(NULL, osc_ids); + if (np) { + u32 rate; + if (!of_property_read_u32(np, "clock-frequency", &rate)) + main_clock = rate; + } + + of_node_put(np); + + return at91_pmc_init(main_clock); +} +#endif + +int __init at91_clock_init(unsigned long main_clock) +{ + at91_pmc_base = ioremap(AT91_PMC, 256); + if (!at91_pmc_base) + panic("Impossible to ioremap AT91_PMC 0x%x\n", AT91_PMC); + + return at91_pmc_init(main_clock); +} + /* * Several unused clocks may be active. Turn them off. */ diff --git a/arch/arm/mach-at91/generic.h b/arch/arm/mach-at91/generic.h index d5f5083880f1..dd9b346c451d 100644 --- a/arch/arm/mach-at91/generic.h +++ b/arch/arm/mach-at91/generic.h @@ -53,6 +53,7 @@ extern void __init at91sam9rl_set_console_clock(int id); extern void __init at91sam9g45_set_console_clock(int id); #ifdef CONFIG_AT91_PMC_UNIT extern int __init at91_clock_init(unsigned long main_clock); +extern int __init at91_dt_clock_init(void); #else static int inline at91_clock_init(unsigned long main_clock) { return 0; } #endif diff --git a/arch/arm/mach-at91/setup.c b/arch/arm/mach-at91/setup.c index c0bd5a625695..d7abc25f6c64 100644 --- a/arch/arm/mach-at91/setup.c +++ b/arch/arm/mach-at91/setup.c @@ -292,9 +292,8 @@ void __init at91_dt_initialize(void) /* temporary until have the ramc binding*/ at91_boot_soc.ioremap_registers(); - /* temporary until have the pmc binding */ /* Init clock subsystem */ - at91_clock_init(12000000); + at91_dt_clock_init(); /* Register the processor-specific clocks */ at91_boot_soc.register_clocks(); -- cgit v1.2.3 From c8082d344ac4c05932fec1766e5e9ce72cf286ed Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Sat, 3 Mar 2012 03:16:27 +0800 Subject: ARM: at91: add RSTC (Reset Controller) dt support Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Acked-by: Rob Herring Acked-by: Nicolas Ferre --- .../devicetree/bindings/arm/atmel-at91.txt | 12 +++++++++ arch/arm/boot/dts/at91sam9g20.dtsi | 5 ++++ arch/arm/boot/dts/at91sam9g45.dtsi | 5 ++++ arch/arm/boot/dts/at91sam9x5.dtsi | 5 ++++ arch/arm/mach-at91/at91sam9x5.c | 1 - arch/arm/mach-at91/setup.c | 30 ++++++++++++++++++++++ 6 files changed, 57 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/arm/atmel-at91.txt b/Documentation/devicetree/bindings/arm/atmel-at91.txt index 1aeaf6f2a1ba..a64f86717b5d 100644 --- a/Documentation/devicetree/bindings/arm/atmel-at91.txt +++ b/Documentation/devicetree/bindings/arm/atmel-at91.txt @@ -30,3 +30,15 @@ One interrupt per TC channel in a TC block: reg = <0xfffdc000 0x100>; interrupts = <26 4 27 4 28 4>; }; + +RSTC Reset Controller required properties: +- compatible: Should be "atmel,-rstc". + can be "at91sam9260" or "at91sam9g45" +- reg: Should contain registers location and length + +Example: + + rstc@fffffd00 { + compatible = "atmel,at91sam9260-rstc"; + reg = <0xfffffd00 0x10>; + }; diff --git a/arch/arm/boot/dts/at91sam9g20.dtsi b/arch/arm/boot/dts/at91sam9g20.dtsi index dd5d114a0e1d..bcad6e7dccce 100644 --- a/arch/arm/boot/dts/at91sam9g20.dtsi +++ b/arch/arm/boot/dts/at91sam9g20.dtsi @@ -64,6 +64,11 @@ reg = <0xfffffc00 0x100>; }; + rstc@fffffd00 { + compatible = "atmel,at91sam9260-rstc"; + reg = <0xfffffd00 0x10>; + }; + pit: timer@fffffd30 { compatible = "atmel,at91sam9260-pit"; reg = <0xfffffd30 0xf>; diff --git a/arch/arm/boot/dts/at91sam9g45.dtsi b/arch/arm/boot/dts/at91sam9g45.dtsi index 621a329307d6..faccd4f5aace 100644 --- a/arch/arm/boot/dts/at91sam9g45.dtsi +++ b/arch/arm/boot/dts/at91sam9g45.dtsi @@ -65,6 +65,11 @@ reg = <0xfffffc00 0x100>; }; + rstc@fffffd00 { + compatible = "atmel,at91sam9g45-rstc"; + reg = <0xfffffd00 0x10>; + }; + pit: timer@fffffd30 { compatible = "atmel,at91sam9260-pit"; reg = <0xfffffd30 0xf>; diff --git a/arch/arm/boot/dts/at91sam9x5.dtsi b/arch/arm/boot/dts/at91sam9x5.dtsi index 3855843fc038..d9a93fdd35a5 100644 --- a/arch/arm/boot/dts/at91sam9x5.dtsi +++ b/arch/arm/boot/dts/at91sam9x5.dtsi @@ -63,6 +63,11 @@ reg = <0xfffffc00 0x100>; }; + rstc@fffffe00 { + compatible = "atmel,at91sam9g45-rstc"; + reg = <0xfffffe00 0x10>; + }; + pit: timer@fffffe30 { compatible = "atmel,at91sam9260-pit"; reg = <0xfffffe30 0xf>; diff --git a/arch/arm/mach-at91/at91sam9x5.c b/arch/arm/mach-at91/at91sam9x5.c index 7bec5a40d01a..c121fe5fabbd 100644 --- a/arch/arm/mach-at91/at91sam9x5.c +++ b/arch/arm/mach-at91/at91sam9x5.c @@ -306,7 +306,6 @@ static void __init at91sam9x5_ioremap_registers(void) void __init at91sam9x5_initialize(void) { - arm_pm_restart = at91sam9g45_restart; at91_extern_irq = (1 << AT91SAM9X5_ID_IRQ0); /* Register GPIO subsystem (using DT) */ diff --git a/arch/arm/mach-at91/setup.c b/arch/arm/mach-at91/setup.c index d7abc25f6c64..3e48b59dfa74 100644 --- a/arch/arm/mach-at91/setup.c +++ b/arch/arm/mach-at91/setup.c @@ -287,8 +287,38 @@ void __init at91_ioremap_matrix(u32 base_addr) } #if defined(CONFIG_OF) +static struct of_device_id rstc_ids[] = { + { .compatible = "atmel,at91sam9260-rstc", .data = at91sam9_alt_restart }, + { .compatible = "atmel,at91sam9g45-rstc", .data = at91sam9g45_restart }, + { /*sentinel*/ } +}; + +static void at91_dt_rstc(void) +{ + struct device_node *np; + const struct of_device_id *of_id; + + np = of_find_matching_node(NULL, rstc_ids); + if (!np) + panic("unable to find compatible rstc node in dtb\n"); + + at91_rstc_base = of_iomap(np, 0); + if (!at91_rstc_base) + panic("unable to map rstc cpu registers\n"); + + of_id = of_match_node(rstc_ids, np); + if (!of_id) + panic("AT91: rtsc no restart function availlable\n"); + + arm_pm_restart = of_id->data; + + of_node_put(np); +} + void __init at91_dt_initialize(void) { + at91_dt_rstc(); + /* temporary until have the ramc binding*/ at91_boot_soc.ioremap_registers(); -- cgit v1.2.3 From a7776ec625c8ca90d050953946a5b72eaf41c21c Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Fri, 2 Mar 2012 20:54:37 +0800 Subject: ARM: at91: add ram controller DT support We can now drop the call to ioremap_registers() as we have the binding for the SDRAM/DDR Controller. Drop ioremap_registers() for sam9x5 too. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Acked-by: Rob Herring Acked-by: Nicolas Ferre --- .../devicetree/bindings/arm/atmel-at91.txt | 19 ++++++++++ arch/arm/boot/dts/at91sam9g20.dtsi | 5 +++ arch/arm/boot/dts/at91sam9g45.dtsi | 6 ++++ arch/arm/boot/dts/at91sam9x5.dtsi | 5 +++ arch/arm/mach-at91/at91sam9x5.c | 6 ---- arch/arm/mach-at91/include/mach/at91sam9x5.h | 5 --- arch/arm/mach-at91/pm.c | 13 ------- arch/arm/mach-at91/setup.c | 40 ++++++++++++++++++++-- 8 files changed, 72 insertions(+), 27 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/arm/atmel-at91.txt b/Documentation/devicetree/bindings/arm/atmel-at91.txt index a64f86717b5d..1f8782077433 100644 --- a/Documentation/devicetree/bindings/arm/atmel-at91.txt +++ b/Documentation/devicetree/bindings/arm/atmel-at91.txt @@ -42,3 +42,22 @@ Example: compatible = "atmel,at91sam9260-rstc"; reg = <0xfffffd00 0x10>; }; + +RAMC SDRAM/DDR Controller required properties: +- compatible: Should be "atmel,at91sam9260-sdramc", + "atmel,at91sam9g45-ddramc", +- reg: Should contain registers location and length + For at91sam9263 and at91sam9g45 you must specify 2 entries. + +Examples: + + ramc0: ramc@ffffe800 { + compatible = "atmel,at91sam9g45-ddramc"; + reg = <0xffffe800 0x200>; + }; + + ramc0: ramc@ffffe400 { + compatible = "atmel,at91sam9g45-ddramc"; + reg = <0xffffe400 0x200 + 0xffffe600 0x200>; + }; diff --git a/arch/arm/boot/dts/at91sam9g20.dtsi b/arch/arm/boot/dts/at91sam9g20.dtsi index bcad6e7dccce..0a1df8d9bfb6 100644 --- a/arch/arm/boot/dts/at91sam9g20.dtsi +++ b/arch/arm/boot/dts/at91sam9g20.dtsi @@ -59,6 +59,11 @@ reg = <0xfffff000 0x200>; }; + ramc0: ramc@ffffea00 { + compatible = "atmel,at91sam9260-sdramc"; + reg = <0xffffea00 0x200>; + }; + pmc: pmc@fffffc00 { compatible = "atmel,at91rm9200-pmc"; reg = <0xfffffc00 0x100>; diff --git a/arch/arm/boot/dts/at91sam9g45.dtsi b/arch/arm/boot/dts/at91sam9g45.dtsi index faccd4f5aace..587a1913c062 100644 --- a/arch/arm/boot/dts/at91sam9g45.dtsi +++ b/arch/arm/boot/dts/at91sam9g45.dtsi @@ -60,6 +60,12 @@ reg = <0xfffff000 0x200>; }; + ramc0: ramc@ffffe400 { + compatible = "atmel,at91sam9g45-ddramc"; + reg = <0xffffe400 0x200 + 0xffffe600 0x200>; + }; + pmc: pmc@fffffc00 { compatible = "atmel,at91rm9200-pmc"; reg = <0xfffffc00 0x100>; diff --git a/arch/arm/boot/dts/at91sam9x5.dtsi b/arch/arm/boot/dts/at91sam9x5.dtsi index d9a93fdd35a5..73c46e3dffab 100644 --- a/arch/arm/boot/dts/at91sam9x5.dtsi +++ b/arch/arm/boot/dts/at91sam9x5.dtsi @@ -58,6 +58,11 @@ reg = <0xfffff000 0x200>; }; + ramc0: ramc@ffffe800 { + compatible = "atmel,at91sam9g45-ddramc"; + reg = <0xffffe800 0x200>; + }; + pmc: pmc@fffffc00 { compatible = "atmel,at91rm9200-pmc"; reg = <0xfffffc00 0x100>; diff --git a/arch/arm/mach-at91/at91sam9x5.c b/arch/arm/mach-at91/at91sam9x5.c index c121fe5fabbd..01b2bd816a9a 100644 --- a/arch/arm/mach-at91/at91sam9x5.c +++ b/arch/arm/mach-at91/at91sam9x5.c @@ -299,11 +299,6 @@ static void __init at91sam9x5_map_io(void) at91_init_sram(0, AT91SAM9X5_SRAM_BASE, AT91SAM9X5_SRAM_SIZE); } -static void __init at91sam9x5_ioremap_registers(void) -{ - at91_ioremap_ramc(0, AT91SAM9X5_BASE_DDRSDRC0, 512); -} - void __init at91sam9x5_initialize(void) { at91_extern_irq = (1 << AT91SAM9X5_ID_IRQ0); @@ -356,7 +351,6 @@ static unsigned int at91sam9x5_default_irq_priority[NR_AIC_IRQS] __initdata = { struct at91_init_soc __initdata at91sam9x5_soc = { .map_io = at91sam9x5_map_io, .default_irq_priority = at91sam9x5_default_irq_priority, - .ioremap_registers = at91sam9x5_ioremap_registers, .register_clocks = at91sam9x5_register_clocks, .init = at91sam9x5_initialize, }; diff --git a/arch/arm/mach-at91/include/mach/at91sam9x5.h b/arch/arm/mach-at91/include/mach/at91sam9x5.h index a297a77d88e2..88e43d534cdf 100644 --- a/arch/arm/mach-at91/include/mach/at91sam9x5.h +++ b/arch/arm/mach-at91/include/mach/at91sam9x5.h @@ -54,11 +54,6 @@ #define AT91SAM9X5_BASE_USART1 0xf8020000 #define AT91SAM9X5_BASE_USART2 0xf8024000 -/* - * System Peripherals - */ -#define AT91SAM9X5_BASE_DDRSDRC0 0xffffe800 - /* * Base addresses for early serial code (uncompress.h) */ diff --git a/arch/arm/mach-at91/pm.c b/arch/arm/mach-at91/pm.c index 6c9d5e69ac28..f630250c6b87 100644 --- a/arch/arm/mach-at91/pm.c +++ b/arch/arm/mach-at91/pm.c @@ -197,19 +197,6 @@ extern void at91_slow_clock(void __iomem *pmc, void __iomem *ramc0, extern u32 at91_slow_clock_sz; #endif -void __iomem *at91_ramc_base[2]; - -void __init at91_ioremap_ramc(int id, u32 addr, u32 size) -{ - if (id < 0 || id > 1) { - pr_emerg("Wrong RAM controller id (%d), cannot continue\n", id); - BUG(); - } - at91_ramc_base[id] = ioremap(addr, size); - if (!at91_ramc_base[id]) - panic("Impossible to ioremap ramc.%d 0x%x\n", id, addr); -} - static int at91_pm_enter(suspend_state_t state) { at91_gpio_suspend(); diff --git a/arch/arm/mach-at91/setup.c b/arch/arm/mach-at91/setup.c index 3e48b59dfa74..46d0a56ba825 100644 --- a/arch/arm/mach-at91/setup.c +++ b/arch/arm/mach-at91/setup.c @@ -52,6 +52,19 @@ void __init at91_init_interrupts(unsigned int *priority) at91_gpio_irq_setup(); } +void __iomem *at91_ramc_base[2]; + +void __init at91_ioremap_ramc(int id, u32 addr, u32 size) +{ + if (id < 0 || id > 1) { + pr_emerg("Wrong RAM controller id (%d), cannot continue\n", id); + BUG(); + } + at91_ramc_base[id] = ioremap(addr, size); + if (!at91_ramc_base[id]) + panic("Impossible to ioremap ramc.%d 0x%x\n", id, addr); +} + static struct map_desc sram_desc[2] __initdata; void __init at91_init_sram(int bank, unsigned long base, unsigned int length) @@ -315,12 +328,33 @@ static void at91_dt_rstc(void) of_node_put(np); } +static struct of_device_id ramc_ids[] = { + { .compatible = "atmel,at91sam9260-sdramc" }, + { .compatible = "atmel,at91sam9g45-ddramc" }, + { /*sentinel*/ } +}; + +static void at91_dt_ramc(void) +{ + struct device_node *np; + + np = of_find_matching_node(NULL, ramc_ids); + if (!np) + panic("unable to find compatible ram conroller node in dtb\n"); + + at91_ramc_base[0] = of_iomap(np, 0); + if (!at91_ramc_base[0]) + panic("unable to map ramc[0] cpu registers\n"); + /* the controller may have 2 banks */ + at91_ramc_base[1] = of_iomap(np, 1); + + of_node_put(np); +} + void __init at91_dt_initialize(void) { at91_dt_rstc(); - - /* temporary until have the ramc binding*/ - at91_boot_soc.ioremap_registers(); + at91_dt_ramc(); /* Init clock subsystem */ at91_dt_clock_init(); -- cgit v1.2.3 From 82015c4eae2ac67cfed8e98f8d9a4ee77a2d26ca Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Fri, 2 Mar 2012 21:01:00 +0800 Subject: ARM: at91: add Shutdown Controller (SHDWC) DT support Use a string to specific the wakeup mode to make it more readable. Add the Real-time Clock Wake-up support too for sam9g45 and sam9x5. Add AT91_SHDW_CPTWK0_MAX to specific the Max of the Wakeup Counter. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Acked-by: Rob Herring Acked-by: Nicolas Ferre --- .../devicetree/bindings/arm/atmel-at91.txt | 29 ++++++++ arch/arm/boot/dts/at91sam9g20.dtsi | 5 ++ arch/arm/boot/dts/at91sam9g45.dtsi | 5 ++ arch/arm/boot/dts/at91sam9x5.dtsi | 5 ++ arch/arm/mach-at91/include/mach/at91_shdwc.h | 4 +- arch/arm/mach-at91/setup.c | 77 ++++++++++++++++++++++ 6 files changed, 124 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/arm/atmel-at91.txt b/Documentation/devicetree/bindings/arm/atmel-at91.txt index 1f8782077433..ecc81e368715 100644 --- a/Documentation/devicetree/bindings/arm/atmel-at91.txt +++ b/Documentation/devicetree/bindings/arm/atmel-at91.txt @@ -61,3 +61,32 @@ Examples: reg = <0xffffe400 0x200 0xffffe600 0x200>; }; + +SHDWC Shutdown Controller + +required properties: +- compatible: Should be "atmel,-shdwc". + can be "at91sam9260", "at91sam9rl" or "at91sam9x5". +- reg: Should contain registers location and length + +optional properties: +- atmel,wakeup-mode: String, operation mode of the wakeup mode. + Supported values are: "none", "high", "low", "any". +- atmel,wakeup-counter: Counter on Wake-up 0 (between 0x0 and 0xf). + +optional at91sam9260 properties: +- atmel,wakeup-rtt-timer: boolean to enable Real-time Timer Wake-up. + +optional at91sam9rl properties: +- atmel,wakeup-rtc-timer: boolean to enable Real-time Clock Wake-up. +- atmel,wakeup-rtt-timer: boolean to enable Real-time Timer Wake-up. + +optional at91sam9x5 properties: +- atmel,wakeup-rtc-timer: boolean to enable Real-time Clock Wake-up. + +Example: + + rstc@fffffd00 { + compatible = "atmel,at91sam9260-rstc"; + reg = <0xfffffd00 0x10>; + }; diff --git a/arch/arm/boot/dts/at91sam9g20.dtsi b/arch/arm/boot/dts/at91sam9g20.dtsi index 0a1df8d9bfb6..9a0647bb387f 100644 --- a/arch/arm/boot/dts/at91sam9g20.dtsi +++ b/arch/arm/boot/dts/at91sam9g20.dtsi @@ -74,6 +74,11 @@ reg = <0xfffffd00 0x10>; }; + shdwc@fffffd10 { + compatible = "atmel,at91sam9260-shdwc"; + reg = <0xfffffd10 0x10>; + }; + pit: timer@fffffd30 { compatible = "atmel,at91sam9260-pit"; reg = <0xfffffd30 0xf>; diff --git a/arch/arm/boot/dts/at91sam9g45.dtsi b/arch/arm/boot/dts/at91sam9g45.dtsi index 587a1913c062..8908f078c307 100644 --- a/arch/arm/boot/dts/at91sam9g45.dtsi +++ b/arch/arm/boot/dts/at91sam9g45.dtsi @@ -83,6 +83,11 @@ }; + shdwc@fffffd10 { + compatible = "atmel,at91sam9rl-shdwc"; + reg = <0xfffffd10 0x10>; + }; + tcb0: timer@fff7c000 { compatible = "atmel,at91rm9200-tcb"; reg = <0xfff7c000 0x100>; diff --git a/arch/arm/boot/dts/at91sam9x5.dtsi b/arch/arm/boot/dts/at91sam9x5.dtsi index 73c46e3dffab..20155ccbbe14 100644 --- a/arch/arm/boot/dts/at91sam9x5.dtsi +++ b/arch/arm/boot/dts/at91sam9x5.dtsi @@ -73,6 +73,11 @@ reg = <0xfffffe00 0x10>; }; + shdwc@fffffe10 { + compatible = "atmel,at91sam9x5-shdwc"; + reg = <0xfffffe10 0x10>; + }; + pit: timer@fffffe30 { compatible = "atmel,at91sam9260-pit"; reg = <0xfffffe30 0xf>; diff --git a/arch/arm/mach-at91/include/mach/at91_shdwc.h b/arch/arm/mach-at91/include/mach/at91_shdwc.h index 1d4fe822c77a..60478ea8bd46 100644 --- a/arch/arm/mach-at91/include/mach/at91_shdwc.h +++ b/arch/arm/mach-at91/include/mach/at91_shdwc.h @@ -36,9 +36,11 @@ extern void __iomem *at91_shdwc_base; #define AT91_SHDW_WKMODE0_HIGH 1 #define AT91_SHDW_WKMODE0_LOW 2 #define AT91_SHDW_WKMODE0_ANYLEVEL 3 -#define AT91_SHDW_CPTWK0 (0xf << 4) /* Counter On Wake Up 0 */ +#define AT91_SHDW_CPTWK0_MAX 0xf /* Maximum Counter On Wake Up 0 */ +#define AT91_SHDW_CPTWK0 (AT91_SHDW_CPTWK0_MAX << 4) /* Counter On Wake Up 0 */ #define AT91_SHDW_CPTWK0_(x) ((x) << 4) #define AT91_SHDW_RTTWKEN (1 << 16) /* Real Time Timer Wake-up Enable */ +#define AT91_SHDW_RTCWKEN (1 << 17) /* Real Time Clock Wake-up Enable */ #define AT91_SHDW_SR 0x08 /* Shut Down Status Register */ #define AT91_SHDW_WAKEUP0 (1 << 0) /* Wake-up 0 Status */ diff --git a/arch/arm/mach-at91/setup.c b/arch/arm/mach-at91/setup.c index 46d0a56ba825..1083739e3065 100644 --- a/arch/arm/mach-at91/setup.c +++ b/arch/arm/mach-at91/setup.c @@ -351,10 +351,87 @@ static void at91_dt_ramc(void) of_node_put(np); } +static struct of_device_id shdwc_ids[] = { + { .compatible = "atmel,at91sam9260-shdwc", }, + { .compatible = "atmel,at91sam9rl-shdwc", }, + { .compatible = "atmel,at91sam9x5-shdwc", }, + { /*sentinel*/ } +}; + +static const char *shdwc_wakeup_modes[] = { + [AT91_SHDW_WKMODE0_NONE] = "none", + [AT91_SHDW_WKMODE0_HIGH] = "high", + [AT91_SHDW_WKMODE0_LOW] = "low", + [AT91_SHDW_WKMODE0_ANYLEVEL] = "any", +}; + +const int at91_dtget_shdwc_wakeup_mode(struct device_node *np) +{ + const char *pm; + int err, i; + + err = of_property_read_string(np, "atmel,wakeup-mode", &pm); + if (err < 0) + return AT91_SHDW_WKMODE0_ANYLEVEL; + + for (i = 0; i < ARRAY_SIZE(shdwc_wakeup_modes); i++) + if (!strcasecmp(pm, shdwc_wakeup_modes[i])) + return i; + + return -ENODEV; +} + +static void at91_dt_shdwc(void) +{ + struct device_node *np; + int wakeup_mode; + u32 reg; + u32 mode = 0; + + np = of_find_matching_node(NULL, shdwc_ids); + if (!np) { + pr_debug("AT91: unable to find compatible shutdown (shdwc) conroller node in dtb\n"); + return; + } + + at91_shdwc_base = of_iomap(np, 0); + if (!at91_shdwc_base) + panic("AT91: unable to map shdwc cpu registers\n"); + + wakeup_mode = at91_dtget_shdwc_wakeup_mode(np); + if (wakeup_mode < 0) { + pr_warn("AT91: shdwc unknown wakeup mode\n"); + goto end; + } + + if (!of_property_read_u32(np, "atmel,wakeup-counter", ®)) { + if (reg > AT91_SHDW_CPTWK0_MAX) { + pr_warn("AT91: shdwc wakeup conter 0x%x > 0x%x reduce it to 0x%x\n", + reg, AT91_SHDW_CPTWK0_MAX, AT91_SHDW_CPTWK0_MAX); + reg = AT91_SHDW_CPTWK0_MAX; + } + mode |= AT91_SHDW_CPTWK0_(reg); + } + + if (of_property_read_bool(np, "atmel,wakeup-rtc-timer")) + mode |= AT91_SHDW_RTCWKEN; + + if (of_property_read_bool(np, "atmel,wakeup-rtt-timer")) + mode |= AT91_SHDW_RTTWKEN; + + at91_shdwc_write(AT91_SHDW_MR, wakeup_mode | mode); + +end: + pm_power_off = at91sam9_poweroff; + + of_node_put(np); +} + void __init at91_dt_initialize(void) { at91_dt_rstc(); at91_dt_ramc(); + at91_dt_shdwc(); /* Init clock subsystem */ at91_dt_clock_init(); -- cgit v1.2.3 From 2419730f8f8ce04cce9e39a715c149283210ce27 Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Mon, 21 Nov 2011 06:55:18 +0800 Subject: ARM: at91: usb ohci add dt support Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Cc: Nicolas Ferre Acked-by: Grant Likely Acked-by: Greg Kroah-Hartman --- .../devicetree/bindings/usb/atmel-usb.txt | 19 ++++ drivers/usb/host/ohci-at91.c | 101 ++++++++++++++++++++- 2 files changed, 119 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/usb/atmel-usb.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/usb/atmel-usb.txt b/Documentation/devicetree/bindings/usb/atmel-usb.txt new file mode 100644 index 000000000000..6c7f728a362e --- /dev/null +++ b/Documentation/devicetree/bindings/usb/atmel-usb.txt @@ -0,0 +1,19 @@ +Atmel SOC USB controllers + +OHCI + +Required properties: + - compatible: Should be "atmel,at91rm9200-ohci" for USB controllers + used in host mode. + - num-ports: Number of ports. + - atmel,vbus-gpio: If present, specifies a gpio that needs to be + activated for the bus to be powered. + - atmel,oc-gpio: If present, specifies a gpio that needs to be + activated for the overcurrent detection. + +usb0: ohci@00500000 { + compatible = "atmel,at91rm9200-ohci", "usb-ohci"; + reg = <0x00500000 0x100000>; + interrupts = <20 4>; + num-ports = <2>; +}; diff --git a/drivers/usb/host/ohci-at91.c b/drivers/usb/host/ohci-at91.c index 8e855eb0bf89..db8963f5fbce 100644 --- a/drivers/usb/host/ohci-at91.c +++ b/drivers/usb/host/ohci-at91.c @@ -14,6 +14,8 @@ #include #include +#include +#include #include #include @@ -477,13 +479,109 @@ static irqreturn_t ohci_hcd_at91_overcurrent_irq(int irq, void *data) return IRQ_HANDLED; } +#ifdef CONFIG_OF +static const struct of_device_id at91_ohci_dt_ids[] = { + { .compatible = "atmel,at91rm9200-ohci" }, + { /* sentinel */ } +}; + +MODULE_DEVICE_TABLE(of, at91_ohci_dt_ids); + +static u64 at91_ohci_dma_mask = DMA_BIT_MASK(32); + +static int __devinit ohci_at91_of_init(struct platform_device *pdev) +{ + struct device_node *np = pdev->dev.of_node; + int i, ret, gpio; + enum of_gpio_flags flags; + struct at91_usbh_data *pdata; + u32 ports; + + if (!np) + return 0; + + /* Right now device-tree probed devices don't get dma_mask set. + * Since shared usb code relies on it, set it here for now. + * Once we have dma capability bindings this can go away. + */ + if (!pdev->dev.dma_mask) + pdev->dev.dma_mask = &at91_ohci_dma_mask; + + pdata = devm_kzalloc(&pdev->dev, sizeof(*pdata), GFP_KERNEL); + if (!pdata) + return -ENOMEM; + + if (!of_property_read_u32(np, "num-ports", &ports)) + pdata->ports = ports; + + for (i = 0; i < 2; i++) { + gpio = of_get_named_gpio_flags(np, "atmel,vbus-gpio", i, &flags); + pdata->vbus_pin[i] = gpio; + if (!gpio_is_valid(gpio)) + continue; + pdata->vbus_pin_active_low[i] = flags & OF_GPIO_ACTIVE_LOW; + ret = gpio_request(gpio, "ohci_vbus"); + if (ret) { + dev_warn(&pdev->dev, "can't request vbus gpio %d", gpio); + continue; + } + ret = gpio_direction_output(gpio, !(flags & OF_GPIO_ACTIVE_LOW) ^ 1); + if (ret) + dev_warn(&pdev->dev, "can't put vbus gpio %d as output %d", + !(flags & OF_GPIO_ACTIVE_LOW) ^ 1, gpio); + } + + for (i = 0; i < 2; i++) { + gpio = of_get_named_gpio_flags(np, "atmel,oc-gpio", i, &flags); + pdata->overcurrent_pin[i] = gpio; + if (!gpio_is_valid(gpio)) + continue; + ret = gpio_request(gpio, "ohci_overcurrent"); + if (ret) { + dev_err(&pdev->dev, "can't request overcurrent gpio %d", gpio); + continue; + } + + ret = gpio_direction_input(gpio); + if (ret) { + dev_err(&pdev->dev, "can't configure overcurrent gpio %d as input", gpio); + continue; + } + + ret = request_irq(gpio_to_irq(gpio), + ohci_hcd_at91_overcurrent_irq, + IRQF_SHARED, "ohci_overcurrent", pdev); + if (ret) { + gpio_free(gpio); + dev_warn(& pdev->dev, "cannot get GPIO IRQ for overcurrent\n"); + } + } + + pdev->dev.platform_data = pdata; + + return 0; +} +#else +static int __devinit ohci_at91_of_init(struct platform_device *pdev) +{ + return 0; +} +#endif + /*-------------------------------------------------------------------------*/ static int ohci_hcd_at91_drv_probe(struct platform_device *pdev) { - struct at91_usbh_data *pdata = pdev->dev.platform_data; + struct at91_usbh_data *pdata; int i; + i = ohci_at91_of_init(pdev); + + if (i) + return i; + + pdata = pdev->dev.platform_data; + if (pdata) { for (i = 0; i < ARRAY_SIZE(pdata->vbus_pin); i++) { if (!gpio_is_valid(pdata->vbus_pin[i])) @@ -596,5 +694,6 @@ static struct platform_driver ohci_hcd_at91_driver = { .driver = { .name = "at91_ohci", .owner = THIS_MODULE, + .of_match_table = of_match_ptr(at91_ohci_dt_ids), }, }; -- cgit v1.2.3 From 9d843003357f0e4948ac624a99a411a2dc37dfaf Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Tue, 22 Nov 2011 12:11:13 +0800 Subject: ARM: at91: usb ehci add dt support Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Cc: Nicolas Ferre Acked-by: Grant Likely Acked-by: Greg Kroah-Hartman --- .../devicetree/bindings/usb/atmel-usb.txt | 12 +++++++++++ drivers/usb/host/ehci-atmel.c | 24 +++++++++++++++++++++- 2 files changed, 35 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/usb/atmel-usb.txt b/Documentation/devicetree/bindings/usb/atmel-usb.txt index 6c7f728a362e..0143d7c5b4b9 100644 --- a/Documentation/devicetree/bindings/usb/atmel-usb.txt +++ b/Documentation/devicetree/bindings/usb/atmel-usb.txt @@ -17,3 +17,15 @@ usb0: ohci@00500000 { interrupts = <20 4>; num-ports = <2>; }; + +EHCI + +Required properties: + - compatible: Should be "atmel,at91sam9g45-ehci" for USB controllers + used in host mode. + +usb1: ehci@00800000 { + compatible = "atmel,at91sam9g45-ehci", "usb-ehci"; + reg = <0x00800000 0x100000>; + interrupts = <22 4>; +}; diff --git a/drivers/usb/host/ehci-atmel.c b/drivers/usb/host/ehci-atmel.c index a5a3ef1f0096..19f318ababa2 100644 --- a/drivers/usb/host/ehci-atmel.c +++ b/drivers/usb/host/ehci-atmel.c @@ -13,6 +13,7 @@ #include #include +#include /* interface and function clocks */ static struct clk *iclk, *fclk; @@ -115,6 +116,8 @@ static const struct hc_driver ehci_atmel_hc_driver = { .clear_tt_buffer_complete = ehci_clear_tt_buffer_complete, }; +static u64 at91_ehci_dma_mask = DMA_BIT_MASK(32); + static int __devinit ehci_atmel_drv_probe(struct platform_device *pdev) { struct usb_hcd *hcd; @@ -137,6 +140,13 @@ static int __devinit ehci_atmel_drv_probe(struct platform_device *pdev) goto fail_create_hcd; } + /* Right now device-tree probed devices don't get dma_mask set. + * Since shared usb code relies on it, set it here for now. + * Once we have dma capability bindings this can go away. + */ + if (!pdev->dev.dma_mask) + pdev->dev.dma_mask = &at91_ehci_dma_mask; + hcd = usb_create_hcd(driver, &pdev->dev, dev_name(&pdev->dev)); if (!hcd) { retval = -ENOMEM; @@ -225,9 +235,21 @@ static int __devexit ehci_atmel_drv_remove(struct platform_device *pdev) return 0; } +#ifdef CONFIG_OF +static const struct of_device_id atmel_ehci_dt_ids[] = { + { .compatible = "atmel,at91sam9g45-ehci" }, + { /* sentinel */ } +}; + +MODULE_DEVICE_TABLE(of, atmel_ehci_dt_ids); +#endif + static struct platform_driver ehci_atmel_driver = { .probe = ehci_atmel_drv_probe, .remove = __devexit_p(ehci_atmel_drv_remove), .shutdown = usb_hcd_platform_shutdown, - .driver.name = "atmel-ehci", + .driver = { + .name = "atmel-ehci", + .of_match_table = of_match_ptr(atmel_ehci_dt_ids), + }, }; -- cgit v1.2.3 From d1494a340807c9b77aa44bc8d8166353df4cf1c3 Mon Sep 17 00:00:00 2001 From: Jean-Christophe PLAGNIOL-VILLARD Date: Sat, 28 Jan 2012 22:35:36 +0800 Subject: USB: at91: Device udc add dt support Allow to compile it if AT91 is enable. Signed-off-by: Jean-Christophe PLAGNIOL-VILLARD Acked-by: Greg Kroah-Hartman --- .../devicetree/bindings/usb/atmel-usb.txt | 18 ++++++++++ drivers/usb/gadget/Kconfig | 2 +- drivers/usb/gadget/at91_udc.c | 40 ++++++++++++++++++++-- 3 files changed, 57 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/usb/atmel-usb.txt b/Documentation/devicetree/bindings/usb/atmel-usb.txt index 0143d7c5b4b9..60bd2150a3e6 100644 --- a/Documentation/devicetree/bindings/usb/atmel-usb.txt +++ b/Documentation/devicetree/bindings/usb/atmel-usb.txt @@ -29,3 +29,21 @@ usb1: ehci@00800000 { reg = <0x00800000 0x100000>; interrupts = <22 4>; }; + +AT91 USB device controller + +Required properties: + - compatible: Should be "atmel,at91rm9200-udc" + - reg: Address and length of the register set for the device + - interrupts: Should contain macb interrupt + +Optional properties: + - atmel,vbus-gpio: If present, specifies a gpio that needs to be + activated for the bus to be powered. + +usb1: gadget@fffa4000 { + compatible = "atmel,at91rm9200-udc"; + reg = <0xfffa4000 0x4000>; + interrupts = <10 4>; + atmel,vbus-gpio = <&pioC 5 0>; +}; diff --git a/drivers/usb/gadget/Kconfig b/drivers/usb/gadget/Kconfig index 85ae4b46bb68..edf114412135 100644 --- a/drivers/usb/gadget/Kconfig +++ b/drivers/usb/gadget/Kconfig @@ -137,7 +137,7 @@ choice config USB_AT91 tristate "Atmel AT91 USB Device Port" - depends on ARCH_AT91 && !ARCH_AT91SAM9RL && !ARCH_AT91SAM9G45 + depends on ARCH_AT91 help Many Atmel AT91 processors (such as the AT91RM2000) have a full speed USB Device Port with support for five configurable diff --git a/drivers/usb/gadget/at91_udc.c b/drivers/usb/gadget/at91_udc.c index f99b3dc745bd..4063209fe8da 100644 --- a/drivers/usb/gadget/at91_udc.c +++ b/drivers/usb/gadget/at91_udc.c @@ -30,6 +30,8 @@ #include #include #include +#include +#include #include #include @@ -1707,7 +1709,27 @@ static void at91udc_shutdown(struct platform_device *dev) spin_unlock_irqrestore(&udc->lock, flags); } -static int __init at91udc_probe(struct platform_device *pdev) +static void __devinit at91udc_of_init(struct at91_udc *udc, + struct device_node *np) +{ + struct at91_udc_data *board = &udc->board; + u32 val; + enum of_gpio_flags flags; + + if (of_property_read_u32(np, "atmel,vbus-polled", &val) == 0) + board->vbus_polled = 1; + + board->vbus_pin = of_get_named_gpio_flags(np, "atmel,vbus-gpio", 0, + &flags); + board->vbus_active_low = (flags & OF_GPIO_ACTIVE_LOW) ? 1 : 0; + + board->pullup_pin = of_get_named_gpio_flags(np, "atmel,pullup-gpio", 0, + &flags); + + board->pullup_active_low = (flags & OF_GPIO_ACTIVE_LOW) ? 1 : 0; +} + +static int __devinit at91udc_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; struct at91_udc *udc; @@ -1742,7 +1764,11 @@ static int __init at91udc_probe(struct platform_device *pdev) /* init software state */ udc = &controller; udc->gadget.dev.parent = dev; - udc->board = *(struct at91_udc_data *) dev->platform_data; + if (pdev->dev.of_node) + at91udc_of_init(udc, pdev->dev.of_node); + else + memcpy(&udc->board, dev->platform_data, + sizeof(struct at91_udc_data)); udc->pdev = pdev; udc->enabled = 0; spin_lock_init(&udc->lock); @@ -1971,6 +1997,15 @@ static int at91udc_resume(struct platform_device *pdev) #define at91udc_resume NULL #endif +#if defined(CONFIG_OF) +static const struct of_device_id at91_udc_dt_ids[] = { + { .compatible = "atmel,at91rm9200-udc" }, + { /* sentinel */ } +}; + +MODULE_DEVICE_TABLE(of, at91_udc_dt_ids); +#endif + static struct platform_driver at91_udc_driver = { .remove = __exit_p(at91udc_remove), .shutdown = at91udc_shutdown, @@ -1979,6 +2014,7 @@ static struct platform_driver at91_udc_driver = { .driver = { .name = (char *) driver_name, .owner = THIS_MODULE, + .of_match_table = of_match_ptr(at91_udc_dt_ids), }, }; -- cgit v1.2.3 From 641cc938815dfd09f8fa1ec72deb814f0938ac33 Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Thu, 15 Mar 2012 20:09:14 +0100 Subject: perf: Adding sysfs group format attribute for pmu device Adding sysfs group 'format' attribute for pmu device that contains a syntax description on how to construct raw events. The event configuration is described in following struct pefr_event_attr attributes: config config1 config2 Each sysfs attribute within the format attribute group, describes mapping of name and bitfield definition within one of above attributes. eg: "/sys/.../format/event" contains "config:0-7" "/sys/.../format/umask" contains "config:8-15" "/sys/.../format/usr" contains "config:16" the attribute value syntax is: line: config ':' bits config: 'config' | 'config1' | 'config2" bits: bits ',' bit_term | bit_term bit_term: VALUE '-' VALUE | VALUE Adding format attribute definitions for x86 cpu pmus. Acked-by: Peter Zijlstra Signed-off-by: Peter Zijlstra Signed-off-by: Jiri Olsa Link: http://lkml.kernel.org/n/tip-vhdk5y2hyype9j63prymty36@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- .../testing/sysfs-bus-event_source-devices-format | 14 +++++++++ arch/x86/kernel/cpu/perf_event.c | 7 +++++ arch/x86/kernel/cpu/perf_event.h | 1 + arch/x86/kernel/cpu/perf_event_amd.c | 18 +++++++++++ arch/x86/kernel/cpu/perf_event_intel.c | 36 ++++++++++++++++++++++ arch/x86/kernel/cpu/perf_event_p6.c | 19 ++++++++++++ include/linux/perf_event.h | 14 +++++++++ 7 files changed, 109 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-bus-event_source-devices-format (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-format b/Documentation/ABI/testing/sysfs-bus-event_source-devices-format new file mode 100644 index 000000000000..079afc71363d --- /dev/null +++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-format @@ -0,0 +1,14 @@ +Where: /sys/bus/event_source/devices//format +Date: January 2012 +Kernel Version: 3.3 +Contact: Jiri Olsa +Description: + Attribute group to describe the magic bits that go into + perf_event_attr::config[012] for a particular pmu. + Each attribute of this group defines the 'hardware' bitmask + we want to export, so that userspace can deal with sane + name/value pairs. + + Example: 'config1:1,6-10,44' + Defines contents of attribute that occupies bits 1,6-10,44 of + perf_event_attr::config1. diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c index 0a18d16cb58d..453ac9497574 100644 --- a/arch/x86/kernel/cpu/perf_event.c +++ b/arch/x86/kernel/cpu/perf_event.c @@ -1314,6 +1314,11 @@ static void __init pmu_check_apic(void) pr_info("no hardware sampling interrupt available.\n"); } +static struct attribute_group x86_pmu_format_group = { + .name = "format", + .attrs = NULL, +}; + static int __init init_hw_perf_events(void) { struct x86_pmu_quirk *quirk; @@ -1388,6 +1393,7 @@ static int __init init_hw_perf_events(void) } x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */ + x86_pmu_format_group.attrs = x86_pmu.format_attrs; pr_info("... version: %d\n", x86_pmu.version); pr_info("... bit width: %d\n", x86_pmu.cntval_bits); @@ -1668,6 +1674,7 @@ static struct attribute_group x86_pmu_attr_group = { static const struct attribute_group *x86_pmu_attr_groups[] = { &x86_pmu_attr_group, + &x86_pmu_format_group, NULL, }; diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h index 8484e77c211e..6638aaf54493 100644 --- a/arch/x86/kernel/cpu/perf_event.h +++ b/arch/x86/kernel/cpu/perf_event.h @@ -339,6 +339,7 @@ struct x86_pmu { * sysfs attrs */ int attr_rdpmc; + struct attribute **format_attrs; /* * CPU Hotplug hooks diff --git a/arch/x86/kernel/cpu/perf_event_amd.c b/arch/x86/kernel/cpu/perf_event_amd.c index dd002faff7a6..95e7fe1c5f0b 100644 --- a/arch/x86/kernel/cpu/perf_event_amd.c +++ b/arch/x86/kernel/cpu/perf_event_amd.c @@ -404,6 +404,21 @@ static void amd_pmu_cpu_dead(int cpu) } } +PMU_FORMAT_ATTR(event, "config:0-7,32-35"); +PMU_FORMAT_ATTR(umask, "config:8-15" ); +PMU_FORMAT_ATTR(edge, "config:18" ); +PMU_FORMAT_ATTR(inv, "config:23" ); +PMU_FORMAT_ATTR(cmask, "config:24-31" ); + +static struct attribute *amd_format_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_inv.attr, + &format_attr_cmask.attr, + NULL, +}; + static __initconst const struct x86_pmu amd_pmu = { .name = "AMD", .handle_irq = x86_pmu_handle_irq, @@ -426,6 +441,8 @@ static __initconst const struct x86_pmu amd_pmu = { .get_event_constraints = amd_get_event_constraints, .put_event_constraints = amd_put_event_constraints, + .format_attrs = amd_format_attr, + .cpu_prepare = amd_pmu_cpu_prepare, .cpu_starting = amd_pmu_cpu_starting, .cpu_dead = amd_pmu_cpu_dead, @@ -596,6 +613,7 @@ static __initconst const struct x86_pmu amd_pmu_f15h = { .cpu_dead = amd_pmu_cpu_dead, #endif .cpu_starting = amd_pmu_cpu_starting, + .format_attrs = amd_format_attr, }; __init int amd_pmu_init(void) diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c index 6a84e7f28f05..26b3e2fef104 100644 --- a/arch/x86/kernel/cpu/perf_event_intel.c +++ b/arch/x86/kernel/cpu/perf_event_intel.c @@ -1431,6 +1431,24 @@ static void core_pmu_enable_all(int added) } } +PMU_FORMAT_ATTR(event, "config:0-7" ); +PMU_FORMAT_ATTR(umask, "config:8-15" ); +PMU_FORMAT_ATTR(edge, "config:18" ); +PMU_FORMAT_ATTR(pc, "config:19" ); +PMU_FORMAT_ATTR(any, "config:21" ); /* v3 + */ +PMU_FORMAT_ATTR(inv, "config:23" ); +PMU_FORMAT_ATTR(cmask, "config:24-31" ); + +static struct attribute *intel_arch_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_pc.attr, + &format_attr_inv.attr, + &format_attr_cmask.attr, + NULL, +}; + static __initconst const struct x86_pmu core_pmu = { .name = "core", .handle_irq = x86_pmu_handle_irq, @@ -1455,6 +1473,7 @@ static __initconst const struct x86_pmu core_pmu = { .put_event_constraints = intel_put_event_constraints, .event_constraints = intel_core_event_constraints, .guest_get_msrs = core_guest_get_msrs, + .format_attrs = intel_arch_formats_attr, }; struct intel_shared_regs *allocate_shared_regs(int cpu) @@ -1553,6 +1572,21 @@ static void intel_pmu_flush_branch_stack(void) intel_pmu_lbr_reset(); } +PMU_FORMAT_ATTR(offcore_rsp, "config1:0-63"); + +static struct attribute *intel_arch3_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_pc.attr, + &format_attr_any.attr, + &format_attr_inv.attr, + &format_attr_cmask.attr, + + &format_attr_offcore_rsp.attr, /* XXX do NHM/WSM + SNB breakout */ + NULL, +}; + static __initconst const struct x86_pmu intel_pmu = { .name = "Intel", .handle_irq = intel_pmu_handle_irq, @@ -1576,6 +1610,8 @@ static __initconst const struct x86_pmu intel_pmu = { .get_event_constraints = intel_get_event_constraints, .put_event_constraints = intel_put_event_constraints, + .format_attrs = intel_arch3_formats_attr, + .cpu_prepare = intel_pmu_cpu_prepare, .cpu_starting = intel_pmu_cpu_starting, .cpu_dying = intel_pmu_cpu_dying, diff --git a/arch/x86/kernel/cpu/perf_event_p6.c b/arch/x86/kernel/cpu/perf_event_p6.c index c7181befecde..32bcfc7dd230 100644 --- a/arch/x86/kernel/cpu/perf_event_p6.c +++ b/arch/x86/kernel/cpu/perf_event_p6.c @@ -87,6 +87,23 @@ static void p6_pmu_enable_event(struct perf_event *event) (void)checking_wrmsrl(hwc->config_base, val); } +PMU_FORMAT_ATTR(event, "config:0-7" ); +PMU_FORMAT_ATTR(umask, "config:8-15" ); +PMU_FORMAT_ATTR(edge, "config:18" ); +PMU_FORMAT_ATTR(pc, "config:19" ); +PMU_FORMAT_ATTR(inv, "config:23" ); +PMU_FORMAT_ATTR(cmask, "config:24-31" ); + +static struct attribute *intel_p6_formats_attr[] = { + &format_attr_event.attr, + &format_attr_umask.attr, + &format_attr_edge.attr, + &format_attr_pc.attr, + &format_attr_inv.attr, + &format_attr_cmask.attr, + NULL, +}; + static __initconst const struct x86_pmu p6_pmu = { .name = "p6", .handle_irq = x86_pmu_handle_irq, @@ -115,6 +132,8 @@ static __initconst const struct x86_pmu p6_pmu = { .cntval_mask = (1ULL << 32) - 1, .get_event_constraints = x86_get_event_constraints, .event_constraints = p6_event_constraints, + + .format_attrs = intel_p6_formats_attr, }; __init int p6_pmu_init(void) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index bd9f55a5958d..57ae485e80fc 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -550,6 +550,7 @@ struct perf_guest_info_callbacks { #include #include #include +#include #include #define PERF_MAX_STACK_DEPTH 255 @@ -1291,5 +1292,18 @@ do { \ register_cpu_notifier(&fn##_nb); \ } while (0) + +#define PMU_FORMAT_ATTR(_name, _format) \ +static ssize_t \ +_name##_show(struct device *dev, \ + struct device_attribute *attr, \ + char *page) \ +{ \ + BUILD_BUG_ON(sizeof(_format) >= PAGE_SIZE); \ + return sprintf(page, _format "\n"); \ +} \ + \ +static struct device_attribute format_attr_##_name = __ATTR_RO(_name) + #endif /* __KERNEL__ */ #endif /* _LINUX_PERF_EVENT_H */ -- cgit v1.2.3 From 9652e8bd16e73f7a34cabf1ab114aaa5c97db660 Mon Sep 17 00:00:00 2001 From: Stefan Roese Date: Fri, 16 Mar 2012 14:03:23 +0100 Subject: ARM: SPEAr600: Add device-tree support to SPEAr600 boards This patch adds a generic target for SPEAr600 board that can be configured via the device-tree. Currently the following devices are supported via the devicetree: - VIC interrupts - PL011 UART - PL061 GPIO - Synopsys DW I2C - Synopsys DW ethernet Other peripheral devices (e.g. SMI flash, FSMC NAND flash etc) will follow in later patches. Only the spear600-evb is currently supported. Other SPEAr600 based boards will follow later. Since the current mainline SPEAr600 code only supports the SPEAr600 evaluation board, with nearly zero peripheral devices (only UART and GPIO), it makes sense to switch over to DT based configuration completely now. So this patch also removes all non-DT stuff, mainly platform device data. The files spear600.c and spear600_evb.c are removed completely. Signed-off-by: Stefan Roese Acked-by: Viresh Kumar Acked-by: Jean-Christophe PLAGNIOL-VILLARD Reviewed-by: Arnd Bergmann Signed-off-by: Arnd Bergmann --- Documentation/devicetree/bindings/arm/spear.txt | 8 ++ arch/arm/boot/dts/spear600-evb.dts | 47 +++++++ arch/arm/boot/dts/spear600.dtsi | 174 ++++++++++++++++++++++++ arch/arm/mach-spear6xx/Kconfig | 7 +- arch/arm/mach-spear6xx/Makefile | 6 - arch/arm/mach-spear6xx/clock.c | 14 +- arch/arm/mach-spear6xx/spear600.c | 25 ---- arch/arm/mach-spear6xx/spear600_evb.c | 54 -------- arch/arm/mach-spear6xx/spear6xx.c | 132 +++++------------- 9 files changed, 276 insertions(+), 191 deletions(-) create mode 100644 Documentation/devicetree/bindings/arm/spear.txt create mode 100644 arch/arm/boot/dts/spear600-evb.dts create mode 100644 arch/arm/boot/dts/spear600.dtsi delete mode 100644 arch/arm/mach-spear6xx/spear600.c delete mode 100644 arch/arm/mach-spear6xx/spear600_evb.c (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/arm/spear.txt b/Documentation/devicetree/bindings/arm/spear.txt new file mode 100644 index 000000000000..f8e54f092328 --- /dev/null +++ b/Documentation/devicetree/bindings/arm/spear.txt @@ -0,0 +1,8 @@ +ST SPEAr Platforms Device Tree Bindings +--------------------------------------- + +Boards with the ST SPEAr600 SoC shall have the following properties: + +Required root node property: + +compatible = "st,spear600"; diff --git a/arch/arm/boot/dts/spear600-evb.dts b/arch/arm/boot/dts/spear600-evb.dts new file mode 100644 index 000000000000..636292e18c90 --- /dev/null +++ b/arch/arm/boot/dts/spear600-evb.dts @@ -0,0 +1,47 @@ +/* + * Copyright 2012 Stefan Roese + * + * The code contained herein is licensed under the GNU General Public + * License. You may obtain a copy of the GNU General Public License + * Version 2 or later at the following locations: + * + * http://www.opensource.org/licenses/gpl-license.html + * http://www.gnu.org/copyleft/gpl.html + */ + +/dts-v1/; +/include/ "spear600.dtsi" + +/ { + model = "ST SPEAr600 Evaluation Board"; + compatible = "st,spear600-evb", "st,spear600"; + #address-cells = <1>; + #size-cells = <1>; + + memory { + device_type = "memory"; + reg = <0 0x10000000>; + }; + + ahb { + gmac: ethernet@e0800000 { + phy-mode = "gmii"; + status = "okay"; + }; + + apb { + serial@d0000000 { + status = "okay"; + }; + + serial@d0080000 { + status = "okay"; + }; + + i2c@d0200000 { + clock-frequency = <400000>; + status = "okay"; + }; + }; + }; +}; diff --git a/arch/arm/boot/dts/spear600.dtsi b/arch/arm/boot/dts/spear600.dtsi new file mode 100644 index 000000000000..ebe0885a2b98 --- /dev/null +++ b/arch/arm/boot/dts/spear600.dtsi @@ -0,0 +1,174 @@ +/* + * Copyright 2012 Stefan Roese + * + * The code contained herein is licensed under the GNU General Public + * License. You may obtain a copy of the GNU General Public License + * Version 2 or later at the following locations: + * + * http://www.opensource.org/licenses/gpl-license.html + * http://www.gnu.org/copyleft/gpl.html + */ + +/include/ "skeleton.dtsi" + +/ { + compatible = "st,spear600"; + + cpus { + cpu@0 { + compatible = "arm,arm926ejs"; + }; + }; + + memory { + device_type = "memory"; + reg = <0 0x40000000>; + }; + + ahb { + #address-cells = <1>; + #size-cells = <1>; + compatible = "simple-bus"; + ranges = <0xd0000000 0xd0000000 0x30000000>; + + vic0: interrupt-controller@f1100000 { + compatible = "arm,pl190-vic"; + interrupt-controller; + reg = <0xf1100000 0x1000>; + #interrupt-cells = <1>; + }; + + vic1: interrupt-controller@f1000000 { + compatible = "arm,pl190-vic"; + interrupt-controller; + reg = <0xf1000000 0x1000>; + #interrupt-cells = <1>; + }; + + gmac: ethernet@e0800000 { + compatible = "st,spear600-gmac"; + reg = <0xe0800000 0x8000>; + interrupt-parent = <&vic1>; + interrupts = <24 23>; + interrupt-names = "macirq", "eth_wake_irq"; + status = "disabled"; + }; + + fsmc: flash@d1800000 { + compatible = "st,spear600-fsmc-nand"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0xd1800000 0x1000 /* FSMC Register */ + 0xd2000000 0x4000>; /* NAND Base */ + reg-names = "fsmc_regs", "nand_data"; + st,ale-off = <0x20000>; + st,cle-off = <0x10000>; + status = "disabled"; + }; + + smi: flash@fc000000 { + compatible = "st,spear600-smi"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0xfc000000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <12>; + status = "disabled"; + }; + + ehci@e1800000 { + compatible = "st,spear600-ehci", "usb-ehci"; + reg = <0xe1800000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <27>; + status = "disabled"; + }; + + ehci@e2000000 { + compatible = "st,spear600-ehci", "usb-ehci"; + reg = <0xe2000000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <29>; + status = "disabled"; + }; + + ohci@e1900000 { + compatible = "st,spear600-ohci", "usb-ohci"; + reg = <0xe1900000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <26>; + status = "disabled"; + }; + + ohci@e2100000 { + compatible = "st,spear600-ohci", "usb-ohci"; + reg = <0xe2100000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <28>; + status = "disabled"; + }; + + apb { + #address-cells = <1>; + #size-cells = <1>; + compatible = "simple-bus"; + ranges = <0xd0000000 0xd0000000 0x30000000>; + + serial@d0000000 { + compatible = "arm,pl011", "arm,primecell"; + reg = <0xd0000000 0x1000>; + interrupt-parent = <&vic0>; + interrupts = <24>; + status = "disabled"; + }; + + serial@d0080000 { + compatible = "arm,pl011", "arm,primecell"; + reg = <0xd0080000 0x1000>; + interrupt-parent = <&vic0>; + interrupts = <25>; + status = "disabled"; + }; + + /* local/cpu GPIO */ + gpio0: gpio@f0100000 { + #gpio-cells = <2>; + compatible = "arm,pl061", "arm,primecell"; + gpio-controller; + reg = <0xf0100000 0x1000>; + interrupt-parent = <&vic0>; + interrupts = <18>; + }; + + /* basic GPIO */ + gpio1: gpio@fc980000 { + #gpio-cells = <2>; + compatible = "arm,pl061", "arm,primecell"; + gpio-controller; + reg = <0xfc980000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <19>; + }; + + /* appl GPIO */ + gpio2: gpio@d8100000 { + #gpio-cells = <2>; + compatible = "arm,pl061", "arm,primecell"; + gpio-controller; + reg = <0xd8100000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <4>; + }; + + i2c@d0200000 { + #address-cells = <1>; + #size-cells = <0>; + compatible = "snps,designware-i2c"; + reg = <0xd0200000 0x1000>; + interrupt-parent = <&vic0>; + interrupts = <28>; + status = "disabled"; + }; + }; + }; +}; diff --git a/arch/arm/mach-spear6xx/Kconfig b/arch/arm/mach-spear6xx/Kconfig index ff4ae5ba00f1..fbe298bd1d92 100644 --- a/arch/arm/mach-spear6xx/Kconfig +++ b/arch/arm/mach-spear6xx/Kconfig @@ -5,11 +5,12 @@ if ARCH_SPEAR6XX menu "SPEAr6xx Implementations" -config BOARD_SPEAR600_EVB - bool "SPEAr600 Evaluation Board" +config BOARD_SPEAR600_DT + bool "SPEAr600 generic board configured via device-tree" select MACH_SPEAR600 + select USE_OF help - Supports ST SPEAr600 Evaluation Board + Supports ST SPEAr600 boards configured via the device-tree endmenu diff --git a/arch/arm/mach-spear6xx/Makefile b/arch/arm/mach-spear6xx/Makefile index cc1a4d82d459..76e5750552fc 100644 --- a/arch/arm/mach-spear6xx/Makefile +++ b/arch/arm/mach-spear6xx/Makefile @@ -4,9 +4,3 @@ # common files obj-y += clock.o spear6xx.o - -# spear600 specific files -obj-$(CONFIG_MACH_SPEAR600) += spear600.o - -# spear600 boards files -obj-$(CONFIG_BOARD_SPEAR600_EVB) += spear600_evb.o diff --git a/arch/arm/mach-spear6xx/clock.c b/arch/arm/mach-spear6xx/clock.c index ac70e0d88fef..358f2800f17b 100644 --- a/arch/arm/mach-spear6xx/clock.c +++ b/arch/arm/mach-spear6xx/clock.c @@ -641,8 +641,8 @@ static struct clk_lookup spear_clk_lookups[] = { { .con_id = "gpt0_synth_clk", .clk = &gpt0_synth_clk}, { .con_id = "gpt2_synth_clk", .clk = &gpt2_synth_clk}, { .con_id = "gpt3_synth_clk", .clk = &gpt3_synth_clk}, - { .dev_id = "uart0", .clk = &uart0_clk}, - { .dev_id = "uart1", .clk = &uart1_clk}, + { .dev_id = "d0000000.serial", .clk = &uart0_clk}, + { .dev_id = "d0080000.serial", .clk = &uart1_clk}, { .dev_id = "firda", .clk = &firda_clk}, { .dev_id = "clcd", .clk = &clcd_clk}, { .dev_id = "gpt0", .clk = &gpt0_clk}, @@ -655,20 +655,20 @@ static struct clk_lookup spear_clk_lookups[] = { { .con_id = "usbh.1_clk", .clk = &usbh1_clk}, /* clock derived from ahb clk */ { .con_id = "apb_clk", .clk = &apb_clk}, - { .dev_id = "i2c_designware.0", .clk = &i2c_clk}, + { .dev_id = "d0200000.i2c", .clk = &i2c_clk}, { .dev_id = "dma", .clk = &dma_clk}, { .dev_id = "jpeg", .clk = &jpeg_clk}, { .dev_id = "gmac", .clk = &gmac_clk}, { .dev_id = "smi", .clk = &smi_clk}, - { .con_id = "fsmc", .clk = &fsmc_clk}, + { .dev_id = "fsmc-nand", .clk = &fsmc_clk}, /* clock derived from apb clk */ { .dev_id = "adc", .clk = &adc_clk}, { .dev_id = "ssp-pl022.0", .clk = &ssp0_clk}, { .dev_id = "ssp-pl022.1", .clk = &ssp1_clk}, { .dev_id = "ssp-pl022.2", .clk = &ssp2_clk}, - { .dev_id = "gpio0", .clk = &gpio0_clk}, - { .dev_id = "gpio1", .clk = &gpio1_clk}, - { .dev_id = "gpio2", .clk = &gpio2_clk}, + { .dev_id = "f0100000.gpio", .clk = &gpio0_clk}, + { .dev_id = "fc980000.gpio", .clk = &gpio1_clk}, + { .dev_id = "d8100000.gpio", .clk = &gpio2_clk}, }; void __init spear6xx_clk_init(void) diff --git a/arch/arm/mach-spear6xx/spear600.c b/arch/arm/mach-spear6xx/spear600.c deleted file mode 100644 index d0e6eeae9b04..000000000000 --- a/arch/arm/mach-spear6xx/spear600.c +++ /dev/null @@ -1,25 +0,0 @@ -/* - * arch/arm/mach-spear6xx/spear600.c - * - * SPEAr600 machine source file - * - * Copyright (C) 2009 ST Microelectronics - * Rajeev Kumar - * - * This file is licensed under the terms of the GNU General Public - * License version 2. This program is licensed "as is" without any - * warranty of any kind, whether express or implied. - */ - -#include -#include -#include -#include - -/* Add spear600 specific devices here */ - -void __init spear600_init(void) -{ - /* call spear6xx family common init function */ - spear6xx_init(); -} diff --git a/arch/arm/mach-spear6xx/spear600_evb.c b/arch/arm/mach-spear6xx/spear600_evb.c deleted file mode 100644 index c6e4254741cc..000000000000 --- a/arch/arm/mach-spear6xx/spear600_evb.c +++ /dev/null @@ -1,54 +0,0 @@ -/* - * arch/arm/mach-spear6xx/spear600_evb.c - * - * SPEAr600 evaluation board source file - * - * Copyright (C) 2009 ST Microelectronics - * Viresh Kumar - * - * This file is licensed under the terms of the GNU General Public - * License version 2. This program is licensed "as is" without any - * warranty of any kind, whether express or implied. - */ - -#include -#include -#include -#include -#include - -static struct amba_device *amba_devs[] __initdata = { - &gpio_device[0], - &gpio_device[1], - &gpio_device[2], - &uart_device[0], - &uart_device[1], -}; - -static struct platform_device *plat_devs[] __initdata = { -}; - -static void __init spear600_evb_init(void) -{ - unsigned int i; - - /* call spear600 machine init function */ - spear600_init(); - - /* Add Platform Devices */ - platform_add_devices(plat_devs, ARRAY_SIZE(plat_devs)); - - /* Add Amba Devices */ - for (i = 0; i < ARRAY_SIZE(amba_devs); i++) - amba_device_register(amba_devs[i], &iomem_resource); -} - -MACHINE_START(SPEAR600, "ST-SPEAR600-EVB") - .atag_offset = 0x100, - .map_io = spear6xx_map_io, - .init_irq = spear6xx_init_irq, - .handle_irq = vic_handle_irq, - .timer = &spear6xx_timer, - .init_machine = spear600_evb_init, - .restart = spear_restart, -MACHINE_END diff --git a/arch/arm/mach-spear6xx/spear6xx.c b/arch/arm/mach-spear6xx/spear6xx.c index e0f6628c8b2c..2ed8b14c82c8 100644 --- a/arch/arm/mach-spear6xx/spear6xx.c +++ b/arch/arm/mach-spear6xx/spear6xx.c @@ -6,111 +6,21 @@ * Copyright (C) 2009 ST Microelectronics * Rajeev Kumar * + * Copyright 2012 Stefan Roese + * * This file is licensed under the terms of the GNU General Public * License version 2. This program is licensed "as is" without any * warranty of any kind, whether express or implied. */ -#include -#include -#include -#include +#include +#include +#include +#include #include -#include #include #include #include -#include - -/* Add spear6xx machines common devices here */ -/* uart device registration */ -struct amba_device uart_device[] = { - { - .dev = { - .init_name = "uart0", - }, - .res = { - .start = SPEAR6XX_ICM1_UART0_BASE, - .end = SPEAR6XX_ICM1_UART0_BASE + SZ_4K - 1, - .flags = IORESOURCE_MEM, - }, - .irq = {IRQ_UART_0, NO_IRQ}, - }, { - .dev = { - .init_name = "uart1", - }, - .res = { - .start = SPEAR6XX_ICM1_UART1_BASE, - .end = SPEAR6XX_ICM1_UART1_BASE + SZ_4K - 1, - .flags = IORESOURCE_MEM, - }, - .irq = {IRQ_UART_1, NO_IRQ}, - } -}; - -/* gpio device registration */ -static struct pl061_platform_data gpio_plat_data[] = { - { - .gpio_base = 0, - .irq_base = SPEAR_GPIO0_INT_BASE, - }, { - .gpio_base = 8, - .irq_base = SPEAR_GPIO1_INT_BASE, - }, { - .gpio_base = 16, - .irq_base = SPEAR_GPIO2_INT_BASE, - }, -}; - -struct amba_device gpio_device[] = { - { - .dev = { - .init_name = "gpio0", - .platform_data = &gpio_plat_data[0], - }, - .res = { - .start = SPEAR6XX_CPU_GPIO_BASE, - .end = SPEAR6XX_CPU_GPIO_BASE + SZ_4K - 1, - .flags = IORESOURCE_MEM, - }, - .irq = {IRQ_LOCAL_GPIO, NO_IRQ}, - }, { - .dev = { - .init_name = "gpio1", - .platform_data = &gpio_plat_data[1], - }, - .res = { - .start = SPEAR6XX_ICM3_GPIO_BASE, - .end = SPEAR6XX_ICM3_GPIO_BASE + SZ_4K - 1, - .flags = IORESOURCE_MEM, - }, - .irq = {IRQ_BASIC_GPIO, NO_IRQ}, - }, { - .dev = { - .init_name = "gpio2", - .platform_data = &gpio_plat_data[2], - }, - .res = { - .start = SPEAR6XX_ICM2_GPIO_BASE, - .end = SPEAR6XX_ICM2_GPIO_BASE + SZ_4K - 1, - .flags = IORESOURCE_MEM, - }, - .irq = {IRQ_APPL_GPIO, NO_IRQ}, - } -}; - -/* This will add devices, and do machine specific tasks */ -void __init spear6xx_init(void) -{ - /* nothing to do for now */ -} - -/* This will initialize vic */ -void __init spear6xx_init_irq(void) -{ - vic_init((void __iomem *)VA_SPEAR6XX_CPU_VIC_PRI_BASE, 0, ~0, 0); - vic_init((void __iomem *)VA_SPEAR6XX_CPU_VIC_SEC_BASE, 32, ~0, 0); -} /* Following will create static virtual/physical mappings */ static struct map_desc spear6xx_io_desc[] __initdata = { @@ -181,3 +91,33 @@ static void __init spear6xx_timer_init(void) struct sys_timer spear6xx_timer = { .init = spear6xx_timer_init, }; + +static void __init spear600_dt_init(void) +{ + of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL); +} + +static const char *spear600_dt_board_compat[] = { + "st,spear600", + NULL +}; + +static const struct of_device_id vic_of_match[] __initconst = { + { .compatible = "arm,pl190-vic", .data = vic_of_init, }, + { /* Sentinel */ } +}; + +static void __init spear6xx_dt_init_irq(void) +{ + of_irq_init(vic_of_match); +} + +DT_MACHINE_START(SPEAR600_DT, "ST SPEAr600 (Flattened Device Tree)") + .map_io = spear6xx_map_io, + .init_irq = spear6xx_dt_init_irq, + .handle_irq = vic_handle_irq, + .timer = &spear6xx_timer, + .init_machine = spear600_dt_init, + .restart = spear_restart, + .dt_compat = spear600_dt_board_compat, +MACHINE_END -- cgit v1.2.3 From 69fe8a8e92ae6877167f222838bd0c92b35c7d72 Mon Sep 17 00:00:00 2001 From: Mike Turquette Date: Thu, 15 Mar 2012 23:11:18 -0700 Subject: Documentation: common clk API Provide documentation for the common clk structures and APIs. This code can be found in drivers/clk/ and include/linux/clk*.h. Signed-off-by: Mike Turquette Signed-off-by: Mike Turquette Reviewed-by: Andrew Lunn Cc: Russell King Cc: Jeremy Kerr Cc: Thomas Gleixner Cc: Arnd Bergman Cc: Paul Walmsley Cc: Shawn Guo Cc: Sascha Hauer Cc: Richard Zhao Cc: Saravana Kannan Cc: Magnus Damm Cc: Rob Herring Cc: Mark Brown Cc: Linus Walleij Cc: Stephen Boyd Cc: Amit Kucheria Cc: Deepak Saxena Cc: Grant Likely Signed-off-by: Arnd Bergmann --- Documentation/clk.txt | 233 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 233 insertions(+) create mode 100644 Documentation/clk.txt (limited to 'Documentation') diff --git a/Documentation/clk.txt b/Documentation/clk.txt new file mode 100644 index 000000000000..1943fae014fd --- /dev/null +++ b/Documentation/clk.txt @@ -0,0 +1,233 @@ + The Common Clk Framework + Mike Turquette + +This document endeavours to explain the common clk framework details, +and how to port a platform over to this framework. It is not yet a +detailed explanation of the clock api in include/linux/clk.h, but +perhaps someday it will include that information. + + Part 1 - introduction and interface split + +The common clk framework is an interface to control the clock nodes +available on various devices today. This may come in the form of clock +gating, rate adjustment, muxing or other operations. This framework is +enabled with the CONFIG_COMMON_CLK option. + +The interface itself is divided into two halves, each shielded from the +details of its counterpart. First is the common definition of struct +clk which unifies the framework-level accounting and infrastructure that +has traditionally been duplicated across a variety of platforms. Second +is a common implementation of the clk.h api, defined in +drivers/clk/clk.c. Finally there is struct clk_ops, whose operations +are invoked by the clk api implementation. + +The second half of the interface is comprised of the hardware-specific +callbacks registered with struct clk_ops and the corresponding +hardware-specific structures needed to model a particular clock. For +the remainder of this document any reference to a callback in struct +clk_ops, such as .enable or .set_rate, implies the hardware-specific +implementation of that code. Likewise, references to struct clk_foo +serve as a convenient shorthand for the implementation of the +hardware-specific bits for the hypothetical "foo" hardware. + +Tying the two halves of this interface together is struct clk_hw, which +is defined in struct clk_foo and pointed to within struct clk. This +allows easy for navigation between the two discrete halves of the common +clock interface. + + Part 2 - common data structures and api + +Below is the common struct clk definition from +include/linux/clk-private.h, modified for brevity: + + struct clk { + const char *name; + const struct clk_ops *ops; + struct clk_hw *hw; + char **parent_names; + struct clk **parents; + struct clk *parent; + struct hlist_head children; + struct hlist_node child_node; + ... + }; + +The members above make up the core of the clk tree topology. The clk +api itself defines several driver-facing functions which operate on +struct clk. That api is documented in include/linux/clk.h. + +Platforms and devices utilizing the common struct clk use the struct +clk_ops pointer in struct clk to perform the hardware-specific parts of +the operations defined in clk.h: + + struct clk_ops { + int (*prepare)(struct clk_hw *hw); + void (*unprepare)(struct clk_hw *hw); + int (*enable)(struct clk_hw *hw); + void (*disable)(struct clk_hw *hw); + int (*is_enabled)(struct clk_hw *hw); + unsigned long (*recalc_rate)(struct clk_hw *hw, + unsigned long parent_rate); + long (*round_rate)(struct clk_hw *hw, unsigned long, + unsigned long *); + int (*set_parent)(struct clk_hw *hw, u8 index); + u8 (*get_parent)(struct clk_hw *hw); + int (*set_rate)(struct clk_hw *hw, unsigned long); + void (*init)(struct clk_hw *hw); + }; + + Part 3 - hardware clk implementations + +The strength of the common struct clk comes from its .ops and .hw pointers +which abstract the details of struct clk from the hardware-specific bits, and +vice versa. To illustrate consider the simple gateable clk implementation in +drivers/clk/clk-gate.c: + +struct clk_gate { + struct clk_hw hw; + void __iomem *reg; + u8 bit_idx; + ... +}; + +struct clk_gate contains struct clk_hw hw as well as hardware-specific +knowledge about which register and bit controls this clk's gating. +Nothing about clock topology or accounting, such as enable_count or +notifier_count, is needed here. That is all handled by the common +framework code and struct clk. + +Let's walk through enabling this clk from driver code: + + struct clk *clk; + clk = clk_get(NULL, "my_gateable_clk"); + + clk_prepare(clk); + clk_enable(clk); + +The call graph for clk_enable is very simple: + +clk_enable(clk); + clk->ops->enable(clk->hw); + [resolves to...] + clk_gate_enable(hw); + [resolves struct clk gate with to_clk_gate(hw)] + clk_gate_set_bit(gate); + +And the definition of clk_gate_set_bit: + +static void clk_gate_set_bit(struct clk_gate *gate) +{ + u32 reg; + + reg = __raw_readl(gate->reg); + reg |= BIT(gate->bit_idx); + writel(reg, gate->reg); +} + +Note that to_clk_gate is defined as: + +#define to_clk_gate(_hw) container_of(_hw, struct clk_gate, clk) + +This pattern of abstraction is used for every clock hardware +representation. + + Part 4 - supporting your own clk hardware + +When implementing support for a new type of clock it only necessary to +include the following header: + +#include + +include/linux/clk.h is included within that header and clk-private.h +must never be included from the code which implements the operations for +a clock. More on that below in Part 5. + +To construct a clk hardware structure for your platform you must define +the following: + +struct clk_foo { + struct clk_hw hw; + ... hardware specific data goes here ... +}; + +To take advantage of your data you'll need to support valid operations +for your clk: + +struct clk_ops clk_foo_ops { + .enable = &clk_foo_enable; + .disable = &clk_foo_disable; +}; + +Implement the above functions using container_of: + +#define to_clk_foo(_hw) container_of(_hw, struct clk_foo, hw) + +int clk_foo_enable(struct clk_hw *hw) +{ + struct clk_foo *foo; + + foo = to_clk_foo(hw); + + ... perform magic on foo ... + + return 0; +}; + +Below is a matrix detailing which clk_ops are mandatory based upon the +hardware capbilities of that clock. A cell marked as "y" means +mandatory, a cell marked as "n" implies that either including that +callback is invalid or otherwise uneccesary. Empty cells are either +optional or must be evaluated on a case-by-case basis. + + clock hardware characteristics + ----------------------------------------------------------- + | gate | change rate | single parent | multiplexer | root | + |------|-------------|---------------|-------------|------| +.prepare | | | | | | +.unprepare | | | | | | + | | | | | | +.enable | y | | | | | +.disable | y | | | | | +.is_enabled | y | | | | | + | | | | | | +.recalc_rate | | y | | | | +.round_rate | | y | | | | +.set_rate | | y | | | | + | | | | | | +.set_parent | | | n | y | n | +.get_parent | | | n | y | n | + | | | | | | +.init | | | | | | + ----------------------------------------------------------- + +Finally, register your clock at run-time with a hardware-specific +registration function. This function simply populates struct clk_foo's +data and then passes the common struct clk parameters to the framework +with a call to: + +clk_register(...) + +See the basic clock types in drivers/clk/clk-*.c for examples. + + Part 5 - static initialization of clock data + +For platforms with many clocks (often numbering into the hundreds) it +may be desirable to statically initialize some clock data. This +presents a problem since the definition of struct clk should be hidden +from everyone except for the clock core in drivers/clk/clk.c. + +To get around this problem struct clk's definition is exposed in +include/linux/clk-private.h along with some macros for more easily +initializing instances of the basic clock types. These clocks must +still be initialized with the common clock framework via a call to +__clk_init. + +clk-private.h must NEVER be included by code which implements struct +clk_ops callbacks, nor must it be included by any logic which pokes +around inside of struct clk at run-time. To do so is a layering +violation. + +To better enforce this policy, always follow this simple rule: any +statically initialized clock data MUST be defined in a separate file +from the logic that implements its ops. Basically separate the logic +from the data and all is well. -- cgit v1.2.3 From 747a562f342895bbb6cfdfcb82104b4b2ae566e6 Mon Sep 17 00:00:00 2001 From: John Hughes Date: Wed, 16 Nov 2011 19:51:57 +0100 Subject: to fix scancodes returned by sony-laptop driver Fix scancodes returned by driver to match scancodes used to remap keys. (Before the patch FN/E returned scancode 0x1B, but to remap scancode 0x14 had to be used). The scancodes returned by the sony-laptop driver for function keys did not match the scancodes used to remap keys. Also, since the scancode was sent to the input subsystem after the mapped keysym the /lib/udev/keymap utility was confused about which scancode to report for which keysym. This patch fixes the driver so the correct scancode is shown for each key. It also adds to the documentation a description of where to find the scancodes. Signed-off-by: John Hughes Acked-by: Dmitry Torokhov Signed-off-by: Matthew Garrett --- Documentation/laptops/sony-laptop.txt | 5 +++++ drivers/platform/x86/sony-laptop.c | 13 ++++++++----- 2 files changed, 13 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/laptops/sony-laptop.txt b/Documentation/laptops/sony-laptop.txt index 2bd4e82e5d9f..0d5ac7f5287e 100644 --- a/Documentation/laptops/sony-laptop.txt +++ b/Documentation/laptops/sony-laptop.txt @@ -17,6 +17,11 @@ subsystem. See the logs of acpid or /proc/acpi/event and devices are created by the driver. Additionally, loading the driver with the debug option will report all events in the kernel log. +The "scancodes" passed to the input system (that can be remapped with udev) +are indexes to the table "sony_laptop_input_keycode_map" in the sony-laptop.c +module. For example the "FN/E" key combination (EJECTCD on some models) +generates the scancode 20 (0x14). + Backlight control: ------------------ If your laptop model supports it, you will find sysfs files in the diff --git a/drivers/platform/x86/sony-laptop.c b/drivers/platform/x86/sony-laptop.c index 40c470507e50..8a51795aa02a 100644 --- a/drivers/platform/x86/sony-laptop.c +++ b/drivers/platform/x86/sony-laptop.c @@ -347,6 +347,7 @@ static void sony_laptop_report_input_event(u8 event) struct input_dev *jog_dev = sony_laptop_input.jog_dev; struct input_dev *key_dev = sony_laptop_input.key_dev; struct sony_laptop_keypress kp = { NULL }; + int scancode = -1; if (event == SONYPI_EVENT_FNKEY_RELEASED || event == SONYPI_EVENT_ANYBUTTON_RELEASED) { @@ -380,8 +381,8 @@ static void sony_laptop_report_input_event(u8 event) dprintk("sony_laptop_report_input_event, event not known: %d\n", event); break; } - if (sony_laptop_input_index[event] != -1) { - kp.key = sony_laptop_input_keycode_map[sony_laptop_input_index[event]]; + if ((scancode = sony_laptop_input_index[event]) != -1) { + kp.key = sony_laptop_input_keycode_map[scancode]; if (kp.key != KEY_UNKNOWN) kp.dev = key_dev; } @@ -389,9 +390,11 @@ static void sony_laptop_report_input_event(u8 event) } if (kp.dev) { + /* if we have a scancode we emit it so we can always + remap the key */ + if (scancode != -1) + input_event(kp.dev, EV_MSC, MSC_SCAN, scancode); input_report_key(kp.dev, kp.key, 1); - /* we emit the scancode so we can always remap the key */ - input_event(kp.dev, EV_MSC, MSC_SCAN, event); input_sync(kp.dev); /* schedule key release */ @@ -466,7 +469,7 @@ static int sony_laptop_setup_input(struct acpi_device *acpi_device) jog_dev->name = "Sony Vaio Jogdial"; jog_dev->id.bustype = BUS_ISA; jog_dev->id.vendor = PCI_VENDOR_ID_SONY; - key_dev->dev.parent = &acpi_device->dev; + jog_dev->dev.parent = &acpi_device->dev; input_set_capability(jog_dev, EV_KEY, BTN_MIDDLE); input_set_capability(jog_dev, EV_REL, REL_WHEEL); -- cgit v1.2.3 From cb5b5c912ee8d97ea60c2d6c43d17ab6585947b8 Mon Sep 17 00:00:00 2001 From: Corentin Chary Date: Sat, 26 Nov 2011 11:00:05 +0100 Subject: samsung-laptop: add battery life extender support Signed-off-by: Corentin Chary Acked-by: Greg Kroah-Hartman Signed-off-by: Matthew Garrett --- .../ABI/testing/sysfs-driver-samsung-laptop | 10 +++ drivers/platform/x86/samsung-laptop.c | 82 ++++++++++++++++++++++ 2 files changed, 92 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-driver-samsung-laptop b/Documentation/ABI/testing/sysfs-driver-samsung-laptop index 0a810231aad4..a6a56e11ebaf 100644 --- a/Documentation/ABI/testing/sysfs-driver-samsung-laptop +++ b/Documentation/ABI/testing/sysfs-driver-samsung-laptop @@ -17,3 +17,13 @@ Description: Some Samsung laptops have different "performance levels" Specifically, not all support the "overclock" option, and it's still unknown if this value even changes anything, other than making the user feel a bit better. + +What: /sys/devices/platform/samsung/battery_life_extender +Date: December 1, 2011 +KernelVersion: 3.3 +Contact: Corentin Chary +Description: Max battery charge level can be modified, battery cycle + life can be extended by reducing the max battery charge + level. + 0 means normal battery mode (100% charge) + 1 means battery life extender mode (80% charge) diff --git a/drivers/platform/x86/samsung-laptop.c b/drivers/platform/x86/samsung-laptop.c index b39fa53baa22..918fa358c11e 100644 --- a/drivers/platform/x86/samsung-laptop.c +++ b/drivers/platform/x86/samsung-laptop.c @@ -104,6 +104,10 @@ struct sabi_commands { u16 get_performance_level; u16 set_performance_level; + /* 0x80 is off, 0x81 is on */ + u16 get_battery_life_extender; + u16 set_battery_life_extender; + /* * Tell the BIOS that Linux is running on this machine. * 81 is on, 80 is off @@ -157,6 +161,9 @@ static const struct sabi_config sabi_configs[] = { .get_performance_level = 0x08, .set_performance_level = 0x09, + .get_battery_life_extender = 0xFFFF, + .set_battery_life_extender = 0xFFFF, + .set_linux = 0x0a, }, @@ -204,6 +211,9 @@ static const struct sabi_config sabi_configs[] = { .get_performance_level = 0x31, .set_performance_level = 0x32, + .get_battery_life_extender = 0x65, + .set_battery_life_extender = 0x66, + .set_linux = 0xff, }, @@ -543,8 +553,78 @@ static ssize_t set_performance_level(struct device *dev, static DEVICE_ATTR(performance_level, S_IWUSR | S_IRUGO, get_performance_level, set_performance_level); +static int read_battery_life_extender(struct samsung_laptop *samsung) +{ + const struct sabi_commands *commands = &samsung->config->commands; + struct sabi_data data; + int retval; + + if (commands->get_battery_life_extender == 0xFFFF) + return -ENODEV; + + memset(&data, 0, sizeof(data)); + data.data[0] = 0x80; + retval = sabi_command(samsung, commands->get_battery_life_extender, + &data, &data); + + if (retval) + return retval; + + if (data.data[0] != 0 && data.data[0] != 1) + return -ENODEV; + + return data.data[0]; +} + +static int write_battery_life_extender(struct samsung_laptop *samsung, + int enabled) +{ + const struct sabi_commands *commands = &samsung->config->commands; + struct sabi_data data; + + memset(&data, 0, sizeof(data)); + data.data[0] = 0x80 | enabled; + return sabi_command(samsung, commands->set_battery_life_extender, + &data, NULL); +} + +static ssize_t get_battery_life_extender(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct samsung_laptop *samsung = dev_get_drvdata(dev); + int ret; + + ret = read_battery_life_extender(samsung); + if (ret < 0) + return ret; + + return sprintf(buf, "%d\n", ret); +} + +static ssize_t set_battery_life_extender(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct samsung_laptop *samsung = dev_get_drvdata(dev); + int ret, value; + + if (!count || sscanf(buf, "%i", &value) != 1) + return -EINVAL; + + ret = write_battery_life_extender(samsung, !!value); + if (ret < 0) + return ret; + + return count; +} + +static DEVICE_ATTR(battery_life_extender, S_IWUSR | S_IRUGO, + get_battery_life_extender, set_battery_life_extender); + static struct attribute *platform_attributes[] = { &dev_attr_performance_level.attr, + &dev_attr_battery_life_extender.attr, NULL }; @@ -643,6 +723,8 @@ static mode_t samsung_sysfs_is_visible(struct kobject *kobj, if (attr == &dev_attr_performance_level.attr) ok = !!samsung->config->performance_levels[0].name; + if (attr == &dev_attr_battery_life_extender.attr) + ok = !!(read_battery_life_extender(samsung) >= 0); return ok ? attr->mode : 0; } -- cgit v1.2.3 From 3a75d378703495a90d6b2b49a3f5332c953879e2 Mon Sep 17 00:00:00 2001 From: Corentin Chary Date: Sat, 26 Nov 2011 11:00:06 +0100 Subject: samsung-laptop: add usb charge support Signed-off-by: Corentin Chary Acked-by: Greg Kroah-Hartman Signed-off-by: Matthew Garrett --- .../ABI/testing/sysfs-driver-samsung-laptop | 8 +++ drivers/platform/x86/samsung-laptop.c | 82 ++++++++++++++++++++++ 2 files changed, 90 insertions(+) (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-driver-samsung-laptop b/Documentation/ABI/testing/sysfs-driver-samsung-laptop index a6a56e11ebaf..f05b897805e5 100644 --- a/Documentation/ABI/testing/sysfs-driver-samsung-laptop +++ b/Documentation/ABI/testing/sysfs-driver-samsung-laptop @@ -27,3 +27,11 @@ Description: Max battery charge level can be modified, battery cycle level. 0 means normal battery mode (100% charge) 1 means battery life extender mode (80% charge) + +What: /sys/devices/platform/samsung/usb_charge +Date: December 1, 2011 +KernelVersion: 3.3 +Contact: Corentin Chary +Description: Use your USB ports to charge devices, even + when your laptop is powered off. + 1 means enabled, 0 means disabled. diff --git a/drivers/platform/x86/samsung-laptop.c b/drivers/platform/x86/samsung-laptop.c index 918fa358c11e..b8d2145981e0 100644 --- a/drivers/platform/x86/samsung-laptop.c +++ b/drivers/platform/x86/samsung-laptop.c @@ -108,6 +108,10 @@ struct sabi_commands { u16 get_battery_life_extender; u16 set_battery_life_extender; + /* 0x80 is off, 0x81 is on */ + u16 get_usb_charge; + u16 set_usb_charge; + /* * Tell the BIOS that Linux is running on this machine. * 81 is on, 80 is off @@ -164,6 +168,9 @@ static const struct sabi_config sabi_configs[] = { .get_battery_life_extender = 0xFFFF, .set_battery_life_extender = 0xFFFF, + .get_usb_charge = 0xFFFF, + .set_usb_charge = 0xFFFF, + .set_linux = 0x0a, }, @@ -214,6 +221,9 @@ static const struct sabi_config sabi_configs[] = { .get_battery_life_extender = 0x65, .set_battery_life_extender = 0x66, + .get_usb_charge = 0x67, + .set_usb_charge = 0x68, + .set_linux = 0xff, }, @@ -622,9 +632,79 @@ static ssize_t set_battery_life_extender(struct device *dev, static DEVICE_ATTR(battery_life_extender, S_IWUSR | S_IRUGO, get_battery_life_extender, set_battery_life_extender); +static int read_usb_charge(struct samsung_laptop *samsung) +{ + const struct sabi_commands *commands = &samsung->config->commands; + struct sabi_data data; + int retval; + + if (commands->get_usb_charge == 0xFFFF) + return -ENODEV; + + memset(&data, 0, sizeof(data)); + data.data[0] = 0x80; + retval = sabi_command(samsung, commands->get_usb_charge, + &data, &data); + + if (retval) + return retval; + + if (data.data[0] != 0 && data.data[0] != 1) + return -ENODEV; + + return data.data[0]; +} + +static int write_usb_charge(struct samsung_laptop *samsung, + int enabled) +{ + const struct sabi_commands *commands = &samsung->config->commands; + struct sabi_data data; + + memset(&data, 0, sizeof(data)); + data.data[0] = 0x80 | enabled; + return sabi_command(samsung, commands->set_usb_charge, + &data, NULL); +} + +static ssize_t get_usb_charge(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct samsung_laptop *samsung = dev_get_drvdata(dev); + int ret; + + ret = read_usb_charge(samsung); + if (ret < 0) + return ret; + + return sprintf(buf, "%d\n", ret); +} + +static ssize_t set_usb_charge(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct samsung_laptop *samsung = dev_get_drvdata(dev); + int ret, value; + + if (!count || sscanf(buf, "%i", &value) != 1) + return -EINVAL; + + ret = write_usb_charge(samsung, !!value); + if (ret < 0) + return ret; + + return count; +} + +static DEVICE_ATTR(usb_charge, S_IWUSR | S_IRUGO, + get_usb_charge, set_usb_charge); + static struct attribute *platform_attributes[] = { &dev_attr_performance_level.attr, &dev_attr_battery_life_extender.attr, + &dev_attr_usb_charge.attr, NULL }; @@ -725,6 +805,8 @@ static mode_t samsung_sysfs_is_visible(struct kobject *kobj, ok = !!samsung->config->performance_levels[0].name; if (attr == &dev_attr_battery_life_extender.attr) ok = !!(read_battery_life_extender(samsung) >= 0); + if (attr == &dev_attr_usb_charge.attr) + ok = !!(read_usb_charge(samsung) >= 0); return ok ? attr->mode : 0; } -- cgit v1.2.3 From 7ec48ceda25c6c16ab3f69b6c318d3d196f7abd0 Mon Sep 17 00:00:00 2001 From: Corentin Chary Date: Thu, 15 Dec 2011 08:27:37 +0100 Subject: platform/x86: drop deprecated asus_acpi driver asus_acpi only support old models, it has been deprecated since 2009 in favor of asus-laptop, it's not built by any (sane) distro, so it is time to say good bye. Thanks to Julien Lerouge and Karol Kozimor for the work they have done on it, I would never have wrote asus-laptop and other asus related drivers without asus_acpi. Signed-off-by: Corentin Chary Signed-off-by: Matthew Garrett --- Documentation/laptops/asus-laptop.txt | 2 +- drivers/acpi/video_detect.c | 2 +- drivers/platform/x86/Kconfig | 43 +- drivers/platform/x86/Makefile | 1 - drivers/platform/x86/asus_acpi.c | 1513 --------------------------------- 5 files changed, 6 insertions(+), 1555 deletions(-) delete mode 100644 drivers/platform/x86/asus_acpi.c (limited to 'Documentation') diff --git a/Documentation/laptops/asus-laptop.txt b/Documentation/laptops/asus-laptop.txt index 803e51f6768b..a1e04d679289 100644 --- a/Documentation/laptops/asus-laptop.txt +++ b/Documentation/laptops/asus-laptop.txt @@ -45,7 +45,7 @@ Status Usage ----- - Try "modprobe asus_acpi". Check your dmesg (simply type dmesg). You should + Try "modprobe asus-laptop". Check your dmesg (simply type dmesg). You should see some lines like this : Asus Laptop Extras version 0.42 diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index f3f0fe7e255a..45d8097ef4cf 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -23,7 +23,7 @@ * Depending on whether ACPI graphics extensions (cmp. ACPI spec Appendix B) * are available, video.ko should be used to handle the device. * - * Otherwise vendor specific drivers like thinkpad_acpi, asus_acpi, + * Otherwise vendor specific drivers like thinkpad_acpi, asus-laptop, * sony_acpi,... can take care about backlight brightness. * * If CONFIG_ACPI_VIDEO is neither set as "compiled in" (y) nor as a module (m) diff --git a/drivers/platform/x86/Kconfig b/drivers/platform/x86/Kconfig index 912ffef0f148..0b5519cda194 100644 --- a/drivers/platform/x86/Kconfig +++ b/drivers/platform/x86/Kconfig @@ -54,7 +54,6 @@ config ACERHDF config ASUS_LAPTOP tristate "Asus Laptop Extras" depends on ACPI - depends on !ACPI_ASUS select LEDS_CLASS select NEW_LEDS select BACKLIGHT_CLASS_DEVICE @@ -460,10 +459,9 @@ config INTEL_MENLOW If unsure, say N. config EEEPC_LAPTOP - tristate "Eee PC Hotkey Driver (EXPERIMENTAL)" + tristate "Eee PC Hotkey Driver" depends on ACPI depends on INPUT - depends on EXPERIMENTAL depends on RFKILL || RFKILL = n depends on HOTPLUG_PCI select BACKLIGHT_CLASS_DEVICE @@ -482,11 +480,10 @@ config EEEPC_LAPTOP doesn't work on your Eee PC, try eeepc-wmi instead. config ASUS_WMI - tristate "ASUS WMI Driver (EXPERIMENTAL)" + tristate "ASUS WMI Driver" depends on ACPI_WMI depends on INPUT depends on HWMON - depends on EXPERIMENTAL depends on BACKLIGHT_CLASS_DEVICE depends on RFKILL || RFKILL = n depends on HOTPLUG_PCI @@ -501,7 +498,7 @@ config ASUS_WMI be called asus-wmi. config ASUS_NB_WMI - tristate "Asus Notebook WMI Driver (EXPERIMENTAL)" + tristate "Asus Notebook WMI Driver" depends on ASUS_WMI ---help--- This is a driver for newer Asus notebooks. It adds extra features @@ -514,7 +511,7 @@ config ASUS_NB_WMI here. config EEEPC_WMI - tristate "Eee PC WMI Driver (EXPERIMENTAL)" + tristate "Eee PC WMI Driver" depends on ASUS_WMI ---help--- This is a driver for newer Eee PC laptops. It adds extra features @@ -559,38 +556,6 @@ config MSI_WMI To compile this driver as a module, choose M here: the module will be called msi-wmi. -config ACPI_ASUS - tristate "ASUS/Medion Laptop Extras (DEPRECATED)" - depends on ACPI - select BACKLIGHT_CLASS_DEVICE - ---help--- - This driver provides support for extra features of ACPI-compatible - ASUS laptops. As some of Medion laptops are made by ASUS, it may also - support some Medion laptops (such as 9675 for example). It makes all - the extra buttons generate standard ACPI events that go through - /proc/acpi/events, and (on some models) adds support for changing the - display brightness and output, switching the LCD backlight on and off, - and most importantly, allows you to blink those fancy LEDs intended - for reporting mail and wireless status. - - Note: display switching code is currently considered EXPERIMENTAL, - toying with these values may even lock your machine. - - All settings are changed via /proc/acpi/asus directory entries. Owner - and group for these entries can be set with asus_uid and asus_gid - parameters. - - More information and a userspace daemon for handling the extra buttons - at . - - If you have an ACPI-compatible ASUS laptop, say Y or M here. This - driver is still under development, so if your laptop is unsupported or - something works not quite as expected, please use the mailing list - available on the above page (acpi4asus-user@lists.sourceforge.net). - - NOTE: This driver is deprecated and will probably be removed soon, - use asus-laptop instead. - config TOPSTAR_LAPTOP tristate "Topstar Laptop Extras" depends on ACPI diff --git a/drivers/platform/x86/Makefile b/drivers/platform/x86/Makefile index d328f21e9fdd..3c07f8bfd42c 100644 --- a/drivers/platform/x86/Makefile +++ b/drivers/platform/x86/Makefile @@ -29,7 +29,6 @@ obj-$(CONFIG_PANASONIC_LAPTOP) += panasonic-laptop.o obj-$(CONFIG_INTEL_MENLOW) += intel_menlow.o obj-$(CONFIG_ACPI_WMI) += wmi.o obj-$(CONFIG_MSI_WMI) += msi-wmi.o -obj-$(CONFIG_ACPI_ASUS) += asus_acpi.o obj-$(CONFIG_TOPSTAR_LAPTOP) += topstar-laptop.o obj-$(CONFIG_ACPI_TOSHIBA) += toshiba_acpi.o obj-$(CONFIG_TOSHIBA_BT_RFKILL) += toshiba_bluetooth.o diff --git a/drivers/platform/x86/asus_acpi.c b/drivers/platform/x86/asus_acpi.c deleted file mode 100644 index 6f966d6c062b..000000000000 --- a/drivers/platform/x86/asus_acpi.c +++ /dev/null @@ -1,1513 +0,0 @@ -/* - * asus_acpi.c - Asus Laptop ACPI Extras - * - * - * Copyright (C) 2002-2005 Julien Lerouge, 2003-2006 Karol Kozimor - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License as published by - * the Free Software Foundation; either version 2 of the License, or - * (at your option) any later version. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program; if not, write to the Free Software - * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - * - * - * The development page for this driver is located at - * http://sourceforge.net/projects/acpi4asus/ - * - * Credits: - * Pontus Fuchs - Helper functions, cleanup - * Johann Wiesner - Small compile fixes - * John Belmonte - ACPI code for Toshiba laptop was a good starting point. - * �ic Burghard - LED display support for W1N - * - */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#define ASUS_ACPI_VERSION "0.30" - -#define PROC_ASUS "asus" /* The directory */ -#define PROC_MLED "mled" -#define PROC_WLED "wled" -#define PROC_TLED "tled" -#define PROC_BT "bluetooth" -#define PROC_LEDD "ledd" -#define PROC_INFO "info" -#define PROC_LCD "lcd" -#define PROC_BRN "brn" -#define PROC_DISP "disp" - -#define ACPI_HOTK_NAME "Asus Laptop ACPI Extras Driver" -#define ACPI_HOTK_CLASS "hotkey" -#define ACPI_HOTK_DEVICE_NAME "Hotkey" - -/* - * Some events we use, same for all Asus - */ -#define BR_UP 0x10 -#define BR_DOWN 0x20 - -/* - * Flags for hotk status - */ -#define MLED_ON 0x01 /* Mail LED */ -#define WLED_ON 0x02 /* Wireless LED */ -#define TLED_ON 0x04 /* Touchpad LED */ -#define BT_ON 0x08 /* Internal Bluetooth */ - -MODULE_AUTHOR("Julien Lerouge, Karol Kozimor"); -MODULE_DESCRIPTION(ACPI_HOTK_NAME); -MODULE_LICENSE("GPL"); - -static uid_t asus_uid; -static gid_t asus_gid; -module_param(asus_uid, uint, 0); -MODULE_PARM_DESC(asus_uid, "UID for entries in /proc/acpi/asus"); -module_param(asus_gid, uint, 0); -MODULE_PARM_DESC(asus_gid, "GID for entries in /proc/acpi/asus"); - -/* For each model, all features implemented, - * those marked with R are relative to HOTK, A for absolute */ -struct model_data { - char *name; /* name of the laptop________________A */ - char *mt_mled; /* method to handle mled_____________R */ - char *mled_status; /* node to handle mled reading_______A */ - char *mt_wled; /* method to handle wled_____________R */ - char *wled_status; /* node to handle wled reading_______A */ - char *mt_tled; /* method to handle tled_____________R */ - char *tled_status; /* node to handle tled reading_______A */ - char *mt_ledd; /* method to handle LED display______R */ - char *mt_bt_switch; /* method to switch Bluetooth on/off_R */ - char *bt_status; /* no model currently supports this__? */ - char *mt_lcd_switch; /* method to turn LCD on/off_________A */ - char *lcd_status; /* node to read LCD panel state______A */ - char *brightness_up; /* method to set brightness up_______A */ - char *brightness_down; /* method to set brightness down ____A */ - char *brightness_set; /* method to set absolute brightness_R */ - char *brightness_get; /* method to get absolute brightness_R */ - char *brightness_status;/* node to get brightness____________A */ - char *display_set; /* method to set video output________R */ - char *display_get; /* method to get video output________R */ -}; - -/* - * This is the main structure, we can use it to store anything interesting - * about the hotk device - */ -struct asus_hotk { - struct acpi_device *device; /* the device we are in */ - acpi_handle handle; /* the handle of the hotk device */ - char status; /* status of the hotk, for LEDs */ - u32 ledd_status; /* status of the LED display */ - struct model_data *methods; /* methods available on the laptop */ - u8 brightness; /* brightness level */ - enum { - A1x = 0, /* A1340D, A1300F */ - A2x, /* A2500H */ - A4G, /* A4700G */ - D1x, /* D1 */ - L2D, /* L2000D */ - L3C, /* L3800C */ - L3D, /* L3400D */ - L3H, /* L3H, L2000E, L5D */ - L4R, /* L4500R */ - L5x, /* L5800C */ - L8L, /* L8400L */ - M1A, /* M1300A */ - M2E, /* M2400E, L4400L */ - M6N, /* M6800N, W3400N */ - M6R, /* M6700R, A3000G */ - P30, /* Samsung P30 */ - S1x, /* S1300A, but also L1400B and M2400A (L84F) */ - S2x, /* S200 (J1 reported), Victor MP-XP7210 */ - W1N, /* W1000N */ - W5A, /* W5A */ - W3V, /* W3030V */ - xxN, /* M2400N, M3700N, M5200N, M6800N, - S1300N, S5200N*/ - A4S, /* Z81sp */ - F3Sa, /* (Centrino) */ - R1F, - END_MODEL - } model; /* Models currently supported */ - u16 event_count[128]; /* Count for each event TODO make this better */ -}; - -/* Here we go */ -#define A1x_PREFIX "\\_SB.PCI0.ISA.EC0." -#define L3C_PREFIX "\\_SB.PCI0.PX40.ECD0." -#define M1A_PREFIX "\\_SB.PCI0.PX40.EC0." -#define P30_PREFIX "\\_SB.PCI0.LPCB.EC0." -#define S1x_PREFIX "\\_SB.PCI0.PX40." -#define S2x_PREFIX A1x_PREFIX -#define xxN_PREFIX "\\_SB.PCI0.SBRG.EC0." - -static struct model_data model_conf[END_MODEL] = { - /* - * TODO I have seen a SWBX and AIBX method on some models, like L1400B, - * it seems to be a kind of switch, but what for ? - */ - - { - .name = "A1x", - .mt_mled = "MLED", - .mled_status = "\\MAIL", - .mt_lcd_switch = A1x_PREFIX "_Q10", - .lcd_status = "\\BKLI", - .brightness_up = A1x_PREFIX "_Q0E", - .brightness_down = A1x_PREFIX "_Q0F"}, - - { - .name = "A2x", - .mt_mled = "MLED", - .mt_wled = "WLED", - .wled_status = "\\SG66", - .mt_lcd_switch = "\\Q10", - .lcd_status = "\\BAOF", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "A4G", - .mt_mled = "MLED", -/* WLED present, but not controlled by ACPI */ - .mt_lcd_switch = xxN_PREFIX "_Q10", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\ADVG"}, - - { - .name = "D1x", - .mt_mled = "MLED", - .mt_lcd_switch = "\\Q0D", - .lcd_status = "\\GP11", - .brightness_up = "\\Q0C", - .brightness_down = "\\Q0B", - .brightness_status = "\\BLVL", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "L2D", - .mt_mled = "MLED", - .mled_status = "\\SGP6", - .mt_wled = "WLED", - .wled_status = "\\RCP3", - .mt_lcd_switch = "\\Q10", - .lcd_status = "\\SGP0", - .brightness_up = "\\Q0E", - .brightness_down = "\\Q0F", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "L3C", - .mt_mled = "MLED", - .mt_wled = "WLED", - .mt_lcd_switch = L3C_PREFIX "_Q10", - .lcd_status = "\\GL32", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\_SB.PCI0.PCI1.VGAC.NMAP"}, - - { - .name = "L3D", - .mt_mled = "MLED", - .mled_status = "\\MALD", - .mt_wled = "WLED", - .mt_lcd_switch = "\\Q10", - .lcd_status = "\\BKLG", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "L3H", - .mt_mled = "MLED", - .mt_wled = "WLED", - .mt_lcd_switch = "EHK", - .lcd_status = "\\_SB.PCI0.PM.PBC", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "L4R", - .mt_mled = "MLED", - .mt_wled = "WLED", - .wled_status = "\\_SB.PCI0.SBRG.SG13", - .mt_lcd_switch = xxN_PREFIX "_Q10", - .lcd_status = "\\_SB.PCI0.SBSM.SEO4", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\_SB.PCI0.P0P1.VGA.GETD"}, - - { - .name = "L5x", - .mt_mled = "MLED", -/* WLED present, but not controlled by ACPI */ - .mt_tled = "TLED", - .mt_lcd_switch = "\\Q0D", - .lcd_status = "\\BAOF", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "L8L" -/* No features, but at least support the hotkeys */ - }, - - { - .name = "M1A", - .mt_mled = "MLED", - .mt_lcd_switch = M1A_PREFIX "Q10", - .lcd_status = "\\PNOF", - .brightness_up = M1A_PREFIX "Q0E", - .brightness_down = M1A_PREFIX "Q0F", - .brightness_status = "\\BRIT", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "M2E", - .mt_mled = "MLED", - .mt_wled = "WLED", - .mt_lcd_switch = "\\Q10", - .lcd_status = "\\GP06", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "M6N", - .mt_mled = "MLED", - .mt_wled = "WLED", - .wled_status = "\\_SB.PCI0.SBRG.SG13", - .mt_lcd_switch = xxN_PREFIX "_Q10", - .lcd_status = "\\_SB.BKLT", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\SSTE"}, - - { - .name = "M6R", - .mt_mled = "MLED", - .mt_wled = "WLED", - .mt_lcd_switch = xxN_PREFIX "_Q10", - .lcd_status = "\\_SB.PCI0.SBSM.SEO4", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\_SB.PCI0.P0P1.VGA.GETD"}, - - { - .name = "P30", - .mt_wled = "WLED", - .mt_lcd_switch = P30_PREFIX "_Q0E", - .lcd_status = "\\BKLT", - .brightness_up = P30_PREFIX "_Q68", - .brightness_down = P30_PREFIX "_Q69", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\DNXT"}, - - { - .name = "S1x", - .mt_mled = "MLED", - .mled_status = "\\EMLE", - .mt_wled = "WLED", - .mt_lcd_switch = S1x_PREFIX "Q10", - .lcd_status = "\\PNOF", - .brightness_set = "SPLV", - .brightness_get = "GPLV"}, - - { - .name = "S2x", - .mt_mled = "MLED", - .mled_status = "\\MAIL", - .mt_lcd_switch = S2x_PREFIX "_Q10", - .lcd_status = "\\BKLI", - .brightness_up = S2x_PREFIX "_Q0B", - .brightness_down = S2x_PREFIX "_Q0A"}, - - { - .name = "W1N", - .mt_mled = "MLED", - .mt_wled = "WLED", - .mt_ledd = "SLCM", - .mt_lcd_switch = xxN_PREFIX "_Q10", - .lcd_status = "\\BKLT", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\ADVG"}, - - { - .name = "W5A", - .mt_bt_switch = "BLED", - .mt_wled = "WLED", - .mt_lcd_switch = xxN_PREFIX "_Q10", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\ADVG"}, - - { - .name = "W3V", - .mt_mled = "MLED", - .mt_wled = "WLED", - .mt_lcd_switch = xxN_PREFIX "_Q10", - .lcd_status = "\\BKLT", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\INFB"}, - - { - .name = "xxN", - .mt_mled = "MLED", -/* WLED present, but not controlled by ACPI */ - .mt_lcd_switch = xxN_PREFIX "_Q10", - .lcd_status = "\\BKLT", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\ADVG"}, - - { - .name = "A4S", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .mt_bt_switch = "BLED", - .mt_wled = "WLED" - }, - - { - .name = "F3Sa", - .mt_bt_switch = "BLED", - .mt_wled = "WLED", - .mt_mled = "MLED", - .brightness_get = "GPLV", - .brightness_set = "SPLV", - .mt_lcd_switch = "\\_SB.PCI0.SBRG.EC0._Q10", - .lcd_status = "\\_SB.PCI0.SBRG.EC0.RPIN", - .display_get = "\\ADVG", - .display_set = "SDSP", - }, - { - .name = "R1F", - .mt_bt_switch = "BLED", - .mt_mled = "MLED", - .mt_wled = "WLED", - .mt_lcd_switch = "\\Q10", - .lcd_status = "\\GP06", - .brightness_set = "SPLV", - .brightness_get = "GPLV", - .display_set = "SDSP", - .display_get = "\\INFB" - } -}; - -/* procdir we use */ -static struct proc_dir_entry *asus_proc_dir; - -static struct backlight_device *asus_backlight_device; - -/* - * This header is made available to allow proper configuration given model, - * revision number , ... this info cannot go in struct asus_hotk because it is - * available before the hotk - */ -static struct acpi_table_header *asus_info; - -/* The actual device the driver binds to */ -static struct asus_hotk *hotk; - -/* - * The hotkey driver and autoloading declaration - */ -static int asus_hotk_add(struct acpi_device *device); -static int asus_hotk_remove(struct acpi_device *device, int type); -static void asus_hotk_notify(struct acpi_device *device, u32 event); - -static const struct acpi_device_id asus_device_ids[] = { - {"ATK0100", 0}, - {"", 0}, -}; -MODULE_DEVICE_TABLE(acpi, asus_device_ids); - -static struct acpi_driver asus_hotk_driver = { - .name = "asus_acpi", - .class = ACPI_HOTK_CLASS, - .owner = THIS_MODULE, - .ids = asus_device_ids, - .flags = ACPI_DRIVER_ALL_NOTIFY_EVENTS, - .ops = { - .add = asus_hotk_add, - .remove = asus_hotk_remove, - .notify = asus_hotk_notify, - }, -}; - -/* - * This function evaluates an ACPI method, given an int as parameter, the - * method is searched within the scope of the handle, can be NULL. The output - * of the method is written is output, which can also be NULL - * - * returns 1 if write is successful, 0 else. - */ -static int write_acpi_int(acpi_handle handle, const char *method, int val, - struct acpi_buffer *output) -{ - struct acpi_object_list params; /* list of input parameters (int) */ - union acpi_object in_obj; /* the only param we use */ - acpi_status status; - - params.count = 1; - params.pointer = &in_obj; - in_obj.type = ACPI_TYPE_INTEGER; - in_obj.integer.value = val; - - status = acpi_evaluate_object(handle, (char *)method, ¶ms, output); - return (status == AE_OK); -} - -static int read_acpi_int(acpi_handle handle, const char *method, int *val) -{ - struct acpi_buffer output; - union acpi_object out_obj; - acpi_status status; - - output.length = sizeof(out_obj); - output.pointer = &out_obj; - - status = acpi_evaluate_object(handle, (char *)method, NULL, &output); - *val = out_obj.integer.value; - return (status == AE_OK) && (out_obj.type == ACPI_TYPE_INTEGER); -} - -static int asus_info_proc_show(struct seq_file *m, void *v) -{ - int temp; - - seq_printf(m, ACPI_HOTK_NAME " " ASUS_ACPI_VERSION "\n"); - seq_printf(m, "Model reference : %s\n", hotk->methods->name); - /* - * The SFUN method probably allows the original driver to get the list - * of features supported by a given model. For now, 0x0100 or 0x0800 - * bit signifies that the laptop is equipped with a Wi-Fi MiniPCI card. - * The significance of others is yet to be found. - */ - if (read_acpi_int(hotk->handle, "SFUN", &temp)) - seq_printf(m, "SFUN value : 0x%04x\n", temp); - /* - * Another value for userspace: the ASYM method returns 0x02 for - * battery low and 0x04 for battery critical, its readings tend to be - * more accurate than those provided by _BST. - * Note: since not all the laptops provide this method, errors are - * silently ignored. - */ - if (read_acpi_int(hotk->handle, "ASYM", &temp)) - seq_printf(m, "ASYM value : 0x%04x\n", temp); - if (asus_info) { - seq_printf(m, "DSDT length : %d\n", asus_info->length); - seq_printf(m, "DSDT checksum : %d\n", asus_info->checksum); - seq_printf(m, "DSDT revision : %d\n", asus_info->revision); - seq_printf(m, "OEM id : %.*s\n", ACPI_OEM_ID_SIZE, asus_info->oem_id); - seq_printf(m, "OEM table id : %.*s\n", ACPI_OEM_TABLE_ID_SIZE, asus_info->oem_table_id); - seq_printf(m, "OEM revision : 0x%x\n", asus_info->oem_revision); - seq_printf(m, "ASL comp vendor id : %.*s\n", ACPI_NAME_SIZE, asus_info->asl_compiler_id); - seq_printf(m, "ASL comp revision : 0x%x\n", asus_info->asl_compiler_revision); - } - - return 0; -} - -static int asus_info_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, asus_info_proc_show, NULL); -} - -static const struct file_operations asus_info_proc_fops = { - .owner = THIS_MODULE, - .open = asus_info_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, -}; - -/* - * /proc handlers - * We write our info in page, we begin at offset off and cannot write more - * than count bytes. We set eof to 1 if we handle those 2 values. We return the - * number of bytes written in page - */ - -/* Generic LED functions */ -static int read_led(const char *ledname, int ledmask) -{ - if (ledname) { - int led_status; - - if (read_acpi_int(NULL, ledname, &led_status)) - return led_status; - else - pr_warn("Error reading LED status\n"); - } - return (hotk->status & ledmask) ? 1 : 0; -} - -static int parse_arg(const char __user *buf, unsigned long count, int *val) -{ - char s[32]; - if (!count) - return 0; - if (count > 31) - return -EINVAL; - if (copy_from_user(s, buf, count)) - return -EFAULT; - s[count] = 0; - if (sscanf(s, "%i", val) != 1) - return -EINVAL; - return count; -} - -/* FIXME: kill extraneous args so it can be called independently */ -static int -write_led(const char __user *buffer, unsigned long count, - char *ledname, int ledmask, int invert) -{ - int rv, value; - int led_out = 0; - - rv = parse_arg(buffer, count, &value); - if (rv > 0) - led_out = value ? 1 : 0; - - hotk->status = - (led_out) ? (hotk->status | ledmask) : (hotk->status & ~ledmask); - - if (invert) /* invert target value */ - led_out = !led_out; - - if (!write_acpi_int(hotk->handle, ledname, led_out, NULL)) - pr_warn("LED (%s) write failed\n", ledname); - - return rv; -} - -/* - * Proc handlers for MLED - */ -static int mled_proc_show(struct seq_file *m, void *v) -{ - seq_printf(m, "%d\n", read_led(hotk->methods->mled_status, MLED_ON)); - return 0; -} - -static int mled_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, mled_proc_show, NULL); -} - -static ssize_t mled_proc_write(struct file *file, const char __user *buffer, - size_t count, loff_t *pos) -{ - return write_led(buffer, count, hotk->methods->mt_mled, MLED_ON, 1); -} - -static const struct file_operations mled_proc_fops = { - .owner = THIS_MODULE, - .open = mled_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, - .write = mled_proc_write, -}; - -/* - * Proc handlers for LED display - */ -static int ledd_proc_show(struct seq_file *m, void *v) -{ - seq_printf(m, "0x%08x\n", hotk->ledd_status); - return 0; -} - -static int ledd_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, ledd_proc_show, NULL); -} - -static ssize_t ledd_proc_write(struct file *file, const char __user *buffer, - size_t count, loff_t *pos) -{ - int rv, value; - - rv = parse_arg(buffer, count, &value); - if (rv > 0) { - if (!write_acpi_int - (hotk->handle, hotk->methods->mt_ledd, value, NULL)) - pr_warn("LED display write failed\n"); - else - hotk->ledd_status = (u32) value; - } - return rv; -} - -static const struct file_operations ledd_proc_fops = { - .owner = THIS_MODULE, - .open = ledd_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, - .write = ledd_proc_write, -}; - -/* - * Proc handlers for WLED - */ -static int wled_proc_show(struct seq_file *m, void *v) -{ - seq_printf(m, "%d\n", read_led(hotk->methods->wled_status, WLED_ON)); - return 0; -} - -static int wled_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, wled_proc_show, NULL); -} - -static ssize_t wled_proc_write(struct file *file, const char __user *buffer, - size_t count, loff_t *pos) -{ - return write_led(buffer, count, hotk->methods->mt_wled, WLED_ON, 0); -} - -static const struct file_operations wled_proc_fops = { - .owner = THIS_MODULE, - .open = wled_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, - .write = wled_proc_write, -}; - -/* - * Proc handlers for Bluetooth - */ -static int bluetooth_proc_show(struct seq_file *m, void *v) -{ - seq_printf(m, "%d\n", read_led(hotk->methods->bt_status, BT_ON)); - return 0; -} - -static int bluetooth_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, bluetooth_proc_show, NULL); -} - -static ssize_t bluetooth_proc_write(struct file *file, - const char __user *buffer, size_t count, loff_t *pos) -{ - /* Note: mt_bt_switch controls both internal Bluetooth adapter's - presence and its LED */ - return write_led(buffer, count, hotk->methods->mt_bt_switch, BT_ON, 0); -} - -static const struct file_operations bluetooth_proc_fops = { - .owner = THIS_MODULE, - .open = bluetooth_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, - .write = bluetooth_proc_write, -}; - -/* - * Proc handlers for TLED - */ -static int tled_proc_show(struct seq_file *m, void *v) -{ - seq_printf(m, "%d\n", read_led(hotk->methods->tled_status, TLED_ON)); - return 0; -} - -static int tled_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, tled_proc_show, NULL); -} - -static ssize_t tled_proc_write(struct file *file, const char __user *buffer, - size_t count, loff_t *pos) -{ - return write_led(buffer, count, hotk->methods->mt_tled, TLED_ON, 0); -} - -static const struct file_operations tled_proc_fops = { - .owner = THIS_MODULE, - .open = tled_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, - .write = tled_proc_write, -}; - -static int get_lcd_state(void) -{ - int lcd = 0; - - if (hotk->model == L3H) { - /* L3H and the like have to be handled differently */ - acpi_status status = 0; - struct acpi_object_list input; - union acpi_object mt_params[2]; - struct acpi_buffer output; - union acpi_object out_obj; - - input.count = 2; - input.pointer = mt_params; - /* Note: the following values are partly guessed up, but - otherwise they seem to work */ - mt_params[0].type = ACPI_TYPE_INTEGER; - mt_params[0].integer.value = 0x02; - mt_params[1].type = ACPI_TYPE_INTEGER; - mt_params[1].integer.value = 0x02; - - output.length = sizeof(out_obj); - output.pointer = &out_obj; - - status = - acpi_evaluate_object(NULL, hotk->methods->lcd_status, - &input, &output); - if (status != AE_OK) - return -1; - if (out_obj.type == ACPI_TYPE_INTEGER) - /* That's what the AML code does */ - lcd = out_obj.integer.value >> 8; - } else if (hotk->model == F3Sa) { - unsigned long long tmp; - union acpi_object param; - struct acpi_object_list input; - acpi_status status; - - /* Read pin 11 */ - param.type = ACPI_TYPE_INTEGER; - param.integer.value = 0x11; - input.count = 1; - input.pointer = ¶m; - - status = acpi_evaluate_integer(NULL, hotk->methods->lcd_status, - &input, &tmp); - if (status != AE_OK) - return -1; - - lcd = tmp; - } else { - /* We don't have to check anything if we are here */ - if (!read_acpi_int(NULL, hotk->methods->lcd_status, &lcd)) - pr_warn("Error reading LCD status\n"); - - if (hotk->model == L2D) - lcd = ~lcd; - } - - return (lcd & 1); -} - -static int set_lcd_state(int value) -{ - int lcd = 0; - acpi_status status = 0; - - lcd = value ? 1 : 0; - if (lcd != get_lcd_state()) { - /* switch */ - if (hotk->model != L3H) { - status = - acpi_evaluate_object(NULL, - hotk->methods->mt_lcd_switch, - NULL, NULL); - } else { - /* L3H and the like must be handled differently */ - if (!write_acpi_int - (hotk->handle, hotk->methods->mt_lcd_switch, 0x07, - NULL)) - status = AE_ERROR; - /* L3H's AML executes EHK (0x07) upon Fn+F7 keypress, - the exact behaviour is simulated here */ - } - if (ACPI_FAILURE(status)) - pr_warn("Error switching LCD\n"); - } - return 0; - -} - -static int lcd_proc_show(struct seq_file *m, void *v) -{ - seq_printf(m, "%d\n", get_lcd_state()); - return 0; -} - -static int lcd_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, lcd_proc_show, NULL); -} - -static ssize_t lcd_proc_write(struct file *file, const char __user *buffer, - size_t count, loff_t *pos) -{ - int rv, value; - - rv = parse_arg(buffer, count, &value); - if (rv > 0) - set_lcd_state(value); - return rv; -} - -static const struct file_operations lcd_proc_fops = { - .owner = THIS_MODULE, - .open = lcd_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, - .write = lcd_proc_write, -}; - -static int read_brightness(struct backlight_device *bd) -{ - int value; - - if (hotk->methods->brightness_get) { /* SPLV/GPLV laptop */ - if (!read_acpi_int(hotk->handle, hotk->methods->brightness_get, - &value)) - pr_warn("Error reading brightness\n"); - } else if (hotk->methods->brightness_status) { /* For D1 for example */ - if (!read_acpi_int(NULL, hotk->methods->brightness_status, - &value)) - pr_warn("Error reading brightness\n"); - } else /* No GPLV method */ - value = hotk->brightness; - return value; -} - -/* - * Change the brightness level - */ -static int set_brightness(int value) -{ - acpi_status status = 0; - int ret = 0; - - /* SPLV laptop */ - if (hotk->methods->brightness_set) { - if (!write_acpi_int(hotk->handle, hotk->methods->brightness_set, - value, NULL)) { - pr_warn("Error changing brightness\n"); - ret = -EIO; - } - goto out; - } - - /* No SPLV method if we are here, act as appropriate */ - value -= read_brightness(NULL); - while (value != 0) { - status = acpi_evaluate_object(NULL, (value > 0) ? - hotk->methods->brightness_up : - hotk->methods->brightness_down, - NULL, NULL); - (value > 0) ? value-- : value++; - if (ACPI_FAILURE(status)) { - pr_warn("Error changing brightness\n"); - ret = -EIO; - } - } -out: - return ret; -} - -static int set_brightness_status(struct backlight_device *bd) -{ - return set_brightness(bd->props.brightness); -} - -static int brn_proc_show(struct seq_file *m, void *v) -{ - seq_printf(m, "%d\n", read_brightness(NULL)); - return 0; -} - -static int brn_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, brn_proc_show, NULL); -} - -static ssize_t brn_proc_write(struct file *file, const char __user *buffer, - size_t count, loff_t *pos) -{ - int rv, value; - - rv = parse_arg(buffer, count, &value); - if (rv > 0) { - value = (0 < value) ? ((15 < value) ? 15 : value) : 0; - /* 0 <= value <= 15 */ - set_brightness(value); - } - return rv; -} - -static const struct file_operations brn_proc_fops = { - .owner = THIS_MODULE, - .open = brn_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, - .write = brn_proc_write, -}; - -static void set_display(int value) -{ - /* no sanity check needed for now */ - if (!write_acpi_int(hotk->handle, hotk->methods->display_set, - value, NULL)) - pr_warn("Error setting display\n"); - return; -} - -/* - * Now, *this* one could be more user-friendly, but so far, no-one has - * complained. The significance of bits is the same as in proc_write_disp() - */ -static int disp_proc_show(struct seq_file *m, void *v) -{ - int value = 0; - - if (!read_acpi_int(hotk->handle, hotk->methods->display_get, &value)) - pr_warn("Error reading display status\n"); - value &= 0x07; /* needed for some models, shouldn't hurt others */ - seq_printf(m, "%d\n", value); - return 0; -} - -static int disp_proc_open(struct inode *inode, struct file *file) -{ - return single_open(file, disp_proc_show, NULL); -} - -/* - * Experimental support for display switching. As of now: 1 should activate - * the LCD output, 2 should do for CRT, and 4 for TV-Out. Any combination - * (bitwise) of these will suffice. I never actually tested 3 displays hooked - * up simultaneously, so be warned. See the acpi4asus README for more info. - */ -static ssize_t disp_proc_write(struct file *file, const char __user *buffer, - size_t count, loff_t *pos) -{ - int rv, value; - - rv = parse_arg(buffer, count, &value); - if (rv > 0) - set_display(value); - return rv; -} - -static const struct file_operations disp_proc_fops = { - .owner = THIS_MODULE, - .open = disp_proc_open, - .read = seq_read, - .llseek = seq_lseek, - .release = single_release, - .write = disp_proc_write, -}; - -static int -asus_proc_add(char *name, const struct file_operations *proc_fops, umode_t mode, - struct acpi_device *device) -{ - struct proc_dir_entry *proc; - - proc = proc_create_data(name, mode, acpi_device_dir(device), - proc_fops, acpi_driver_data(device)); - if (!proc) { - pr_warn(" Unable to create %s fs entry\n", name); - return -1; - } - proc->uid = asus_uid; - proc->gid = asus_gid; - return 0; -} - -static int asus_hotk_add_fs(struct acpi_device *device) -{ - struct proc_dir_entry *proc; - umode_t mode; - - if ((asus_uid == 0) && (asus_gid == 0)) { - mode = S_IFREG | S_IRUGO | S_IWUSR | S_IWGRP; - } else { - mode = S_IFREG | S_IRUSR | S_IRGRP | S_IWUSR | S_IWGRP; - pr_warn(" asus_uid and asus_gid parameters are " - "deprecated, use chown and chmod instead!\n"); - } - - acpi_device_dir(device) = asus_proc_dir; - if (!acpi_device_dir(device)) - return -ENODEV; - - proc = proc_create(PROC_INFO, mode, acpi_device_dir(device), - &asus_info_proc_fops); - if (proc) { - proc->uid = asus_uid; - proc->gid = asus_gid; - } else { - pr_warn(" Unable to create " PROC_INFO " fs entry\n"); - } - - if (hotk->methods->mt_wled) { - asus_proc_add(PROC_WLED, &wled_proc_fops, mode, device); - } - - if (hotk->methods->mt_ledd) { - asus_proc_add(PROC_LEDD, &ledd_proc_fops, mode, device); - } - - if (hotk->methods->mt_mled) { - asus_proc_add(PROC_MLED, &mled_proc_fops, mode, device); - } - - if (hotk->methods->mt_tled) { - asus_proc_add(PROC_TLED, &tled_proc_fops, mode, device); - } - - if (hotk->methods->mt_bt_switch) { - asus_proc_add(PROC_BT, &bluetooth_proc_fops, mode, device); - } - - /* - * We need both read node and write method as LCD switch is also - * accessible from the keyboard - */ - if (hotk->methods->mt_lcd_switch && hotk->methods->lcd_status) { - asus_proc_add(PROC_LCD, &lcd_proc_fops, mode, device); - } - - if ((hotk->methods->brightness_up && hotk->methods->brightness_down) || - (hotk->methods->brightness_get && hotk->methods->brightness_set)) { - asus_proc_add(PROC_BRN, &brn_proc_fops, mode, device); - } - - if (hotk->methods->display_set) { - asus_proc_add(PROC_DISP, &disp_proc_fops, mode, device); - } - - return 0; -} - -static int asus_hotk_remove_fs(struct acpi_device *device) -{ - if (acpi_device_dir(device)) { - remove_proc_entry(PROC_INFO, acpi_device_dir(device)); - if (hotk->methods->mt_wled) - remove_proc_entry(PROC_WLED, acpi_device_dir(device)); - if (hotk->methods->mt_mled) - remove_proc_entry(PROC_MLED, acpi_device_dir(device)); - if (hotk->methods->mt_tled) - remove_proc_entry(PROC_TLED, acpi_device_dir(device)); - if (hotk->methods->mt_ledd) - remove_proc_entry(PROC_LEDD, acpi_device_dir(device)); - if (hotk->methods->mt_bt_switch) - remove_proc_entry(PROC_BT, acpi_device_dir(device)); - if (hotk->methods->mt_lcd_switch && hotk->methods->lcd_status) - remove_proc_entry(PROC_LCD, acpi_device_dir(device)); - if ((hotk->methods->brightness_up - && hotk->methods->brightness_down) - || (hotk->methods->brightness_get - && hotk->methods->brightness_set)) - remove_proc_entry(PROC_BRN, acpi_device_dir(device)); - if (hotk->methods->display_set) - remove_proc_entry(PROC_DISP, acpi_device_dir(device)); - } - return 0; -} - -static void asus_hotk_notify(struct acpi_device *device, u32 event) -{ - /* TODO Find a better way to handle events count. */ - if (!hotk) - return; - - /* - * The BIOS *should* be sending us device events, but apparently - * Asus uses system events instead, so just ignore any device - * events we get. - */ - if (event > ACPI_MAX_SYS_NOTIFY) - return; - - if ((event & ~((u32) BR_UP)) < 16) - hotk->brightness = (event & ~((u32) BR_UP)); - else if ((event & ~((u32) BR_DOWN)) < 16) - hotk->brightness = (event & ~((u32) BR_DOWN)); - - acpi_bus_generate_proc_event(hotk->device, event, - hotk->event_count[event % 128]++); - - return; -} - -/* - * Match the model string to the list of supported models. Return END_MODEL if - * no match or model is NULL. - */ -static int asus_model_match(char *model) -{ - if (model == NULL) - return END_MODEL; - - if (strncmp(model, "L3D", 3) == 0) - return L3D; - else if (strncmp(model, "L2E", 3) == 0 || - strncmp(model, "L3H", 3) == 0 || strncmp(model, "L5D", 3) == 0) - return L3H; - else if (strncmp(model, "L3", 2) == 0 || strncmp(model, "L2B", 3) == 0) - return L3C; - else if (strncmp(model, "L8L", 3) == 0) - return L8L; - else if (strncmp(model, "L4R", 3) == 0) - return L4R; - else if (strncmp(model, "M6N", 3) == 0 || strncmp(model, "W3N", 3) == 0) - return M6N; - else if (strncmp(model, "M6R", 3) == 0 || strncmp(model, "A3G", 3) == 0) - return M6R; - else if (strncmp(model, "M2N", 3) == 0 || - strncmp(model, "M3N", 3) == 0 || - strncmp(model, "M5N", 3) == 0 || - strncmp(model, "S1N", 3) == 0 || - strncmp(model, "S5N", 3) == 0) - return xxN; - else if (strncmp(model, "M1", 2) == 0) - return M1A; - else if (strncmp(model, "M2", 2) == 0 || strncmp(model, "L4E", 3) == 0) - return M2E; - else if (strncmp(model, "L2", 2) == 0) - return L2D; - else if (strncmp(model, "L8", 2) == 0) - return S1x; - else if (strncmp(model, "D1", 2) == 0) - return D1x; - else if (strncmp(model, "A1", 2) == 0) - return A1x; - else if (strncmp(model, "A2", 2) == 0) - return A2x; - else if (strncmp(model, "J1", 2) == 0) - return S2x; - else if (strncmp(model, "L5", 2) == 0) - return L5x; - else if (strncmp(model, "A4G", 3) == 0) - return A4G; - else if (strncmp(model, "W1N", 3) == 0) - return W1N; - else if (strncmp(model, "W3V", 3) == 0) - return W3V; - else if (strncmp(model, "W5A", 3) == 0) - return W5A; - else if (strncmp(model, "R1F", 3) == 0) - return R1F; - else if (strncmp(model, "A4S", 3) == 0) - return A4S; - else if (strncmp(model, "F3Sa", 4) == 0) - return F3Sa; - else - return END_MODEL; -} - -/* - * This function is used to initialize the hotk with right values. In this - * method, we can make all the detection we want, and modify the hotk struct - */ -static int asus_hotk_get_info(void) -{ - struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; - union acpi_object *model = NULL; - int bsts_result; - char *string = NULL; - acpi_status status; - - /* - * Get DSDT headers early enough to allow for differentiating between - * models, but late enough to allow acpi_bus_register_driver() to fail - * before doing anything ACPI-specific. Should we encounter a machine, - * which needs special handling (i.e. its hotkey device has a different - * HID), this bit will be moved. A global variable asus_info contains - * the DSDT header. - */ - status = acpi_get_table(ACPI_SIG_DSDT, 1, &asus_info); - if (ACPI_FAILURE(status)) - pr_warn(" Couldn't get the DSDT table header\n"); - - /* We have to write 0 on init this far for all ASUS models */ - if (!write_acpi_int(hotk->handle, "INIT", 0, &buffer)) { - pr_err(" Hotkey initialization failed\n"); - return -ENODEV; - } - - /* This needs to be called for some laptops to init properly */ - if (!read_acpi_int(hotk->handle, "BSTS", &bsts_result)) - pr_warn(" Error calling BSTS\n"); - else if (bsts_result) - pr_notice(" BSTS called, 0x%02x returned\n", bsts_result); - - /* - * Try to match the object returned by INIT to the specific model. - * Handle every possible object (or the lack of thereof) the DSDT - * writers might throw at us. When in trouble, we pass NULL to - * asus_model_match() and try something completely different. - */ - if (buffer.pointer) { - model = buffer.pointer; - switch (model->type) { - case ACPI_TYPE_STRING: - string = model->string.pointer; - break; - case ACPI_TYPE_BUFFER: - string = model->buffer.pointer; - break; - default: - kfree(model); - model = NULL; - break; - } - } - hotk->model = asus_model_match(string); - if (hotk->model == END_MODEL) { /* match failed */ - if (asus_info && - strncmp(asus_info->oem_table_id, "ODEM", 4) == 0) { - hotk->model = P30; - pr_notice(" Samsung P30 detected, supported\n"); - hotk->methods = &model_conf[hotk->model]; - kfree(model); - return 0; - } else { - hotk->model = M2E; - pr_notice(" unsupported model %s, trying default values\n", - string); - pr_notice(" send /proc/acpi/dsdt to the developers\n"); - kfree(model); - return -ENODEV; - } - } - hotk->methods = &model_conf[hotk->model]; - pr_notice(" %s model detected, supported\n", string); - - /* Sort of per-model blacklist */ - if (strncmp(string, "L2B", 3) == 0) - hotk->methods->lcd_status = NULL; - /* L2B is similar enough to L3C to use its settings, with this only - exception */ - else if (strncmp(string, "A3G", 3) == 0) - hotk->methods->lcd_status = "\\BLFG"; - /* A3G is like M6R */ - else if (strncmp(string, "S5N", 3) == 0 || - strncmp(string, "M5N", 3) == 0 || - strncmp(string, "W3N", 3) == 0) - hotk->methods->mt_mled = NULL; - /* S5N, M5N and W3N have no MLED */ - else if (strncmp(string, "L5D", 3) == 0) - hotk->methods->mt_wled = NULL; - /* L5D's WLED is not controlled by ACPI */ - else if (strncmp(string, "M2N", 3) == 0 || - strncmp(string, "W3V", 3) == 0 || - strncmp(string, "S1N", 3) == 0) - hotk->methods->mt_wled = "WLED"; - /* M2N, S1N and W3V have a usable WLED */ - else if (asus_info) { - if (strncmp(asus_info->oem_table_id, "L1", 2) == 0) - hotk->methods->mled_status = NULL; - /* S1300A reports L84F, but L1400B too, account for that */ - } - - kfree(model); - - return 0; -} - -static int asus_hotk_check(void) -{ - int result = 0; - - result = acpi_bus_get_status(hotk->device); - if (result) - return result; - - if (hotk->device->status.present) { - result = asus_hotk_get_info(); - } else { - pr_err(" Hotkey device not present, aborting\n"); - return -EINVAL; - } - - return result; -} - -static int asus_hotk_found; - -static int asus_hotk_add(struct acpi_device *device) -{ - acpi_status status = AE_OK; - int result; - - pr_notice("Asus Laptop ACPI Extras version %s\n", ASUS_ACPI_VERSION); - - hotk = kzalloc(sizeof(struct asus_hotk), GFP_KERNEL); - if (!hotk) - return -ENOMEM; - - hotk->handle = device->handle; - strcpy(acpi_device_name(device), ACPI_HOTK_DEVICE_NAME); - strcpy(acpi_device_class(device), ACPI_HOTK_CLASS); - device->driver_data = hotk; - hotk->device = device; - - result = asus_hotk_check(); - if (result) - goto end; - - result = asus_hotk_add_fs(device); - if (result) - goto end; - - /* For laptops without GPLV: init the hotk->brightness value */ - if ((!hotk->methods->brightness_get) - && (!hotk->methods->brightness_status) - && (hotk->methods->brightness_up && hotk->methods->brightness_down)) { - status = - acpi_evaluate_object(NULL, hotk->methods->brightness_down, - NULL, NULL); - if (ACPI_FAILURE(status)) - pr_warn(" Error changing brightness\n"); - else { - status = - acpi_evaluate_object(NULL, - hotk->methods->brightness_up, - NULL, NULL); - if (ACPI_FAILURE(status)) - pr_warn(" Strange, error changing brightness\n"); - } - } - - asus_hotk_found = 1; - - /* LED display is off by default */ - hotk->ledd_status = 0xFFF; - -end: - if (result) - kfree(hotk); - - return result; -} - -static int asus_hotk_remove(struct acpi_device *device, int type) -{ - asus_hotk_remove_fs(device); - - kfree(hotk); - - return 0; -} - -static const struct backlight_ops asus_backlight_data = { - .get_brightness = read_brightness, - .update_status = set_brightness_status, -}; - -static void asus_acpi_exit(void) -{ - if (asus_backlight_device) - backlight_device_unregister(asus_backlight_device); - - acpi_bus_unregister_driver(&asus_hotk_driver); - remove_proc_entry(PROC_ASUS, acpi_root_dir); - - return; -} - -static int __init asus_acpi_init(void) -{ - struct backlight_properties props; - int result; - - result = acpi_bus_register_driver(&asus_hotk_driver); - if (result < 0) - return result; - - asus_proc_dir = proc_mkdir(PROC_ASUS, acpi_root_dir); - if (!asus_proc_dir) { - pr_err("Unable to create /proc entry\n"); - acpi_bus_unregister_driver(&asus_hotk_driver); - return -ENODEV; - } - - /* - * This is a bit of a kludge. We only want this module loaded - * for ASUS systems, but there's currently no way to probe the - * ACPI namespace for ASUS HIDs. So we just return failure if - * we didn't find one, which will cause the module to be - * unloaded. - */ - if (!asus_hotk_found) { - acpi_bus_unregister_driver(&asus_hotk_driver); - remove_proc_entry(PROC_ASUS, acpi_root_dir); - return -ENODEV; - } - - memset(&props, 0, sizeof(struct backlight_properties)); - props.type = BACKLIGHT_PLATFORM; - props.max_brightness = 15; - asus_backlight_device = backlight_device_register("asus", NULL, NULL, - &asus_backlight_data, - &props); - if (IS_ERR(asus_backlight_device)) { - pr_err("Could not register asus backlight device\n"); - asus_backlight_device = NULL; - asus_acpi_exit(); - return -ENODEV; - } - - return 0; -} - -module_init(asus_acpi_init); -module_exit(asus_acpi_exit); -- cgit v1.2.3 From b8589e2a8065b8e7773742b60ae96b63b757bb69 Mon Sep 17 00:00:00 2001 From: Benoit Cousson Date: Wed, 29 Feb 2012 22:51:32 +0100 Subject: gpio/twl: Add DT support to gpio-twl4030 driver Add the DT support for the I2C GPIO expander inside the twl4030. Note: The pdata parameters still have to be properly adapted using dedicated bindings. Signed-off-by: Benoit Cousson Acked-by: Felipe Balbi Acked-by: Grant Likely Signed-off-by: Samuel Ortiz --- .../devicetree/bindings/gpio/gpio-twl4030.txt | 23 +++++++ drivers/gpio/gpio-twl4030.c | 79 ++++++++++++++-------- 2 files changed, 73 insertions(+), 29 deletions(-) create mode 100644 Documentation/devicetree/bindings/gpio/gpio-twl4030.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/gpio/gpio-twl4030.txt b/Documentation/devicetree/bindings/gpio/gpio-twl4030.txt new file mode 100644 index 000000000000..16695d9cf1e8 --- /dev/null +++ b/Documentation/devicetree/bindings/gpio/gpio-twl4030.txt @@ -0,0 +1,23 @@ +twl4030 GPIO controller bindings + +Required properties: +- compatible: + - "ti,twl4030-gpio" for twl4030 GPIO controller +- #gpio-cells : Should be two. + - first cell is the pin number + - second cell is used to specify optional parameters (unused) +- gpio-controller : Marks the device node as a GPIO controller. +- #interrupt-cells : Should be 2. +- interrupt-controller: Mark the device node as an interrupt controller + The first cell is the GPIO number. + The second cell is not used. + +Example: + +twl_gpio: gpio { + compatible = "ti,twl4030-gpio"; + #gpio-cells = <2>; + gpio-controller; + #interrupt-cells = <2>; + interrupt-controller; +}; diff --git a/drivers/gpio/gpio-twl4030.c b/drivers/gpio/gpio-twl4030.c index 49e5c6eb403a..94256fe7bf36 100644 --- a/drivers/gpio/gpio-twl4030.c +++ b/drivers/gpio/gpio-twl4030.c @@ -32,6 +32,8 @@ #include #include #include +#include +#include #include @@ -256,7 +258,8 @@ static int twl_request(struct gpio_chip *chip, unsigned offset) * and vMMC2 power supplies based on card presence. */ pdata = chip->dev->platform_data; - value |= pdata->mmc_cd & 0x03; + if (pdata) + value |= pdata->mmc_cd & 0x03; status = gpio_twl4030_write(REG_GPIO_CTRL, value); } @@ -395,6 +398,7 @@ static int gpio_twl4030_remove(struct platform_device *pdev); static int __devinit gpio_twl4030_probe(struct platform_device *pdev) { struct twl4030_gpio_platform_data *pdata = pdev->dev.platform_data; + struct device_node *node = pdev->dev.of_node; int ret, irq_base; /* maybe setup IRQs */ @@ -409,6 +413,9 @@ static int __devinit gpio_twl4030_probe(struct platform_device *pdev) return irq_base; } + irq_domain_add_legacy(node, TWL4030_GPIO_MAX, irq_base, 0, + &irq_domain_simple_ops, NULL); + ret = twl4030_sih_setup(&pdev->dev, TWL4030_MODULE_GPIO, irq_base); if (ret < 0) return ret; @@ -416,40 +423,45 @@ static int __devinit gpio_twl4030_probe(struct platform_device *pdev) twl4030_gpio_irq_base = irq_base; no_irqs: - /* - * NOTE: boards may waste power if they don't set pullups - * and pulldowns correctly ... default for non-ULPI pins is - * pulldown, and some other pins may have external pullups - * or pulldowns. Careful! - */ - ret = gpio_twl4030_pulls(pdata->pullups, pdata->pulldowns); - if (ret) - dev_dbg(&pdev->dev, "pullups %.05x %.05x --> %d\n", - pdata->pullups, pdata->pulldowns, - ret); - - ret = gpio_twl4030_debounce(pdata->debounce, pdata->mmc_cd); - if (ret) - dev_dbg(&pdev->dev, "debounce %.03x %.01x --> %d\n", - pdata->debounce, pdata->mmc_cd, - ret); - - twl_gpiochip.base = pdata->gpio_base; + twl_gpiochip.base = -1; twl_gpiochip.ngpio = TWL4030_GPIO_MAX; twl_gpiochip.dev = &pdev->dev; - /* NOTE: we assume VIBRA_CTL.VIBRA_EN, in MODULE_AUDIO_VOICE, - * is (still) clear if use_leds is set. - */ - if (pdata->use_leds) - twl_gpiochip.ngpio += 2; + if (pdata) { + twl_gpiochip.base = pdata->gpio_base; + + /* + * NOTE: boards may waste power if they don't set pullups + * and pulldowns correctly ... default for non-ULPI pins is + * pulldown, and some other pins may have external pullups + * or pulldowns. Careful! + */ + ret = gpio_twl4030_pulls(pdata->pullups, pdata->pulldowns); + if (ret) + dev_dbg(&pdev->dev, "pullups %.05x %.05x --> %d\n", + pdata->pullups, pdata->pulldowns, + ret); + + ret = gpio_twl4030_debounce(pdata->debounce, pdata->mmc_cd); + if (ret) + dev_dbg(&pdev->dev, "debounce %.03x %.01x --> %d\n", + pdata->debounce, pdata->mmc_cd, + ret); + + /* + * NOTE: we assume VIBRA_CTL.VIBRA_EN, in MODULE_AUDIO_VOICE, + * is (still) clear if use_leds is set. + */ + if (pdata->use_leds) + twl_gpiochip.ngpio += 2; + } ret = gpiochip_add(&twl_gpiochip); if (ret < 0) { dev_err(&pdev->dev, "could not register gpiochip, %d\n", ret); twl_gpiochip.ngpio = 0; gpio_twl4030_remove(pdev); - } else if (pdata->setup) { + } else if (pdata && pdata->setup) { int status; status = pdata->setup(&pdev->dev, @@ -467,7 +479,7 @@ static int gpio_twl4030_remove(struct platform_device *pdev) struct twl4030_gpio_platform_data *pdata = pdev->dev.platform_data; int status; - if (pdata->teardown) { + if (pdata && pdata->teardown) { status = pdata->teardown(&pdev->dev, pdata->gpio_base, TWL4030_GPIO_MAX); if (status) { @@ -488,12 +500,21 @@ static int gpio_twl4030_remove(struct platform_device *pdev) return -EIO; } +static const struct of_device_id twl_gpio_match[] = { + { .compatible = "ti,twl4030-gpio", }, + { }, +}; +MODULE_DEVICE_TABLE(of, twl_gpio_match); + /* Note: this hardware lives inside an I2C-based multi-function device. */ MODULE_ALIAS("platform:twl4030_gpio"); static struct platform_driver gpio_twl4030_driver = { - .driver.name = "twl4030_gpio", - .driver.owner = THIS_MODULE, + .driver = { + .name = "twl4030_gpio", + .owner = THIS_MODULE, + .of_match_table = of_match_ptr(twl_gpio_match), + }, .probe = gpio_twl4030_probe, .remove = gpio_twl4030_remove, }; -- cgit v1.2.3 From 02608bef8f774c058779546926889a2f2717a499 Mon Sep 17 00:00:00 2001 From: Dave Young Date: Wed, 1 Feb 2012 10:33:14 +0800 Subject: module: add kernel param to force disable module load Sometimes we need to test a kernel of same version with code or config option changes. We already have sysctl to disable module load, but add a kernel parameter will be more convenient. Since modules_disabled is int, so here use bint type in core_param. TODO: make sysctl accept bool and change modules_disabled to bool Signed-off-by: Dave Young Signed-off-by: Rusty Russell --- Documentation/kernel-parameters.txt | 2 ++ kernel/module.c | 1 + 2 files changed, 3 insertions(+) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 58eac231fe69..e2f8c297a8a4 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1869,6 +1869,8 @@ bytes respectively. Such letter suffixes can also be entirely omitted. shutdown the other cpus. Instead use the REBOOT_VECTOR irq. + nomodule Disable module load + nopat [X86] Disable PAT (page attribute table extension of pagetables) support. diff --git a/kernel/module.c b/kernel/module.c index 2c932760fd33..7e31da9750c0 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -105,6 +105,7 @@ struct list_head *kdb_modules = &modules; /* kdb needs the list of modules */ /* Block module loading/unloading? */ int modules_disabled = 0; +core_param(nomodule, modules_disabled, bint, 0); /* Waiting for a module to finish initializing? */ static DECLARE_WAIT_QUEUE_HEAD(module_wq); -- cgit v1.2.3 From b0b40f24848440aa65418532fe97d8d46acceb3d Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Mon, 19 Mar 2012 00:34:27 +0100 Subject: dma_buf: Add documentation for the new cpu access support v2: Fix spelling issues noticed by Rob Clark. Signed-off-by: Daniel Vetter Signed-off-by: Rob Clark Signed-off-by: Sumit Semwal --- Documentation/dma-buf-sharing.txt | 102 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 99 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index 225f96d88f55..9f3aeeffd608 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -32,8 +32,12 @@ The buffer-user *IMPORTANT*: [see https://lkml.org/lkml/2011/12/20/211 for more details] For this first version, A buffer shared using the dma_buf sharing API: - *may* be exported to user space using "mmap" *ONLY* by exporter, outside of - this framework. -- may be used *ONLY* by importers that do not need CPU access to the buffer. + this framework. +- with this new iteration of the dma-buf api cpu access from the kernel has been + enable, see below for the details. + +dma-buf operations for device dma only +-------------------------------------- The dma_buf buffer sharing API usage contains the following steps: @@ -219,7 +223,99 @@ NOTES: If the exporter chooses not to allow an attach() operation once a map_dma_buf() API has been called, it simply returns an error. -Miscellaneous notes: +Kernel cpu access to a dma-buf buffer object +-------------------------------------------- + +The motivation to allow cpu access from the kernel to a dma-buf object from the +importers side are: +- fallback operations, e.g. if the devices is connected to a usb bus and the + kernel needs to shuffle the data around first before sending it away. +- full transparency for existing users on the importer side, i.e. userspace + should not notice the difference between a normal object from that subsystem + and an imported one backed by a dma-buf. This is really important for drm + opengl drivers that expect to still use all the existing upload/download + paths. + +Access to a dma_buf from the kernel context involves three steps: + +1. Prepare access, which invalidate any necessary caches and make the object + available for cpu access. +2. Access the object page-by-page with the dma_buf map apis +3. Finish access, which will flush any necessary cpu caches and free reserved + resources. + +1. Prepare access + + Before an importer can access a dma_buf object with the cpu from the kernel + context, it needs to notify the exporter of the access that is about to + happen. + + Interface: + int dma_buf_begin_cpu_access(struct dma_buf *dmabuf, + size_t start, size_t len, + enum dma_data_direction direction) + + This allows the exporter to ensure that the memory is actually available for + cpu access - the exporter might need to allocate or swap-in and pin the + backing storage. The exporter also needs to ensure that cpu access is + coherent for the given range and access direction. The range and access + direction can be used by the exporter to optimize the cache flushing, i.e. + access outside of the range or with a different direction (read instead of + write) might return stale or even bogus data (e.g. when the exporter needs to + copy the data to temporary storage). + + This step might fail, e.g. in oom conditions. + +2. Accessing the buffer + + To support dma_buf objects residing in highmem cpu access is page-based using + an api similar to kmap. Accessing a dma_buf is done in aligned chunks of + PAGE_SIZE size. Before accessing a chunk it needs to be mapped, which returns + a pointer in kernel virtual address space. Afterwards the chunk needs to be + unmapped again. There is no limit on how often a given chunk can be mapped + and unmapped, i.e. the importer does not need to call begin_cpu_access again + before mapping the same chunk again. + + Interfaces: + void *dma_buf_kmap(struct dma_buf *, unsigned long); + void dma_buf_kunmap(struct dma_buf *, unsigned long, void *); + + There are also atomic variants of these interfaces. Like for kmap they + facilitate non-blocking fast-paths. Neither the importer nor the exporter (in + the callback) is allowed to block when using these. + + Interfaces: + void *dma_buf_kmap_atomic(struct dma_buf *, unsigned long); + void dma_buf_kunmap_atomic(struct dma_buf *, unsigned long, void *); + + For importers all the restrictions of using kmap apply, like the limited + supply of kmap_atomic slots. Hence an importer shall only hold onto at most 2 + atomic dma_buf kmaps at the same time (in any given process context). + + dma_buf kmap calls outside of the range specified in begin_cpu_access are + undefined. If the range is not PAGE_SIZE aligned, kmap needs to succeed on + the partial chunks at the beginning and end but may return stale or bogus + data outside of the range (in these partial chunks). + + Note that these calls need to always succeed. The exporter needs to complete + any preparations that might fail in begin_cpu_access. + +3. Finish access + + When the importer is done accessing the range specified in begin_cpu_access, + it needs to announce this to the exporter (to facilitate cache flushing and + unpinning of any pinned resources). The result of of any dma_buf kmap calls + after end_cpu_access is undefined. + + Interface: + void dma_buf_end_cpu_access(struct dma_buf *dma_buf, + size_t start, size_t len, + enum dma_data_direction dir); + + +Miscellaneous notes +------------------- + - Any exporters or users of the dma-buf buffer sharing framework must have a 'select DMA_SHARED_BUFFER' in their respective Kconfigs. -- cgit v1.2.3 From fbb231e1a98cb0360b681b6a6195a619e98d7077 Mon Sep 17 00:00:00 2001 From: Rob Clark Date: Mon, 19 Mar 2012 16:42:49 -0500 Subject: dma-buf: document fd flags and O_CLOEXEC requirement Otherwise subsystems will get this wrong and end up with a second export ioctl with the flag and O_CLOEXEC support added. Signed-off-by: Rob Clark Reviewed-by: Daniel Vetter Signed-off-by: Sumit Semwal --- Documentation/dma-buf-sharing.txt | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) (limited to 'Documentation') diff --git a/Documentation/dma-buf-sharing.txt b/Documentation/dma-buf-sharing.txt index 9f3aeeffd608..3bbd5c51605a 100644 --- a/Documentation/dma-buf-sharing.txt +++ b/Documentation/dma-buf-sharing.txt @@ -319,6 +319,24 @@ Miscellaneous notes - Any exporters or users of the dma-buf buffer sharing framework must have a 'select DMA_SHARED_BUFFER' in their respective Kconfigs. +- In order to avoid fd leaks on exec, the FD_CLOEXEC flag must be set + on the file descriptor. This is not just a resource leak, but a + potential security hole. It could give the newly exec'd application + access to buffers, via the leaked fd, to which it should otherwise + not be permitted access. + + The problem with doing this via a separate fcntl() call, versus doing it + atomically when the fd is created, is that this is inherently racy in a + multi-threaded app[3]. The issue is made worse when it is library code + opening/creating the file descriptor, as the application may not even be + aware of the fd's. + + To avoid this problem, userspace must have a way to request O_CLOEXEC + flag be set when the dma-buf fd is created. So any API provided by + the exporting driver to create a dmabuf fd must provide a way to let + userspace control setting of O_CLOEXEC flag passed in to dma_buf_fd(). + References: [1] struct dma_buf_ops in include/linux/dma-buf.h [2] All interfaces mentioned above defined in include/linux/dma-buf.h +[3] https://lwn.net/Articles/236486/ -- cgit v1.2.3 From e9541ce8efc22c233a045f091c2b969923709038 Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Thu, 22 Mar 2012 16:07:18 -0400 Subject: nfsd4: allow numeric idmapping Mimic the client side by providing a module parameter that turns off idmapping in the auth_sys case, for backwards compatibility with NFSv2 and NFSv3. Unlike in the client case, we don't have any way to negotiate, since the client can return an error to us if it doesn't like the id that we return to it in (for example) a getattr call. However, it has always been possible for servers to return numeric id's, and as far as we're aware clients have always been able to handle them. Also, in the auth_sys case clients already need to have numeric id's the same between client and server. Therefore we believe it's safe to default this to on; but the module parameter is available to return to previous behavior if this proves to be a problem in some unexpected setup. Signed-off-by: J. Bruce Fields --- Documentation/kernel-parameters.txt | 6 +++++ fs/nfsd/nfs4idmap.c | 53 ++++++++++++++++++++++++++++++++++--- 2 files changed, 55 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 0d79a88f4de9..e4f84f013b57 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -1686,6 +1686,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. The default is to send the implementation identification information. + nfsd.nfs4_disable_idmapping= + [NFSv4] When set to the default of '1', the NFSv4 + server will return only numeric uids and gids to + clients using auth_sys, and will accept numeric uids + and gids from such clients. This is intended to ease + migration from NFSv2/v3. objlayoutdriver.osd_login_prog= [NFS] [OBJLAYOUT] sets the pathname to the program which diff --git a/fs/nfsd/nfs4idmap.c b/fs/nfsd/nfs4idmap.c index 94096273cd6c..69ca9c5bb600 100644 --- a/fs/nfsd/nfs4idmap.c +++ b/fs/nfsd/nfs4idmap.c @@ -40,6 +40,14 @@ #include "idmap.h" #include "nfsd.h" +/* + * Turn off idmapping when using AUTH_SYS. + */ +static bool nfs4_disable_idmapping = true; +module_param(nfs4_disable_idmapping, bool, 0644); +MODULE_PARM_DESC(nfs4_disable_idmapping, + "Turn off server's NFSv4 idmapping when using 'sec=sys'"); + /* * Cache entry */ @@ -561,28 +569,65 @@ idmap_id_to_name(struct svc_rqst *rqstp, int type, uid_t id, char *name) return ret; } +static bool +numeric_name_to_id(struct svc_rqst *rqstp, int type, const char *name, u32 namelen, uid_t *id) +{ + int ret; + char buf[11]; + + if (namelen + 1 > sizeof(buf)) + /* too long to represent a 32-bit id: */ + return false; + /* Just to make sure it's null-terminated: */ + memcpy(buf, name, namelen); + buf[namelen] = '\0'; + ret = strict_strtoul(name, 10, (unsigned long *)id); + return ret == 0; +} + +static __be32 +do_name_to_id(struct svc_rqst *rqstp, int type, const char *name, u32 namelen, uid_t *id) +{ + if (nfs4_disable_idmapping && rqstp->rq_flavor < RPC_AUTH_GSS) + if (numeric_name_to_id(rqstp, type, name, namelen, id)) + return 0; + /* + * otherwise, fall through and try idmapping, for + * backwards compatibility with clients sending names: + */ + return idmap_name_to_id(rqstp, type, name, namelen, id); +} + +static int +do_id_to_name(struct svc_rqst *rqstp, int type, uid_t id, char *name) +{ + if (nfs4_disable_idmapping && rqstp->rq_flavor < RPC_AUTH_GSS) + return sprintf(name, "%u", id); + return idmap_id_to_name(rqstp, type, id, name); +} + __be32 nfsd_map_name_to_uid(struct svc_rqst *rqstp, const char *name, size_t namelen, __u32 *id) { - return idmap_name_to_id(rqstp, IDMAP_TYPE_USER, name, namelen, id); + return do_name_to_id(rqstp, IDMAP_TYPE_USER, name, namelen, id); } __be32 nfsd_map_name_to_gid(struct svc_rqst *rqstp, const char *name, size_t namelen, __u32 *id) { - return idmap_name_to_id(rqstp, IDMAP_TYPE_GROUP, name, namelen, id); + return do_name_to_id(rqstp, IDMAP_TYPE_GROUP, name, namelen, id); } int nfsd_map_uid_to_name(struct svc_rqst *rqstp, __u32 id, char *name) { - return idmap_id_to_name(rqstp, IDMAP_TYPE_USER, id, name); + return do_id_to_name(rqstp, IDMAP_TYPE_USER, id, name); } int nfsd_map_gid_to_name(struct svc_rqst *rqstp, __u32 id, char *name) { - return idmap_id_to_name(rqstp, IDMAP_TYPE_GROUP, id, name); + return do_id_to_name(rqstp, IDMAP_TYPE_GROUP, id, name); } -- cgit v1.2.3 From 3832246ddd83391187edff8af328f9e9684d3177 Mon Sep 17 00:00:00 2001 From: Karol Lewandowski Date: Wed, 22 Feb 2012 19:06:22 +0100 Subject: max17042_battery: Make it possible to instantiate driver from DT Allow both device tree (preferred) and platform data-based driver instantiation. Signed-off-by: Karol Lewandowski Signed-off-by: Kyungmin Park Signed-off-by: Anton Vorontsov --- .../bindings/power_supply/max17042_battery.txt | 18 ++++++++ drivers/power/max17042_battery.c | 50 +++++++++++++++++++++- 2 files changed, 67 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/power_supply/max17042_battery.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/power_supply/max17042_battery.txt b/Documentation/devicetree/bindings/power_supply/max17042_battery.txt new file mode 100644 index 000000000000..5bc9b685cf8a --- /dev/null +++ b/Documentation/devicetree/bindings/power_supply/max17042_battery.txt @@ -0,0 +1,18 @@ +max17042_battery +~~~~~~~~~~~~~~~~ + +Required properties : + - compatible : "maxim,max17042" + +Optional properties : + - maxim,rsns-microohm : Resistance of rsns resistor in micro Ohms + (datasheet-recommended value is 10000). + Defining this property enables current-sense functionality. + +Example: + + battery-charger@36 { + compatible = "maxim,max17042"; + reg = <0x36>; + maxim,rsns-microohm = <10000>; + }; diff --git a/drivers/power/max17042_battery.c b/drivers/power/max17042_battery.c index 872e9b4a5ade..e36763a4f0f0 100644 --- a/drivers/power/max17042_battery.c +++ b/drivers/power/max17042_battery.c @@ -31,6 +31,7 @@ #include #include #include +#include /* Status register bits */ #define STATUS_POR_BIT (1 << 1) @@ -608,6 +609,40 @@ static void max17042_init_worker(struct work_struct *work) chip->init_complete = 1; } +#ifdef CONFIG_OF +static struct max17042_platform_data * +max17042_get_pdata(struct device *dev) +{ + struct device_node *np = dev->of_node; + u32 prop; + struct max17042_platform_data *pdata; + + if (!np) + return dev->platform_data; + + pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL); + if (!pdata) + return NULL; + + /* + * Require current sense resistor value to be specified for + * current-sense functionality to be enabled at all. + */ + if (of_property_read_u32(np, "maxim,rsns-microohm", &prop) == 0) { + pdata->r_sns = prop; + pdata->enable_current_sense = true; + } + + return pdata; +} +#else +static struct max17042_platform_data * +max17042_get_pdata(struct device *dev) +{ + return dev->platform_data; +} +#endif + static int __devinit max17042_probe(struct i2c_client *client, const struct i2c_device_id *id) { @@ -624,7 +659,11 @@ static int __devinit max17042_probe(struct i2c_client *client, return -ENOMEM; chip->client = client; - chip->pdata = client->dev.platform_data; + chip->pdata = max17042_get_pdata(&client->dev); + if (!chip->pdata) { + dev_err(&client->dev, "no platform data provided\n"); + return -EINVAL; + } i2c_set_clientdata(client, chip); @@ -689,6 +728,14 @@ static int __devexit max17042_remove(struct i2c_client *client) return 0; } +#ifdef CONFIG_OF +static const struct of_device_id max17042_dt_match[] = { + { .compatible = "maxim,max17042" }, + { }, +}; +MODULE_DEVICE_TABLE(of, max17042_dt_match); +#endif + static const struct i2c_device_id max17042_id[] = { { "max17042", 0 }, { } @@ -698,6 +745,7 @@ MODULE_DEVICE_TABLE(i2c, max17042_id); static struct i2c_driver max17042_i2c_driver = { .driver = { .name = "max17042", + .of_match_table = of_match_ptr(max17042_dt_match), }, .probe = max17042_probe, .remove = __devexit_p(max17042_remove), -- cgit v1.2.3 From 062737fb6d90c632439b1f77ad6a4965cfc84a20 Mon Sep 17 00:00:00 2001 From: Seth Heasley Date: Mon, 26 Mar 2012 21:47:19 +0200 Subject: i2c-i801: Add device IDs for Intel Lynx Point Add the SMBus controller device IDs for the Intel Lynx Point PCH. Signed-off-by: Seth Heasley Signed-off-by: Jean Delvare --- Documentation/i2c/busses/i2c-i801 | 1 + drivers/i2c/busses/Kconfig | 1 + drivers/i2c/busses/i2c-i801.c | 3 +++ 3 files changed, 5 insertions(+) (limited to 'Documentation') diff --git a/Documentation/i2c/busses/i2c-i801 b/Documentation/i2c/busses/i2c-i801 index 2871fd500349..71f55bbcefc8 100644 --- a/Documentation/i2c/busses/i2c-i801 +++ b/Documentation/i2c/busses/i2c-i801 @@ -20,6 +20,7 @@ Supported adapters: * Intel Patsburg (PCH) * Intel DH89xxCC (PCH) * Intel Panther Point (PCH) + * Intel Lynx Point (PCH) Datasheets: Publicly available at the Intel website On Intel Patsburg and later chipsets, both the normal host SMBus controller diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig index 71c1b0a7535c..d2c5095deeac 100644 --- a/drivers/i2c/busses/Kconfig +++ b/drivers/i2c/busses/Kconfig @@ -103,6 +103,7 @@ config I2C_I801 Patsburg (PCH) DH89xxCC (PCH) Panther Point (PCH) + Lynx Point (PCH) This driver can also be built as a module. If so, the module will be called i2c-i801. diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c index 5d2e2816831f..07d0c9565fc5 100644 --- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -51,6 +51,7 @@ Patsburg (PCH) IDF 0x1d72 32 hard yes yes yes DH89xxCC (PCH) 0x2330 32 hard yes yes yes Panther Point (PCH) 0x1e22 32 hard yes yes yes + Lynx Point (PCH) 0x8c22 32 hard yes yes yes Features supported by this driver: Software PEC no @@ -145,6 +146,7 @@ #define PCI_DEVICE_ID_INTEL_PANTHERPOINT_SMBUS 0x1e22 #define PCI_DEVICE_ID_INTEL_DH89XXCC_SMBUS 0x2330 #define PCI_DEVICE_ID_INTEL_5_3400_SERIES_SMBUS 0x3b30 +#define PCI_DEVICE_ID_INTEL_LYNXPOINT_SMBUS 0x8c22 struct i801_priv { struct i2c_adapter adapter; @@ -633,6 +635,7 @@ static DEFINE_PCI_DEVICE_TABLE(i801_ids) = { { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PATSBURG_SMBUS_IDF2) }, { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_DH89XXCC_SMBUS) }, { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PANTHERPOINT_SMBUS) }, + { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_LYNXPOINT_SMBUS) }, { 0, } }; -- cgit v1.2.3 From eea628199d5b12429c47db17035a954b0062e554 Mon Sep 17 00:00:00 2001 From: Stefan Roese Date: Fri, 16 Mar 2012 10:19:31 +0100 Subject: mtd: Add device-tree support to fsmc_nand This patch adds support to configure the FSMC NAND driver (used amongst others on SPEAr platforms) via device-tree instead of platform_data. Signed-off-by: Stefan Roese Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse --- .../devicetree/bindings/mtd/fsmc-nand.txt | 33 +++++++++++++ drivers/mtd/nand/fsmc_nand.c | 57 +++++++++++++++++++++- include/linux/mtd/fsmc.h | 4 +- 3 files changed, 91 insertions(+), 3 deletions(-) create mode 100644 Documentation/devicetree/bindings/mtd/fsmc-nand.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/mtd/fsmc-nand.txt b/Documentation/devicetree/bindings/mtd/fsmc-nand.txt new file mode 100644 index 000000000000..e2c663b354d2 --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/fsmc-nand.txt @@ -0,0 +1,33 @@ +* FSMC NAND + +Required properties: +- compatible : "st,spear600-fsmc-nand" +- reg : Address range of the mtd chip +- reg-names: Should contain the reg names "fsmc_regs" and "nand_data" +- st,ale-off : Chip specific offset to ALE +- st,cle-off : Chip specific offset to CLE + +Optional properties: +- bank-width : Width (in bytes) of the device. If not present, the width + defaults to 1 byte +- nand-skip-bbtscan: Indicates the the BBT scanning should be skipped + +Example: + + fsmc: flash@d1800000 { + compatible = "st,spear600-fsmc-nand"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0xd1800000 0x1000 /* FSMC Register */ + 0xd2000000 0x4000>; /* NAND Base */ + reg-names = "fsmc_regs", "nand_data"; + st,ale-off = <0x20000>; + st,cle-off = <0x10000>; + + bank-width = <1>; + nand-skip-bbtscan; + + partition@0 { + ... + }; + }; diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c index 049018213090..1b8330e1155a 100644 --- a/drivers/mtd/nand/fsmc_nand.c +++ b/drivers/mtd/nand/fsmc_nand.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -854,6 +855,38 @@ static bool filter(struct dma_chan *chan, void *slave) return true; } +#ifdef CONFIG_OF +static int __devinit fsmc_nand_probe_config_dt(struct platform_device *pdev, + struct device_node *np) +{ + struct fsmc_nand_platform_data *pdata = dev_get_platdata(&pdev->dev); + u32 val; + + /* Set default NAND width to 8 bits */ + pdata->width = 8; + if (!of_property_read_u32(np, "bank-width", &val)) { + if (val == 2) { + pdata->width = 16; + } else if (val != 1) { + dev_err(&pdev->dev, "invalid bank-width %u\n", val); + return -EINVAL; + } + } + of_property_read_u32(np, "st,ale-off", &pdata->ale_off); + of_property_read_u32(np, "st,cle-off", &pdata->cle_off); + if (of_get_property(np, "nand-skip-bbtscan", NULL)) + pdata->options = NAND_SKIP_BBTSCAN; + + return 0; +} +#else +static int __devinit fsmc_nand_probe_config_dt(struct platform_device *pdev, + struct device_node *np) +{ + return -ENOSYS; +} +#endif + /* * fsmc_nand_probe - Probe function * @pdev: platform device structure @@ -861,6 +894,8 @@ static bool filter(struct dma_chan *chan, void *slave) static int __init fsmc_nand_probe(struct platform_device *pdev) { struct fsmc_nand_platform_data *pdata = dev_get_platdata(&pdev->dev); + struct device_node __maybe_unused *np = pdev->dev.of_node; + struct mtd_part_parser_data ppdata = {}; struct fsmc_nand_data *host; struct mtd_info *mtd; struct nand_chip *nand; @@ -870,6 +905,16 @@ static int __init fsmc_nand_probe(struct platform_device *pdev) u32 pid; int i; + if (np) { + pdata = devm_kzalloc(&pdev->dev, sizeof(*pdata), GFP_KERNEL); + pdev->dev.platform_data = pdata; + ret = fsmc_nand_probe_config_dt(pdev, np); + if (ret) { + dev_err(&pdev->dev, "no platform data\n"); + return -ENODEV; + } + } + if (!pdata) { dev_err(&pdev->dev, "platform data is NULL\n"); return -EINVAL; @@ -1113,7 +1158,8 @@ static int __init fsmc_nand_probe(struct platform_device *pdev) * Check for partition info passed */ host->mtd.name = "nand"; - ret = mtd_device_parse_register(&host->mtd, NULL, NULL, + ppdata.of_node = np; + ret = mtd_device_parse_register(&host->mtd, NULL, &ppdata, host->partitions, host->nr_partitions); if (ret) goto err_probe; @@ -1183,11 +1229,20 @@ static int fsmc_nand_resume(struct device *dev) static SIMPLE_DEV_PM_OPS(fsmc_nand_pm_ops, fsmc_nand_suspend, fsmc_nand_resume); #endif +#ifdef CONFIG_OF +static const struct of_device_id fsmc_nand_id_table[] = { + { .compatible = "st,spear600-fsmc-nand" }, + {} +}; +MODULE_DEVICE_TABLE(of, fsmc_nand_id_table); +#endif + static struct platform_driver fsmc_nand_driver = { .remove = fsmc_nand_remove, .driver = { .owner = THIS_MODULE, .name = "fsmc-nand", + .of_match_table = of_match_ptr(fsmc_nand_id_table), #ifdef CONFIG_PM .pm = &fsmc_nand_pm_ops, #endif diff --git a/include/linux/mtd/fsmc.h b/include/linux/mtd/fsmc.h index 823359c0f5a1..b20029221fb1 100644 --- a/include/linux/mtd/fsmc.h +++ b/include/linux/mtd/fsmc.h @@ -156,8 +156,8 @@ struct fsmc_nand_platform_data { unsigned int bank; /* CLE, ALE offsets */ - unsigned long cle_off; - unsigned long ale_off; + unsigned int cle_off; + unsigned int ale_off; enum access_mode mode; void (*select_bank)(uint32_t bank, uint32_t busw); -- cgit v1.2.3 From 6551ab5d30d6bf0cea0c6cb294686ce3c7fc6042 Mon Sep 17 00:00:00 2001 From: Stefan Roese Date: Fri, 16 Mar 2012 11:42:11 +0100 Subject: mtd: add device-tree support to spear_smi This patch adds support to configure the SPEAr SMI driver via device-tree instead of platform_data. Signed-off-by: Stefan Roese Signed-off-by: Artem Bityutskiy Signed-off-by: David Woodhouse --- .../devicetree/bindings/mtd/spear_smi.txt | 31 ++++++ drivers/mtd/devices/spear_smi.c | 111 ++++++++++++++++++--- include/linux/mtd/spear_smi.h | 5 + 3 files changed, 135 insertions(+), 12 deletions(-) create mode 100644 Documentation/devicetree/bindings/mtd/spear_smi.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/mtd/spear_smi.txt b/Documentation/devicetree/bindings/mtd/spear_smi.txt new file mode 100644 index 000000000000..7248aadd89e4 --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/spear_smi.txt @@ -0,0 +1,31 @@ +* SPEAr SMI + +Required properties: +- compatible : "st,spear600-smi" +- reg : Address range of the mtd chip +- #address-cells, #size-cells : Must be present if the device has sub-nodes + representing partitions. +- interrupt-parent: Should be the phandle for the interrupt controller + that services interrupts for this device +- interrupts: Should contain the STMMAC interrupts +- clock-rate : Functional clock rate of SMI in Hz + +Optional properties: +- st,smi-fast-mode : Flash supports read in fast mode + +Example: + + smi: flash@fc000000 { + compatible = "st,spear600-smi"; + #address-cells = <1>; + #size-cells = <1>; + reg = <0xfc000000 0x1000>; + interrupt-parent = <&vic1>; + interrupts = <12>; + clock-rate = <50000000>; /* 50MHz */ + + flash@f8000000 { + st,smi-fast-mode; + ... + }; + }; diff --git a/drivers/mtd/devices/spear_smi.c b/drivers/mtd/devices/spear_smi.c index f7f34fd1a3e5..797d43cd3550 100644 --- a/drivers/mtd/devices/spear_smi.c +++ b/drivers/mtd/devices/spear_smi.c @@ -33,9 +33,8 @@ #include #include #include - -/* max possible slots for serial-nor flash chip in the SMI controller */ -#define MAX_NUM_FLASH_CHIP 4 +#include +#include /* SMI clock rate */ #define SMI_MAX_CLOCK_FREQ 50000000 /* 50 MHz */ @@ -748,9 +747,63 @@ err_probe: return ret; } -static int spear_smi_setup_banks(struct platform_device *pdev, u32 bank) + +#ifdef CONFIG_OF +static int __devinit spear_smi_probe_config_dt(struct platform_device *pdev, + struct device_node *np) +{ + struct spear_smi_plat_data *pdata = dev_get_platdata(&pdev->dev); + struct device_node *pp = NULL; + const __be32 *addr; + u32 val; + int len; + int i = 0; + + if (!np) + return -ENODEV; + + of_property_read_u32(np, "clock-rate", &val); + pdata->clk_rate = val; + + pdata->board_flash_info = devm_kzalloc(&pdev->dev, + sizeof(*pdata->board_flash_info), + GFP_KERNEL); + + /* Fill structs for each subnode (flash device) */ + while ((pp = of_get_next_child(np, pp))) { + struct spear_smi_flash_info *flash_info; + + flash_info = &pdata->board_flash_info[i]; + pdata->np[i] = pp; + + /* Read base-addr and size from DT */ + addr = of_get_property(pp, "reg", &len); + pdata->board_flash_info->mem_base = be32_to_cpup(&addr[0]); + pdata->board_flash_info->size = be32_to_cpup(&addr[1]); + + if (of_get_property(pp, "st,smi-fast-mode", NULL)) + pdata->board_flash_info->fast_mode = 1; + + i++; + } + + pdata->num_flashes = i; + + return 0; +} +#else +static int __devinit spear_smi_probe_config_dt(struct platform_device *pdev, + struct device_node *np) +{ + return -ENOSYS; +} +#endif + +static int spear_smi_setup_banks(struct platform_device *pdev, + u32 bank, struct device_node *np) { struct spear_smi *dev = platform_get_drvdata(pdev); + struct mtd_part_parser_data ppdata = {}; struct spear_smi_flash_info *flash_info; struct spear_smi_plat_data *pdata; struct spear_snor_flash *flash; @@ -816,11 +869,16 @@ static int spear_smi_setup_banks(struct platform_device *pdev, u32 bank) dev_info(&dev->pdev->dev, ".erasesize = 0x%x(%uK)\n", flash->mtd.erasesize, flash->mtd.erasesize / 1024); +#ifndef CONFIG_OF if (flash_info->partitions) { parts = flash_info->partitions; count = flash_info->nr_partitions; } - ret = mtd_device_parse_register(&flash->mtd, NULL, NULL, parts, count); +#endif + ppdata.of_node = np; + + ret = mtd_device_parse_register(&flash->mtd, NULL, &ppdata, parts, + count); if (ret) { dev_err(&dev->pdev->dev, "Err MTD partition=%d\n", ret); goto err_map; @@ -847,17 +905,34 @@ err_probe: */ static int __devinit spear_smi_probe(struct platform_device *pdev) { - struct spear_smi_plat_data *pdata; + struct device_node *np = pdev->dev.of_node; + struct spear_smi_plat_data *pdata = NULL; struct spear_smi *dev; struct resource *smi_base; int irq, ret = 0; int i; - pdata = dev_get_platdata(&pdev->dev); - if (pdata < 0) { - ret = -ENODEV; - dev_err(&pdev->dev, "no platform data\n"); - goto err; + if (np) { + pdata = devm_kzalloc(&pdev->dev, sizeof(*pdata), GFP_KERNEL); + if (!pdata) { + pr_err("%s: ERROR: no memory", __func__); + ret = -ENOMEM; + goto err; + } + pdev->dev.platform_data = pdata; + ret = spear_smi_probe_config_dt(pdev, np); + if (ret) { + ret = -ENODEV; + dev_err(&pdev->dev, "no platform data\n"); + goto err; + } + } else { + pdata = dev_get_platdata(&pdev->dev); + if (pdata < 0) { + ret = -ENODEV; + dev_err(&pdev->dev, "no platform data\n"); + goto err; + } } smi_base = platform_get_resource(pdev, IORESOURCE_MEM, 0); @@ -932,7 +1007,7 @@ static int __devinit spear_smi_probe(struct platform_device *pdev) /* loop for each serial nor-flash which is connected to smi */ for (i = 0; i < dev->num_flashes; i++) { - ret = spear_smi_setup_banks(pdev, i); + ret = spear_smi_setup_banks(pdev, i, pdata->np[i]); if (ret) { dev_err(&dev->pdev->dev, "bank setup failed\n"); goto err_bank_setup; @@ -967,6 +1042,7 @@ err: static int __devexit spear_smi_remove(struct platform_device *pdev) { struct spear_smi *dev; + struct spear_smi_plat_data *pdata; struct spear_snor_flash *flash; struct resource *smi_base; int ret; @@ -978,6 +1054,8 @@ static int __devexit spear_smi_remove(struct platform_device *pdev) return -ENODEV; } + pdata = dev_get_platdata(&pdev->dev); + /* clean up for all nor flash */ for (i = 0; i < dev->num_flashes; i++) { flash = dev->flash[i]; @@ -1031,11 +1109,20 @@ int spear_smi_resume(struct platform_device *pdev) return ret; } +#ifdef CONFIG_OF +static const struct of_device_id spear_smi_id_table[] = { + { .compatible = "st,spear600-smi" }, + {} +}; +MODULE_DEVICE_TABLE(of, spear_smi_id_table); +#endif + static struct platform_driver spear_smi_driver = { .driver = { .name = "smi", .bus = &platform_bus_type, .owner = THIS_MODULE, + .of_match_table = of_match_ptr(spear_smi_id_table), }, .probe = spear_smi_probe, .remove = __devexit_p(spear_smi_remove), diff --git a/include/linux/mtd/spear_smi.h b/include/linux/mtd/spear_smi.h index 4e26b4e38da3..8ae1726044c3 100644 --- a/include/linux/mtd/spear_smi.h +++ b/include/linux/mtd/spear_smi.h @@ -14,6 +14,10 @@ #include #include #include +#include + +/* max possible slots for serial-nor flash chip in the SMI controller */ +#define MAX_NUM_FLASH_CHIP 4 /* macro to define partitions for flash devices */ #define DEFINE_PARTS(n, of, s) \ @@ -55,6 +59,7 @@ struct spear_smi_plat_data { unsigned long clk_rate; int num_flashes; struct spear_smi_flash_info *board_flash_info; + struct device_node *np[MAX_NUM_FLASH_CHIP]; }; #endif /* __MTD_SPEAR_SMI_H */ -- cgit v1.2.3 From 7a3e97b0dc4bbac2ba7803564ab0057722689921 Mon Sep 17 00:00:00 2001 From: Santosh Yaraganavi Date: Wed, 29 Feb 2012 12:11:50 +0530 Subject: [SCSI] ufshcd: UFS Host controller driver This patch adds support for Universal Flash Storage(UFS) host controllers. The UFS host controller driver includes host controller initialization method. The Initialization process involves following steps: - Initiate UFS Host Controller initialization process by writing to Host controller enable register - Configure UFS Host controller registers with host memory space datastructure offsets. - Unipro link startup procedure - Check for connected device - Configure UFS host controller to process requests - Enable required interrupts - Configure interrupt aggregation [jejb: fix warnings in 32 bit compile] Signed-off-by: Santosh Yaraganavi Signed-off-by: Vinayak Holikatti Reviewed-by: Arnd Bergmann Reviewed-by: Vishak G Reviewed-by: Girish K S Reviewed-by: Namjae Jeon Signed-off-by: James Bottomley --- Documentation/scsi/00-INDEX | 2 + Documentation/scsi/ufs.txt | 133 +++ drivers/scsi/Kconfig | 1 + drivers/scsi/Makefile | 1 + drivers/scsi/ufs/Kconfig | 49 ++ drivers/scsi/ufs/Makefile | 2 + drivers/scsi/ufs/ufs.h | 207 +++++ drivers/scsi/ufs/ufshcd.c | 1978 +++++++++++++++++++++++++++++++++++++++++++ drivers/scsi/ufs/ufshci.h | 376 ++++++++ 9 files changed, 2749 insertions(+) create mode 100644 Documentation/scsi/ufs.txt create mode 100644 drivers/scsi/ufs/Kconfig create mode 100644 drivers/scsi/ufs/Makefile create mode 100644 drivers/scsi/ufs/ufs.h create mode 100644 drivers/scsi/ufs/ufshcd.c create mode 100644 drivers/scsi/ufs/ufshci.h (limited to 'Documentation') diff --git a/Documentation/scsi/00-INDEX b/Documentation/scsi/00-INDEX index b48ded55b555..b7dd6502bec5 100644 --- a/Documentation/scsi/00-INDEX +++ b/Documentation/scsi/00-INDEX @@ -94,3 +94,5 @@ sym53c8xx_2.txt - info on second generation driver for sym53c8xx based adapters tmscsim.txt - info on driver for AM53c974 based adapters +ufs.txt + - info on Universal Flash Storage(UFS) and UFS host controller driver. diff --git a/Documentation/scsi/ufs.txt b/Documentation/scsi/ufs.txt new file mode 100644 index 000000000000..41a6164592aa --- /dev/null +++ b/Documentation/scsi/ufs.txt @@ -0,0 +1,133 @@ + Universal Flash Storage + ======================= + + +Contents +-------- + +1. Overview +2. UFS Architecture Overview + 2.1 Application Layer + 2.2 UFS Transport Protocol(UTP) layer + 2.3 UFS Interconnect(UIC) Layer +3. UFSHCD Overview + 3.1 UFS controller initialization + 3.2 UTP Transfer requests + 3.3 UFS error handling + 3.4 SCSI Error handling + + +1. Overview +----------- + +Universal Flash Storage(UFS) is a storage specification for flash devices. +It is aimed to provide a universal storage interface for both +embedded and removable flash memory based storage in mobile +devices such as smart phones and tablet computers. The specification +is defined by JEDEC Solid State Technology Association. UFS is based +on MIPI M-PHY physical layer standard. UFS uses MIPI M-PHY as the +physical layer and MIPI Unipro as the link layer. + +The main goals of UFS is to provide, + * Optimized performance: + For UFS version 1.0 and 1.1 the target performance is as follows, + Support for Gear1 is mandatory (rate A: 1248Mbps, rate B: 1457.6Mbps) + Support for Gear2 is optional (rate A: 2496Mbps, rate B: 2915.2Mbps) + Future version of the standard, + Gear3 (rate A: 4992Mbps, rate B: 5830.4Mbps) + * Low power consumption + * High random IOPs and low latency + + +2. UFS Architecture Overview +---------------------------- + +UFS has a layered communication architecture which is based on SCSI +SAM-5 architectural model. + +UFS communication architecture consists of following layers, + +2.1 Application Layer + + The Application layer is composed of UFS command set layer(UCS), + Task Manager and Device manager. The UFS interface is designed to be + protocol agnostic, however SCSI has been selected as a baseline + protocol for versions 1.0 and 1.1 of UFS protocol layer. + UFS supports subset of SCSI commands defined by SPC-4 and SBC-3. + * UCS: It handles SCSI commands supported by UFS specification. + * Task manager: It handles task management functions defined by the + UFS which are meant for command queue control. + * Device manager: It handles device level operations and device + configuration operations. Device level operations mainly involve + device power management operations and commands to Interconnect + layers. Device level configurations involve handling of query + requests which are used to modify and retrieve configuration + information of the device. + +2.2 UFS Transport Protocol(UTP) layer + + UTP layer provides services for + the higher layers through Service Access Points. UTP defines 3 + service access points for higher layers. + * UDM_SAP: Device manager service access point is exposed to device + manager for device level operations. These device level operations + are done through query requests. + * UTP_CMD_SAP: Command service access point is exposed to UFS command + set layer(UCS) to transport commands. + * UTP_TM_SAP: Task management service access point is exposed to task + manager to transport task management functions. + UTP transports messages through UFS protocol information unit(UPIU). + +2.3 UFS Interconnect(UIC) Layer + + UIC is the lowest layer of UFS layered architecture. It handles + connection between UFS host and UFS device. UIC consists of + MIPI UniPro and MIPI M-PHY. UIC provides 2 service access points + to upper layer, + * UIC_SAP: To transport UPIU between UFS host and UFS device. + * UIO_SAP: To issue commands to Unipro layers. + + +3. UFSHCD Overview +------------------ + +The UFS host controller driver is based on Linux SCSI Framework. +UFSHCD is a low level device driver which acts as an interface between +SCSI Midlayer and PCIe based UFS host controllers. + +The current UFSHCD implementation supports following functionality, + +3.1 UFS controller initialization + + The initialization module brings UFS host controller to active state + and prepares the controller to transfer commands/response between + UFSHCD and UFS device. + +3.2 UTP Transfer requests + + Transfer request handling module of UFSHCD receives SCSI commands + from SCSI Midlayer, forms UPIUs and issues the UPIUs to UFS Host + controller. Also, the module decodes, responses received from UFS + host controller in the form of UPIUs and intimates the SCSI Midlayer + of the status of the command. + +3.3 UFS error handling + + Error handling module handles Host controller fatal errors, + Device fatal errors and UIC interconnect layer related errors. + +3.4 SCSI Error handling + + This is done through UFSHCD SCSI error handling routines registered + with SCSI Midlayer. Examples of some of the error handling commands + issues by SCSI Midlayer are Abort task, Lun reset and host reset. + UFSHCD Routines to perform these tasks are registered with + SCSI Midlayer through .eh_abort_handler, .eh_device_reset_handler and + .eh_host_reset_handler. + +In this version of UFSHCD Query requests and power management +functionality are not implemented. + +UFS Specifications can be found at, +UFS - http://www.jedec.org/sites/default/files/docs/JESD220.pdf +UFSHCI - http://www.jedec.org/sites/default/files/docs/JESD223.pdf diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig index 827ebaf4a4a2..af479aa4f494 100644 --- a/drivers/scsi/Kconfig +++ b/drivers/scsi/Kconfig @@ -619,6 +619,7 @@ config SCSI_ARCMSR source "drivers/scsi/megaraid/Kconfig.megaraid" source "drivers/scsi/mpt2sas/Kconfig" +source "drivers/scsi/ufs/Kconfig" config SCSI_HPTIOP tristate "HighPoint RocketRAID 3xxx/4xxx Controller support" diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile index 351b28b3d7df..d469816942f4 100644 --- a/drivers/scsi/Makefile +++ b/drivers/scsi/Makefile @@ -108,6 +108,7 @@ obj-$(CONFIG_MEGARAID_LEGACY) += megaraid.o obj-$(CONFIG_MEGARAID_NEWGEN) += megaraid/ obj-$(CONFIG_MEGARAID_SAS) += megaraid/ obj-$(CONFIG_SCSI_MPT2SAS) += mpt2sas/ +obj-$(CONFIG_SCSI_UFSHCD) += ufs/ obj-$(CONFIG_SCSI_ACARD) += atp870u.o obj-$(CONFIG_SCSI_SUNESP) += esp_scsi.o sun_esp.o obj-$(CONFIG_SCSI_GDTH) += gdth.o diff --git a/drivers/scsi/ufs/Kconfig b/drivers/scsi/ufs/Kconfig new file mode 100644 index 000000000000..8f27f9d6f91d --- /dev/null +++ b/drivers/scsi/ufs/Kconfig @@ -0,0 +1,49 @@ +# +# Kernel configuration file for the UFS Host Controller +# +# This code is based on drivers/scsi/ufs/Kconfig +# Copyright (C) 2011 Samsung Samsung India Software Operations +# +# Santosh Yaraganavi +# Vinayak Holikatti + +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License +# as published by the Free Software Foundation; either version 2 +# of the License, or (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# NO WARRANTY +# THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR +# CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT +# LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, +# MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is +# solely responsible for determining the appropriateness of using and +# distributing the Program and assumes all risks associated with its +# exercise of rights under this Agreement, including but not limited to +# the risks and costs of program errors, damage to or loss of data, +# programs or equipment, and unavailability or interruption of operations. + +# DISCLAIMER OF LIABILITY +# NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY +# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL +# DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND +# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR +# TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE +# USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED +# HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES + +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, +# USA. + +config SCSI_UFSHCD + tristate "Universal Flash Storage host controller driver" + depends on PCI && SCSI + ---help--- + This is a generic driver which supports PCIe UFS Host controllers. diff --git a/drivers/scsi/ufs/Makefile b/drivers/scsi/ufs/Makefile new file mode 100644 index 000000000000..adf7895a6a91 --- /dev/null +++ b/drivers/scsi/ufs/Makefile @@ -0,0 +1,2 @@ +# UFSHCD makefile +obj-$(CONFIG_SCSI_UFSHCD) += ufshcd.o diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h new file mode 100644 index 000000000000..b207529f8d54 --- /dev/null +++ b/drivers/scsi/ufs/ufs.h @@ -0,0 +1,207 @@ +/* + * Universal Flash Storage Host controller driver + * + * This code is based on drivers/scsi/ufs/ufs.h + * Copyright (C) 2011-2012 Samsung India Software Operations + * + * Santosh Yaraganavi + * Vinayak Holikatti + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * NO WARRANTY + * THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR + * CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT + * LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is + * solely responsible for determining the appropriateness of using and + * distributing the Program and assumes all risks associated with its + * exercise of rights under this Agreement, including but not limited to + * the risks and costs of program errors, damage to or loss of data, + * programs or equipment, and unavailability or interruption of operations. + + * DISCLAIMER OF LIABILITY + * NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED + * HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES + + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + */ + +#ifndef _UFS_H +#define _UFS_H + +#define MAX_CDB_SIZE 16 + +#define UPIU_HEADER_DWORD(byte3, byte2, byte1, byte0)\ + ((byte3 << 24) | (byte2 << 16) |\ + (byte1 << 8) | (byte0)) + +/* + * UFS Protocol Information Unit related definitions + */ + +/* Task management functions */ +enum { + UFS_ABORT_TASK = 0x01, + UFS_ABORT_TASK_SET = 0x02, + UFS_CLEAR_TASK_SET = 0x04, + UFS_LOGICAL_RESET = 0x08, + UFS_QUERY_TASK = 0x80, + UFS_QUERY_TASK_SET = 0x81, +}; + +/* UTP UPIU Transaction Codes Initiator to Target */ +enum { + UPIU_TRANSACTION_NOP_OUT = 0x00, + UPIU_TRANSACTION_COMMAND = 0x01, + UPIU_TRANSACTION_DATA_OUT = 0x02, + UPIU_TRANSACTION_TASK_REQ = 0x04, + UPIU_TRANSACTION_QUERY_REQ = 0x26, +}; + +/* UTP UPIU Transaction Codes Target to Initiator */ +enum { + UPIU_TRANSACTION_NOP_IN = 0x20, + UPIU_TRANSACTION_RESPONSE = 0x21, + UPIU_TRANSACTION_DATA_IN = 0x22, + UPIU_TRANSACTION_TASK_RSP = 0x24, + UPIU_TRANSACTION_READY_XFER = 0x31, + UPIU_TRANSACTION_QUERY_RSP = 0x36, +}; + +/* UPIU Read/Write flags */ +enum { + UPIU_CMD_FLAGS_NONE = 0x00, + UPIU_CMD_FLAGS_WRITE = 0x20, + UPIU_CMD_FLAGS_READ = 0x40, +}; + +/* UPIU Task Attributes */ +enum { + UPIU_TASK_ATTR_SIMPLE = 0x00, + UPIU_TASK_ATTR_ORDERED = 0x01, + UPIU_TASK_ATTR_HEADQ = 0x02, + UPIU_TASK_ATTR_ACA = 0x03, +}; + +/* UTP QUERY Transaction Specific Fields OpCode */ +enum { + UPIU_QUERY_OPCODE_NOP = 0x0, + UPIU_QUERY_OPCODE_READ_DESC = 0x1, + UPIU_QUERY_OPCODE_WRITE_DESC = 0x2, + UPIU_QUERY_OPCODE_READ_ATTR = 0x3, + UPIU_QUERY_OPCODE_WRITE_ATTR = 0x4, + UPIU_QUERY_OPCODE_READ_FLAG = 0x5, + UPIU_QUERY_OPCODE_SET_FLAG = 0x6, + UPIU_QUERY_OPCODE_CLEAR_FLAG = 0x7, + UPIU_QUERY_OPCODE_TOGGLE_FLAG = 0x8, +}; + +/* UTP Transfer Request Command Type (CT) */ +enum { + UPIU_COMMAND_SET_TYPE_SCSI = 0x0, + UPIU_COMMAND_SET_TYPE_UFS = 0x1, + UPIU_COMMAND_SET_TYPE_QUERY = 0x2, +}; + +enum { + MASK_SCSI_STATUS = 0xFF, + MASK_TASK_RESPONSE = 0xFF00, + MASK_RSP_UPIU_RESULT = 0xFFFF, +}; + +/* Task management service response */ +enum { + UPIU_TASK_MANAGEMENT_FUNC_COMPL = 0x00, + UPIU_TASK_MANAGEMENT_FUNC_NOT_SUPPORTED = 0x04, + UPIU_TASK_MANAGEMENT_FUNC_SUCCEEDED = 0x08, + UPIU_TASK_MANAGEMENT_FUNC_FAILED = 0x05, + UPIU_INCORRECT_LOGICAL_UNIT_NO = 0x09, +}; +/** + * struct utp_upiu_header - UPIU header structure + * @dword_0: UPIU header DW-0 + * @dword_1: UPIU header DW-1 + * @dword_2: UPIU header DW-2 + */ +struct utp_upiu_header { + u32 dword_0; + u32 dword_1; + u32 dword_2; +}; + +/** + * struct utp_upiu_cmd - Command UPIU structure + * @header: UPIU header structure DW-0 to DW-2 + * @data_transfer_len: Data Transfer Length DW-3 + * @cdb: Command Descriptor Block CDB DW-4 to DW-7 + */ +struct utp_upiu_cmd { + struct utp_upiu_header header; + u32 exp_data_transfer_len; + u8 cdb[MAX_CDB_SIZE]; +}; + +/** + * struct utp_upiu_rsp - Response UPIU structure + * @header: UPIU header DW-0 to DW-2 + * @residual_transfer_count: Residual transfer count DW-3 + * @reserved: Reserved double words DW-4 to DW-7 + * @sense_data_len: Sense data length DW-8 U16 + * @sense_data: Sense data field DW-8 to DW-12 + */ +struct utp_upiu_rsp { + struct utp_upiu_header header; + u32 residual_transfer_count; + u32 reserved[4]; + u16 sense_data_len; + u8 sense_data[18]; +}; + +/** + * struct utp_upiu_task_req - Task request UPIU structure + * @header - UPIU header structure DW0 to DW-2 + * @input_param1: Input parameter 1 DW-3 + * @input_param2: Input parameter 2 DW-4 + * @input_param3: Input parameter 3 DW-5 + * @reserved: Reserved double words DW-6 to DW-7 + */ +struct utp_upiu_task_req { + struct utp_upiu_header header; + u32 input_param1; + u32 input_param2; + u32 input_param3; + u32 reserved[2]; +}; + +/** + * struct utp_upiu_task_rsp - Task Management Response UPIU structure + * @header: UPIU header structure DW0-DW-2 + * @output_param1: Ouput parameter 1 DW3 + * @output_param2: Output parameter 2 DW4 + * @reserved: Reserved double words DW-5 to DW-7 + */ +struct utp_upiu_task_rsp { + struct utp_upiu_header header; + u32 output_param1; + u32 output_param2; + u32 reserved[3]; +}; + +#endif /* End of Header */ diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c new file mode 100644 index 000000000000..52b96e8bf92e --- /dev/null +++ b/drivers/scsi/ufs/ufshcd.c @@ -0,0 +1,1978 @@ +/* + * Universal Flash Storage Host controller driver + * + * This code is based on drivers/scsi/ufs/ufshcd.c + * Copyright (C) 2011-2012 Samsung India Software Operations + * + * Santosh Yaraganavi + * Vinayak Holikatti + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * NO WARRANTY + * THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR + * CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT + * LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is + * solely responsible for determining the appropriateness of using and + * distributing the Program and assumes all risks associated with its + * exercise of rights under this Agreement, including but not limited to + * the risks and costs of program errors, damage to or loss of data, + * programs or equipment, and unavailability or interruption of operations. + + * DISCLAIMER OF LIABILITY + * NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED + * HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES + + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "ufs.h" +#include "ufshci.h" + +#define UFSHCD "ufshcd" +#define UFSHCD_DRIVER_VERSION "0.1" + +enum { + UFSHCD_MAX_CHANNEL = 0, + UFSHCD_MAX_ID = 1, + UFSHCD_MAX_LUNS = 8, + UFSHCD_CMD_PER_LUN = 32, + UFSHCD_CAN_QUEUE = 32, +}; + +/* UFSHCD states */ +enum { + UFSHCD_STATE_OPERATIONAL, + UFSHCD_STATE_RESET, + UFSHCD_STATE_ERROR, +}; + +/* Interrupt configuration options */ +enum { + UFSHCD_INT_DISABLE, + UFSHCD_INT_ENABLE, + UFSHCD_INT_CLEAR, +}; + +/* Interrupt aggregation options */ +enum { + INT_AGGR_RESET, + INT_AGGR_CONFIG, +}; + +/** + * struct uic_command - UIC command structure + * @command: UIC command + * @argument1: UIC command argument 1 + * @argument2: UIC command argument 2 + * @argument3: UIC command argument 3 + * @cmd_active: Indicate if UIC command is outstanding + * @result: UIC command result + */ +struct uic_command { + u32 command; + u32 argument1; + u32 argument2; + u32 argument3; + int cmd_active; + int result; +}; + +/** + * struct ufs_hba - per adapter private structure + * @mmio_base: UFSHCI base register address + * @ucdl_base_addr: UFS Command Descriptor base address + * @utrdl_base_addr: UTP Transfer Request Descriptor base address + * @utmrdl_base_addr: UTP Task Management Descriptor base address + * @ucdl_dma_addr: UFS Command Descriptor DMA address + * @utrdl_dma_addr: UTRDL DMA address + * @utmrdl_dma_addr: UTMRDL DMA address + * @host: Scsi_Host instance of the driver + * @pdev: PCI device handle + * @lrb: local reference block + * @outstanding_tasks: Bits representing outstanding task requests + * @outstanding_reqs: Bits representing outstanding transfer requests + * @capabilities: UFS Controller Capabilities + * @nutrs: Transfer Request Queue depth supported by controller + * @nutmrs: Task Management Queue depth supported by controller + * @active_uic_cmd: handle of active UIC command + * @ufshcd_tm_wait_queue: wait queue for task management + * @tm_condition: condition variable for task management + * @ufshcd_state: UFSHCD states + * @int_enable_mask: Interrupt Mask Bits + * @uic_workq: Work queue for UIC completion handling + * @feh_workq: Work queue for fatal controller error handling + * @errors: HBA errors + */ +struct ufs_hba { + void __iomem *mmio_base; + + /* Virtual memory reference */ + struct utp_transfer_cmd_desc *ucdl_base_addr; + struct utp_transfer_req_desc *utrdl_base_addr; + struct utp_task_req_desc *utmrdl_base_addr; + + /* DMA memory reference */ + dma_addr_t ucdl_dma_addr; + dma_addr_t utrdl_dma_addr; + dma_addr_t utmrdl_dma_addr; + + struct Scsi_Host *host; + struct pci_dev *pdev; + + struct ufshcd_lrb *lrb; + + unsigned long outstanding_tasks; + unsigned long outstanding_reqs; + + u32 capabilities; + int nutrs; + int nutmrs; + u32 ufs_version; + + struct uic_command active_uic_cmd; + wait_queue_head_t ufshcd_tm_wait_queue; + unsigned long tm_condition; + + u32 ufshcd_state; + u32 int_enable_mask; + + /* Work Queues */ + struct work_struct uic_workq; + struct work_struct feh_workq; + + /* HBA Errors */ + u32 errors; +}; + +/** + * struct ufshcd_lrb - local reference block + * @utr_descriptor_ptr: UTRD address of the command + * @ucd_cmd_ptr: UCD address of the command + * @ucd_rsp_ptr: Response UPIU address for this command + * @ucd_prdt_ptr: PRDT address of the command + * @cmd: pointer to SCSI command + * @sense_buffer: pointer to sense buffer address of the SCSI command + * @sense_bufflen: Length of the sense buffer + * @scsi_status: SCSI status of the command + * @command_type: SCSI, UFS, Query. + * @task_tag: Task tag of the command + * @lun: LUN of the command + */ +struct ufshcd_lrb { + struct utp_transfer_req_desc *utr_descriptor_ptr; + struct utp_upiu_cmd *ucd_cmd_ptr; + struct utp_upiu_rsp *ucd_rsp_ptr; + struct ufshcd_sg_entry *ucd_prdt_ptr; + + struct scsi_cmnd *cmd; + u8 *sense_buffer; + unsigned int sense_bufflen; + int scsi_status; + + int command_type; + int task_tag; + unsigned int lun; +}; + +/** + * ufshcd_get_ufs_version - Get the UFS version supported by the HBA + * @hba - Pointer to adapter instance + * + * Returns UFSHCI version supported by the controller + */ +static inline u32 ufshcd_get_ufs_version(struct ufs_hba *hba) +{ + return readl(hba->mmio_base + REG_UFS_VERSION); +} + +/** + * ufshcd_is_device_present - Check if any device connected to + * the host controller + * @reg_hcs - host controller status register value + * + * Returns 0 if device present, non-zero if no device detected + */ +static inline int ufshcd_is_device_present(u32 reg_hcs) +{ + return (DEVICE_PRESENT & reg_hcs) ? 0 : -1; +} + +/** + * ufshcd_get_tr_ocs - Get the UTRD Overall Command Status + * @lrb: pointer to local command reference block + * + * This function is used to get the OCS field from UTRD + * Returns the OCS field in the UTRD + */ +static inline int ufshcd_get_tr_ocs(struct ufshcd_lrb *lrbp) +{ + return lrbp->utr_descriptor_ptr->header.dword_2 & MASK_OCS; +} + +/** + * ufshcd_get_tmr_ocs - Get the UTMRD Overall Command Status + * @task_req_descp: pointer to utp_task_req_desc structure + * + * This function is used to get the OCS field from UTMRD + * Returns the OCS field in the UTMRD + */ +static inline int +ufshcd_get_tmr_ocs(struct utp_task_req_desc *task_req_descp) +{ + return task_req_descp->header.dword_2 & MASK_OCS; +} + +/** + * ufshcd_get_tm_free_slot - get a free slot for task management request + * @hba: per adapter instance + * + * Returns maximum number of task management request slots in case of + * task management queue full or returns the free slot number + */ +static inline int ufshcd_get_tm_free_slot(struct ufs_hba *hba) +{ + return find_first_zero_bit(&hba->outstanding_tasks, hba->nutmrs); +} + +/** + * ufshcd_utrl_clear - Clear a bit in UTRLCLR register + * @hba: per adapter instance + * @pos: position of the bit to be cleared + */ +static inline void ufshcd_utrl_clear(struct ufs_hba *hba, u32 pos) +{ + writel(~(1 << pos), + (hba->mmio_base + REG_UTP_TRANSFER_REQ_LIST_CLEAR)); +} + +/** + * ufshcd_get_lists_status - Check UCRDY, UTRLRDY and UTMRLRDY + * @reg: Register value of host controller status + * + * Returns integer, 0 on Success and positive value if failed + */ +static inline int ufshcd_get_lists_status(u32 reg) +{ + /* + * The mask 0xFF is for the following HCS register bits + * Bit Description + * 0 Device Present + * 1 UTRLRDY + * 2 UTMRLRDY + * 3 UCRDY + * 4 HEI + * 5 DEI + * 6-7 reserved + */ + return (((reg) & (0xFF)) >> 1) ^ (0x07); +} + +/** + * ufshcd_get_uic_cmd_result - Get the UIC command result + * @hba: Pointer to adapter instance + * + * This function gets the result of UIC command completion + * Returns 0 on success, non zero value on error + */ +static inline int ufshcd_get_uic_cmd_result(struct ufs_hba *hba) +{ + return readl(hba->mmio_base + REG_UIC_COMMAND_ARG_2) & + MASK_UIC_COMMAND_RESULT; +} + +/** + * ufshcd_free_hba_memory - Free allocated memory for LRB, request + * and task lists + * @hba: Pointer to adapter instance + */ +static inline void ufshcd_free_hba_memory(struct ufs_hba *hba) +{ + size_t utmrdl_size, utrdl_size, ucdl_size; + + kfree(hba->lrb); + + if (hba->utmrdl_base_addr) { + utmrdl_size = sizeof(struct utp_task_req_desc) * hba->nutmrs; + dma_free_coherent(&hba->pdev->dev, utmrdl_size, + hba->utmrdl_base_addr, hba->utmrdl_dma_addr); + } + + if (hba->utrdl_base_addr) { + utrdl_size = + (sizeof(struct utp_transfer_req_desc) * hba->nutrs); + dma_free_coherent(&hba->pdev->dev, utrdl_size, + hba->utrdl_base_addr, hba->utrdl_dma_addr); + } + + if (hba->ucdl_base_addr) { + ucdl_size = + (sizeof(struct utp_transfer_cmd_desc) * hba->nutrs); + dma_free_coherent(&hba->pdev->dev, ucdl_size, + hba->ucdl_base_addr, hba->ucdl_dma_addr); + } +} + +/** + * ufshcd_is_valid_req_rsp - checks if controller TR response is valid + * @ucd_rsp_ptr: pointer to response UPIU + * + * This function checks the response UPIU for valid transaction type in + * response field + * Returns 0 on success, non-zero on failure + */ +static inline int +ufshcd_is_valid_req_rsp(struct utp_upiu_rsp *ucd_rsp_ptr) +{ + return ((be32_to_cpu(ucd_rsp_ptr->header.dword_0) >> 24) == + UPIU_TRANSACTION_RESPONSE) ? 0 : DID_ERROR << 16; +} + +/** + * ufshcd_get_rsp_upiu_result - Get the result from response UPIU + * @ucd_rsp_ptr: pointer to response UPIU + * + * This function gets the response status and scsi_status from response UPIU + * Returns the response result code. + */ +static inline int +ufshcd_get_rsp_upiu_result(struct utp_upiu_rsp *ucd_rsp_ptr) +{ + return be32_to_cpu(ucd_rsp_ptr->header.dword_1) & MASK_RSP_UPIU_RESULT; +} + +/** + * ufshcd_config_int_aggr - Configure interrupt aggregation values. + * Currently there is no use case where we want to configure + * interrupt aggregation dynamically. So to configure interrupt + * aggregation, #define INT_AGGR_COUNTER_THRESHOLD_VALUE and + * INT_AGGR_TIMEOUT_VALUE are used. + * @hba: per adapter instance + * @option: Interrupt aggregation option + */ +static inline void +ufshcd_config_int_aggr(struct ufs_hba *hba, int option) +{ + switch (option) { + case INT_AGGR_RESET: + writel((INT_AGGR_ENABLE | + INT_AGGR_COUNTER_AND_TIMER_RESET), + (hba->mmio_base + + REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL)); + break; + case INT_AGGR_CONFIG: + writel((INT_AGGR_ENABLE | + INT_AGGR_PARAM_WRITE | + INT_AGGR_COUNTER_THRESHOLD_VALUE | + INT_AGGR_TIMEOUT_VALUE), + (hba->mmio_base + + REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL)); + break; + } +} + +/** + * ufshcd_enable_run_stop_reg - Enable run-stop registers, + * When run-stop registers are set to 1, it indicates the + * host controller that it can process the requests + * @hba: per adapter instance + */ +static void ufshcd_enable_run_stop_reg(struct ufs_hba *hba) +{ + writel(UTP_TASK_REQ_LIST_RUN_STOP_BIT, + (hba->mmio_base + + REG_UTP_TASK_REQ_LIST_RUN_STOP)); + writel(UTP_TRANSFER_REQ_LIST_RUN_STOP_BIT, + (hba->mmio_base + + REG_UTP_TRANSFER_REQ_LIST_RUN_STOP)); +} + +/** + * ufshcd_hba_stop - Send controller to reset state + * @hba: per adapter instance + */ +static inline void ufshcd_hba_stop(struct ufs_hba *hba) +{ + writel(CONTROLLER_DISABLE, (hba->mmio_base + REG_CONTROLLER_ENABLE)); +} + +/** + * ufshcd_hba_start - Start controller initialization sequence + * @hba: per adapter instance + */ +static inline void ufshcd_hba_start(struct ufs_hba *hba) +{ + writel(CONTROLLER_ENABLE , (hba->mmio_base + REG_CONTROLLER_ENABLE)); +} + +/** + * ufshcd_is_hba_active - Get controller state + * @hba: per adapter instance + * + * Returns zero if controller is active, 1 otherwise + */ +static inline int ufshcd_is_hba_active(struct ufs_hba *hba) +{ + return (readl(hba->mmio_base + REG_CONTROLLER_ENABLE) & 0x1) ? 0 : 1; +} + +/** + * ufshcd_send_command - Send SCSI or device management commands + * @hba: per adapter instance + * @task_tag: Task tag of the command + */ +static inline +void ufshcd_send_command(struct ufs_hba *hba, unsigned int task_tag) +{ + __set_bit(task_tag, &hba->outstanding_reqs); + writel((1 << task_tag), + (hba->mmio_base + REG_UTP_TRANSFER_REQ_DOOR_BELL)); +} + +/** + * ufshcd_copy_sense_data - Copy sense data in case of check condition + * @lrb - pointer to local reference block + */ +static inline void ufshcd_copy_sense_data(struct ufshcd_lrb *lrbp) +{ + int len; + if (lrbp->sense_buffer) { + len = be16_to_cpu(lrbp->ucd_rsp_ptr->sense_data_len); + memcpy(lrbp->sense_buffer, + lrbp->ucd_rsp_ptr->sense_data, + min_t(int, len, SCSI_SENSE_BUFFERSIZE)); + } +} + +/** + * ufshcd_hba_capabilities - Read controller capabilities + * @hba: per adapter instance + */ +static inline void ufshcd_hba_capabilities(struct ufs_hba *hba) +{ + hba->capabilities = + readl(hba->mmio_base + REG_CONTROLLER_CAPABILITIES); + + /* nutrs and nutmrs are 0 based values */ + hba->nutrs = (hba->capabilities & MASK_TRANSFER_REQUESTS_SLOTS) + 1; + hba->nutmrs = + ((hba->capabilities & MASK_TASK_MANAGEMENT_REQUEST_SLOTS) >> 16) + 1; +} + +/** + * ufshcd_send_uic_command - Send UIC commands to unipro layers + * @hba: per adapter instance + * @uic_command: UIC command + */ +static inline void +ufshcd_send_uic_command(struct ufs_hba *hba, struct uic_command *uic_cmnd) +{ + /* Write Args */ + writel(uic_cmnd->argument1, + (hba->mmio_base + REG_UIC_COMMAND_ARG_1)); + writel(uic_cmnd->argument2, + (hba->mmio_base + REG_UIC_COMMAND_ARG_2)); + writel(uic_cmnd->argument3, + (hba->mmio_base + REG_UIC_COMMAND_ARG_3)); + + /* Write UIC Cmd */ + writel((uic_cmnd->command & COMMAND_OPCODE_MASK), + (hba->mmio_base + REG_UIC_COMMAND)); +} + +/** + * ufshcd_map_sg - Map scatter-gather list to prdt + * @lrbp - pointer to local reference block + * + * Returns 0 in case of success, non-zero value in case of failure + */ +static int ufshcd_map_sg(struct ufshcd_lrb *lrbp) +{ + struct ufshcd_sg_entry *prd_table; + struct scatterlist *sg; + struct scsi_cmnd *cmd; + int sg_segments; + int i; + + cmd = lrbp->cmd; + sg_segments = scsi_dma_map(cmd); + if (sg_segments < 0) + return sg_segments; + + if (sg_segments) { + lrbp->utr_descriptor_ptr->prd_table_length = + cpu_to_le16((u16) (sg_segments)); + + prd_table = (struct ufshcd_sg_entry *)lrbp->ucd_prdt_ptr; + + scsi_for_each_sg(cmd, sg, sg_segments, i) { + prd_table[i].size = + cpu_to_le32(((u32) sg_dma_len(sg))-1); + prd_table[i].base_addr = + cpu_to_le32(lower_32_bits(sg->dma_address)); + prd_table[i].upper_addr = + cpu_to_le32(upper_32_bits(sg->dma_address)); + } + } else { + lrbp->utr_descriptor_ptr->prd_table_length = 0; + } + + return 0; +} + +/** + * ufshcd_int_config - enable/disable interrupts + * @hba: per adapter instance + * @option: interrupt option + */ +static void ufshcd_int_config(struct ufs_hba *hba, u32 option) +{ + switch (option) { + case UFSHCD_INT_ENABLE: + writel(hba->int_enable_mask, + (hba->mmio_base + REG_INTERRUPT_ENABLE)); + break; + case UFSHCD_INT_DISABLE: + if (hba->ufs_version == UFSHCI_VERSION_10) + writel(INTERRUPT_DISABLE_MASK_10, + (hba->mmio_base + REG_INTERRUPT_ENABLE)); + else + writel(INTERRUPT_DISABLE_MASK_11, + (hba->mmio_base + REG_INTERRUPT_ENABLE)); + break; + } +} + +/** + * ufshcd_compose_upiu - form UFS Protocol Information Unit(UPIU) + * @lrb - pointer to local reference block + */ +static void ufshcd_compose_upiu(struct ufshcd_lrb *lrbp) +{ + struct utp_transfer_req_desc *req_desc; + struct utp_upiu_cmd *ucd_cmd_ptr; + u32 data_direction; + u32 upiu_flags; + + ucd_cmd_ptr = lrbp->ucd_cmd_ptr; + req_desc = lrbp->utr_descriptor_ptr; + + switch (lrbp->command_type) { + case UTP_CMD_TYPE_SCSI: + if (lrbp->cmd->sc_data_direction == DMA_FROM_DEVICE) { + data_direction = UTP_DEVICE_TO_HOST; + upiu_flags = UPIU_CMD_FLAGS_READ; + } else if (lrbp->cmd->sc_data_direction == DMA_TO_DEVICE) { + data_direction = UTP_HOST_TO_DEVICE; + upiu_flags = UPIU_CMD_FLAGS_WRITE; + } else { + data_direction = UTP_NO_DATA_TRANSFER; + upiu_flags = UPIU_CMD_FLAGS_NONE; + } + + /* Transfer request descriptor header fields */ + req_desc->header.dword_0 = + cpu_to_le32(data_direction | UTP_SCSI_COMMAND); + + /* + * assigning invalid value for command status. Controller + * updates OCS on command completion, with the command + * status + */ + req_desc->header.dword_2 = + cpu_to_le32(OCS_INVALID_COMMAND_STATUS); + + /* command descriptor fields */ + ucd_cmd_ptr->header.dword_0 = + cpu_to_be32(UPIU_HEADER_DWORD(UPIU_TRANSACTION_COMMAND, + upiu_flags, + lrbp->lun, + lrbp->task_tag)); + ucd_cmd_ptr->header.dword_1 = + cpu_to_be32( + UPIU_HEADER_DWORD(UPIU_COMMAND_SET_TYPE_SCSI, + 0, + 0, + 0)); + + /* Total EHS length and Data segment length will be zero */ + ucd_cmd_ptr->header.dword_2 = 0; + + ucd_cmd_ptr->exp_data_transfer_len = + cpu_to_be32(lrbp->cmd->transfersize); + + memcpy(ucd_cmd_ptr->cdb, + lrbp->cmd->cmnd, + (min_t(unsigned short, + lrbp->cmd->cmd_len, + MAX_CDB_SIZE))); + break; + case UTP_CMD_TYPE_DEV_MANAGE: + /* For query function implementation */ + break; + case UTP_CMD_TYPE_UFS: + /* For UFS native command implementation */ + break; + } /* end of switch */ +} + +/** + * ufshcd_queuecommand - main entry point for SCSI requests + * @cmd: command from SCSI Midlayer + * @done: call back function + * + * Returns 0 for success, non-zero in case of failure + */ +static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) +{ + struct ufshcd_lrb *lrbp; + struct ufs_hba *hba; + unsigned long flags; + int tag; + int err = 0; + + hba = shost_priv(host); + + tag = cmd->request->tag; + + if (hba->ufshcd_state != UFSHCD_STATE_OPERATIONAL) { + err = SCSI_MLQUEUE_HOST_BUSY; + goto out; + } + + lrbp = &hba->lrb[tag]; + + lrbp->cmd = cmd; + lrbp->sense_bufflen = SCSI_SENSE_BUFFERSIZE; + lrbp->sense_buffer = cmd->sense_buffer; + lrbp->task_tag = tag; + lrbp->lun = cmd->device->lun; + + lrbp->command_type = UTP_CMD_TYPE_SCSI; + + /* form UPIU before issuing the command */ + ufshcd_compose_upiu(lrbp); + err = ufshcd_map_sg(lrbp); + if (err) + goto out; + + /* issue command to the controller */ + spin_lock_irqsave(hba->host->host_lock, flags); + ufshcd_send_command(hba, tag); + spin_unlock_irqrestore(hba->host->host_lock, flags); +out: + return err; +} + +/** + * ufshcd_memory_alloc - allocate memory for host memory space data structures + * @hba: per adapter instance + * + * 1. Allocate DMA memory for Command Descriptor array + * Each command descriptor consist of Command UPIU, Response UPIU and PRDT + * 2. Allocate DMA memory for UTP Transfer Request Descriptor List (UTRDL). + * 3. Allocate DMA memory for UTP Task Management Request Descriptor List + * (UTMRDL) + * 4. Allocate memory for local reference block(lrb). + * + * Returns 0 for success, non-zero in case of failure + */ +static int ufshcd_memory_alloc(struct ufs_hba *hba) +{ + size_t utmrdl_size, utrdl_size, ucdl_size; + + /* Allocate memory for UTP command descriptors */ + ucdl_size = (sizeof(struct utp_transfer_cmd_desc) * hba->nutrs); + hba->ucdl_base_addr = dma_alloc_coherent(&hba->pdev->dev, + ucdl_size, + &hba->ucdl_dma_addr, + GFP_KERNEL); + + /* + * UFSHCI requires UTP command descriptor to be 128 byte aligned. + * make sure hba->ucdl_dma_addr is aligned to PAGE_SIZE + * if hba->ucdl_dma_addr is aligned to PAGE_SIZE, then it will + * be aligned to 128 bytes as well + */ + if (!hba->ucdl_base_addr || + WARN_ON(hba->ucdl_dma_addr & (PAGE_SIZE - 1))) { + dev_err(&hba->pdev->dev, + "Command Descriptor Memory allocation failed\n"); + goto out; + } + + /* + * Allocate memory for UTP Transfer descriptors + * UFSHCI requires 1024 byte alignment of UTRD + */ + utrdl_size = (sizeof(struct utp_transfer_req_desc) * hba->nutrs); + hba->utrdl_base_addr = dma_alloc_coherent(&hba->pdev->dev, + utrdl_size, + &hba->utrdl_dma_addr, + GFP_KERNEL); + if (!hba->utrdl_base_addr || + WARN_ON(hba->utrdl_dma_addr & (PAGE_SIZE - 1))) { + dev_err(&hba->pdev->dev, + "Transfer Descriptor Memory allocation failed\n"); + goto out; + } + + /* + * Allocate memory for UTP Task Management descriptors + * UFSHCI requires 1024 byte alignment of UTMRD + */ + utmrdl_size = sizeof(struct utp_task_req_desc) * hba->nutmrs; + hba->utmrdl_base_addr = dma_alloc_coherent(&hba->pdev->dev, + utmrdl_size, + &hba->utmrdl_dma_addr, + GFP_KERNEL); + if (!hba->utmrdl_base_addr || + WARN_ON(hba->utmrdl_dma_addr & (PAGE_SIZE - 1))) { + dev_err(&hba->pdev->dev, + "Task Management Descriptor Memory allocation failed\n"); + goto out; + } + + /* Allocate memory for local reference block */ + hba->lrb = kcalloc(hba->nutrs, sizeof(struct ufshcd_lrb), GFP_KERNEL); + if (!hba->lrb) { + dev_err(&hba->pdev->dev, "LRB Memory allocation failed\n"); + goto out; + } + return 0; +out: + ufshcd_free_hba_memory(hba); + return -ENOMEM; +} + +/** + * ufshcd_host_memory_configure - configure local reference block with + * memory offsets + * @hba: per adapter instance + * + * Configure Host memory space + * 1. Update Corresponding UTRD.UCDBA and UTRD.UCDBAU with UCD DMA + * address. + * 2. Update each UTRD with Response UPIU offset, Response UPIU length + * and PRDT offset. + * 3. Save the corresponding addresses of UTRD, UCD.CMD, UCD.RSP and UCD.PRDT + * into local reference block. + */ +static void ufshcd_host_memory_configure(struct ufs_hba *hba) +{ + struct utp_transfer_cmd_desc *cmd_descp; + struct utp_transfer_req_desc *utrdlp; + dma_addr_t cmd_desc_dma_addr; + dma_addr_t cmd_desc_element_addr; + u16 response_offset; + u16 prdt_offset; + int cmd_desc_size; + int i; + + utrdlp = hba->utrdl_base_addr; + cmd_descp = hba->ucdl_base_addr; + + response_offset = + offsetof(struct utp_transfer_cmd_desc, response_upiu); + prdt_offset = + offsetof(struct utp_transfer_cmd_desc, prd_table); + + cmd_desc_size = sizeof(struct utp_transfer_cmd_desc); + cmd_desc_dma_addr = hba->ucdl_dma_addr; + + for (i = 0; i < hba->nutrs; i++) { + /* Configure UTRD with command descriptor base address */ + cmd_desc_element_addr = + (cmd_desc_dma_addr + (cmd_desc_size * i)); + utrdlp[i].command_desc_base_addr_lo = + cpu_to_le32(lower_32_bits(cmd_desc_element_addr)); + utrdlp[i].command_desc_base_addr_hi = + cpu_to_le32(upper_32_bits(cmd_desc_element_addr)); + + /* Response upiu and prdt offset should be in double words */ + utrdlp[i].response_upiu_offset = + cpu_to_le16((response_offset >> 2)); + utrdlp[i].prd_table_offset = + cpu_to_le16((prdt_offset >> 2)); + utrdlp[i].response_upiu_length = + cpu_to_le16(ALIGNED_UPIU_SIZE); + + hba->lrb[i].utr_descriptor_ptr = (utrdlp + i); + hba->lrb[i].ucd_cmd_ptr = + (struct utp_upiu_cmd *)(cmd_descp + i); + hba->lrb[i].ucd_rsp_ptr = + (struct utp_upiu_rsp *)cmd_descp[i].response_upiu; + hba->lrb[i].ucd_prdt_ptr = + (struct ufshcd_sg_entry *)cmd_descp[i].prd_table; + } +} + +/** + * ufshcd_dme_link_startup - Notify Unipro to perform link startup + * @hba: per adapter instance + * + * UIC_CMD_DME_LINK_STARTUP command must be issued to Unipro layer, + * in order to initialize the Unipro link startup procedure. + * Once the Unipro links are up, the device connected to the controller + * is detected. + * + * Returns 0 on success, non-zero value on failure + */ +static int ufshcd_dme_link_startup(struct ufs_hba *hba) +{ + struct uic_command *uic_cmd; + unsigned long flags; + + /* check if controller is ready to accept UIC commands */ + if (((readl(hba->mmio_base + REG_CONTROLLER_STATUS)) & + UIC_COMMAND_READY) == 0x0) { + dev_err(&hba->pdev->dev, + "Controller not ready" + " to accept UIC commands\n"); + return -EIO; + } + + spin_lock_irqsave(hba->host->host_lock, flags); + + /* form UIC command */ + uic_cmd = &hba->active_uic_cmd; + uic_cmd->command = UIC_CMD_DME_LINK_STARTUP; + uic_cmd->argument1 = 0; + uic_cmd->argument2 = 0; + uic_cmd->argument3 = 0; + + /* enable UIC related interrupts */ + hba->int_enable_mask |= UIC_COMMAND_COMPL; + ufshcd_int_config(hba, UFSHCD_INT_ENABLE); + + /* sending UIC commands to controller */ + ufshcd_send_uic_command(hba, uic_cmd); + spin_unlock_irqrestore(hba->host->host_lock, flags); + return 0; +} + +/** + * ufshcd_make_hba_operational - Make UFS controller operational + * @hba: per adapter instance + * + * To bring UFS host controller to operational state, + * 1. Check if device is present + * 2. Configure run-stop-registers + * 3. Enable required interrupts + * 4. Configure interrupt aggregation + * + * Returns 0 on success, non-zero value on failure + */ +static int ufshcd_make_hba_operational(struct ufs_hba *hba) +{ + int err = 0; + u32 reg; + + /* check if device present */ + reg = readl((hba->mmio_base + REG_CONTROLLER_STATUS)); + if (ufshcd_is_device_present(reg)) { + dev_err(&hba->pdev->dev, "cc: Device not present\n"); + err = -ENXIO; + goto out; + } + + /* + * UCRDY, UTMRLDY and UTRLRDY bits must be 1 + * DEI, HEI bits must be 0 + */ + if (!(ufshcd_get_lists_status(reg))) { + ufshcd_enable_run_stop_reg(hba); + } else { + dev_err(&hba->pdev->dev, + "Host controller not ready to process requests"); + err = -EIO; + goto out; + } + + /* Enable required interrupts */ + hba->int_enable_mask |= (UTP_TRANSFER_REQ_COMPL | + UIC_ERROR | + UTP_TASK_REQ_COMPL | + DEVICE_FATAL_ERROR | + CONTROLLER_FATAL_ERROR | + SYSTEM_BUS_FATAL_ERROR); + ufshcd_int_config(hba, UFSHCD_INT_ENABLE); + + /* Configure interrupt aggregation */ + ufshcd_config_int_aggr(hba, INT_AGGR_CONFIG); + + if (hba->ufshcd_state == UFSHCD_STATE_RESET) + scsi_unblock_requests(hba->host); + + hba->ufshcd_state = UFSHCD_STATE_OPERATIONAL; + scsi_scan_host(hba->host); +out: + return err; +} + +/** + * ufshcd_hba_enable - initialize the controller + * @hba: per adapter instance + * + * The controller resets itself and controller firmware initialization + * sequence kicks off. When controller is ready it will set + * the Host Controller Enable bit to 1. + * + * Returns 0 on success, non-zero value on failure + */ +static int ufshcd_hba_enable(struct ufs_hba *hba) +{ + int retry; + + /* + * msleep of 1 and 5 used in this function might result in msleep(20), + * but it was necessary to send the UFS FPGA to reset mode during + * development and testing of this driver. msleep can be changed to + * mdelay and retry count can be reduced based on the controller. + */ + if (!ufshcd_is_hba_active(hba)) { + + /* change controller state to "reset state" */ + ufshcd_hba_stop(hba); + + /* + * This delay is based on the testing done with UFS host + * controller FPGA. The delay can be changed based on the + * host controller used. + */ + msleep(5); + } + + /* start controller initialization sequence */ + ufshcd_hba_start(hba); + + /* + * To initialize a UFS host controller HCE bit must be set to 1. + * During initialization the HCE bit value changes from 1->0->1. + * When the host controller completes initialization sequence + * it sets the value of HCE bit to 1. The same HCE bit is read back + * to check if the controller has completed initialization sequence. + * So without this delay the value HCE = 1, set in the previous + * instruction might be read back. + * This delay can be changed based on the controller. + */ + msleep(1); + + /* wait for the host controller to complete initialization */ + retry = 10; + while (ufshcd_is_hba_active(hba)) { + if (retry) { + retry--; + } else { + dev_err(&hba->pdev->dev, + "Controller enable failed\n"); + return -EIO; + } + msleep(5); + } + return 0; +} + +/** + * ufshcd_initialize_hba - start the initialization process + * @hba: per adapter instance + * + * 1. Enable the controller via ufshcd_hba_enable. + * 2. Program the Transfer Request List Address with the starting address of + * UTRDL. + * 3. Program the Task Management Request List Address with starting address + * of UTMRDL. + * + * Returns 0 on success, non-zero value on failure. + */ +static int ufshcd_initialize_hba(struct ufs_hba *hba) +{ + if (ufshcd_hba_enable(hba)) + return -EIO; + + /* Configure UTRL and UTMRL base address registers */ + writel(hba->utrdl_dma_addr, + (hba->mmio_base + REG_UTP_TRANSFER_REQ_LIST_BASE_L)); + writel(lower_32_bits(hba->utrdl_dma_addr), + (hba->mmio_base + REG_UTP_TRANSFER_REQ_LIST_BASE_H)); + writel(hba->utmrdl_dma_addr, + (hba->mmio_base + REG_UTP_TASK_REQ_LIST_BASE_L)); + writel(upper_32_bits(hba->utmrdl_dma_addr), + (hba->mmio_base + REG_UTP_TASK_REQ_LIST_BASE_H)); + + /* Initialize unipro link startup procedure */ + return ufshcd_dme_link_startup(hba); +} + +/** + * ufshcd_do_reset - reset the host controller + * @hba: per adapter instance + * + * Returns SUCCESS/FAILED + */ +static int ufshcd_do_reset(struct ufs_hba *hba) +{ + struct ufshcd_lrb *lrbp; + unsigned long flags; + int tag; + + /* block commands from midlayer */ + scsi_block_requests(hba->host); + + spin_lock_irqsave(hba->host->host_lock, flags); + hba->ufshcd_state = UFSHCD_STATE_RESET; + + /* send controller to reset state */ + ufshcd_hba_stop(hba); + spin_unlock_irqrestore(hba->host->host_lock, flags); + + /* abort outstanding commands */ + for (tag = 0; tag < hba->nutrs; tag++) { + if (test_bit(tag, &hba->outstanding_reqs)) { + lrbp = &hba->lrb[tag]; + scsi_dma_unmap(lrbp->cmd); + lrbp->cmd->result = DID_RESET << 16; + lrbp->cmd->scsi_done(lrbp->cmd); + lrbp->cmd = NULL; + } + } + + /* clear outstanding request/task bit maps */ + hba->outstanding_reqs = 0; + hba->outstanding_tasks = 0; + + /* start the initialization process */ + if (ufshcd_initialize_hba(hba)) { + dev_err(&hba->pdev->dev, + "Reset: Controller initialization failed\n"); + return FAILED; + } + return SUCCESS; +} + +/** + * ufshcd_slave_alloc - handle initial SCSI device configurations + * @sdev: pointer to SCSI device + * + * Returns success + */ +static int ufshcd_slave_alloc(struct scsi_device *sdev) +{ + struct ufs_hba *hba; + + hba = shost_priv(sdev->host); + sdev->tagged_supported = 1; + + /* Mode sense(6) is not supported by UFS, so use Mode sense(10) */ + sdev->use_10_for_ms = 1; + scsi_set_tag_type(sdev, MSG_SIMPLE_TAG); + + /* + * Inform SCSI Midlayer that the LUN queue depth is same as the + * controller queue depth. If a LUN queue depth is less than the + * controller queue depth and if the LUN reports + * SAM_STAT_TASK_SET_FULL, the LUN queue depth will be adjusted + * with scsi_adjust_queue_depth. + */ + scsi_activate_tcq(sdev, hba->nutrs); + return 0; +} + +/** + * ufshcd_slave_destroy - remove SCSI device configurations + * @sdev: pointer to SCSI device + */ +static void ufshcd_slave_destroy(struct scsi_device *sdev) +{ + struct ufs_hba *hba; + + hba = shost_priv(sdev->host); + scsi_deactivate_tcq(sdev, hba->nutrs); +} + +/** + * ufshcd_task_req_compl - handle task management request completion + * @hba: per adapter instance + * @index: index of the completed request + * + * Returns SUCCESS/FAILED + */ +static int ufshcd_task_req_compl(struct ufs_hba *hba, u32 index) +{ + struct utp_task_req_desc *task_req_descp; + struct utp_upiu_task_rsp *task_rsp_upiup; + unsigned long flags; + int ocs_value; + int task_result; + + spin_lock_irqsave(hba->host->host_lock, flags); + + /* Clear completed tasks from outstanding_tasks */ + __clear_bit(index, &hba->outstanding_tasks); + + task_req_descp = hba->utmrdl_base_addr; + ocs_value = ufshcd_get_tmr_ocs(&task_req_descp[index]); + + if (ocs_value == OCS_SUCCESS) { + task_rsp_upiup = (struct utp_upiu_task_rsp *) + task_req_descp[index].task_rsp_upiu; + task_result = be32_to_cpu(task_rsp_upiup->header.dword_1); + task_result = ((task_result & MASK_TASK_RESPONSE) >> 8); + + if (task_result != UPIU_TASK_MANAGEMENT_FUNC_COMPL || + task_result != UPIU_TASK_MANAGEMENT_FUNC_SUCCEEDED) + task_result = FAILED; + } else { + task_result = FAILED; + dev_err(&hba->pdev->dev, + "trc: Invalid ocs = %x\n", ocs_value); + } + spin_unlock_irqrestore(hba->host->host_lock, flags); + return task_result; +} + +/** + * ufshcd_adjust_lun_qdepth - Update LUN queue depth if device responds with + * SAM_STAT_TASK_SET_FULL SCSI command status. + * @cmd: pointer to SCSI command + */ +static void ufshcd_adjust_lun_qdepth(struct scsi_cmnd *cmd) +{ + struct ufs_hba *hba; + int i; + int lun_qdepth = 0; + + hba = shost_priv(cmd->device->host); + + /* + * LUN queue depth can be obtained by counting outstanding commands + * on the LUN. + */ + for (i = 0; i < hba->nutrs; i++) { + if (test_bit(i, &hba->outstanding_reqs)) { + + /* + * Check if the outstanding command belongs + * to the LUN which reported SAM_STAT_TASK_SET_FULL. + */ + if (cmd->device->lun == hba->lrb[i].lun) + lun_qdepth++; + } + } + + /* + * LUN queue depth will be total outstanding commands, except the + * command for which the LUN reported SAM_STAT_TASK_SET_FULL. + */ + scsi_adjust_queue_depth(cmd->device, MSG_SIMPLE_TAG, lun_qdepth - 1); +} + +/** + * ufshcd_scsi_cmd_status - Update SCSI command result based on SCSI status + * @lrb: pointer to local reference block of completed command + * @scsi_status: SCSI command status + * + * Returns value base on SCSI command status + */ +static inline int +ufshcd_scsi_cmd_status(struct ufshcd_lrb *lrbp, int scsi_status) +{ + int result = 0; + + switch (scsi_status) { + case SAM_STAT_GOOD: + result |= DID_OK << 16 | + COMMAND_COMPLETE << 8 | + SAM_STAT_GOOD; + break; + case SAM_STAT_CHECK_CONDITION: + result |= DID_OK << 16 | + COMMAND_COMPLETE << 8 | + SAM_STAT_CHECK_CONDITION; + ufshcd_copy_sense_data(lrbp); + break; + case SAM_STAT_BUSY: + result |= SAM_STAT_BUSY; + break; + case SAM_STAT_TASK_SET_FULL: + + /* + * If a LUN reports SAM_STAT_TASK_SET_FULL, then the LUN queue + * depth needs to be adjusted to the exact number of + * outstanding commands the LUN can handle at any given time. + */ + ufshcd_adjust_lun_qdepth(lrbp->cmd); + result |= SAM_STAT_TASK_SET_FULL; + break; + case SAM_STAT_TASK_ABORTED: + result |= SAM_STAT_TASK_ABORTED; + break; + default: + result |= DID_ERROR << 16; + break; + } /* end of switch */ + + return result; +} + +/** + * ufshcd_transfer_rsp_status - Get overall status of the response + * @hba: per adapter instance + * @lrb: pointer to local reference block of completed command + * + * Returns result of the command to notify SCSI midlayer + */ +static inline int +ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct ufshcd_lrb *lrbp) +{ + int result = 0; + int scsi_status; + int ocs; + + /* overall command status of utrd */ + ocs = ufshcd_get_tr_ocs(lrbp); + + switch (ocs) { + case OCS_SUCCESS: + + /* check if the returned transfer response is valid */ + result = ufshcd_is_valid_req_rsp(lrbp->ucd_rsp_ptr); + if (result) { + dev_err(&hba->pdev->dev, + "Invalid response = %x\n", result); + break; + } + + /* + * get the response UPIU result to extract + * the SCSI command status + */ + result = ufshcd_get_rsp_upiu_result(lrbp->ucd_rsp_ptr); + + /* + * get the result based on SCSI status response + * to notify the SCSI midlayer of the command status + */ + scsi_status = result & MASK_SCSI_STATUS; + result = ufshcd_scsi_cmd_status(lrbp, scsi_status); + break; + case OCS_ABORTED: + result |= DID_ABORT << 16; + break; + case OCS_INVALID_CMD_TABLE_ATTR: + case OCS_INVALID_PRDT_ATTR: + case OCS_MISMATCH_DATA_BUF_SIZE: + case OCS_MISMATCH_RESP_UPIU_SIZE: + case OCS_PEER_COMM_FAILURE: + case OCS_FATAL_ERROR: + default: + result |= DID_ERROR << 16; + dev_err(&hba->pdev->dev, + "OCS error from controller = %x\n", ocs); + break; + } /* end of switch */ + + return result; +} + +/** + * ufshcd_transfer_req_compl - handle SCSI and query command completion + * @hba: per adapter instance + */ +static void ufshcd_transfer_req_compl(struct ufs_hba *hba) +{ + struct ufshcd_lrb *lrb; + unsigned long completed_reqs; + u32 tr_doorbell; + int result; + int index; + + lrb = hba->lrb; + tr_doorbell = + readl(hba->mmio_base + REG_UTP_TRANSFER_REQ_DOOR_BELL); + completed_reqs = tr_doorbell ^ hba->outstanding_reqs; + + for (index = 0; index < hba->nutrs; index++) { + if (test_bit(index, &completed_reqs)) { + + result = ufshcd_transfer_rsp_status(hba, &lrb[index]); + + if (lrb[index].cmd) { + scsi_dma_unmap(lrb[index].cmd); + lrb[index].cmd->result = result; + lrb[index].cmd->scsi_done(lrb[index].cmd); + + /* Mark completed command as NULL in LRB */ + lrb[index].cmd = NULL; + } + } /* end of if */ + } /* end of for */ + + /* clear corresponding bits of completed commands */ + hba->outstanding_reqs ^= completed_reqs; + + /* Reset interrupt aggregation counters */ + ufshcd_config_int_aggr(hba, INT_AGGR_RESET); +} + +/** + * ufshcd_uic_cc_handler - handle UIC command completion + * @work: pointer to a work queue structure + * + * Returns 0 on success, non-zero value on failure + */ +static void ufshcd_uic_cc_handler (struct work_struct *work) +{ + struct ufs_hba *hba; + + hba = container_of(work, struct ufs_hba, uic_workq); + + if ((hba->active_uic_cmd.command == UIC_CMD_DME_LINK_STARTUP) && + !(ufshcd_get_uic_cmd_result(hba))) { + + if (ufshcd_make_hba_operational(hba)) + dev_err(&hba->pdev->dev, + "cc: hba not operational state\n"); + return; + } +} + +/** + * ufshcd_fatal_err_handler - handle fatal errors + * @hba: per adapter instance + */ +static void ufshcd_fatal_err_handler(struct work_struct *work) +{ + struct ufs_hba *hba; + hba = container_of(work, struct ufs_hba, feh_workq); + + /* check if reset is already in progress */ + if (hba->ufshcd_state != UFSHCD_STATE_RESET) + ufshcd_do_reset(hba); +} + +/** + * ufshcd_err_handler - Check for fatal errors + * @work: pointer to a work queue structure + */ +static void ufshcd_err_handler(struct ufs_hba *hba) +{ + u32 reg; + + if (hba->errors & INT_FATAL_ERRORS) + goto fatal_eh; + + if (hba->errors & UIC_ERROR) { + + reg = readl(hba->mmio_base + + REG_UIC_ERROR_CODE_PHY_ADAPTER_LAYER); + if (reg & UIC_DATA_LINK_LAYER_ERROR_PA_INIT) + goto fatal_eh; + } + return; +fatal_eh: + hba->ufshcd_state = UFSHCD_STATE_ERROR; + schedule_work(&hba->feh_workq); +} + +/** + * ufshcd_tmc_handler - handle task management function completion + * @hba: per adapter instance + */ +static void ufshcd_tmc_handler(struct ufs_hba *hba) +{ + u32 tm_doorbell; + + tm_doorbell = readl(hba->mmio_base + REG_UTP_TASK_REQ_DOOR_BELL); + hba->tm_condition = tm_doorbell ^ hba->outstanding_tasks; + wake_up_interruptible(&hba->ufshcd_tm_wait_queue); +} + +/** + * ufshcd_sl_intr - Interrupt service routine + * @hba: per adapter instance + * @intr_status: contains interrupts generated by the controller + */ +static void ufshcd_sl_intr(struct ufs_hba *hba, u32 intr_status) +{ + hba->errors = UFSHCD_ERROR_MASK & intr_status; + if (hba->errors) + ufshcd_err_handler(hba); + + if (intr_status & UIC_COMMAND_COMPL) + schedule_work(&hba->uic_workq); + + if (intr_status & UTP_TASK_REQ_COMPL) + ufshcd_tmc_handler(hba); + + if (intr_status & UTP_TRANSFER_REQ_COMPL) + ufshcd_transfer_req_compl(hba); +} + +/** + * ufshcd_intr - Main interrupt service routine + * @irq: irq number + * @__hba: pointer to adapter instance + * + * Returns IRQ_HANDLED - If interrupt is valid + * IRQ_NONE - If invalid interrupt + */ +static irqreturn_t ufshcd_intr(int irq, void *__hba) +{ + u32 intr_status; + irqreturn_t retval = IRQ_NONE; + struct ufs_hba *hba = __hba; + + spin_lock(hba->host->host_lock); + intr_status = readl(hba->mmio_base + REG_INTERRUPT_STATUS); + + if (intr_status) { + ufshcd_sl_intr(hba, intr_status); + + /* If UFSHCI 1.0 then clear interrupt status register */ + if (hba->ufs_version == UFSHCI_VERSION_10) + writel(intr_status, + (hba->mmio_base + REG_INTERRUPT_STATUS)); + retval = IRQ_HANDLED; + } + spin_unlock(hba->host->host_lock); + return retval; +} + +/** + * ufshcd_issue_tm_cmd - issues task management commands to controller + * @hba: per adapter instance + * @lrbp: pointer to local reference block + * + * Returns SUCCESS/FAILED + */ +static int +ufshcd_issue_tm_cmd(struct ufs_hba *hba, + struct ufshcd_lrb *lrbp, + u8 tm_function) +{ + struct utp_task_req_desc *task_req_descp; + struct utp_upiu_task_req *task_req_upiup; + struct Scsi_Host *host; + unsigned long flags; + int free_slot = 0; + int err; + + host = hba->host; + + spin_lock_irqsave(host->host_lock, flags); + + /* If task management queue is full */ + free_slot = ufshcd_get_tm_free_slot(hba); + if (free_slot >= hba->nutmrs) { + spin_unlock_irqrestore(host->host_lock, flags); + dev_err(&hba->pdev->dev, "Task management queue full\n"); + err = FAILED; + goto out; + } + + task_req_descp = hba->utmrdl_base_addr; + task_req_descp += free_slot; + + /* Configure task request descriptor */ + task_req_descp->header.dword_0 = cpu_to_le32(UTP_REQ_DESC_INT_CMD); + task_req_descp->header.dword_2 = + cpu_to_le32(OCS_INVALID_COMMAND_STATUS); + + /* Configure task request UPIU */ + task_req_upiup = + (struct utp_upiu_task_req *) task_req_descp->task_req_upiu; + task_req_upiup->header.dword_0 = + cpu_to_be32(UPIU_HEADER_DWORD(UPIU_TRANSACTION_TASK_REQ, 0, + lrbp->lun, lrbp->task_tag)); + task_req_upiup->header.dword_1 = + cpu_to_be32(UPIU_HEADER_DWORD(0, tm_function, 0, 0)); + + task_req_upiup->input_param1 = lrbp->lun; + task_req_upiup->input_param1 = + cpu_to_be32(task_req_upiup->input_param1); + task_req_upiup->input_param2 = lrbp->task_tag; + task_req_upiup->input_param2 = + cpu_to_be32(task_req_upiup->input_param2); + + /* send command to the controller */ + __set_bit(free_slot, &hba->outstanding_tasks); + writel((1 << free_slot), + (hba->mmio_base + REG_UTP_TASK_REQ_DOOR_BELL)); + + spin_unlock_irqrestore(host->host_lock, flags); + + /* wait until the task management command is completed */ + err = + wait_event_interruptible_timeout(hba->ufshcd_tm_wait_queue, + (test_bit(free_slot, + &hba->tm_condition) != 0), + 60 * HZ); + if (!err) { + dev_err(&hba->pdev->dev, + "Task management command timed-out\n"); + err = FAILED; + goto out; + } + clear_bit(free_slot, &hba->tm_condition); + return ufshcd_task_req_compl(hba, free_slot); +out: + return err; +} + +/** + * ufshcd_device_reset - reset device and abort all the pending commands + * @cmd: SCSI command pointer + * + * Returns SUCCESS/FAILED + */ +static int ufshcd_device_reset(struct scsi_cmnd *cmd) +{ + struct Scsi_Host *host; + struct ufs_hba *hba; + unsigned int tag; + u32 pos; + int err; + + host = cmd->device->host; + hba = shost_priv(host); + tag = cmd->request->tag; + + err = ufshcd_issue_tm_cmd(hba, &hba->lrb[tag], UFS_LOGICAL_RESET); + if (err) + goto out; + + for (pos = 0; pos < hba->nutrs; pos++) { + if (test_bit(pos, &hba->outstanding_reqs) && + (hba->lrb[tag].lun == hba->lrb[pos].lun)) { + + /* clear the respective UTRLCLR register bit */ + ufshcd_utrl_clear(hba, pos); + + clear_bit(pos, &hba->outstanding_reqs); + + if (hba->lrb[pos].cmd) { + scsi_dma_unmap(hba->lrb[pos].cmd); + hba->lrb[pos].cmd->result = + DID_ABORT << 16; + hba->lrb[pos].cmd->scsi_done(cmd); + hba->lrb[pos].cmd = NULL; + } + } + } /* end of for */ +out: + return err; +} + +/** + * ufshcd_host_reset - Main reset function registered with scsi layer + * @cmd: SCSI command pointer + * + * Returns SUCCESS/FAILED + */ +static int ufshcd_host_reset(struct scsi_cmnd *cmd) +{ + struct ufs_hba *hba; + + hba = shost_priv(cmd->device->host); + + if (hba->ufshcd_state == UFSHCD_STATE_RESET) + return SUCCESS; + + return (ufshcd_do_reset(hba) == SUCCESS) ? SUCCESS : FAILED; +} + +/** + * ufshcd_abort - abort a specific command + * @cmd: SCSI command pointer + * + * Returns SUCCESS/FAILED + */ +static int ufshcd_abort(struct scsi_cmnd *cmd) +{ + struct Scsi_Host *host; + struct ufs_hba *hba; + unsigned long flags; + unsigned int tag; + int err; + + host = cmd->device->host; + hba = shost_priv(host); + tag = cmd->request->tag; + + spin_lock_irqsave(host->host_lock, flags); + + /* check if command is still pending */ + if (!(test_bit(tag, &hba->outstanding_reqs))) { + err = FAILED; + spin_unlock_irqrestore(host->host_lock, flags); + goto out; + } + spin_unlock_irqrestore(host->host_lock, flags); + + err = ufshcd_issue_tm_cmd(hba, &hba->lrb[tag], UFS_ABORT_TASK); + if (err) + goto out; + + scsi_dma_unmap(cmd); + + spin_lock_irqsave(host->host_lock, flags); + + /* clear the respective UTRLCLR register bit */ + ufshcd_utrl_clear(hba, tag); + + __clear_bit(tag, &hba->outstanding_reqs); + hba->lrb[tag].cmd = NULL; + spin_unlock_irqrestore(host->host_lock, flags); +out: + return err; +} + +static struct scsi_host_template ufshcd_driver_template = { + .module = THIS_MODULE, + .name = UFSHCD, + .proc_name = UFSHCD, + .queuecommand = ufshcd_queuecommand, + .slave_alloc = ufshcd_slave_alloc, + .slave_destroy = ufshcd_slave_destroy, + .eh_abort_handler = ufshcd_abort, + .eh_device_reset_handler = ufshcd_device_reset, + .eh_host_reset_handler = ufshcd_host_reset, + .this_id = -1, + .sg_tablesize = SG_ALL, + .cmd_per_lun = UFSHCD_CMD_PER_LUN, + .can_queue = UFSHCD_CAN_QUEUE, +}; + +/** + * ufshcd_shutdown - main function to put the controller in reset state + * @pdev: pointer to PCI device handle + */ +static void ufshcd_shutdown(struct pci_dev *pdev) +{ + ufshcd_hba_stop((struct ufs_hba *)pci_get_drvdata(pdev)); +} + +#ifdef CONFIG_PM +/** + * ufshcd_suspend - suspend power management function + * @pdev: pointer to PCI device handle + * @state: power state + * + * Returns -ENOSYS + */ +static int ufshcd_suspend(struct pci_dev *pdev, pm_message_t state) +{ + /* + * TODO: + * 1. Block SCSI requests from SCSI midlayer + * 2. Change the internal driver state to non operational + * 3. Set UTRLRSR and UTMRLRSR bits to zero + * 4. Wait until outstanding commands are completed + * 5. Set HCE to zero to send the UFS host controller to reset state + */ + + return -ENOSYS; +} + +/** + * ufshcd_resume - resume power management function + * @pdev: pointer to PCI device handle + * + * Returns -ENOSYS + */ +static int ufshcd_resume(struct pci_dev *pdev) +{ + /* + * TODO: + * 1. Set HCE to 1, to start the UFS host controller + * initialization process + * 2. Set UTRLRSR and UTMRLRSR bits to 1 + * 3. Change the internal driver state to operational + * 4. Unblock SCSI requests from SCSI midlayer + */ + + return -ENOSYS; +} +#endif /* CONFIG_PM */ + +/** + * ufshcd_hba_free - free allocated memory for + * host memory space data structures + * @hba: per adapter instance + */ +static void ufshcd_hba_free(struct ufs_hba *hba) +{ + iounmap(hba->mmio_base); + ufshcd_free_hba_memory(hba); + pci_release_regions(hba->pdev); +} + +/** + * ufshcd_remove - de-allocate PCI/SCSI host and host memory space + * data structure memory + * @pdev - pointer to PCI handle + */ +static void ufshcd_remove(struct pci_dev *pdev) +{ + struct ufs_hba *hba = pci_get_drvdata(pdev); + + /* disable interrupts */ + ufshcd_int_config(hba, UFSHCD_INT_DISABLE); + free_irq(pdev->irq, hba); + + ufshcd_hba_stop(hba); + ufshcd_hba_free(hba); + + scsi_remove_host(hba->host); + scsi_host_put(hba->host); + pci_set_drvdata(pdev, NULL); + pci_clear_master(pdev); + pci_disable_device(pdev); +} + +/** + * ufshcd_set_dma_mask - Set dma mask based on the controller + * addressing capability + * @pdev: PCI device structure + * + * Returns 0 for success, non-zero for failure + */ +static int ufshcd_set_dma_mask(struct ufs_hba *hba) +{ + int err; + u64 dma_mask; + + /* + * If controller supports 64 bit addressing mode, then set the DMA + * mask to 64-bit, else set the DMA mask to 32-bit + */ + if (hba->capabilities & MASK_64_ADDRESSING_SUPPORT) + dma_mask = DMA_BIT_MASK(64); + else + dma_mask = DMA_BIT_MASK(32); + + err = pci_set_dma_mask(hba->pdev, dma_mask); + if (err) + return err; + + err = pci_set_consistent_dma_mask(hba->pdev, dma_mask); + + return err; +} + +/** + * ufshcd_probe - probe routine of the driver + * @pdev: pointer to PCI device handle + * @id: PCI device id + * + * Returns 0 on success, non-zero value on failure + */ +static int __devinit +ufshcd_probe(struct pci_dev *pdev, const struct pci_device_id *id) +{ + struct Scsi_Host *host; + struct ufs_hba *hba; + int err; + + err = pci_enable_device(pdev); + if (err) { + dev_err(&pdev->dev, "pci_enable_device failed\n"); + goto out_error; + } + + pci_set_master(pdev); + + host = scsi_host_alloc(&ufshcd_driver_template, + sizeof(struct ufs_hba)); + if (!host) { + dev_err(&pdev->dev, "scsi_host_alloc failed\n"); + err = -ENOMEM; + goto out_disable; + } + hba = shost_priv(host); + + err = pci_request_regions(pdev, UFSHCD); + if (err < 0) { + dev_err(&pdev->dev, "request regions failed\n"); + goto out_disable; + } + + hba->mmio_base = pci_ioremap_bar(pdev, 0); + if (!hba->mmio_base) { + dev_err(&pdev->dev, "memory map failed\n"); + err = -ENOMEM; + goto out_release_regions; + } + + hba->host = host; + hba->pdev = pdev; + + /* Read capabilities registers */ + ufshcd_hba_capabilities(hba); + + /* Get UFS version supported by the controller */ + hba->ufs_version = ufshcd_get_ufs_version(hba); + + err = ufshcd_set_dma_mask(hba); + if (err) { + dev_err(&pdev->dev, "set dma mask failed\n"); + goto out_iounmap; + } + + /* Allocate memory for host memory space */ + err = ufshcd_memory_alloc(hba); + if (err) { + dev_err(&pdev->dev, "Memory allocation failed\n"); + goto out_iounmap; + } + + /* Configure LRB */ + ufshcd_host_memory_configure(hba); + + host->can_queue = hba->nutrs; + host->cmd_per_lun = hba->nutrs; + host->max_id = UFSHCD_MAX_ID; + host->max_lun = UFSHCD_MAX_LUNS; + host->max_channel = UFSHCD_MAX_CHANNEL; + host->unique_id = host->host_no; + host->max_cmd_len = MAX_CDB_SIZE; + + /* Initailize wait queue for task management */ + init_waitqueue_head(&hba->ufshcd_tm_wait_queue); + + /* Initialize work queues */ + INIT_WORK(&hba->uic_workq, ufshcd_uic_cc_handler); + INIT_WORK(&hba->feh_workq, ufshcd_fatal_err_handler); + + /* IRQ registration */ + err = request_irq(pdev->irq, ufshcd_intr, IRQF_SHARED, UFSHCD, hba); + if (err) { + dev_err(&pdev->dev, "request irq failed\n"); + goto out_lrb_free; + } + + /* Enable SCSI tag mapping */ + err = scsi_init_shared_tag_map(host, host->can_queue); + if (err) { + dev_err(&pdev->dev, "init shared queue failed\n"); + goto out_free_irq; + } + + pci_set_drvdata(pdev, hba); + + err = scsi_add_host(host, &pdev->dev); + if (err) { + dev_err(&pdev->dev, "scsi_add_host failed\n"); + goto out_free_irq; + } + + /* Initialization routine */ + err = ufshcd_initialize_hba(hba); + if (err) { + dev_err(&pdev->dev, "Initialization failed\n"); + goto out_free_irq; + } + + return 0; + +out_free_irq: + free_irq(pdev->irq, hba); +out_lrb_free: + ufshcd_free_hba_memory(hba); +out_iounmap: + iounmap(hba->mmio_base); +out_release_regions: + pci_release_regions(pdev); +out_disable: + scsi_host_put(host); + pci_clear_master(pdev); + pci_disable_device(pdev); +out_error: + return err; +} + +static DEFINE_PCI_DEVICE_TABLE(ufshcd_pci_tbl) = { + { PCI_VENDOR_ID_SAMSUNG, 0xC00C, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 }, + { } /* terminate list */ +}; + +MODULE_DEVICE_TABLE(pci, ufshcd_pci_tbl); + +static struct pci_driver ufshcd_pci_driver = { + .name = UFSHCD, + .id_table = ufshcd_pci_tbl, + .probe = ufshcd_probe, + .remove = __devexit_p(ufshcd_remove), + .shutdown = ufshcd_shutdown, +#ifdef CONFIG_PM + .suspend = ufshcd_suspend, + .resume = ufshcd_resume, +#endif +}; + +/** + * ufshcd_init - Driver registration routine + */ +static int __init ufshcd_init(void) +{ + return pci_register_driver(&ufshcd_pci_driver); +} +module_init(ufshcd_init); + +/** + * ufshcd_exit - Driver exit clean-up routine + */ +static void __exit ufshcd_exit(void) +{ + pci_unregister_driver(&ufshcd_pci_driver); +} +module_exit(ufshcd_exit); + + +MODULE_AUTHOR("Santosh Yaragnavi , " + "Vinayak Holikatti "); +MODULE_DESCRIPTION("Generic UFS host controller driver"); +MODULE_LICENSE("GPL"); +MODULE_VERSION(UFSHCD_DRIVER_VERSION); diff --git a/drivers/scsi/ufs/ufshci.h b/drivers/scsi/ufs/ufshci.h new file mode 100644 index 000000000000..6e3510f71167 --- /dev/null +++ b/drivers/scsi/ufs/ufshci.h @@ -0,0 +1,376 @@ +/* + * Universal Flash Storage Host controller driver + * + * This code is based on drivers/scsi/ufs/ufshci.h + * Copyright (C) 2011-2012 Samsung India Software Operations + * + * Santosh Yaraganavi + * Vinayak Holikatti + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * NO WARRANTY + * THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR + * CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT + * LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is + * solely responsible for determining the appropriateness of using and + * distributing the Program and assumes all risks associated with its + * exercise of rights under this Agreement, including but not limited to + * the risks and costs of program errors, damage to or loss of data, + * programs or equipment, and unavailability or interruption of operations. + + * DISCLAIMER OF LIABILITY + * NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY + * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR + * TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE + * USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED + * HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES + + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, + * USA. + */ + +#ifndef _UFSHCI_H +#define _UFSHCI_H + +enum { + TASK_REQ_UPIU_SIZE_DWORDS = 8, + TASK_RSP_UPIU_SIZE_DWORDS = 8, + ALIGNED_UPIU_SIZE = 128, +}; + +/* UFSHCI Registers */ +enum { + REG_CONTROLLER_CAPABILITIES = 0x00, + REG_UFS_VERSION = 0x08, + REG_CONTROLLER_DEV_ID = 0x10, + REG_CONTROLLER_PROD_ID = 0x14, + REG_INTERRUPT_STATUS = 0x20, + REG_INTERRUPT_ENABLE = 0x24, + REG_CONTROLLER_STATUS = 0x30, + REG_CONTROLLER_ENABLE = 0x34, + REG_UIC_ERROR_CODE_PHY_ADAPTER_LAYER = 0x38, + REG_UIC_ERROR_CODE_DATA_LINK_LAYER = 0x3C, + REG_UIC_ERROR_CODE_NETWORK_LAYER = 0x40, + REG_UIC_ERROR_CODE_TRANSPORT_LAYER = 0x44, + REG_UIC_ERROR_CODE_DME = 0x48, + REG_UTP_TRANSFER_REQ_INT_AGG_CONTROL = 0x4C, + REG_UTP_TRANSFER_REQ_LIST_BASE_L = 0x50, + REG_UTP_TRANSFER_REQ_LIST_BASE_H = 0x54, + REG_UTP_TRANSFER_REQ_DOOR_BELL = 0x58, + REG_UTP_TRANSFER_REQ_LIST_CLEAR = 0x5C, + REG_UTP_TRANSFER_REQ_LIST_RUN_STOP = 0x60, + REG_UTP_TASK_REQ_LIST_BASE_L = 0x70, + REG_UTP_TASK_REQ_LIST_BASE_H = 0x74, + REG_UTP_TASK_REQ_DOOR_BELL = 0x78, + REG_UTP_TASK_REQ_LIST_CLEAR = 0x7C, + REG_UTP_TASK_REQ_LIST_RUN_STOP = 0x80, + REG_UIC_COMMAND = 0x90, + REG_UIC_COMMAND_ARG_1 = 0x94, + REG_UIC_COMMAND_ARG_2 = 0x98, + REG_UIC_COMMAND_ARG_3 = 0x9C, +}; + +/* Controller capability masks */ +enum { + MASK_TRANSFER_REQUESTS_SLOTS = 0x0000001F, + MASK_TASK_MANAGEMENT_REQUEST_SLOTS = 0x00070000, + MASK_64_ADDRESSING_SUPPORT = 0x01000000, + MASK_OUT_OF_ORDER_DATA_DELIVERY_SUPPORT = 0x02000000, + MASK_UIC_DME_TEST_MODE_SUPPORT = 0x04000000, +}; + +/* UFS Version 08h */ +#define MINOR_VERSION_NUM_MASK UFS_MASK(0xFFFF, 0) +#define MAJOR_VERSION_NUM_MASK UFS_MASK(0xFFFF, 16) + +/* Controller UFSHCI version */ +enum { + UFSHCI_VERSION_10 = 0x00010000, + UFSHCI_VERSION_11 = 0x00010100, +}; + +/* + * HCDDID - Host Controller Identification Descriptor + * - Device ID and Device Class 10h + */ +#define DEVICE_CLASS UFS_MASK(0xFFFF, 0) +#define DEVICE_ID UFS_MASK(0xFF, 24) + +/* + * HCPMID - Host Controller Identification Descriptor + * - Product/Manufacturer ID 14h + */ +#define MANUFACTURE_ID_MASK UFS_MASK(0xFFFF, 0) +#define PRODUCT_ID_MASK UFS_MASK(0xFFFF, 16) + +#define UFS_BIT(x) (1L << (x)) + +#define UTP_TRANSFER_REQ_COMPL UFS_BIT(0) +#define UIC_DME_END_PT_RESET UFS_BIT(1) +#define UIC_ERROR UFS_BIT(2) +#define UIC_TEST_MODE UFS_BIT(3) +#define UIC_POWER_MODE UFS_BIT(4) +#define UIC_HIBERNATE_EXIT UFS_BIT(5) +#define UIC_HIBERNATE_ENTER UFS_BIT(6) +#define UIC_LINK_LOST UFS_BIT(7) +#define UIC_LINK_STARTUP UFS_BIT(8) +#define UTP_TASK_REQ_COMPL UFS_BIT(9) +#define UIC_COMMAND_COMPL UFS_BIT(10) +#define DEVICE_FATAL_ERROR UFS_BIT(11) +#define CONTROLLER_FATAL_ERROR UFS_BIT(16) +#define SYSTEM_BUS_FATAL_ERROR UFS_BIT(17) + +#define UFSHCD_ERROR_MASK (UIC_ERROR |\ + DEVICE_FATAL_ERROR |\ + CONTROLLER_FATAL_ERROR |\ + SYSTEM_BUS_FATAL_ERROR) + +#define INT_FATAL_ERRORS (DEVICE_FATAL_ERROR |\ + CONTROLLER_FATAL_ERROR |\ + SYSTEM_BUS_FATAL_ERROR) + +/* HCS - Host Controller Status 30h */ +#define DEVICE_PRESENT UFS_BIT(0) +#define UTP_TRANSFER_REQ_LIST_READY UFS_BIT(1) +#define UTP_TASK_REQ_LIST_READY UFS_BIT(2) +#define UIC_COMMAND_READY UFS_BIT(3) +#define HOST_ERROR_INDICATOR UFS_BIT(4) +#define DEVICE_ERROR_INDICATOR UFS_BIT(5) +#define UIC_POWER_MODE_CHANGE_REQ_STATUS_MASK UFS_MASK(0x7, 8) + +/* HCE - Host Controller Enable 34h */ +#define CONTROLLER_ENABLE UFS_BIT(0) +#define CONTROLLER_DISABLE 0x0 + +/* UECPA - Host UIC Error Code PHY Adapter Layer 38h */ +#define UIC_PHY_ADAPTER_LAYER_ERROR UFS_BIT(31) +#define UIC_PHY_ADAPTER_LAYER_ERROR_CODE_MASK 0x1F + +/* UECDL - Host UIC Error Code Data Link Layer 3Ch */ +#define UIC_DATA_LINK_LAYER_ERROR UFS_BIT(31) +#define UIC_DATA_LINK_LAYER_ERROR_CODE_MASK 0x7FFF +#define UIC_DATA_LINK_LAYER_ERROR_PA_INIT 0x2000 + +/* UECN - Host UIC Error Code Network Layer 40h */ +#define UIC_NETWORK_LAYER_ERROR UFS_BIT(31) +#define UIC_NETWORK_LAYER_ERROR_CODE_MASK 0x7 + +/* UECT - Host UIC Error Code Transport Layer 44h */ +#define UIC_TRANSPORT_LAYER_ERROR UFS_BIT(31) +#define UIC_TRANSPORT_LAYER_ERROR_CODE_MASK 0x7F + +/* UECDME - Host UIC Error Code DME 48h */ +#define UIC_DME_ERROR UFS_BIT(31) +#define UIC_DME_ERROR_CODE_MASK 0x1 + +#define INT_AGGR_TIMEOUT_VAL_MASK 0xFF +#define INT_AGGR_COUNTER_THRESHOLD_MASK UFS_MASK(0x1F, 8) +#define INT_AGGR_COUNTER_AND_TIMER_RESET UFS_BIT(16) +#define INT_AGGR_STATUS_BIT UFS_BIT(20) +#define INT_AGGR_PARAM_WRITE UFS_BIT(24) +#define INT_AGGR_ENABLE UFS_BIT(31) + +/* UTRLRSR - UTP Transfer Request Run-Stop Register 60h */ +#define UTP_TRANSFER_REQ_LIST_RUN_STOP_BIT UFS_BIT(0) + +/* UTMRLRSR - UTP Task Management Request Run-Stop Register 80h */ +#define UTP_TASK_REQ_LIST_RUN_STOP_BIT UFS_BIT(0) + +/* UICCMD - UIC Command */ +#define COMMAND_OPCODE_MASK 0xFF +#define GEN_SELECTOR_INDEX_MASK 0xFFFF + +#define MIB_ATTRIBUTE_MASK UFS_MASK(0xFFFF, 16) +#define RESET_LEVEL 0xFF + +#define ATTR_SET_TYPE_MASK UFS_MASK(0xFF, 16) +#define CONFIG_RESULT_CODE_MASK 0xFF +#define GENERIC_ERROR_CODE_MASK 0xFF + +/* UIC Commands */ +enum { + UIC_CMD_DME_GET = 0x01, + UIC_CMD_DME_SET = 0x02, + UIC_CMD_DME_PEER_GET = 0x03, + UIC_CMD_DME_PEER_SET = 0x04, + UIC_CMD_DME_POWERON = 0x10, + UIC_CMD_DME_POWEROFF = 0x11, + UIC_CMD_DME_ENABLE = 0x12, + UIC_CMD_DME_RESET = 0x14, + UIC_CMD_DME_END_PT_RST = 0x15, + UIC_CMD_DME_LINK_STARTUP = 0x16, + UIC_CMD_DME_HIBER_ENTER = 0x17, + UIC_CMD_DME_HIBER_EXIT = 0x18, + UIC_CMD_DME_TEST_MODE = 0x1A, +}; + +/* UIC Config result code / Generic error code */ +enum { + UIC_CMD_RESULT_SUCCESS = 0x00, + UIC_CMD_RESULT_INVALID_ATTR = 0x01, + UIC_CMD_RESULT_FAILURE = 0x01, + UIC_CMD_RESULT_INVALID_ATTR_VALUE = 0x02, + UIC_CMD_RESULT_READ_ONLY_ATTR = 0x03, + UIC_CMD_RESULT_WRITE_ONLY_ATTR = 0x04, + UIC_CMD_RESULT_BAD_INDEX = 0x05, + UIC_CMD_RESULT_LOCKED_ATTR = 0x06, + UIC_CMD_RESULT_BAD_TEST_FEATURE_INDEX = 0x07, + UIC_CMD_RESULT_PEER_COMM_FAILURE = 0x08, + UIC_CMD_RESULT_BUSY = 0x09, + UIC_CMD_RESULT_DME_FAILURE = 0x0A, +}; + +#define MASK_UIC_COMMAND_RESULT 0xFF + +#define INT_AGGR_COUNTER_THRESHOLD_VALUE (0x1F << 8) +#define INT_AGGR_TIMEOUT_VALUE (0x02) + +/* Interrupt disable masks */ +enum { + /* Interrupt disable mask for UFSHCI v1.0 */ + INTERRUPT_DISABLE_MASK_10 = 0xFFFF, + + /* Interrupt disable mask for UFSHCI v1.1 */ + INTERRUPT_DISABLE_MASK_11 = 0x0, +}; + +/* + * Request Descriptor Definitions + */ + +/* Transfer request command type */ +enum { + UTP_CMD_TYPE_SCSI = 0x0, + UTP_CMD_TYPE_UFS = 0x1, + UTP_CMD_TYPE_DEV_MANAGE = 0x2, +}; + +enum { + UTP_SCSI_COMMAND = 0x00000000, + UTP_NATIVE_UFS_COMMAND = 0x10000000, + UTP_DEVICE_MANAGEMENT_FUNCTION = 0x20000000, + UTP_REQ_DESC_INT_CMD = 0x01000000, +}; + +/* UTP Transfer Request Data Direction (DD) */ +enum { + UTP_NO_DATA_TRANSFER = 0x00000000, + UTP_HOST_TO_DEVICE = 0x02000000, + UTP_DEVICE_TO_HOST = 0x04000000, +}; + +/* Overall command status values */ +enum { + OCS_SUCCESS = 0x0, + OCS_INVALID_CMD_TABLE_ATTR = 0x1, + OCS_INVALID_PRDT_ATTR = 0x2, + OCS_MISMATCH_DATA_BUF_SIZE = 0x3, + OCS_MISMATCH_RESP_UPIU_SIZE = 0x4, + OCS_PEER_COMM_FAILURE = 0x5, + OCS_ABORTED = 0x6, + OCS_FATAL_ERROR = 0x7, + OCS_INVALID_COMMAND_STATUS = 0x0F, + MASK_OCS = 0x0F, +}; + +/** + * struct ufshcd_sg_entry - UFSHCI PRD Entry + * @base_addr: Lower 32bit physical address DW-0 + * @upper_addr: Upper 32bit physical address DW-1 + * @reserved: Reserved for future use DW-2 + * @size: size of physical segment DW-3 + */ +struct ufshcd_sg_entry { + u32 base_addr; + u32 upper_addr; + u32 reserved; + u32 size; +}; + +/** + * struct utp_transfer_cmd_desc - UFS Command Descriptor structure + * @command_upiu: Command UPIU Frame address + * @response_upiu: Response UPIU Frame address + * @prd_table: Physical Region Descriptor + */ +struct utp_transfer_cmd_desc { + u8 command_upiu[ALIGNED_UPIU_SIZE]; + u8 response_upiu[ALIGNED_UPIU_SIZE]; + struct ufshcd_sg_entry prd_table[SG_ALL]; +}; + +/** + * struct request_desc_header - Descriptor Header common to both UTRD and UTMRD + * @dword0: Descriptor Header DW0 + * @dword1: Descriptor Header DW1 + * @dword2: Descriptor Header DW2 + * @dword3: Descriptor Header DW3 + */ +struct request_desc_header { + u32 dword_0; + u32 dword_1; + u32 dword_2; + u32 dword_3; +}; + +/** + * struct utp_transfer_req_desc - UTRD structure + * @header: UTRD header DW-0 to DW-3 + * @command_desc_base_addr_lo: UCD base address low DW-4 + * @command_desc_base_addr_hi: UCD base address high DW-5 + * @response_upiu_length: response UPIU length DW-6 + * @response_upiu_offset: response UPIU offset DW-6 + * @prd_table_length: Physical region descriptor length DW-7 + * @prd_table_offset: Physical region descriptor offset DW-7 + */ +struct utp_transfer_req_desc { + + /* DW 0-3 */ + struct request_desc_header header; + + /* DW 4-5*/ + u32 command_desc_base_addr_lo; + u32 command_desc_base_addr_hi; + + /* DW 6 */ + u16 response_upiu_length; + u16 response_upiu_offset; + + /* DW 7 */ + u16 prd_table_length; + u16 prd_table_offset; +}; + +/** + * struct utp_task_req_desc - UTMRD structure + * @header: UTMRD header DW-0 to DW-3 + * @task_req_upiu: Pointer to task request UPIU DW-4 to DW-11 + * @task_rsp_upiu: Pointer to task response UPIU DW12 to DW-19 + */ +struct utp_task_req_desc { + + /* DW 0-3 */ + struct request_desc_header header; + + /* DW 4-11 */ + u32 task_req_upiu[TASK_REQ_UPIU_SIZE_DWORDS]; + + /* DW 12-19 */ + u32 task_rsp_upiu[TASK_RSP_UPIU_SIZE_DWORDS]; +}; + +#endif /* End of Header */ -- cgit v1.2.3 From c743e44fbb1f8668941e83de07662b1ecd33d083 Mon Sep 17 00:00:00 2001 From: Lee Duncan Date: Thu, 1 Mar 2012 12:41:01 -0800 Subject: [SCSI] st: expand ability to write immediate filemarks The st tape driver recently added the MTWEOFI ioctl, which writes a tape filemark (EOF), like the MTWEOF ioctl, except that MTWEOFI returns immediately. This makes certain applications, like backup software, run much more quickly on buffered tape drives. Since legacy applications do not know about this new MTWEOFI ioctl, this patch adds a new ioctl option that tells the st driver to return immediately when writing an EOF (i.e. a filemark). This new flag is much like the existing flag that tells the st driver to perform writes (and certain other IOs) immediately, but this new flag only applies to writing EOFs. This new feature is controlled via the MTSETDRVBUFFER ioctl, using the newly-defined MT_ST_NOWAIT_EOF flag. Use of this new feature is displayed via the sysfs tape "options" attribute. The st documentation was updated to mention this new flag, as well as the problems that can occur from using it. Signed-off-by: Lee Duncan Acked-by: Kai Makisara Signed-off-by: James Bottomley --- Documentation/scsi/st.txt | 4 ++++ drivers/scsi/st.c | 21 ++++++++++++++++++--- drivers/scsi/st.h | 1 + include/linux/mtio.h | 1 + 4 files changed, 24 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/scsi/st.txt b/Documentation/scsi/st.txt index 691ca292c24d..685bf3582abe 100644 --- a/Documentation/scsi/st.txt +++ b/Documentation/scsi/st.txt @@ -390,6 +390,10 @@ MTSETDRVBUFFER MT_ST_SYSV sets the SYSV semantics (mode) MT_ST_NOWAIT enables immediate mode (i.e., don't wait for the command to finish) for some commands (e.g., rewind) + MT_ST_NOWAIT_EOF enables immediate filemark mode (i.e. when + writing a filemark, don't wait for it to complete). Please + see the BASICS note about MTWEOFI with respect to the + possible dangers of writing immediate filemarks. MT_ST_SILI enables setting the SILI bit in SCSI commands when reading in variable block mode to enhance performance when reading blocks shorter than the byte count; set this only diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c index 9262cdfa4b23..8c663e16fd02 100644 --- a/drivers/scsi/st.c +++ b/drivers/scsi/st.c @@ -1106,6 +1106,12 @@ static int check_tape(struct scsi_tape *STp, struct file *filp) STp->drv_buffer)); } STp->drv_write_prot = ((STp->buffer)->b_data[2] & 0x80) != 0; + if (!STp->drv_buffer && STp->immediate_filemark) { + printk(KERN_WARNING + "%s: non-buffered tape: disabling writing immediate filemarks\n", + name); + STp->immediate_filemark = 0; + } } st_release_request(SRpnt); SRpnt = NULL; @@ -1314,6 +1320,8 @@ static int st_flush(struct file *filp, fl_owner_t id) memset(cmd, 0, MAX_COMMAND_SIZE); cmd[0] = WRITE_FILEMARKS; + if (STp->immediate_filemark) + cmd[1] = 1; cmd[4] = 1 + STp->two_fm; SRpnt = st_do_scsi(NULL, STp, cmd, 0, DMA_NONE, @@ -2181,8 +2189,9 @@ static void st_log_options(struct scsi_tape * STp, struct st_modedef * STm, char name, STm->defaults_for_writes, STp->omit_blklims, STp->can_partitions, STp->scsi2_logical); printk(KERN_INFO - "%s: sysv: %d nowait: %d sili: %d\n", name, STm->sysv, STp->immediate, - STp->sili); + "%s: sysv: %d nowait: %d sili: %d nowait_filemark: %d\n", + name, STm->sysv, STp->immediate, STp->sili, + STp->immediate_filemark); printk(KERN_INFO "%s: debugging: %d\n", name, debugging); } @@ -2224,6 +2233,7 @@ static int st_set_options(struct scsi_tape *STp, long options) STp->can_partitions = (options & MT_ST_CAN_PARTITIONS) != 0; STp->scsi2_logical = (options & MT_ST_SCSI2LOGICAL) != 0; STp->immediate = (options & MT_ST_NOWAIT) != 0; + STp->immediate_filemark = (options & MT_ST_NOWAIT_EOF) != 0; STm->sysv = (options & MT_ST_SYSV) != 0; STp->sili = (options & MT_ST_SILI) != 0; DEB( debugging = (options & MT_ST_DEBUGGING) != 0; @@ -2255,6 +2265,8 @@ static int st_set_options(struct scsi_tape *STp, long options) STp->scsi2_logical = value; if ((options & MT_ST_NOWAIT) != 0) STp->immediate = value; + if ((options & MT_ST_NOWAIT_EOF) != 0) + STp->immediate_filemark = value; if ((options & MT_ST_SYSV) != 0) STm->sysv = value; if ((options & MT_ST_SILI) != 0) @@ -2714,7 +2726,8 @@ static int st_int_ioctl(struct scsi_tape *STp, unsigned int cmd_in, unsigned lon cmd[0] = WRITE_FILEMARKS; if (cmd_in == MTWSM) cmd[1] = 2; - if (cmd_in == MTWEOFI) + if (cmd_in == MTWEOFI || + (cmd_in == MTWEOF && STp->immediate_filemark)) cmd[1] |= 1; cmd[2] = (arg >> 16); cmd[3] = (arg >> 8); @@ -4093,6 +4106,7 @@ static int st_probe(struct device *dev) tpnt->scsi2_logical = ST_SCSI2LOGICAL; tpnt->sili = ST_SILI; tpnt->immediate = ST_NOWAIT; + tpnt->immediate_filemark = 0; tpnt->default_drvbuffer = 0xff; /* No forced buffering */ tpnt->partition = 0; tpnt->new_partition = 0; @@ -4478,6 +4492,7 @@ st_options_show(struct device *dev, struct device_attribute *attr, char *buf) options |= STp->scsi2_logical ? MT_ST_SCSI2LOGICAL : 0; options |= STm->sysv ? MT_ST_SYSV : 0; options |= STp->immediate ? MT_ST_NOWAIT : 0; + options |= STp->immediate_filemark ? MT_ST_NOWAIT_EOF : 0; options |= STp->sili ? MT_ST_SILI : 0; l = snprintf(buf, PAGE_SIZE, "0x%08x\n", options); diff --git a/drivers/scsi/st.h b/drivers/scsi/st.h index f91a67c6d968..ea35632b986c 100644 --- a/drivers/scsi/st.h +++ b/drivers/scsi/st.h @@ -120,6 +120,7 @@ struct scsi_tape { unsigned char c_algo; /* compression algorithm */ unsigned char pos_unknown; /* after reset position unknown */ unsigned char sili; /* use SILI when reading in variable b mode */ + unsigned char immediate_filemark; /* write filemark immediately */ int tape_type; int long_timeout; /* timeout for commands known to take long time */ diff --git a/include/linux/mtio.h b/include/linux/mtio.h index 8f825756c459..18543e2db06f 100644 --- a/include/linux/mtio.h +++ b/include/linux/mtio.h @@ -194,6 +194,7 @@ struct mtpos { #define MT_ST_SYSV 0x1000 #define MT_ST_NOWAIT 0x2000 #define MT_ST_SILI 0x4000 +#define MT_ST_NOWAIT_EOF 0x8000 /* The mode parameters to be controlled. Parameter chosen with bits 20-28 */ #define MT_ST_CLEAR_DEFAULT 0xfffff -- cgit v1.2.3 From 46856a68dcb5f67c779d211fd6bcb5d7a2a7f19b Mon Sep 17 00:00:00 2001 From: Rajendra Nayak Date: Mon, 12 Mar 2012 20:32:37 +0530 Subject: mmc: omap_hsmmc: Convert hsmmc driver to use device tree Define dt bindings for the ti-omap-hsmmc, and adapt the driver to extract data (which was earlier passed as platform_data) from device tree. Signed-off-by: Rajendra Nayak Acked-by: Rob Herring Signed-off-by: Chris Ball --- .../devicetree/bindings/mmc/ti-omap-hsmmc.txt | 33 ++++++++++ drivers/mmc/host/omap_hsmmc.c | 73 ++++++++++++++++++++++ 2 files changed, 106 insertions(+) create mode 100644 Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt b/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt new file mode 100644 index 000000000000..dbd4368ab8cc --- /dev/null +++ b/Documentation/devicetree/bindings/mmc/ti-omap-hsmmc.txt @@ -0,0 +1,33 @@ +* TI Highspeed MMC host controller for OMAP + +The Highspeed MMC Host Controller on TI OMAP family +provides an interface for MMC, SD, and SDIO types of memory cards. + +Required properties: +- compatible: + Should be "ti,omap2-hsmmc", for OMAP2 controllers + Should be "ti,omap3-hsmmc", for OMAP3 controllers + Should be "ti,omap4-hsmmc", for OMAP4 controllers +- ti,hwmods: Must be "mmc", n is controller instance starting 1 +- reg : should contain hsmmc registers location and length + +Optional properties: +ti,dual-volt: boolean, supports dual voltage cards +-supply: phandle to the regulator device tree node +"supply-name" examples are "vmmc", "vmmc_aux" etc +ti,bus-width: Number of data lines, default assumed is 1 if the property is missing. +cd-gpios: GPIOs for card detection +wp-gpios: GPIOs for write protection +ti,non-removable: non-removable slot (like eMMC) +ti,needs-special-reset: Requires a special softreset sequence + +Example: + mmc1: mmc@0x4809c000 { + compatible = "ti,omap4-hsmmc"; + reg = <0x4809c000 0x400>; + ti,hwmods = "mmc1"; + ti,dual-volt; + ti,bus-width = <4>; + vmmc-supply = <&vmmc>; /* phandle to regulator node */ + ti,non-removable; + }; diff --git a/drivers/mmc/host/omap_hsmmc.c b/drivers/mmc/host/omap_hsmmc.c index 0306aea329d4..e0808d4a7681 100644 --- a/drivers/mmc/host/omap_hsmmc.c +++ b/drivers/mmc/host/omap_hsmmc.c @@ -26,6 +26,9 @@ #include #include #include +#include +#include +#include #include #include #include @@ -1710,6 +1713,65 @@ static void omap_hsmmc_debugfs(struct mmc_host *mmc) #endif +#ifdef CONFIG_OF +static u16 omap4_reg_offset = 0x100; + +static const struct of_device_id omap_mmc_of_match[] = { + { + .compatible = "ti,omap2-hsmmc", + }, + { + .compatible = "ti,omap3-hsmmc", + }, + { + .compatible = "ti,omap4-hsmmc", + .data = &omap4_reg_offset, + }, + {}, +} +MODULE_DEVICE_TABLE(of, omap_mmc_of_match); + +static struct omap_mmc_platform_data *of_get_hsmmc_pdata(struct device *dev) +{ + struct omap_mmc_platform_data *pdata; + struct device_node *np = dev->of_node; + u32 bus_width; + + pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL); + if (!pdata) + return NULL; /* out of memory */ + + if (of_find_property(np, "ti,dual-volt", NULL)) + pdata->controller_flags |= OMAP_HSMMC_SUPPORTS_DUAL_VOLT; + + /* This driver only supports 1 slot */ + pdata->nr_slots = 1; + pdata->slots[0].switch_pin = of_get_named_gpio(np, "cd-gpios", 0); + pdata->slots[0].gpio_wp = of_get_named_gpio(np, "wp-gpios", 0); + + if (of_find_property(np, "ti,non-removable", NULL)) { + pdata->slots[0].nonremovable = true; + pdata->slots[0].no_regulator_off_init = true; + } + of_property_read_u32(np, "ti,bus-width", &bus_width); + if (bus_width == 4) + pdata->slots[0].caps |= MMC_CAP_4_BIT_DATA; + else if (bus_width == 8) + pdata->slots[0].caps |= MMC_CAP_8_BIT_DATA; + + if (of_find_property(np, "ti,needs-special-reset", NULL)) + pdata->slots[0].features |= HSMMC_HAS_UPDATED_RESET; + + return pdata; +} +#else +static inline struct omap_mmc_platform_data + *of_get_hsmmc_pdata(struct device *dev) +{ + return NULL; +} +#endif + static int __init omap_hsmmc_probe(struct platform_device *pdev) { struct omap_mmc_platform_data *pdata = pdev->dev.platform_data; @@ -1717,6 +1779,16 @@ static int __init omap_hsmmc_probe(struct platform_device *pdev) struct omap_hsmmc_host *host = NULL; struct resource *res; int ret, irq; + const struct of_device_id *match; + + match = of_match_device(of_match_ptr(omap_mmc_of_match), &pdev->dev); + if (match) { + pdata = of_get_hsmmc_pdata(&pdev->dev); + if (match->data) { + u16 *offsetp = match->data; + pdata->reg_offset = *offsetp; + } + } if (pdata == NULL) { dev_err(&pdev->dev, "Platform Data is missing\n"); @@ -2122,6 +2194,7 @@ static struct platform_driver omap_hsmmc_driver = { .name = DRIVER_NAME, .owner = THIS_MODULE, .pm = &omap_hsmmc_dev_pm_ops, + .of_match_table = of_match_ptr(omap_mmc_of_match), }, }; -- cgit v1.2.3 From 5ba927e8ca3f73acb98f417d126652e26ab40a57 Mon Sep 17 00:00:00 2001 From: Wolfram Sang Date: Mon, 23 Jan 2012 20:17:38 +0100 Subject: watchdog: documentation: remove index-file The 00-index file in the watchdog directory is, like many others, outdated (conversion-howto is missing) and doesn't contain worthwhile additional information. As it seems to be a maintenance burden without much gain, simply remove it. Signed-off-by: Wolfram Sang Signed-off-by: Wim Van Sebroeck --- Documentation/watchdog/00-INDEX | 19 ------------------- 1 file changed, 19 deletions(-) delete mode 100644 Documentation/watchdog/00-INDEX (limited to 'Documentation') diff --git a/Documentation/watchdog/00-INDEX b/Documentation/watchdog/00-INDEX deleted file mode 100644 index fc9082a1477a..000000000000 --- a/Documentation/watchdog/00-INDEX +++ /dev/null @@ -1,19 +0,0 @@ -00-INDEX - - this file. -convert_drivers_to_kernel_api.txt - - how-to for converting old watchdog drivers to the new kernel API. -hpwdt.txt - - information on the HP iLO2 NMI watchdog -pcwd-watchdog.txt - - documentation for Berkshire Products PC Watchdog ISA cards. -src/ - - directory holding watchdog related example programs. -watchdog-api.txt - - description of the Linux Watchdog driver API. -watchdog-kernel-api.txt - - description of the Linux WatchDog Timer Driver Core kernel API. -watchdog-parameters.txt - - information on driver parameters (for drivers other than - the ones that have driver-specific files here) -wdt.txt - - description of the Watchdog Timer Interfaces for Linux. -- cgit v1.2.3 From b10f7c12e051762b84457f6d38d4b71acbf76a02 Mon Sep 17 00:00:00 2001 From: Hans de Goede Date: Mon, 12 Sep 2011 11:56:59 +0200 Subject: watchdog: watchdog_dev: Let the driver update the timeout field on set_timeout success When a set_timeout operation succeeds this does not necessarily mean that the exact timeout requested has been achieved, because the watchdog does not necessarily have a 1 second resolution. So rather then have the core set the timeout member of the watchdog_device struct to the exact requested value, instead the driver should set it to the actually achieved timeout value. Signed-off-by: Hans de Goede Signed-off-by: Wim Van Sebroeck --- Documentation/watchdog/watchdog-kernel-api.txt | 7 ++++--- drivers/watchdog/watchdog_dev.c | 1 - 2 files changed, 4 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt index 9e162465b0cf..7d9d1da762b6 100644 --- a/Documentation/watchdog/watchdog-kernel-api.txt +++ b/Documentation/watchdog/watchdog-kernel-api.txt @@ -117,9 +117,10 @@ they are supported. These optional routines/operations are: status of the device is reported with watchdog WDIOF_* status flags/bits. * set_timeout: this routine checks and changes the timeout of the watchdog timer device. It returns 0 on success, -EINVAL for "parameter out of range" - and -EIO for "could not write value to the watchdog". On success the timeout - value of the watchdog_device will be changed to the value that was just used - to re-program the watchdog timer device. + and -EIO for "could not write value to the watchdog". On success this + routine should set the timeout value of the watchdog_device to the + achieved timeout value (which may be different from the requested one + because the watchdog does not necessarily has a 1 second resolution). (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the watchdog's info structure). * ioctl: if this routine is present then it will be called first before we do diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 55b1f60b19d2..c6e1b8dd80cc 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -226,7 +226,6 @@ static long watchdog_ioctl(struct file *file, unsigned int cmd, err = wdd->ops->set_timeout(wdd, val); if (err < 0) return err; - wdd->timeout = val; /* If the watchdog is active then we send a keepalive ping * to make sure that the watchdog keep's running (and if * possible that it takes the new timeout) */ -- cgit v1.2.3 From fd7b673c92731fc6c0b1e999adfd87b6762ee797 Mon Sep 17 00:00:00 2001 From: Viresh Kumar Date: Fri, 16 Mar 2012 09:14:00 +0100 Subject: watchdog: Add support for WDIOC_GETTIMELEFT IOCTL in watchdog core This patch adds support for WDIOC_GETTIMELEFT IOCTL in watchdog core. So, there is another function pointer added to struct watchdog_ops, which can be passed by drivers to support this IOCTL. Related documentation is updated too. Signed-off-by: Viresh Kumar Signed-off-by: Linus Walleij Signed-off-by: Wim Van Sebroeck --- Documentation/watchdog/convert_drivers_to_kernel_api.txt | 4 ++++ Documentation/watchdog/watchdog-kernel-api.txt | 4 +++- drivers/watchdog/watchdog_dev.c | 5 +++++ include/linux/watchdog.h | 2 ++ 4 files changed, 14 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/watchdog/convert_drivers_to_kernel_api.txt b/Documentation/watchdog/convert_drivers_to_kernel_api.txt index be8119bb15d2..271b8850dde7 100644 --- a/Documentation/watchdog/convert_drivers_to_kernel_api.txt +++ b/Documentation/watchdog/convert_drivers_to_kernel_api.txt @@ -59,6 +59,10 @@ Here is a overview of the functions and probably needed actions: WDIOC_GETTIMEOUT: No preparations needed + WDIOC_GETTIMELEFT: + It needs get_timeleft() callback to be defined. Otherwise it + will return EOPNOTSUPP + Other IOCTLs can be served using the ioctl-callback. Note that this is mainly intended for porting old drivers; new drivers should not invent private IOCTLs. Private IOCTLs are processed first. When the callback returns with diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt index 7d9d1da762b6..227f6cd0e5fa 100644 --- a/Documentation/watchdog/watchdog-kernel-api.txt +++ b/Documentation/watchdog/watchdog-kernel-api.txt @@ -1,6 +1,6 @@ The Linux WatchDog Timer Driver Core kernel API. =============================================== -Last reviewed: 29-Nov-2011 +Last reviewed: 16-Mar-2012 Wim Van Sebroeck @@ -77,6 +77,7 @@ struct watchdog_ops { int (*ping)(struct watchdog_device *); unsigned int (*status)(struct watchdog_device *); int (*set_timeout)(struct watchdog_device *, unsigned int); + unsigned int (*get_timeleft)(struct watchdog_device *); long (*ioctl)(struct watchdog_device *, unsigned int, unsigned long); }; @@ -123,6 +124,7 @@ they are supported. These optional routines/operations are: because the watchdog does not necessarily has a 1 second resolution). (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the watchdog's info structure). +* get_timeleft: this routines returns the time that's left before a reset. * ioctl: if this routine is present then it will be called first before we do our own internal ioctl call handling. This routine should return -ENOIOCTLCMD if a command is not supported. The parameters that are passed to the ioctl diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index c6e1b8dd80cc..8558da912c42 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -236,6 +236,11 @@ static long watchdog_ioctl(struct file *file, unsigned int cmd, if (wdd->timeout == 0) return -EOPNOTSUPP; return put_user(wdd->timeout, p); + case WDIOC_GETTIMELEFT: + if (!wdd->ops->get_timeleft) + return -EOPNOTSUPP; + + return put_user(wdd->ops->get_timeleft(wdd), p); default: return -ENOTTY; } diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h index de75167093bf..ac40716b44e9 100644 --- a/include/linux/watchdog.h +++ b/include/linux/watchdog.h @@ -66,6 +66,7 @@ struct watchdog_device; * @ping: The routine that sends a keepalive ping to the watchdog device. * @status: The routine that shows the status of the watchdog device. * @set_timeout:The routine for setting the watchdog devices timeout value. + * @get_timeleft:The routine that get's the time that's left before a reset. * @ioctl: The routines that handles extra ioctl calls. * * The watchdog_ops structure contains a list of low-level operations @@ -82,6 +83,7 @@ struct watchdog_ops { int (*ping)(struct watchdog_device *); unsigned int (*status)(struct watchdog_device *); int (*set_timeout)(struct watchdog_device *, unsigned int); + unsigned int (*get_timeleft)(struct watchdog_device *); long (*ioctl)(struct watchdog_device *, unsigned int, unsigned long); }; -- cgit v1.2.3 From a3c8121b8724c3d496dc00201ab40e8313edcf0d Mon Sep 17 00:00:00 2001 From: Daniel Drake Date: Tue, 27 Mar 2012 16:07:40 +0100 Subject: x86/olpc: Add debugfs interface for EC commands Add a debugfs interface for sending commands to the OLPC Embedded Controller (EC) and reading the responses. The EC provides functionality for machine identification, battery and AC control, wakeup control, etc. Having a debugfs interface available is useful for EC development and debugging. Based on code by Paul Fox (who also approves of the end result). Signed-off-by: Daniel Drake Acked-by: Paul Fox Cc: "H. Peter Anvin" Cc: Andres Salomon Link: http://lkml.kernel.org/r/20120327150740.667D09D401E@zog.reactivated.net Signed-off-by: Ingo Molnar --- Documentation/ABI/testing/debugfs-olpc | 16 ++++++ arch/x86/platform/olpc/olpc.c | 97 ++++++++++++++++++++++++++++++++++ 2 files changed, 113 insertions(+) create mode 100644 Documentation/ABI/testing/debugfs-olpc (limited to 'Documentation') diff --git a/Documentation/ABI/testing/debugfs-olpc b/Documentation/ABI/testing/debugfs-olpc new file mode 100644 index 000000000000..bd76cc6d55f9 --- /dev/null +++ b/Documentation/ABI/testing/debugfs-olpc @@ -0,0 +1,16 @@ +What: /sys/kernel/debug/olpc-ec/cmd +Date: Dec 2011 +KernelVersion: 3.4 +Contact: devel@lists.laptop.org +Description: + +A generic interface for executing OLPC Embedded Controller commands and +reading their responses. + +To execute a command, write data with the format: CC:N A A A A +CC is the (hex) command, N is the count of expected reply bytes, and A A A A +are optional (hex) arguments. + +To read the response (if any), read from the generic node after executing +a command. Hex reply bytes will be returned, *whether or not* they came from +the immediately previous command. diff --git a/arch/x86/platform/olpc/olpc.c b/arch/x86/platform/olpc/olpc.c index 7cce722667b8..a4bee53c2e54 100644 --- a/arch/x86/platform/olpc/olpc.c +++ b/arch/x86/platform/olpc/olpc.c @@ -20,6 +20,8 @@ #include #include #include +#include +#include #include #include @@ -31,6 +33,15 @@ EXPORT_SYMBOL_GPL(olpc_platform_info); static DEFINE_SPINLOCK(ec_lock); +/* debugfs interface to EC commands */ +#define EC_MAX_CMD_ARGS (5 + 1) /* cmd byte + 5 args */ +#define EC_MAX_CMD_REPLY (8) + +static struct dentry *ec_debugfs_dir; +static DEFINE_MUTEX(ec_debugfs_cmd_lock); +static unsigned char ec_debugfs_resp[EC_MAX_CMD_REPLY]; +static unsigned int ec_debugfs_resp_bytes; + /* EC event mask to be applied during suspend (defining wakeup sources). */ static u16 ec_wakeup_mask; @@ -269,6 +280,91 @@ int olpc_ec_sci_query(u16 *sci_value) } EXPORT_SYMBOL_GPL(olpc_ec_sci_query); +static ssize_t ec_debugfs_cmd_write(struct file *file, const char __user *buf, + size_t size, loff_t *ppos) +{ + int i, m; + unsigned char ec_cmd[EC_MAX_CMD_ARGS]; + unsigned int ec_cmd_int[EC_MAX_CMD_ARGS]; + char cmdbuf[64]; + int ec_cmd_bytes; + + mutex_lock(&ec_debugfs_cmd_lock); + + size = simple_write_to_buffer(cmdbuf, sizeof(cmdbuf), ppos, buf, size); + + m = sscanf(cmdbuf, "%x:%u %x %x %x %x %x", &ec_cmd_int[0], + &ec_debugfs_resp_bytes, + &ec_cmd_int[1], &ec_cmd_int[2], &ec_cmd_int[3], + &ec_cmd_int[4], &ec_cmd_int[5]); + if (m < 2 || ec_debugfs_resp_bytes > EC_MAX_CMD_REPLY) { + /* reset to prevent overflow on read */ + ec_debugfs_resp_bytes = 0; + + printk(KERN_DEBUG "olpc-ec: bad ec cmd: " + "cmd:response-count [arg1 [arg2 ...]]\n"); + size = -EINVAL; + goto out; + } + + /* convert scanf'd ints to char */ + ec_cmd_bytes = m - 2; + for (i = 0; i <= ec_cmd_bytes; i++) + ec_cmd[i] = ec_cmd_int[i]; + + printk(KERN_DEBUG "olpc-ec: debugfs cmd 0x%02x with %d args " + "%02x %02x %02x %02x %02x, want %d returns\n", + ec_cmd[0], ec_cmd_bytes, ec_cmd[1], ec_cmd[2], ec_cmd[3], + ec_cmd[4], ec_cmd[5], ec_debugfs_resp_bytes); + + olpc_ec_cmd(ec_cmd[0], (ec_cmd_bytes == 0) ? NULL : &ec_cmd[1], + ec_cmd_bytes, ec_debugfs_resp, ec_debugfs_resp_bytes); + + printk(KERN_DEBUG "olpc-ec: response " + "%02x %02x %02x %02x %02x %02x %02x %02x (%d bytes expected)\n", + ec_debugfs_resp[0], ec_debugfs_resp[1], ec_debugfs_resp[2], + ec_debugfs_resp[3], ec_debugfs_resp[4], ec_debugfs_resp[5], + ec_debugfs_resp[6], ec_debugfs_resp[7], ec_debugfs_resp_bytes); + +out: + mutex_unlock(&ec_debugfs_cmd_lock); + return size; +} + +static ssize_t ec_debugfs_cmd_read(struct file *file, char __user *buf, + size_t size, loff_t *ppos) +{ + unsigned int i, r; + char *rp; + char respbuf[64]; + + mutex_lock(&ec_debugfs_cmd_lock); + rp = respbuf; + rp += sprintf(rp, "%02x", ec_debugfs_resp[0]); + for (i = 1; i < ec_debugfs_resp_bytes; i++) + rp += sprintf(rp, ", %02x", ec_debugfs_resp[i]); + mutex_unlock(&ec_debugfs_cmd_lock); + rp += sprintf(rp, "\n"); + + r = rp - respbuf; + return simple_read_from_buffer(buf, size, ppos, respbuf, r); +} + +static const struct file_operations ec_debugfs_genops = { + .write = ec_debugfs_cmd_write, + .read = ec_debugfs_cmd_read, +}; + +static void setup_debugfs(void) +{ + ec_debugfs_dir = debugfs_create_dir("olpc-ec", 0); + if (ec_debugfs_dir == ERR_PTR(-ENODEV)) + return; + + debugfs_create_file("cmd", 0600, ec_debugfs_dir, NULL, + &ec_debugfs_genops); +} + static int olpc_ec_suspend(void) { return olpc_ec_mask_write(ec_wakeup_mask); @@ -372,6 +468,7 @@ static int __init olpc_init(void) } register_syscore_ops(&olpc_syscore_ops); + setup_debugfs(); return 0; } -- cgit v1.2.3 From 2f2cc27f50e3d232602d3b7c972071b4a30e5e38 Mon Sep 17 00:00:00 2001 From: "Ying-Chun Liu (PaulLiu)" Date: Tue, 27 Mar 2012 15:54:01 +0800 Subject: regulator: anatop: patching to device-tree property "reg". Change "reg" to "anatop-reg-offset" due to there is a warning of handling no size field in reg. This patch also adds the missing device-tree binding documentation. Signed-off-by: Ying-Chun Liu (PaulLiu) Acked-by: Shawn Guo Signed-off-by: Mark Brown --- .../bindings/regulator/anatop-regulator.txt | 29 ++++++++++++++++++++++ drivers/regulator/anatop-regulator.c | 5 ++-- 2 files changed, 32 insertions(+), 2 deletions(-) create mode 100644 Documentation/devicetree/bindings/regulator/anatop-regulator.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/regulator/anatop-regulator.txt b/Documentation/devicetree/bindings/regulator/anatop-regulator.txt new file mode 100644 index 000000000000..357758cb6e92 --- /dev/null +++ b/Documentation/devicetree/bindings/regulator/anatop-regulator.txt @@ -0,0 +1,29 @@ +Anatop Voltage regulators + +Required properties: +- compatible: Must be "fsl,anatop-regulator" +- anatop-reg-offset: Anatop MFD register offset +- anatop-vol-bit-shift: Bit shift for the register +- anatop-vol-bit-width: Number of bits used in the register +- anatop-min-bit-val: Minimum value of this register +- anatop-min-voltage: Minimum voltage of this regulator +- anatop-max-voltage: Maximum voltage of this regulator + +Any property defined as part of the core regulator +binding, defined in regulator.txt, can also be used. + +Example: + + regulator-vddpu { + compatible = "fsl,anatop-regulator"; + regulator-name = "vddpu"; + regulator-min-microvolt = <725000>; + regulator-max-microvolt = <1300000>; + regulator-always-on; + anatop-reg-offset = <0x140>; + anatop-vol-bit-shift = <9>; + anatop-vol-bit-width = <5>; + anatop-min-bit-val = <1>; + anatop-min-voltage = <725000>; + anatop-max-voltage = <1300000>; + }; diff --git a/drivers/regulator/anatop-regulator.c b/drivers/regulator/anatop-regulator.c index 17499a55113d..53969af17558 100644 --- a/drivers/regulator/anatop-regulator.c +++ b/drivers/regulator/anatop-regulator.c @@ -138,9 +138,10 @@ static int __devinit anatop_regulator_probe(struct platform_device *pdev) rdesc->type = REGULATOR_VOLTAGE; rdesc->owner = THIS_MODULE; sreg->mfd = anatopmfd; - ret = of_property_read_u32(np, "reg", &sreg->control_reg); + ret = of_property_read_u32(np, "anatop-reg-offset", + &sreg->control_reg); if (ret) { - dev_err(dev, "no reg property set\n"); + dev_err(dev, "no anatop-reg-offset property set\n"); goto anatop_probe_end; } ret = of_property_read_u32(np, "anatop-vol-bit-width", -- cgit v1.2.3 From 8a4134322bd429d24f71147eb59a47a981e8f63a Mon Sep 17 00:00:00 2001 From: Marek Szyprowski Date: Fri, 23 Dec 2011 09:30:47 +0100 Subject: common: DMA-mapping: add WRITE_COMBINE attribute DMA_ATTR_WRITE_COMBINE specifies that writes to the mapping may be buffered to improve performance. It will be used by the replacement for ARM/ARV32 specific dma_alloc_writecombine() function. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Reviewed-by: Arnd Bergmann --- Documentation/DMA-attributes.txt | 10 ++++++++++ include/linux/dma-attrs.h | 1 + 2 files changed, 11 insertions(+) (limited to 'Documentation') diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index b768cc0e402b..811a5d458dae 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -31,3 +31,13 @@ may be weakly ordered, that is that reads and writes may pass each other. Since it is optional for platforms to implement DMA_ATTR_WEAK_ORDERING, those that do not will simply ignore the attribute and exhibit default behavior. + +DMA_ATTR_WRITE_COMBINE +---------------------- + +DMA_ATTR_WRITE_COMBINE specifies that writes to the mapping may be +buffered to improve performance. + +Since it is optional for platforms to implement DMA_ATTR_WRITE_COMBINE, +those that do not will simply ignore the attribute and exhibit default +behavior. diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h index 71ad34eca6e3..ada61e1abf29 100644 --- a/include/linux/dma-attrs.h +++ b/include/linux/dma-attrs.h @@ -13,6 +13,7 @@ enum dma_attr { DMA_ATTR_WRITE_BARRIER, DMA_ATTR_WEAK_ORDERING, + DMA_ATTR_WRITE_COMBINE, DMA_ATTR_MAX, }; -- cgit v1.2.3 From 64d70fe5d3640e1a89790ed21120921278f8cb86 Mon Sep 17 00:00:00 2001 From: Marek Szyprowski Date: Wed, 28 Mar 2012 07:55:56 +0200 Subject: common: DMA-mapping: add NON-CONSISTENT attribute DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either consistent or non-consistent memory as it sees fit. By using this API, you are guaranteeing to the platform that you have all the correct and necessary sync points for this memory in the driver. Signed-off-by: Marek Szyprowski Acked-by: Kyungmin Park Reviewed-by: Arnd Bergmann --- Documentation/DMA-attributes.txt | 8 ++++++++ include/linux/dma-attrs.h | 1 + 2 files changed, 9 insertions(+) (limited to 'Documentation') diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt index 811a5d458dae..5c72eed89563 100644 --- a/Documentation/DMA-attributes.txt +++ b/Documentation/DMA-attributes.txt @@ -41,3 +41,11 @@ buffered to improve performance. Since it is optional for platforms to implement DMA_ATTR_WRITE_COMBINE, those that do not will simply ignore the attribute and exhibit default behavior. + +DMA_ATTR_NON_CONSISTENT +----------------------- + +DMA_ATTR_NON_CONSISTENT lets the platform to choose to return either +consistent or non-consistent memory as it sees fit. By using this API, +you are guaranteeing to the platform that you have all the correct and +necessary sync points for this memory in the driver. diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h index ada61e1abf29..547ab568d3ae 100644 --- a/include/linux/dma-attrs.h +++ b/include/linux/dma-attrs.h @@ -14,6 +14,7 @@ enum dma_attr { DMA_ATTR_WRITE_BARRIER, DMA_ATTR_WEAK_ORDERING, DMA_ATTR_WRITE_COMBINE, + DMA_ATTR_NON_CONSISTENT, DMA_ATTR_MAX, }; -- cgit v1.2.3 From 1d330ec6f9186bda04fdf63ed513cd0cc84bd102 Mon Sep 17 00:00:00 2001 From: Milan Broz Date: Wed, 28 Mar 2012 18:41:23 +0100 Subject: dm: document sysfs entries Describe attributes provided by device-mapper in /sys/block. Signed-off-by: Milan Broz Signed-off-by: Alasdair G Kergon --- Documentation/ABI/testing/sysfs-block-dm | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-block-dm (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-block-dm b/Documentation/ABI/testing/sysfs-block-dm new file mode 100644 index 000000000000..87ca5691e29b --- /dev/null +++ b/Documentation/ABI/testing/sysfs-block-dm @@ -0,0 +1,25 @@ +What: /sys/block/dm-/dm/name +Date: January 2009 +KernelVersion: 2.6.29 +Contact: dm-devel@redhat.com +Description: Device-mapper device name. + Read-only string containing mapped device name. +Users: util-linux, device-mapper udev rules + +What: /sys/block/dm-/dm/uuid +Date: January 2009 +KernelVersion: 2.6.29 +Contact: dm-devel@redhat.com +Description: Device-mapper device UUID. + Read-only string containing DM-UUID or empty string + if DM-UUID is not set. +Users: util-linux, device-mapper udev rules + +What: /sys/block/dm-/dm/suspended +Date: June 2009 +KernelVersion: 2.6.31 +Contact: dm-devel@redhat.com +Description: Device-mapper device suspend state. + Contains the value 1 while the device is suspended. + Otherwise it contains 0. Read-only attribute. +Users: util-linux, device-mapper udev rules -- cgit v1.2.3 From fe878f34df89ad4af758f40bbec829807dc93a00 Mon Sep 17 00:00:00 2001 From: Joe Thornber Date: Wed, 28 Mar 2012 18:41:24 +0100 Subject: dm thin: correct comments Remove documentation for unimplemented 'trim' message. I'd planned a 'trim' target message for shrinking thin devices, but this is better handled via the discard ioctl. Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/thin-provisioning.txt | 10 ---------- drivers/md/dm-thin.c | 2 +- 2 files changed, 1 insertion(+), 11 deletions(-) (limited to 'Documentation') diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt index 1ff044d87ca4..da50347c0053 100644 --- a/Documentation/device-mapper/thin-provisioning.txt +++ b/Documentation/device-mapper/thin-provisioning.txt @@ -237,16 +237,6 @@ iii) Messages Deletes a thin device. Irreversible. - trim - - Delete mappings from the end of a thin device. Irreversible. - You might want to use this if you're reducing the size of - your thinly-provisioned device. In many cases, due to the - sharing of blocks between devices, it is not possible to - determine in advance how much space 'trim' will release. (In - future a userspace tool might be able to perform this - calculation.) - set_transaction_id Userland volume managers, such as LVM, need a way to diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index da2f0217df66..1791134cf477 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -72,7 +72,7 @@ * missed out if the io covers the block. (schedule_copy). * * iv) insert the new mapping into the origin's btree - * (process_prepared_mappings). This act of inserting breaks some + * (process_prepared_mapping). This act of inserting breaks some * sharing of btree nodes between the two devices. Breaking sharing only * effects the btree of that specific device. Btrees for the other * devices that share the block never change. The btree for the origin -- cgit v1.2.3 From c4a69ecdb463a901b4645230613961e134e897cd Mon Sep 17 00:00:00 2001 From: Mike Snitzer Date: Wed, 28 Mar 2012 18:41:28 +0100 Subject: dm thin: relax hard limit on the maximum size of a metadata device The thin metadata format can only make use of a device that is <= THIN_METADATA_MAX_SECTORS (currently 15.9375 GB). Therefore, there is no practical benefit to using a larger device. However, it may be that other factors impose a certain granularity for the space that is allocated to a device (E.g. lvm2 can impose a coarse granularity through the use of large, >= 1 GB, physical extents). Rather than reject a larger metadata device, during thin-pool device construction, switch to allowing it but issue a warning if a device larger than THIN_METADATA_MAX_SECTORS_WARNING (16 GB) is provided. Any space over 15.9375 GB will not be used. Signed-off-by: Mike Snitzer Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/thin-provisioning.txt | 8 +++++--- drivers/md/dm-thin-metadata.c | 3 +++ drivers/md/dm-thin-metadata.h | 13 +++++++++++++ drivers/md/dm-thin.c | 19 ++++--------------- 4 files changed, 25 insertions(+), 18 deletions(-) (limited to 'Documentation') diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt index da50347c0053..a119fbdd0604 100644 --- a/Documentation/device-mapper/thin-provisioning.txt +++ b/Documentation/device-mapper/thin-provisioning.txt @@ -75,10 +75,12 @@ less sharing than average you'll need a larger-than-average metadata device. As a guide, we suggest you calculate the number of bytes to use in the metadata device as 48 * $data_dev_size / $data_block_size but round it up -to 2MB if the answer is smaller. The largest size supported is 16GB. +to 2MB if the answer is smaller. If you're creating large numbers of +snapshots which are recording large amounts of change, you may find you +need to increase this. -If you're creating large numbers of snapshots which are recording large -amounts of change, you may need find you need to increase this. +The largest size supported is 16GB: If the device is larger, +a warning will be issued and the excess space will not be used. Reloading a pool table ---------------------- diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c index a680c761341f..737d38865b69 100644 --- a/drivers/md/dm-thin-metadata.c +++ b/drivers/md/dm-thin-metadata.c @@ -713,6 +713,9 @@ struct dm_pool_metadata *dm_pool_metadata_open(struct block_device *bdev, if (r) goto bad; + if (bdev_size > THIN_METADATA_MAX_SECTORS) + bdev_size = THIN_METADATA_MAX_SECTORS; + disk_super = dm_block_data(sblock); disk_super->magic = cpu_to_le64(THIN_SUPERBLOCK_MAGIC); disk_super->version = cpu_to_le32(THIN_VERSION); diff --git a/drivers/md/dm-thin-metadata.h b/drivers/md/dm-thin-metadata.h index 859c16896877..ed4725e67c96 100644 --- a/drivers/md/dm-thin-metadata.h +++ b/drivers/md/dm-thin-metadata.h @@ -11,6 +11,19 @@ #define THIN_METADATA_BLOCK_SIZE 4096 +/* + * The metadata device is currently limited in size. + * + * We have one block of index, which can hold 255 index entries. Each + * index entry contains allocation info about 16k metadata blocks. + */ +#define THIN_METADATA_MAX_SECTORS (255 * (1 << 14) * (THIN_METADATA_BLOCK_SIZE / (1 << SECTOR_SHIFT))) + +/* + * A metadata device larger than 16GB triggers a warning. + */ +#define THIN_METADATA_MAX_SECTORS_WARNING (16 * (1024 * 1024 * 1024 >> SECTOR_SHIFT)) + /*----------------------------------------------------------------*/ struct dm_pool_metadata; diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index bcb143396fe0..2378ee88b1e8 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -32,16 +32,6 @@ #define DATA_DEV_BLOCK_SIZE_MIN_SECTORS (64 * 1024 >> SECTOR_SHIFT) #define DATA_DEV_BLOCK_SIZE_MAX_SECTORS (1024 * 1024 * 1024 >> SECTOR_SHIFT) -/* - * The metadata device is currently limited in size. The limitation is - * checked lower down in dm-space-map-metadata, but we also check it here - * so we can fail early. - * - * We have one block of index, which can hold 255 index entries. Each - * index entry contains allocation info about 16k metadata blocks. - */ -#define METADATA_DEV_MAX_SECTORS (255 * (1 << 14) * (THIN_METADATA_BLOCK_SIZE / (1 << SECTOR_SHIFT))) - /* * Device id is restricted to 24 bits. */ @@ -1736,6 +1726,7 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) dm_block_t low_water_blocks; struct dm_dev *metadata_dev; sector_t metadata_dev_size; + char b[BDEVNAME_SIZE]; /* * FIXME Remove validation from scope of lock. @@ -1757,11 +1748,9 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) } metadata_dev_size = i_size_read(metadata_dev->bdev->bd_inode) >> SECTOR_SHIFT; - if (metadata_dev_size > METADATA_DEV_MAX_SECTORS) { - ti->error = "Metadata device is too large"; - r = -EINVAL; - goto out_metadata; - } + if (metadata_dev_size > THIN_METADATA_MAX_SECTORS_WARNING) + DMWARN("Metadata device %s is larger than %u sectors: excess space will not be used.", + bdevname(metadata_dev->bdev, b), THIN_METADATA_MAX_SECTORS); r = dm_get_device(ti, argv[1], FMODE_READ | FMODE_WRITE, &data_dev); if (r) { -- cgit v1.2.3 From 2dd9c257fbc243aa76ee6db0bb8371f9f74fad2d Mon Sep 17 00:00:00 2001 From: Joe Thornber Date: Wed, 28 Mar 2012 18:41:28 +0100 Subject: dm thin: support read only external snapshot origins Support the use of an external _read only_ device as an origin for a thin device. Any read to an unprovisioned area of the thin device will be passed through to the origin. Writes trigger allocation of new blocks as usual. One possible use case for this would be VM hosts that want to run guests on thinly-provisioned volumes but have the base image on another device (possibly shared between many VMs). Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/thin-provisioning.txt | 39 ++++++++++- drivers/md/dm-thin.c | 85 +++++++++++++++++++---- 2 files changed, 109 insertions(+), 15 deletions(-) (limited to 'Documentation') diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt index a119fbdd0604..310c1201d621 100644 --- a/Documentation/device-mapper/thin-provisioning.txt +++ b/Documentation/device-mapper/thin-provisioning.txt @@ -169,6 +169,38 @@ ii) Using an internal snapshot. dmsetup create snap --table "0 2097152 thin /dev/mapper/pool 1" +External snapshots +------------------ + +You can use an external _read only_ device as an origin for a +thinly-provisioned volume. Any read to an unprovisioned area of the +thin device will be passed through to the origin. Writes trigger +the allocation of new blocks as usual. + +One use case for this is VM hosts that want to run guests on +thinly-provisioned volumes but have the base image on another device +(possibly shared between many VMs). + +You must not write to the origin device if you use this technique! +Of course, you may write to the thin device and take internal snapshots +of the thin volume. + +i) Creating a snapshot of an external device + + This is the same as creating a thin device. + You don't mention the origin at this stage. + + dmsetup message /dev/mapper/pool 0 "create_thin 0" + +ii) Using a snapshot of an external device. + + Append an extra parameter to the thin target specifying the origin: + + dmsetup create snap --table "0 2097152 thin /dev/mapper/pool 0 /dev/image" + + N.B. All descendants (internal snapshots) of this snapshot require the + same extra origin parameter. + Deactivation ------------ @@ -254,7 +286,7 @@ iii) Messages i) Constructor - thin + thin [] pool dev: the thin-pool device, e.g. /dev/mapper/my_pool or 253:0 @@ -263,6 +295,11 @@ i) Constructor the internal device identifier of the device to be activated. + external origin dev: + an optional block device outside the pool to be treated as a + read-only snapshot origin: reads to unprovisioned areas of the + thin target will be mapped to this device. + The pool doesn't store any size against the thin devices. If you load a thin target that is smaller than you've been using previously, then you'll have no access to blocks mapped beyond the end. If you diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 2378ee88b1e8..2d9e75586d60 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -549,6 +549,7 @@ struct pool_c { */ struct thin_c { struct dm_dev *pool_dev; + struct dm_dev *origin_dev; dm_thin_id dev_id; struct pool *pool; @@ -666,14 +667,16 @@ static void remap(struct thin_c *tc, struct bio *bio, dm_block_t block) (bio->bi_sector & pool->offset_mask); } -static void remap_and_issue(struct thin_c *tc, struct bio *bio, - dm_block_t block) +static void remap_to_origin(struct thin_c *tc, struct bio *bio) +{ + bio->bi_bdev = tc->origin_dev->bdev; +} + +static void issue(struct thin_c *tc, struct bio *bio) { struct pool *pool = tc->pool; unsigned long flags; - remap(tc, bio, block); - /* * Batch together any FUA/FLUSH bios we find and then issue * a single commit for them in process_deferred_bios(). @@ -686,6 +689,19 @@ static void remap_and_issue(struct thin_c *tc, struct bio *bio, generic_make_request(bio); } +static void remap_to_origin_and_issue(struct thin_c *tc, struct bio *bio) +{ + remap_to_origin(tc, bio); + issue(tc, bio); +} + +static void remap_and_issue(struct thin_c *tc, struct bio *bio, + dm_block_t block) +{ + remap(tc, bio, block); + issue(tc, bio); +} + /* * wake_worker() is used when new work is queued and when pool_resume is * ready to continue deferred IO processing. @@ -932,7 +948,8 @@ static struct new_mapping *get_next_mapping(struct pool *pool) } static void schedule_copy(struct thin_c *tc, dm_block_t virt_block, - dm_block_t data_origin, dm_block_t data_dest, + struct dm_dev *origin, dm_block_t data_origin, + dm_block_t data_dest, struct cell *cell, struct bio *bio) { int r; @@ -964,7 +981,7 @@ static void schedule_copy(struct thin_c *tc, dm_block_t virt_block, } else { struct dm_io_region from, to; - from.bdev = tc->pool_dev->bdev; + from.bdev = origin->bdev; from.sector = data_origin * pool->sectors_per_block; from.count = pool->sectors_per_block; @@ -982,6 +999,22 @@ static void schedule_copy(struct thin_c *tc, dm_block_t virt_block, } } +static void schedule_internal_copy(struct thin_c *tc, dm_block_t virt_block, + dm_block_t data_origin, dm_block_t data_dest, + struct cell *cell, struct bio *bio) +{ + schedule_copy(tc, virt_block, tc->pool_dev, + data_origin, data_dest, cell, bio); +} + +static void schedule_external_copy(struct thin_c *tc, dm_block_t virt_block, + dm_block_t data_dest, + struct cell *cell, struct bio *bio) +{ + schedule_copy(tc, virt_block, tc->origin_dev, + virt_block, data_dest, cell, bio); +} + static void schedule_zero(struct thin_c *tc, dm_block_t virt_block, dm_block_t data_block, struct cell *cell, struct bio *bio) @@ -1128,8 +1161,8 @@ static void break_sharing(struct thin_c *tc, struct bio *bio, dm_block_t block, r = alloc_data_block(tc, &data_block); switch (r) { case 0: - schedule_copy(tc, block, lookup_result->block, - data_block, cell, bio); + schedule_internal_copy(tc, block, lookup_result->block, + data_block, cell, bio); break; case -ENOSPC: @@ -1203,7 +1236,10 @@ static void provision_block(struct thin_c *tc, struct bio *bio, dm_block_t block r = alloc_data_block(tc, &data_block); switch (r) { case 0: - schedule_zero(tc, block, data_block, cell, bio); + if (tc->origin_dev) + schedule_external_copy(tc, block, data_block, cell, bio); + else + schedule_zero(tc, block, data_block, cell, bio); break; case -ENOSPC: @@ -1254,7 +1290,11 @@ static void process_bio(struct thin_c *tc, struct bio *bio) break; case -ENODATA: - provision_block(tc, bio, block, cell); + if (bio_data_dir(bio) == READ && tc->origin_dev) { + cell_release_singleton(cell, bio); + remap_to_origin_and_issue(tc, bio); + } else + provision_block(tc, bio, block, cell); break; default: @@ -2237,6 +2277,8 @@ static void thin_dtr(struct dm_target *ti) __pool_dec(tc->pool); dm_pool_close_thin_device(tc->td); dm_put_device(ti, tc->pool_dev); + if (tc->origin_dev) + dm_put_device(ti, tc->origin_dev); kfree(tc); mutex_unlock(&dm_thin_pool_table.mutex); @@ -2245,21 +2287,22 @@ static void thin_dtr(struct dm_target *ti) /* * Thin target parameters: * - * + * [origin_dev] * * pool_dev: the path to the pool (eg, /dev/mapper/my_pool) * dev_id: the internal device identifier + * origin_dev: a device external to the pool that should act as the origin */ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv) { int r; struct thin_c *tc; - struct dm_dev *pool_dev; + struct dm_dev *pool_dev, *origin_dev; struct mapped_device *pool_md; mutex_lock(&dm_thin_pool_table.mutex); - if (argc != 2) { + if (argc != 2 && argc != 3) { ti->error = "Invalid argument count"; r = -EINVAL; goto out_unlock; @@ -2272,6 +2315,15 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv) goto out_unlock; } + if (argc == 3) { + r = dm_get_device(ti, argv[2], FMODE_READ, &origin_dev); + if (r) { + ti->error = "Error opening origin device"; + goto bad_origin_dev; + } + tc->origin_dev = origin_dev; + } + r = dm_get_device(ti, argv[0], dm_table_get_mode(ti->table), &pool_dev); if (r) { ti->error = "Error opening pool device"; @@ -2324,6 +2376,9 @@ bad_pool_lookup: bad_common: dm_put_device(ti, tc->pool_dev); bad_pool_dev: + if (tc->origin_dev) + dm_put_device(ti, tc->origin_dev); +bad_origin_dev: kfree(tc); out_unlock: mutex_unlock(&dm_thin_pool_table.mutex); @@ -2382,6 +2437,8 @@ static int thin_status(struct dm_target *ti, status_type_t type, DMEMIT("%s %lu", format_dev_t(buf, tc->pool_dev->bdev->bd_dev), (unsigned long) tc->dev_id); + if (tc->origin_dev) + DMEMIT(" %s", format_dev_t(buf, tc->origin_dev->bdev->bd_dev)); break; } } @@ -2419,7 +2476,7 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits) static struct target_type thin_target = { .name = "thin", - .version = {1, 0, 0}, + .version = {1, 1, 0}, .module = THIS_MODULE, .ctr = thin_ctr, .dtr = thin_dtr, -- cgit v1.2.3 From 67e2e2b281812b5caf4923a38aadc6b89e34f064 Mon Sep 17 00:00:00 2001 From: Joe Thornber Date: Wed, 28 Mar 2012 18:41:29 +0100 Subject: dm thin: add pool target flags to control discard Add dm thin target arguments to control discard support. ignore_discard: Disables discard support no_discard_passdown: Don't pass discards down to the underlying data device, but just remove the mapping within the thin provisioning target. Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/thin-provisioning.txt | 8 +- drivers/md/dm-thin.c | 135 +++++++++++++++++----- 2 files changed, 115 insertions(+), 28 deletions(-) (limited to 'Documentation') diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.txt index 310c1201d621..3370bc4d7b98 100644 --- a/Documentation/device-mapper/thin-provisioning.txt +++ b/Documentation/device-mapper/thin-provisioning.txt @@ -223,7 +223,13 @@ i) Constructor [ []*] Optional feature arguments: - - 'skip_block_zeroing': skips the zeroing of newly-provisioned blocks. + + skip_block_zeroing: Skip the zeroing of newly-provisioned blocks. + + ignore_discard: Disable discard support. + + no_discard_passdown: Don't pass discards down to the underlying + data device, but just remove the mapping. Data block size must be between 64KB (128 sectors) and 1GB (2097152 sectors) inclusive. diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 703bbbc4f16f..213ae32a0fc4 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -489,6 +489,13 @@ static void build_virtual_key(struct dm_thin_device *td, dm_block_t b, * devices. */ struct new_mapping; + +struct pool_features { + unsigned zero_new_blocks:1; + unsigned discard_enabled:1; + unsigned discard_passdown:1; +}; + struct pool { struct list_head list; struct dm_target *ti; /* Only set if a pool target is bound */ @@ -502,7 +509,7 @@ struct pool { dm_block_t offset_mask; dm_block_t low_water_blocks; - unsigned zero_new_blocks:1; + struct pool_features pf; unsigned low_water_triggered:1; /* A dm event has been sent */ unsigned no_free_space:1; /* A -ENOSPC warning has been issued */ @@ -543,7 +550,7 @@ struct pool_c { struct dm_target_callbacks callbacks; dm_block_t low_water_blocks; - unsigned zero_new_blocks:1; + struct pool_features pf; }; /* @@ -1051,7 +1058,7 @@ static void schedule_zero(struct thin_c *tc, dm_block_t virt_block, * zeroing pre-existing data, we can issue the bio immediately. * Otherwise we use kcopyd to zero the data first. */ - if (!pool->zero_new_blocks) + if (!pool->pf.zero_new_blocks) process_prepared_mapping(m); else if (io_overwrites_block(pool, bio)) { @@ -1202,7 +1209,7 @@ static void process_discard(struct thin_c *tc, struct bio *bio) */ m = get_next_mapping(pool); m->tc = tc; - m->pass_discard = !lookup_result.shared; + m->pass_discard = (!lookup_result.shared) & pool->pf.discard_passdown; m->virt_block = block; m->data_block = lookup_result.block; m->cell = cell; @@ -1617,7 +1624,7 @@ static int bind_control_target(struct pool *pool, struct dm_target *ti) pool->ti = ti; pool->low_water_blocks = pt->low_water_blocks; - pool->zero_new_blocks = pt->zero_new_blocks; + pool->pf = pt->pf; return 0; } @@ -1631,6 +1638,14 @@ static void unbind_control_target(struct pool *pool, struct dm_target *ti) /*---------------------------------------------------------------- * Pool creation *--------------------------------------------------------------*/ +/* Initialize pool features. */ +static void pool_features_init(struct pool_features *pf) +{ + pf->zero_new_blocks = 1; + pf->discard_enabled = 1; + pf->discard_passdown = 1; +} + static void __pool_destroy(struct pool *pool) { __pool_table_remove(pool); @@ -1678,7 +1693,7 @@ static struct pool *pool_create(struct mapped_device *pool_md, pool->block_shift = ffs(block_size) - 1; pool->offset_mask = block_size - 1; pool->low_water_blocks = 0; - pool->zero_new_blocks = 1; + pool_features_init(&pool->pf); pool->prison = prison_create(PRISON_CELLS); if (!pool->prison) { *error = "Error creating pool's bio prison"; @@ -1775,7 +1790,8 @@ static void __pool_dec(struct pool *pool) static struct pool *__pool_find(struct mapped_device *pool_md, struct block_device *metadata_dev, - unsigned long block_size, char **error) + unsigned long block_size, char **error, + int *created) { struct pool *pool = __pool_table_lookup_metadata_dev(metadata_dev); @@ -1791,8 +1807,10 @@ static struct pool *__pool_find(struct mapped_device *pool_md, return ERR_PTR(-EINVAL); __pool_inc(pool); - } else + } else { pool = pool_create(pool_md, metadata_dev, block_size, error); + *created = 1; + } } return pool; @@ -1816,10 +1834,6 @@ static void pool_dtr(struct dm_target *ti) mutex_unlock(&dm_thin_pool_table.mutex); } -struct pool_features { - unsigned zero_new_blocks:1; -}; - static int parse_pool_features(struct dm_arg_set *as, struct pool_features *pf, struct dm_target *ti) { @@ -1828,7 +1842,7 @@ static int parse_pool_features(struct dm_arg_set *as, struct pool_features *pf, const char *arg_name; static struct dm_arg _args[] = { - {0, 1, "Invalid number of pool feature arguments"}, + {0, 3, "Invalid number of pool feature arguments"}, }; /* @@ -1848,6 +1862,12 @@ static int parse_pool_features(struct dm_arg_set *as, struct pool_features *pf, if (!strcasecmp(arg_name, "skip_block_zeroing")) { pf->zero_new_blocks = 0; continue; + } else if (!strcasecmp(arg_name, "ignore_discard")) { + pf->discard_enabled = 0; + continue; + } else if (!strcasecmp(arg_name, "no_discard_passdown")) { + pf->discard_passdown = 0; + continue; } ti->error = "Unrecognised pool feature requested"; @@ -1865,10 +1885,12 @@ static int parse_pool_features(struct dm_arg_set *as, struct pool_features *pf, * * Optional feature arguments are: * skip_block_zeroing: skips the zeroing of newly-provisioned blocks. + * ignore_discard: disable discard + * no_discard_passdown: don't pass discards down to the data device */ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) { - int r; + int r, pool_created = 0; struct pool_c *pt; struct pool *pool; struct pool_features pf; @@ -1928,8 +1950,7 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) /* * Set default pool features. */ - memset(&pf, 0, sizeof(pf)); - pf.zero_new_blocks = 1; + pool_features_init(&pf); dm_consume_args(&as, 4); r = parse_pool_features(&as, &pf, ti); @@ -1943,21 +1964,58 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) } pool = __pool_find(dm_table_get_md(ti->table), metadata_dev->bdev, - block_size, &ti->error); + block_size, &ti->error, &pool_created); if (IS_ERR(pool)) { r = PTR_ERR(pool); goto out_free_pt; } + /* + * 'pool_created' reflects whether this is the first table load. + * Top level discard support is not allowed to be changed after + * initial load. This would require a pool reload to trigger thin + * device changes. + */ + if (!pool_created && pf.discard_enabled != pool->pf.discard_enabled) { + ti->error = "Discard support cannot be disabled once enabled"; + r = -EINVAL; + goto out_flags_changed; + } + + /* + * If discard_passdown was enabled verify that the data device + * supports discards. Disable discard_passdown if not; otherwise + * -EOPNOTSUPP will be returned. + */ + if (pf.discard_passdown) { + struct request_queue *q = bdev_get_queue(data_dev->bdev); + if (!q || !blk_queue_discard(q)) { + DMWARN("Discard unsupported by data device: Disabling discard passdown."); + pf.discard_passdown = 0; + } + } + pt->pool = pool; pt->ti = ti; pt->metadata_dev = metadata_dev; pt->data_dev = data_dev; pt->low_water_blocks = low_water_blocks; - pt->zero_new_blocks = pf.zero_new_blocks; + pt->pf = pf; ti->num_flush_requests = 1; - ti->num_discard_requests = 1; - ti->discards_supported = 1; + /* + * Only need to enable discards if the pool should pass + * them down to the data device. The thin device's discard + * processing will cause mappings to be removed from the btree. + */ + if (pf.discard_enabled && pf.discard_passdown) { + ti->num_discard_requests = 1; + /* + * Setting 'discards_supported' circumvents the normal + * stacking of discard limits (this keeps the pool and + * thin devices' discard limits consistent). + */ + ti->discards_supported = 1; + } ti->private = pt; pt->callbacks.congested_fn = pool_is_congested; @@ -1967,6 +2025,8 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) return 0; +out_flags_changed: + __pool_dec(pool); out_free_pt: kfree(pt); out: @@ -2255,7 +2315,7 @@ static int pool_message(struct dm_target *ti, unsigned argc, char **argv) static int pool_status(struct dm_target *ti, status_type_t type, char *result, unsigned maxlen) { - int r; + int r, count; unsigned sz = 0; uint64_t transaction_id; dm_block_t nr_free_blocks_data; @@ -2318,10 +2378,19 @@ static int pool_status(struct dm_target *ti, status_type_t type, (unsigned long)pool->sectors_per_block, (unsigned long long)pt->low_water_blocks); - DMEMIT("%u ", !pool->zero_new_blocks); + count = !pool->pf.zero_new_blocks + !pool->pf.discard_enabled + + !pool->pf.discard_passdown; + DMEMIT("%u ", count); - if (!pool->zero_new_blocks) + if (!pool->pf.zero_new_blocks) DMEMIT("skip_block_zeroing "); + + if (!pool->pf.discard_enabled) + DMEMIT("ignore_discard "); + + if (!pool->pf.discard_passdown) + DMEMIT("no_discard_passdown "); + break; } @@ -2352,6 +2421,9 @@ static int pool_merge(struct dm_target *ti, struct bvec_merge_data *bvm, static void set_discard_limits(struct pool *pool, struct queue_limits *limits) { + /* + * FIXME: these limits may be incompatible with the pool's data device + */ limits->max_discard_sectors = pool->sectors_per_block; /* @@ -2359,6 +2431,7 @@ static void set_discard_limits(struct pool *pool, struct queue_limits *limits) * bios that overlap 2 blocks. */ limits->discard_granularity = pool->sectors_per_block << SECTOR_SHIFT; + limits->discard_zeroes_data = pool->pf.zero_new_blocks; } static void pool_io_hints(struct dm_target *ti, struct queue_limits *limits) @@ -2368,14 +2441,15 @@ static void pool_io_hints(struct dm_target *ti, struct queue_limits *limits) blk_limits_io_min(limits, 0); blk_limits_io_opt(limits, pool->sectors_per_block << SECTOR_SHIFT); - set_discard_limits(pool, limits); + if (pool->pf.discard_enabled) + set_discard_limits(pool, limits); } static struct target_type pool_target = { .name = "thin-pool", .features = DM_TARGET_SINGLETON | DM_TARGET_ALWAYS_WRITEABLE | DM_TARGET_IMMUTABLE, - .version = {1, 0, 0}, + .version = {1, 1, 0}, .module = THIS_MODULE, .ctr = pool_ctr, .dtr = pool_dtr, @@ -2417,6 +2491,9 @@ static void thin_dtr(struct dm_target *ti) * pool_dev: the path to the pool (eg, /dev/mapper/my_pool) * dev_id: the internal device identifier * origin_dev: a device external to the pool that should act as the origin + * + * If the pool device has discards disabled, they get disabled for the thin + * device as well. */ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv) { @@ -2485,8 +2562,12 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv) ti->split_io = tc->pool->sectors_per_block; ti->num_flush_requests = 1; - ti->num_discard_requests = 1; - ti->discards_supported = 1; + + /* In case the pool supports discards, pass them on. */ + if (tc->pool->pf.discard_enabled) { + ti->discards_supported = 1; + ti->num_discard_requests = 1; + } dm_put(pool_md); -- cgit v1.2.3 From a4ffc152198efba2ed9e6eac0eb97f17bfebce85 Mon Sep 17 00:00:00 2001 From: Mikulas Patocka Date: Wed, 28 Mar 2012 18:43:38 +0100 Subject: dm: add verity target This device-mapper target creates a read-only device that transparently validates the data on one underlying device against a pre-generated tree of cryptographic checksums stored on a second device. Two checksum device formats are supported: version 0 which is already shipping in Chromium OS and version 1 which incorporates some improvements. Signed-off-by: Mikulas Patocka Signed-off-by: Mandeep Singh Baines Signed-off-by: Will Drewry Signed-off-by: Elly Jones Cc: Milan Broz Cc: Olof Johansson Cc: Steffen Klassert Cc: Andrew Morton Signed-off-by: Alasdair G Kergon --- Documentation/device-mapper/verity.txt | 194 +++++++ drivers/md/Kconfig | 20 + drivers/md/Makefile | 1 + drivers/md/dm-verity.c | 913 +++++++++++++++++++++++++++++++++ 4 files changed, 1128 insertions(+) create mode 100644 Documentation/device-mapper/verity.txt create mode 100644 drivers/md/dm-verity.c (limited to 'Documentation') diff --git a/Documentation/device-mapper/verity.txt b/Documentation/device-mapper/verity.txt new file mode 100644 index 000000000000..32e48797a14f --- /dev/null +++ b/Documentation/device-mapper/verity.txt @@ -0,0 +1,194 @@ +dm-verity +========== + +Device-Mapper's "verity" target provides transparent integrity checking of +block devices using a cryptographic digest provided by the kernel crypto API. +This target is read-only. + +Construction Parameters +======================= + + + + + + + This is the version number of the on-disk format. + + 0 is the original format used in the Chromium OS. + The salt is appended when hashing, digests are stored continuously and + the rest of the block is padded with zeros. + + 1 is the current format that should be used for new devices. + The salt is prepended when hashing and each digest is + padded with zeros to the power of two. + + + This is the device containing the data the integrity of which needs to be + checked. It may be specified as a path, like /dev/sdaX, or a device number, + :. + + + This is the device that that supplies the hash tree data. It may be + specified similarly to the device path and may be the same device. If the + same device is used, the hash_start should be outside of the dm-verity + configured device size. + + + The block size on a data device. Each block corresponds to one digest on + the hash device. + + + The size of a hash block. + + + The number of data blocks on the data device. Additional blocks are + inaccessible. You can place hashes to the same partition as data, in this + case hashes are placed after . + + + This is the offset, in -blocks, from the start of hash_dev + to the root block of the hash tree. + + + The cryptographic hash algorithm used for this device. This should + be the name of the algorithm, like "sha1". + + + The hexadecimal encoding of the cryptographic hash of the root hash block + and the salt. This hash should be trusted as there is no other authenticity + beyond this point. + + + The hexadecimal encoding of the salt value. + +Theory of operation +=================== + +dm-verity is meant to be setup as part of a verified boot path. This +may be anything ranging from a boot using tboot or trustedgrub to just +booting from a known-good device (like a USB drive or CD). + +When a dm-verity device is configured, it is expected that the caller +has been authenticated in some way (cryptographic signatures, etc). +After instantiation, all hashes will be verified on-demand during +disk access. If they cannot be verified up to the root node of the +tree, the root hash, then the I/O will fail. This should identify +tampering with any data on the device and the hash data. + +Cryptographic hashes are used to assert the integrity of the device on a +per-block basis. This allows for a lightweight hash computation on first read +into the page cache. Block hashes are stored linearly-aligned to the nearest +block the size of a page. + +Hash Tree +--------- + +Each node in the tree is a cryptographic hash. If it is a leaf node, the hash +is of some block data on disk. If it is an intermediary node, then the hash is +of a number of child nodes. + +Each entry in the tree is a collection of neighboring nodes that fit in one +block. The number is determined based on block_size and the size of the +selected cryptographic digest algorithm. The hashes are linearly-ordered in +this entry and any unaligned trailing space is ignored but included when +calculating the parent node. + +The tree looks something like: + +alg = sha256, num_blocks = 32768, block_size = 4096 + + [ root ] + / . . . \ + [entry_0] [entry_1] + / . . . \ . . . \ + [entry_0_0] . . . [entry_0_127] . . . . [entry_1_127] + / ... \ / . . . \ / \ + blk_0 ... blk_127 blk_16256 blk_16383 blk_32640 . . . blk_32767 + + +On-disk format +============== + +Below is the recommended on-disk format. The verity kernel code does not +read the on-disk header. It only reads the hash blocks which directly +follow the header. It is expected that a user-space tool will verify the +integrity of the verity_header and then call dmsetup with the correct +parameters. Alternatively, the header can be omitted and the dmsetup +parameters can be passed via the kernel command-line in a rooted chain +of trust where the command-line is verified. + +The on-disk format is especially useful in cases where the hash blocks +are on a separate partition. The magic number allows easy identification +of the partition contents. Alternatively, the hash blocks can be stored +in the same partition as the data to be verified. In such a configuration +the filesystem on the partition would be sized a little smaller than +the full-partition, leaving room for the hash blocks. + +struct superblock { + uint8_t signature[8] + "verity\0\0"; + + uint8_t version; + 1 - current format + + uint8_t data_block_bits; + log2(data block size) + + uint8_t hash_block_bits; + log2(hash block size) + + uint8_t pad1[1]; + zero padding + + uint16_t salt_size; + big-endian salt size + + uint8_t pad2[2]; + zero padding + + uint32_t data_blocks_hi; + big-endian high 32 bits of the 64-bit number of data blocks + + uint32_t data_blocks_lo; + big-endian low 32 bits of the 64-bit number of data blocks + + uint8_t algorithm[16]; + cryptographic algorithm + + uint8_t salt[384]; + salt (the salt size is specified above) + + uint8_t pad3[88]; + zero padding to 512-byte boundary +} + +Directly following the header (and with sector number padded to the next hash +block boundary) are the hash blocks which are stored a depth at a time +(starting from the root), sorted in order of increasing index. + +Status +====== +V (for Valid) is returned if every check performed so far was valid. +If any check failed, C (for Corruption) is returned. + +Example +======= + +Setup a device: + dmsetup create vroot --table \ + "0 2097152 "\ + "verity 1 /dev/sda1 /dev/sda2 4096 4096 2097152 1 "\ + "4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 "\ + "1234000000000000000000000000000000000000000000000000000000000000" + +A command line tool veritysetup is available to compute or verify +the hash tree or activate the kernel driver. This is available from +the LVM2 upstream repository and may be supplied as a package called +device-mapper-verity-tools: + git://sources.redhat.com/git/lvm2 + http://sourceware.org/git/?p=lvm2.git + http://sourceware.org/cgi-bin/cvsweb.cgi/LVM2/verity?cvsroot=lvm2 + +veritysetup -a vroot /dev/sda1 /dev/sda2 \ + 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig index 71000078351a..10f122a3a856 100644 --- a/drivers/md/Kconfig +++ b/drivers/md/Kconfig @@ -370,4 +370,24 @@ config DM_FLAKEY ---help--- A target that intermittently fails I/O for debugging purposes. +config DM_VERITY + tristate "Verity target support (EXPERIMENTAL)" + depends on BLK_DEV_DM && EXPERIMENTAL + select CRYPTO + select CRYPTO_HASH + select DM_BUFIO + ---help--- + This device-mapper target creates a read-only device that + transparently validates the data on one underlying device against + a pre-generated tree of cryptographic checksums stored on a second + device. + + You'll need to activate the digests you're going to use in the + cryptoapi configuration. + + To compile this code as a module, choose M here: the module will + be called dm-verity. + + If unsure, say N. + endif # MD diff --git a/drivers/md/Makefile b/drivers/md/Makefile index 046860c7a166..8b2e0dffe82e 100644 --- a/drivers/md/Makefile +++ b/drivers/md/Makefile @@ -42,6 +42,7 @@ obj-$(CONFIG_DM_LOG_USERSPACE) += dm-log-userspace.o obj-$(CONFIG_DM_ZERO) += dm-zero.o obj-$(CONFIG_DM_RAID) += dm-raid.o obj-$(CONFIG_DM_THIN_PROVISIONING) += dm-thin-pool.o +obj-$(CONFIG_DM_VERITY) += dm-verity.o ifeq ($(CONFIG_DM_UEVENT),y) dm-mod-objs += dm-uevent.o diff --git a/drivers/md/dm-verity.c b/drivers/md/dm-verity.c new file mode 100644 index 000000000000..fa365d39b612 --- /dev/null +++ b/drivers/md/dm-verity.c @@ -0,0 +1,913 @@ +/* + * Copyright (C) 2012 Red Hat, Inc. + * + * Author: Mikulas Patocka + * + * Based on Chromium dm-verity driver (C) 2011 The Chromium OS Authors + * + * This file is released under the GPLv2. + * + * In the file "/sys/module/dm_verity/parameters/prefetch_cluster" you can set + * default prefetch value. Data are read in "prefetch_cluster" chunks from the + * hash device. Setting this greatly improves performance when data and hash + * are on the same disk on different partitions on devices with poor random + * access behavior. + */ + +#include "dm-bufio.h" + +#include +#include +#include + +#define DM_MSG_PREFIX "verity" + +#define DM_VERITY_IO_VEC_INLINE 16 +#define DM_VERITY_MEMPOOL_SIZE 4 +#define DM_VERITY_DEFAULT_PREFETCH_SIZE 262144 + +#define DM_VERITY_MAX_LEVELS 63 + +static unsigned dm_verity_prefetch_cluster = DM_VERITY_DEFAULT_PREFETCH_SIZE; + +module_param_named(prefetch_cluster, dm_verity_prefetch_cluster, uint, S_IRUGO | S_IWUSR); + +struct dm_verity { + struct dm_dev *data_dev; + struct dm_dev *hash_dev; + struct dm_target *ti; + struct dm_bufio_client *bufio; + char *alg_name; + struct crypto_shash *tfm; + u8 *root_digest; /* digest of the root block */ + u8 *salt; /* salt: its size is salt_size */ + unsigned salt_size; + sector_t data_start; /* data offset in 512-byte sectors */ + sector_t hash_start; /* hash start in blocks */ + sector_t data_blocks; /* the number of data blocks */ + sector_t hash_blocks; /* the number of hash blocks */ + unsigned char data_dev_block_bits; /* log2(data blocksize) */ + unsigned char hash_dev_block_bits; /* log2(hash blocksize) */ + unsigned char hash_per_block_bits; /* log2(hashes in hash block) */ + unsigned char levels; /* the number of tree levels */ + unsigned char version; + unsigned digest_size; /* digest size for the current hash algorithm */ + unsigned shash_descsize;/* the size of temporary space for crypto */ + int hash_failed; /* set to 1 if hash of any block failed */ + + mempool_t *io_mempool; /* mempool of struct dm_verity_io */ + mempool_t *vec_mempool; /* mempool of bio vector */ + + struct workqueue_struct *verify_wq; + + /* starting blocks for each tree level. 0 is the lowest level. */ + sector_t hash_level_block[DM_VERITY_MAX_LEVELS]; +}; + +struct dm_verity_io { + struct dm_verity *v; + struct bio *bio; + + /* original values of bio->bi_end_io and bio->bi_private */ + bio_end_io_t *orig_bi_end_io; + void *orig_bi_private; + + sector_t block; + unsigned n_blocks; + + /* saved bio vector */ + struct bio_vec *io_vec; + unsigned io_vec_size; + + struct work_struct work; + + /* A space for short vectors; longer vectors are allocated separately. */ + struct bio_vec io_vec_inline[DM_VERITY_IO_VEC_INLINE]; + + /* + * Three variably-size fields follow this struct: + * + * u8 hash_desc[v->shash_descsize]; + * u8 real_digest[v->digest_size]; + * u8 want_digest[v->digest_size]; + * + * To access them use: io_hash_desc(), io_real_digest() and io_want_digest(). + */ +}; + +static struct shash_desc *io_hash_desc(struct dm_verity *v, struct dm_verity_io *io) +{ + return (struct shash_desc *)(io + 1); +} + +static u8 *io_real_digest(struct dm_verity *v, struct dm_verity_io *io) +{ + return (u8 *)(io + 1) + v->shash_descsize; +} + +static u8 *io_want_digest(struct dm_verity *v, struct dm_verity_io *io) +{ + return (u8 *)(io + 1) + v->shash_descsize + v->digest_size; +} + +/* + * Auxiliary structure appended to each dm-bufio buffer. If the value + * hash_verified is nonzero, hash of the block has been verified. + * + * The variable hash_verified is set to 0 when allocating the buffer, then + * it can be changed to 1 and it is never reset to 0 again. + * + * There is no lock around this value, a race condition can at worst cause + * that multiple processes verify the hash of the same buffer simultaneously + * and write 1 to hash_verified simultaneously. + * This condition is harmless, so we don't need locking. + */ +struct buffer_aux { + int hash_verified; +}; + +/* + * Initialize struct buffer_aux for a freshly created buffer. + */ +static void dm_bufio_alloc_callback(struct dm_buffer *buf) +{ + struct buffer_aux *aux = dm_bufio_get_aux_data(buf); + + aux->hash_verified = 0; +} + +/* + * Translate input sector number to the sector number on the target device. + */ +static sector_t verity_map_sector(struct dm_verity *v, sector_t bi_sector) +{ + return v->data_start + dm_target_offset(v->ti, bi_sector); +} + +/* + * Return hash position of a specified block at a specified tree level + * (0 is the lowest level). + * The lowest "hash_per_block_bits"-bits of the result denote hash position + * inside a hash block. The remaining bits denote location of the hash block. + */ +static sector_t verity_position_at_level(struct dm_verity *v, sector_t block, + int level) +{ + return block >> (level * v->hash_per_block_bits); +} + +static void verity_hash_at_level(struct dm_verity *v, sector_t block, int level, + sector_t *hash_block, unsigned *offset) +{ + sector_t position = verity_position_at_level(v, block, level); + unsigned idx; + + *hash_block = v->hash_level_block[level] + (position >> v->hash_per_block_bits); + + if (!offset) + return; + + idx = position & ((1 << v->hash_per_block_bits) - 1); + if (!v->version) + *offset = idx * v->digest_size; + else + *offset = idx << (v->hash_dev_block_bits - v->hash_per_block_bits); +} + +/* + * Verify hash of a metadata block pertaining to the specified data block + * ("block" argument) at a specified level ("level" argument). + * + * On successful return, io_want_digest(v, io) contains the hash value for + * a lower tree level or for the data block (if we're at the lowest leve). + * + * If "skip_unverified" is true, unverified buffer is skipped and 1 is returned. + * If "skip_unverified" is false, unverified buffer is hashed and verified + * against current value of io_want_digest(v, io). + */ +static int verity_verify_level(struct dm_verity_io *io, sector_t block, + int level, bool skip_unverified) +{ + struct dm_verity *v = io->v; + struct dm_buffer *buf; + struct buffer_aux *aux; + u8 *data; + int r; + sector_t hash_block; + unsigned offset; + + verity_hash_at_level(v, block, level, &hash_block, &offset); + + data = dm_bufio_read(v->bufio, hash_block, &buf); + if (unlikely(IS_ERR(data))) + return PTR_ERR(data); + + aux = dm_bufio_get_aux_data(buf); + + if (!aux->hash_verified) { + struct shash_desc *desc; + u8 *result; + + if (skip_unverified) { + r = 1; + goto release_ret_r; + } + + desc = io_hash_desc(v, io); + desc->tfm = v->tfm; + desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP; + r = crypto_shash_init(desc); + if (r < 0) { + DMERR("crypto_shash_init failed: %d", r); + goto release_ret_r; + } + + if (likely(v->version >= 1)) { + r = crypto_shash_update(desc, v->salt, v->salt_size); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + goto release_ret_r; + } + } + + r = crypto_shash_update(desc, data, 1 << v->hash_dev_block_bits); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + goto release_ret_r; + } + + if (!v->version) { + r = crypto_shash_update(desc, v->salt, v->salt_size); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + goto release_ret_r; + } + } + + result = io_real_digest(v, io); + r = crypto_shash_final(desc, result); + if (r < 0) { + DMERR("crypto_shash_final failed: %d", r); + goto release_ret_r; + } + if (unlikely(memcmp(result, io_want_digest(v, io), v->digest_size))) { + DMERR_LIMIT("metadata block %llu is corrupted", + (unsigned long long)hash_block); + v->hash_failed = 1; + r = -EIO; + goto release_ret_r; + } else + aux->hash_verified = 1; + } + + data += offset; + + memcpy(io_want_digest(v, io), data, v->digest_size); + + dm_bufio_release(buf); + return 0; + +release_ret_r: + dm_bufio_release(buf); + + return r; +} + +/* + * Verify one "dm_verity_io" structure. + */ +static int verity_verify_io(struct dm_verity_io *io) +{ + struct dm_verity *v = io->v; + unsigned b; + int i; + unsigned vector = 0, offset = 0; + + for (b = 0; b < io->n_blocks; b++) { + struct shash_desc *desc; + u8 *result; + int r; + unsigned todo; + + if (likely(v->levels)) { + /* + * First, we try to get the requested hash for + * the current block. If the hash block itself is + * verified, zero is returned. If it isn't, this + * function returns 0 and we fall back to whole + * chain verification. + */ + int r = verity_verify_level(io, io->block + b, 0, true); + if (likely(!r)) + goto test_block_hash; + if (r < 0) + return r; + } + + memcpy(io_want_digest(v, io), v->root_digest, v->digest_size); + + for (i = v->levels - 1; i >= 0; i--) { + int r = verity_verify_level(io, io->block + b, i, false); + if (unlikely(r)) + return r; + } + +test_block_hash: + desc = io_hash_desc(v, io); + desc->tfm = v->tfm; + desc->flags = CRYPTO_TFM_REQ_MAY_SLEEP; + r = crypto_shash_init(desc); + if (r < 0) { + DMERR("crypto_shash_init failed: %d", r); + return r; + } + + if (likely(v->version >= 1)) { + r = crypto_shash_update(desc, v->salt, v->salt_size); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + return r; + } + } + + todo = 1 << v->data_dev_block_bits; + do { + struct bio_vec *bv; + u8 *page; + unsigned len; + + BUG_ON(vector >= io->io_vec_size); + bv = &io->io_vec[vector]; + page = kmap_atomic(bv->bv_page); + len = bv->bv_len - offset; + if (likely(len >= todo)) + len = todo; + r = crypto_shash_update(desc, + page + bv->bv_offset + offset, len); + kunmap_atomic(page); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + return r; + } + offset += len; + if (likely(offset == bv->bv_len)) { + offset = 0; + vector++; + } + todo -= len; + } while (todo); + + if (!v->version) { + r = crypto_shash_update(desc, v->salt, v->salt_size); + if (r < 0) { + DMERR("crypto_shash_update failed: %d", r); + return r; + } + } + + result = io_real_digest(v, io); + r = crypto_shash_final(desc, result); + if (r < 0) { + DMERR("crypto_shash_final failed: %d", r); + return r; + } + if (unlikely(memcmp(result, io_want_digest(v, io), v->digest_size))) { + DMERR_LIMIT("data block %llu is corrupted", + (unsigned long long)(io->block + b)); + v->hash_failed = 1; + return -EIO; + } + } + BUG_ON(vector != io->io_vec_size); + BUG_ON(offset); + + return 0; +} + +/* + * End one "io" structure with a given error. + */ +static void verity_finish_io(struct dm_verity_io *io, int error) +{ + struct bio *bio = io->bio; + struct dm_verity *v = io->v; + + bio->bi_end_io = io->orig_bi_end_io; + bio->bi_private = io->orig_bi_private; + + if (io->io_vec != io->io_vec_inline) + mempool_free(io->io_vec, v->vec_mempool); + + mempool_free(io, v->io_mempool); + + bio_endio(bio, error); +} + +static void verity_work(struct work_struct *w) +{ + struct dm_verity_io *io = container_of(w, struct dm_verity_io, work); + + verity_finish_io(io, verity_verify_io(io)); +} + +static void verity_end_io(struct bio *bio, int error) +{ + struct dm_verity_io *io = bio->bi_private; + + if (error) { + verity_finish_io(io, error); + return; + } + + INIT_WORK(&io->work, verity_work); + queue_work(io->v->verify_wq, &io->work); +} + +/* + * Prefetch buffers for the specified io. + * The root buffer is not prefetched, it is assumed that it will be cached + * all the time. + */ +static void verity_prefetch_io(struct dm_verity *v, struct dm_verity_io *io) +{ + int i; + + for (i = v->levels - 2; i >= 0; i--) { + sector_t hash_block_start; + sector_t hash_block_end; + verity_hash_at_level(v, io->block, i, &hash_block_start, NULL); + verity_hash_at_level(v, io->block + io->n_blocks - 1, i, &hash_block_end, NULL); + if (!i) { + unsigned cluster = *(volatile unsigned *)&dm_verity_prefetch_cluster; + + cluster >>= v->data_dev_block_bits; + if (unlikely(!cluster)) + goto no_prefetch_cluster; + + if (unlikely(cluster & (cluster - 1))) + cluster = 1 << (fls(cluster) - 1); + + hash_block_start &= ~(sector_t)(cluster - 1); + hash_block_end |= cluster - 1; + if (unlikely(hash_block_end >= v->hash_blocks)) + hash_block_end = v->hash_blocks - 1; + } +no_prefetch_cluster: + dm_bufio_prefetch(v->bufio, hash_block_start, + hash_block_end - hash_block_start + 1); + } +} + +/* + * Bio map function. It allocates dm_verity_io structure and bio vector and + * fills them. Then it issues prefetches and the I/O. + */ +static int verity_map(struct dm_target *ti, struct bio *bio, + union map_info *map_context) +{ + struct dm_verity *v = ti->private; + struct dm_verity_io *io; + + bio->bi_bdev = v->data_dev->bdev; + bio->bi_sector = verity_map_sector(v, bio->bi_sector); + + if (((unsigned)bio->bi_sector | bio_sectors(bio)) & + ((1 << (v->data_dev_block_bits - SECTOR_SHIFT)) - 1)) { + DMERR_LIMIT("unaligned io"); + return -EIO; + } + + if ((bio->bi_sector + bio_sectors(bio)) >> + (v->data_dev_block_bits - SECTOR_SHIFT) > v->data_blocks) { + DMERR_LIMIT("io out of range"); + return -EIO; + } + + if (bio_data_dir(bio) == WRITE) + return -EIO; + + io = mempool_alloc(v->io_mempool, GFP_NOIO); + io->v = v; + io->bio = bio; + io->orig_bi_end_io = bio->bi_end_io; + io->orig_bi_private = bio->bi_private; + io->block = bio->bi_sector >> (v->data_dev_block_bits - SECTOR_SHIFT); + io->n_blocks = bio->bi_size >> v->data_dev_block_bits; + + bio->bi_end_io = verity_end_io; + bio->bi_private = io; + io->io_vec_size = bio->bi_vcnt - bio->bi_idx; + if (io->io_vec_size < DM_VERITY_IO_VEC_INLINE) + io->io_vec = io->io_vec_inline; + else + io->io_vec = mempool_alloc(v->vec_mempool, GFP_NOIO); + memcpy(io->io_vec, bio_iovec(bio), + io->io_vec_size * sizeof(struct bio_vec)); + + verity_prefetch_io(v, io); + + generic_make_request(bio); + + return DM_MAPIO_SUBMITTED; +} + +/* + * Status: V (valid) or C (corruption found) + */ +static int verity_status(struct dm_target *ti, status_type_t type, + char *result, unsigned maxlen) +{ + struct dm_verity *v = ti->private; + unsigned sz = 0; + unsigned x; + + switch (type) { + case STATUSTYPE_INFO: + DMEMIT("%c", v->hash_failed ? 'C' : 'V'); + break; + case STATUSTYPE_TABLE: + DMEMIT("%u %s %s %u %u %llu %llu %s ", + v->version, + v->data_dev->name, + v->hash_dev->name, + 1 << v->data_dev_block_bits, + 1 << v->hash_dev_block_bits, + (unsigned long long)v->data_blocks, + (unsigned long long)v->hash_start, + v->alg_name + ); + for (x = 0; x < v->digest_size; x++) + DMEMIT("%02x", v->root_digest[x]); + DMEMIT(" "); + if (!v->salt_size) + DMEMIT("-"); + else + for (x = 0; x < v->salt_size; x++) + DMEMIT("%02x", v->salt[x]); + break; + } + + return 0; +} + +static int verity_ioctl(struct dm_target *ti, unsigned cmd, + unsigned long arg) +{ + struct dm_verity *v = ti->private; + int r = 0; + + if (v->data_start || + ti->len != i_size_read(v->data_dev->bdev->bd_inode) >> SECTOR_SHIFT) + r = scsi_verify_blk_ioctl(NULL, cmd); + + return r ? : __blkdev_driver_ioctl(v->data_dev->bdev, v->data_dev->mode, + cmd, arg); +} + +static int verity_merge(struct dm_target *ti, struct bvec_merge_data *bvm, + struct bio_vec *biovec, int max_size) +{ + struct dm_verity *v = ti->private; + struct request_queue *q = bdev_get_queue(v->data_dev->bdev); + + if (!q->merge_bvec_fn) + return max_size; + + bvm->bi_bdev = v->data_dev->bdev; + bvm->bi_sector = verity_map_sector(v, bvm->bi_sector); + + return min(max_size, q->merge_bvec_fn(q, bvm, biovec)); +} + +static int verity_iterate_devices(struct dm_target *ti, + iterate_devices_callout_fn fn, void *data) +{ + struct dm_verity *v = ti->private; + + return fn(ti, v->data_dev, v->data_start, ti->len, data); +} + +static void verity_io_hints(struct dm_target *ti, struct queue_limits *limits) +{ + struct dm_verity *v = ti->private; + + if (limits->logical_block_size < 1 << v->data_dev_block_bits) + limits->logical_block_size = 1 << v->data_dev_block_bits; + + if (limits->physical_block_size < 1 << v->data_dev_block_bits) + limits->physical_block_size = 1 << v->data_dev_block_bits; + + blk_limits_io_min(limits, limits->logical_block_size); +} + +static void verity_dtr(struct dm_target *ti) +{ + struct dm_verity *v = ti->private; + + if (v->verify_wq) + destroy_workqueue(v->verify_wq); + + if (v->vec_mempool) + mempool_destroy(v->vec_mempool); + + if (v->io_mempool) + mempool_destroy(v->io_mempool); + + if (v->bufio) + dm_bufio_client_destroy(v->bufio); + + kfree(v->salt); + kfree(v->root_digest); + + if (v->tfm) + crypto_free_shash(v->tfm); + + kfree(v->alg_name); + + if (v->hash_dev) + dm_put_device(ti, v->hash_dev); + + if (v->data_dev) + dm_put_device(ti, v->data_dev); + + kfree(v); +} + +/* + * Target parameters: + * The current format is version 1. + * Vsn 0 is compatible with original Chromium OS releases. + * + * + * + * + * + * + * + * + * Hex string or "-" if no salt. + */ +static int verity_ctr(struct dm_target *ti, unsigned argc, char **argv) +{ + struct dm_verity *v; + unsigned num; + unsigned long long num_ll; + int r; + int i; + sector_t hash_position; + char dummy; + + v = kzalloc(sizeof(struct dm_verity), GFP_KERNEL); + if (!v) { + ti->error = "Cannot allocate verity structure"; + return -ENOMEM; + } + ti->private = v; + v->ti = ti; + + if ((dm_table_get_mode(ti->table) & ~FMODE_READ)) { + ti->error = "Device must be readonly"; + r = -EINVAL; + goto bad; + } + + if (argc != 10) { + ti->error = "Invalid argument count: exactly 10 arguments required"; + r = -EINVAL; + goto bad; + } + + if (sscanf(argv[0], "%d%c", &num, &dummy) != 1 || + num < 0 || num > 1) { + ti->error = "Invalid version"; + r = -EINVAL; + goto bad; + } + v->version = num; + + r = dm_get_device(ti, argv[1], FMODE_READ, &v->data_dev); + if (r) { + ti->error = "Data device lookup failed"; + goto bad; + } + + r = dm_get_device(ti, argv[2], FMODE_READ, &v->hash_dev); + if (r) { + ti->error = "Data device lookup failed"; + goto bad; + } + + if (sscanf(argv[3], "%u%c", &num, &dummy) != 1 || + !num || (num & (num - 1)) || + num < bdev_logical_block_size(v->data_dev->bdev) || + num > PAGE_SIZE) { + ti->error = "Invalid data device block size"; + r = -EINVAL; + goto bad; + } + v->data_dev_block_bits = ffs(num) - 1; + + if (sscanf(argv[4], "%u%c", &num, &dummy) != 1 || + !num || (num & (num - 1)) || + num < bdev_logical_block_size(v->hash_dev->bdev) || + num > INT_MAX) { + ti->error = "Invalid hash device block size"; + r = -EINVAL; + goto bad; + } + v->hash_dev_block_bits = ffs(num) - 1; + + if (sscanf(argv[5], "%llu%c", &num_ll, &dummy) != 1 || + num_ll << (v->data_dev_block_bits - SECTOR_SHIFT) != + (sector_t)num_ll << (v->data_dev_block_bits - SECTOR_SHIFT)) { + ti->error = "Invalid data blocks"; + r = -EINVAL; + goto bad; + } + v->data_blocks = num_ll; + + if (ti->len > (v->data_blocks << (v->data_dev_block_bits - SECTOR_SHIFT))) { + ti->error = "Data device is too small"; + r = -EINVAL; + goto bad; + } + + if (sscanf(argv[6], "%llu%c", &num_ll, &dummy) != 1 || + num_ll << (v->hash_dev_block_bits - SECTOR_SHIFT) != + (sector_t)num_ll << (v->hash_dev_block_bits - SECTOR_SHIFT)) { + ti->error = "Invalid hash start"; + r = -EINVAL; + goto bad; + } + v->hash_start = num_ll; + + v->alg_name = kstrdup(argv[7], GFP_KERNEL); + if (!v->alg_name) { + ti->error = "Cannot allocate algorithm name"; + r = -ENOMEM; + goto bad; + } + + v->tfm = crypto_alloc_shash(v->alg_name, 0, 0); + if (IS_ERR(v->tfm)) { + ti->error = "Cannot initialize hash function"; + r = PTR_ERR(v->tfm); + v->tfm = NULL; + goto bad; + } + v->digest_size = crypto_shash_digestsize(v->tfm); + if ((1 << v->hash_dev_block_bits) < v->digest_size * 2) { + ti->error = "Digest size too big"; + r = -EINVAL; + goto bad; + } + v->shash_descsize = + sizeof(struct shash_desc) + crypto_shash_descsize(v->tfm); + + v->root_digest = kmalloc(v->digest_size, GFP_KERNEL); + if (!v->root_digest) { + ti->error = "Cannot allocate root digest"; + r = -ENOMEM; + goto bad; + } + if (strlen(argv[8]) != v->digest_size * 2 || + hex2bin(v->root_digest, argv[8], v->digest_size)) { + ti->error = "Invalid root digest"; + r = -EINVAL; + goto bad; + } + + if (strcmp(argv[9], "-")) { + v->salt_size = strlen(argv[9]) / 2; + v->salt = kmalloc(v->salt_size, GFP_KERNEL); + if (!v->salt) { + ti->error = "Cannot allocate salt"; + r = -ENOMEM; + goto bad; + } + if (strlen(argv[9]) != v->salt_size * 2 || + hex2bin(v->salt, argv[9], v->salt_size)) { + ti->error = "Invalid salt"; + r = -EINVAL; + goto bad; + } + } + + v->hash_per_block_bits = + fls((1 << v->hash_dev_block_bits) / v->digest_size) - 1; + + v->levels = 0; + if (v->data_blocks) + while (v->hash_per_block_bits * v->levels < 64 && + (unsigned long long)(v->data_blocks - 1) >> + (v->hash_per_block_bits * v->levels)) + v->levels++; + + if (v->levels > DM_VERITY_MAX_LEVELS) { + ti->error = "Too many tree levels"; + r = -E2BIG; + goto bad; + } + + hash_position = v->hash_start; + for (i = v->levels - 1; i >= 0; i--) { + sector_t s; + v->hash_level_block[i] = hash_position; + s = verity_position_at_level(v, v->data_blocks, i); + s = (s >> v->hash_per_block_bits) + + !!(s & ((1 << v->hash_per_block_bits) - 1)); + if (hash_position + s < hash_position) { + ti->error = "Hash device offset overflow"; + r = -E2BIG; + goto bad; + } + hash_position += s; + } + v->hash_blocks = hash_position; + + v->bufio = dm_bufio_client_create(v->hash_dev->bdev, + 1 << v->hash_dev_block_bits, 1, sizeof(struct buffer_aux), + dm_bufio_alloc_callback, NULL); + if (IS_ERR(v->bufio)) { + ti->error = "Cannot initialize dm-bufio"; + r = PTR_ERR(v->bufio); + v->bufio = NULL; + goto bad; + } + + if (dm_bufio_get_device_size(v->bufio) < v->hash_blocks) { + ti->error = "Hash device is too small"; + r = -E2BIG; + goto bad; + } + + v->io_mempool = mempool_create_kmalloc_pool(DM_VERITY_MEMPOOL_SIZE, + sizeof(struct dm_verity_io) + v->shash_descsize + v->digest_size * 2); + if (!v->io_mempool) { + ti->error = "Cannot allocate io mempool"; + r = -ENOMEM; + goto bad; + } + + v->vec_mempool = mempool_create_kmalloc_pool(DM_VERITY_MEMPOOL_SIZE, + BIO_MAX_PAGES * sizeof(struct bio_vec)); + if (!v->vec_mempool) { + ti->error = "Cannot allocate vector mempool"; + r = -ENOMEM; + goto bad; + } + + /* WQ_UNBOUND greatly improves performance when running on ramdisk */ + v->verify_wq = alloc_workqueue("kverityd", WQ_CPU_INTENSIVE | WQ_MEM_RECLAIM | WQ_UNBOUND, num_online_cpus()); + if (!v->verify_wq) { + ti->error = "Cannot allocate workqueue"; + r = -ENOMEM; + goto bad; + } + + return 0; + +bad: + verity_dtr(ti); + + return r; +} + +static struct target_type verity_target = { + .name = "verity", + .version = {1, 0, 0}, + .module = THIS_MODULE, + .ctr = verity_ctr, + .dtr = verity_dtr, + .map = verity_map, + .status = verity_status, + .ioctl = verity_ioctl, + .merge = verity_merge, + .iterate_devices = verity_iterate_devices, + .io_hints = verity_io_hints, +}; + +static int __init dm_verity_init(void) +{ + int r; + + r = dm_register_target(&verity_target); + if (r < 0) + DMERR("register failed %d", r); + + return r; +} + +static void __exit dm_verity_exit(void) +{ + dm_unregister_target(&verity_target); +} + +module_init(dm_verity_init); +module_exit(dm_verity_exit); + +MODULE_AUTHOR("Mikulas Patocka "); +MODULE_AUTHOR("Mandeep Baines "); +MODULE_AUTHOR("Will Drewry "); +MODULE_DESCRIPTION(DM_NAME " target for transparent disk integrity checking"); +MODULE_LICENSE("GPL"); -- cgit v1.2.3 From 83619ea08e9abe0f5ebcfc569a829d1105a1685e Mon Sep 17 00:00:00 2001 From: Jamie Lentin Date: Tue, 27 Mar 2012 22:54:15 +0100 Subject: mtd: Move fdt partition documentation to a seperate file Partitions are described in the same way for all mtd devices when using devicetree, move the documentation to a separate file and add references to it. Signed-off-by: Jamie Lentin Acked-by: Arnd Bergmann Acked-by: Jason Cooper Signed-off-by: Grant Likely --- .../devicetree/bindings/mtd/arm-versatile.txt | 4 +-- .../devicetree/bindings/mtd/atmel-dataflash.txt | 3 ++ .../devicetree/bindings/mtd/fsl-upm-nand.txt | 4 +++ .../devicetree/bindings/mtd/gpio-control-nand.txt | 3 ++ .../devicetree/bindings/mtd/mtd-physmap.txt | 23 ++----------- .../devicetree/bindings/mtd/partition.txt | 38 ++++++++++++++++++++++ 6 files changed, 52 insertions(+), 23 deletions(-) create mode 100644 Documentation/devicetree/bindings/mtd/partition.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/mtd/arm-versatile.txt b/Documentation/devicetree/bindings/mtd/arm-versatile.txt index 476845db94d0..beace4b89daa 100644 --- a/Documentation/devicetree/bindings/mtd/arm-versatile.txt +++ b/Documentation/devicetree/bindings/mtd/arm-versatile.txt @@ -4,5 +4,5 @@ Required properties: - compatible : must be "arm,versatile-flash"; - bank-width : width in bytes of flash interface. -Optional properties: -- Subnode partition map from mtd flash binding +The device tree may optionally contain sub-nodes describing partitions of the +address space. See partition.txt for more detail. diff --git a/Documentation/devicetree/bindings/mtd/atmel-dataflash.txt b/Documentation/devicetree/bindings/mtd/atmel-dataflash.txt index ef66ddd01da0..1889a4db5b7c 100644 --- a/Documentation/devicetree/bindings/mtd/atmel-dataflash.txt +++ b/Documentation/devicetree/bindings/mtd/atmel-dataflash.txt @@ -3,6 +3,9 @@ Required properties: - compatible : "atmel,", "atmel,", "atmel,dataflash". +The device tree may optionally contain sub-nodes describing partitions of the +address space. See partition.txt for more detail. + Example: flash@1 { diff --git a/Documentation/devicetree/bindings/mtd/fsl-upm-nand.txt b/Documentation/devicetree/bindings/mtd/fsl-upm-nand.txt index 00f1f546b32e..fce4894f5a98 100644 --- a/Documentation/devicetree/bindings/mtd/fsl-upm-nand.txt +++ b/Documentation/devicetree/bindings/mtd/fsl-upm-nand.txt @@ -19,6 +19,10 @@ Optional properties: read registers (tR). Required if property "gpios" is not used (R/B# pins not connected). +Each flash chip described may optionally contain additional sub-nodes +describing partitions of the address space. See partition.txt for more +detail. + Examples: upm@1,0 { diff --git a/Documentation/devicetree/bindings/mtd/gpio-control-nand.txt b/Documentation/devicetree/bindings/mtd/gpio-control-nand.txt index 719f4dc58df7..36ef07d3c90f 100644 --- a/Documentation/devicetree/bindings/mtd/gpio-control-nand.txt +++ b/Documentation/devicetree/bindings/mtd/gpio-control-nand.txt @@ -25,6 +25,9 @@ Optional properties: GPIO state and before and after command byte writes, this register will be read to ensure that the GPIO accesses have completed. +The device tree may optionally contain sub-nodes describing partitions of the +address space. See partition.txt for more detail. + Examples: gpio-nand@1,0 { diff --git a/Documentation/devicetree/bindings/mtd/mtd-physmap.txt b/Documentation/devicetree/bindings/mtd/mtd-physmap.txt index 80152cb567d9..a63c2bd7de2b 100644 --- a/Documentation/devicetree/bindings/mtd/mtd-physmap.txt +++ b/Documentation/devicetree/bindings/mtd/mtd-physmap.txt @@ -23,27 +23,8 @@ are defined: - vendor-id : Contains the flash chip's vendor id (1 byte). - device-id : Contains the flash chip's device id (1 byte). -In addition to the information on the mtd bank itself, the -device tree may optionally contain additional information -describing partitions of the address space. This can be -used on platforms which have strong conventions about which -portions of a flash are used for what purposes, but which don't -use an on-flash partition table such as RedBoot. - -Each partition is represented as a sub-node of the mtd device. -Each node's name represents the name of the corresponding -partition of the mtd device. - -Flash partitions - - reg : The partition's offset and size within the mtd bank. - - label : (optional) The label / name for this partition. - If omitted, the label is taken from the node name (excluding - the unit address). - - read-only : (optional) This parameter, if present, is a hint to - Linux that this partition should only be mounted - read-only. This is usually used for flash partitions - containing early-boot firmware images or data which should not - be clobbered. +The device tree may optionally contain sub-nodes describing partitions of the +address space. See partition.txt for more detail. Example: diff --git a/Documentation/devicetree/bindings/mtd/partition.txt b/Documentation/devicetree/bindings/mtd/partition.txt new file mode 100644 index 000000000000..f114ce1657c2 --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/partition.txt @@ -0,0 +1,38 @@ +Representing flash partitions in devicetree + +Partitions can be represented by sub-nodes of an mtd device. This can be used +on platforms which have strong conventions about which portions of a flash are +used for what purposes, but which don't use an on-flash partition table such +as RedBoot. + +#address-cells & #size-cells must both be present in the mtd device and be +equal to 1. + +Required properties: +- reg : The partition's offset and size within the mtd bank. + +Optional properties: +- label : The label / name for this partition. If omitted, the label is taken + from the node name (excluding the unit address). +- read-only : This parameter, if present, is a hint to Linux that this + partition should only be mounted read-only. This is usually used for flash + partitions containing early-boot firmware images or data which should not be + clobbered. + +Examples: + + +flash@0 { + #address-cells = <1>; + #size-cells = <1>; + + partition@0 { + label = "u-boot"; + reg = <0x0000000 0x100000>; + read-only; + }; + + uimage@100000 { + reg = <0x0100000 0x200000>; + }; +]; -- cgit v1.2.3 From c6dd897f3bfc54a44942d742d6dfa842e33d88e0 Mon Sep 17 00:00:00 2001 From: Dave Young Date: Wed, 28 Mar 2012 14:42:55 -0700 Subject: mm: move page-types.c from Documentation to tools/vm tools/ is the better place for vm tools which are used by many people. Moving them to tools also make them open to more users instead of hide in Documentation folder. This patch moves page-types.c to tools/vm/page-types.c. Also add a Makefile in tools/vm and fix two coding style problems: a) change const arrary to 'const char * const', b) change a space to tab for indent. Signed-off-by: Dave Young Acked-by: Wu Fengguang Cc: Christoph Lameter Cc: Pekka Enberg Cc: Frederic Weisbecker Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/vm/Makefile | 2 +- Documentation/vm/page-types.c | 1102 ----------------------------------------- tools/vm/Makefile | 11 + tools/vm/page-types.c | 1102 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 1114 insertions(+), 1103 deletions(-) delete mode 100644 Documentation/vm/page-types.c create mode 100644 tools/vm/Makefile create mode 100644 tools/vm/page-types.c (limited to 'Documentation') diff --git a/Documentation/vm/Makefile b/Documentation/vm/Makefile index 3fa4d0668864..e538864bfc63 100644 --- a/Documentation/vm/Makefile +++ b/Documentation/vm/Makefile @@ -2,7 +2,7 @@ obj- := dummy.o # List of programs to build -hostprogs-y := page-types hugepage-mmap hugepage-shm map_hugetlb +hostprogs-y := hugepage-mmap hugepage-shm map_hugetlb # Tell kbuild to always build the programs always := $(hostprogs-y) diff --git a/Documentation/vm/page-types.c b/Documentation/vm/page-types.c deleted file mode 100644 index 0b13f02d4059..000000000000 --- a/Documentation/vm/page-types.c +++ /dev/null @@ -1,1102 +0,0 @@ -/* - * page-types: Tool for querying page flags - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the Free - * Software Foundation; version 2. - * - * This program is distributed in the hope that it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for - * more details. - * - * You should find a copy of v2 of the GNU General Public License somewhere on - * your Linux system; if not, write to the Free Software Foundation, Inc., 59 - * Temple Place, Suite 330, Boston, MA 02111-1307 USA. - * - * Copyright (C) 2009 Intel corporation - * - * Authors: Wu Fengguang - */ - -#define _LARGEFILE64_SOURCE -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include "../../include/linux/magic.h" - - -#ifndef MAX_PATH -# define MAX_PATH 256 -#endif - -#ifndef STR -# define _STR(x) #x -# define STR(x) _STR(x) -#endif - -/* - * pagemap kernel ABI bits - */ - -#define PM_ENTRY_BYTES sizeof(uint64_t) -#define PM_STATUS_BITS 3 -#define PM_STATUS_OFFSET (64 - PM_STATUS_BITS) -#define PM_STATUS_MASK (((1LL << PM_STATUS_BITS) - 1) << PM_STATUS_OFFSET) -#define PM_STATUS(nr) (((nr) << PM_STATUS_OFFSET) & PM_STATUS_MASK) -#define PM_PSHIFT_BITS 6 -#define PM_PSHIFT_OFFSET (PM_STATUS_OFFSET - PM_PSHIFT_BITS) -#define PM_PSHIFT_MASK (((1LL << PM_PSHIFT_BITS) - 1) << PM_PSHIFT_OFFSET) -#define PM_PSHIFT(x) (((u64) (x) << PM_PSHIFT_OFFSET) & PM_PSHIFT_MASK) -#define PM_PFRAME_MASK ((1LL << PM_PSHIFT_OFFSET) - 1) -#define PM_PFRAME(x) ((x) & PM_PFRAME_MASK) - -#define PM_PRESENT PM_STATUS(4LL) -#define PM_SWAP PM_STATUS(2LL) - - -/* - * kernel page flags - */ - -#define KPF_BYTES 8 -#define PROC_KPAGEFLAGS "/proc/kpageflags" - -/* copied from kpageflags_read() */ -#define KPF_LOCKED 0 -#define KPF_ERROR 1 -#define KPF_REFERENCED 2 -#define KPF_UPTODATE 3 -#define KPF_DIRTY 4 -#define KPF_LRU 5 -#define KPF_ACTIVE 6 -#define KPF_SLAB 7 -#define KPF_WRITEBACK 8 -#define KPF_RECLAIM 9 -#define KPF_BUDDY 10 - -/* [11-20] new additions in 2.6.31 */ -#define KPF_MMAP 11 -#define KPF_ANON 12 -#define KPF_SWAPCACHE 13 -#define KPF_SWAPBACKED 14 -#define KPF_COMPOUND_HEAD 15 -#define KPF_COMPOUND_TAIL 16 -#define KPF_HUGE 17 -#define KPF_UNEVICTABLE 18 -#define KPF_HWPOISON 19 -#define KPF_NOPAGE 20 -#define KPF_KSM 21 -#define KPF_THP 22 - -/* [32-] kernel hacking assistances */ -#define KPF_RESERVED 32 -#define KPF_MLOCKED 33 -#define KPF_MAPPEDTODISK 34 -#define KPF_PRIVATE 35 -#define KPF_PRIVATE_2 36 -#define KPF_OWNER_PRIVATE 37 -#define KPF_ARCH 38 -#define KPF_UNCACHED 39 - -/* [48-] take some arbitrary free slots for expanding overloaded flags - * not part of kernel API - */ -#define KPF_READAHEAD 48 -#define KPF_SLOB_FREE 49 -#define KPF_SLUB_FROZEN 50 -#define KPF_SLUB_DEBUG 51 - -#define KPF_ALL_BITS ((uint64_t)~0ULL) -#define KPF_HACKERS_BITS (0xffffULL << 32) -#define KPF_OVERLOADED_BITS (0xffffULL << 48) -#define BIT(name) (1ULL << KPF_##name) -#define BITS_COMPOUND (BIT(COMPOUND_HEAD) | BIT(COMPOUND_TAIL)) - -static const char *page_flag_names[] = { - [KPF_LOCKED] = "L:locked", - [KPF_ERROR] = "E:error", - [KPF_REFERENCED] = "R:referenced", - [KPF_UPTODATE] = "U:uptodate", - [KPF_DIRTY] = "D:dirty", - [KPF_LRU] = "l:lru", - [KPF_ACTIVE] = "A:active", - [KPF_SLAB] = "S:slab", - [KPF_WRITEBACK] = "W:writeback", - [KPF_RECLAIM] = "I:reclaim", - [KPF_BUDDY] = "B:buddy", - - [KPF_MMAP] = "M:mmap", - [KPF_ANON] = "a:anonymous", - [KPF_SWAPCACHE] = "s:swapcache", - [KPF_SWAPBACKED] = "b:swapbacked", - [KPF_COMPOUND_HEAD] = "H:compound_head", - [KPF_COMPOUND_TAIL] = "T:compound_tail", - [KPF_HUGE] = "G:huge", - [KPF_UNEVICTABLE] = "u:unevictable", - [KPF_HWPOISON] = "X:hwpoison", - [KPF_NOPAGE] = "n:nopage", - [KPF_KSM] = "x:ksm", - [KPF_THP] = "t:thp", - - [KPF_RESERVED] = "r:reserved", - [KPF_MLOCKED] = "m:mlocked", - [KPF_MAPPEDTODISK] = "d:mappedtodisk", - [KPF_PRIVATE] = "P:private", - [KPF_PRIVATE_2] = "p:private_2", - [KPF_OWNER_PRIVATE] = "O:owner_private", - [KPF_ARCH] = "h:arch", - [KPF_UNCACHED] = "c:uncached", - - [KPF_READAHEAD] = "I:readahead", - [KPF_SLOB_FREE] = "P:slob_free", - [KPF_SLUB_FROZEN] = "A:slub_frozen", - [KPF_SLUB_DEBUG] = "E:slub_debug", -}; - - -static const char *debugfs_known_mountpoints[] = { - "/sys/kernel/debug", - "/debug", - 0, -}; - -/* - * data structures - */ - -static int opt_raw; /* for kernel developers */ -static int opt_list; /* list pages (in ranges) */ -static int opt_no_summary; /* don't show summary */ -static pid_t opt_pid; /* process to walk */ - -#define MAX_ADDR_RANGES 1024 -static int nr_addr_ranges; -static unsigned long opt_offset[MAX_ADDR_RANGES]; -static unsigned long opt_size[MAX_ADDR_RANGES]; - -#define MAX_VMAS 10240 -static int nr_vmas; -static unsigned long pg_start[MAX_VMAS]; -static unsigned long pg_end[MAX_VMAS]; - -#define MAX_BIT_FILTERS 64 -static int nr_bit_filters; -static uint64_t opt_mask[MAX_BIT_FILTERS]; -static uint64_t opt_bits[MAX_BIT_FILTERS]; - -static int page_size; - -static int pagemap_fd; -static int kpageflags_fd; - -static int opt_hwpoison; -static int opt_unpoison; - -static char hwpoison_debug_fs[MAX_PATH+1]; -static int hwpoison_inject_fd; -static int hwpoison_forget_fd; - -#define HASH_SHIFT 13 -#define HASH_SIZE (1 << HASH_SHIFT) -#define HASH_MASK (HASH_SIZE - 1) -#define HASH_KEY(flags) (flags & HASH_MASK) - -static unsigned long total_pages; -static unsigned long nr_pages[HASH_SIZE]; -static uint64_t page_flags[HASH_SIZE]; - - -/* - * helper functions - */ - -#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) - -#define min_t(type, x, y) ({ \ - type __min1 = (x); \ - type __min2 = (y); \ - __min1 < __min2 ? __min1 : __min2; }) - -#define max_t(type, x, y) ({ \ - type __max1 = (x); \ - type __max2 = (y); \ - __max1 > __max2 ? __max1 : __max2; }) - -static unsigned long pages2mb(unsigned long pages) -{ - return (pages * page_size) >> 20; -} - -static void fatal(const char *x, ...) -{ - va_list ap; - - va_start(ap, x); - vfprintf(stderr, x, ap); - va_end(ap); - exit(EXIT_FAILURE); -} - -static int checked_open(const char *pathname, int flags) -{ - int fd = open(pathname, flags); - - if (fd < 0) { - perror(pathname); - exit(EXIT_FAILURE); - } - - return fd; -} - -/* - * pagemap/kpageflags routines - */ - -static unsigned long do_u64_read(int fd, char *name, - uint64_t *buf, - unsigned long index, - unsigned long count) -{ - long bytes; - - if (index > ULONG_MAX / 8) - fatal("index overflow: %lu\n", index); - - if (lseek(fd, index * 8, SEEK_SET) < 0) { - perror(name); - exit(EXIT_FAILURE); - } - - bytes = read(fd, buf, count * 8); - if (bytes < 0) { - perror(name); - exit(EXIT_FAILURE); - } - if (bytes % 8) - fatal("partial read: %lu bytes\n", bytes); - - return bytes / 8; -} - -static unsigned long kpageflags_read(uint64_t *buf, - unsigned long index, - unsigned long pages) -{ - return do_u64_read(kpageflags_fd, PROC_KPAGEFLAGS, buf, index, pages); -} - -static unsigned long pagemap_read(uint64_t *buf, - unsigned long index, - unsigned long pages) -{ - return do_u64_read(pagemap_fd, "/proc/pid/pagemap", buf, index, pages); -} - -static unsigned long pagemap_pfn(uint64_t val) -{ - unsigned long pfn; - - if (val & PM_PRESENT) - pfn = PM_PFRAME(val); - else - pfn = 0; - - return pfn; -} - - -/* - * page flag names - */ - -static char *page_flag_name(uint64_t flags) -{ - static char buf[65]; - int present; - int i, j; - - for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) { - present = (flags >> i) & 1; - if (!page_flag_names[i]) { - if (present) - fatal("unknown flag bit %d\n", i); - continue; - } - buf[j++] = present ? page_flag_names[i][0] : '_'; - } - - return buf; -} - -static char *page_flag_longname(uint64_t flags) -{ - static char buf[1024]; - int i, n; - - for (i = 0, n = 0; i < ARRAY_SIZE(page_flag_names); i++) { - if (!page_flag_names[i]) - continue; - if ((flags >> i) & 1) - n += snprintf(buf + n, sizeof(buf) - n, "%s,", - page_flag_names[i] + 2); - } - if (n) - n--; - buf[n] = '\0'; - - return buf; -} - - -/* - * page list and summary - */ - -static void show_page_range(unsigned long voffset, - unsigned long offset, uint64_t flags) -{ - static uint64_t flags0; - static unsigned long voff; - static unsigned long index; - static unsigned long count; - - if (flags == flags0 && offset == index + count && - (!opt_pid || voffset == voff + count)) { - count++; - return; - } - - if (count) { - if (opt_pid) - printf("%lx\t", voff); - printf("%lx\t%lx\t%s\n", - index, count, page_flag_name(flags0)); - } - - flags0 = flags; - index = offset; - voff = voffset; - count = 1; -} - -static void show_page(unsigned long voffset, - unsigned long offset, uint64_t flags) -{ - if (opt_pid) - printf("%lx\t", voffset); - printf("%lx\t%s\n", offset, page_flag_name(flags)); -} - -static void show_summary(void) -{ - int i; - - printf(" flags\tpage-count MB" - " symbolic-flags\t\t\tlong-symbolic-flags\n"); - - for (i = 0; i < ARRAY_SIZE(nr_pages); i++) { - if (nr_pages[i]) - printf("0x%016llx\t%10lu %8lu %s\t%s\n", - (unsigned long long)page_flags[i], - nr_pages[i], - pages2mb(nr_pages[i]), - page_flag_name(page_flags[i]), - page_flag_longname(page_flags[i])); - } - - printf(" total\t%10lu %8lu\n", - total_pages, pages2mb(total_pages)); -} - - -/* - * page flag filters - */ - -static int bit_mask_ok(uint64_t flags) -{ - int i; - - for (i = 0; i < nr_bit_filters; i++) { - if (opt_bits[i] == KPF_ALL_BITS) { - if ((flags & opt_mask[i]) == 0) - return 0; - } else { - if ((flags & opt_mask[i]) != opt_bits[i]) - return 0; - } - } - - return 1; -} - -static uint64_t expand_overloaded_flags(uint64_t flags) -{ - /* SLOB/SLUB overload several page flags */ - if (flags & BIT(SLAB)) { - if (flags & BIT(PRIVATE)) - flags ^= BIT(PRIVATE) | BIT(SLOB_FREE); - if (flags & BIT(ACTIVE)) - flags ^= BIT(ACTIVE) | BIT(SLUB_FROZEN); - if (flags & BIT(ERROR)) - flags ^= BIT(ERROR) | BIT(SLUB_DEBUG); - } - - /* PG_reclaim is overloaded as PG_readahead in the read path */ - if ((flags & (BIT(RECLAIM) | BIT(WRITEBACK))) == BIT(RECLAIM)) - flags ^= BIT(RECLAIM) | BIT(READAHEAD); - - return flags; -} - -static uint64_t well_known_flags(uint64_t flags) -{ - /* hide flags intended only for kernel hacker */ - flags &= ~KPF_HACKERS_BITS; - - /* hide non-hugeTLB compound pages */ - if ((flags & BITS_COMPOUND) && !(flags & BIT(HUGE))) - flags &= ~BITS_COMPOUND; - - return flags; -} - -static uint64_t kpageflags_flags(uint64_t flags) -{ - flags = expand_overloaded_flags(flags); - - if (!opt_raw) - flags = well_known_flags(flags); - - return flags; -} - -/* verify that a mountpoint is actually a debugfs instance */ -static int debugfs_valid_mountpoint(const char *debugfs) -{ - struct statfs st_fs; - - if (statfs(debugfs, &st_fs) < 0) - return -ENOENT; - else if (st_fs.f_type != (long) DEBUGFS_MAGIC) - return -ENOENT; - - return 0; -} - -/* find the path to the mounted debugfs */ -static const char *debugfs_find_mountpoint(void) -{ - const char **ptr; - char type[100]; - FILE *fp; - - ptr = debugfs_known_mountpoints; - while (*ptr) { - if (debugfs_valid_mountpoint(*ptr) == 0) { - strcpy(hwpoison_debug_fs, *ptr); - return hwpoison_debug_fs; - } - ptr++; - } - - /* give up and parse /proc/mounts */ - fp = fopen("/proc/mounts", "r"); - if (fp == NULL) - perror("Can't open /proc/mounts for read"); - - while (fscanf(fp, "%*s %" - STR(MAX_PATH) - "s %99s %*s %*d %*d\n", - hwpoison_debug_fs, type) == 2) { - if (strcmp(type, "debugfs") == 0) - break; - } - fclose(fp); - - if (strcmp(type, "debugfs") != 0) - return NULL; - - return hwpoison_debug_fs; -} - -/* mount the debugfs somewhere if it's not mounted */ - -static void debugfs_mount(void) -{ - const char **ptr; - - /* see if it's already mounted */ - if (debugfs_find_mountpoint()) - return; - - ptr = debugfs_known_mountpoints; - while (*ptr) { - if (mount(NULL, *ptr, "debugfs", 0, NULL) == 0) { - /* save the mountpoint */ - strcpy(hwpoison_debug_fs, *ptr); - break; - } - ptr++; - } - - if (*ptr == NULL) { - perror("mount debugfs"); - exit(EXIT_FAILURE); - } -} - -/* - * page actions - */ - -static void prepare_hwpoison_fd(void) -{ - char buf[MAX_PATH + 1]; - - debugfs_mount(); - - if (opt_hwpoison && !hwpoison_inject_fd) { - snprintf(buf, MAX_PATH, "%s/hwpoison/corrupt-pfn", - hwpoison_debug_fs); - hwpoison_inject_fd = checked_open(buf, O_WRONLY); - } - - if (opt_unpoison && !hwpoison_forget_fd) { - snprintf(buf, MAX_PATH, "%s/hwpoison/unpoison-pfn", - hwpoison_debug_fs); - hwpoison_forget_fd = checked_open(buf, O_WRONLY); - } -} - -static int hwpoison_page(unsigned long offset) -{ - char buf[100]; - int len; - - len = sprintf(buf, "0x%lx\n", offset); - len = write(hwpoison_inject_fd, buf, len); - if (len < 0) { - perror("hwpoison inject"); - return len; - } - return 0; -} - -static int unpoison_page(unsigned long offset) -{ - char buf[100]; - int len; - - len = sprintf(buf, "0x%lx\n", offset); - len = write(hwpoison_forget_fd, buf, len); - if (len < 0) { - perror("hwpoison forget"); - return len; - } - return 0; -} - -/* - * page frame walker - */ - -static int hash_slot(uint64_t flags) -{ - int k = HASH_KEY(flags); - int i; - - /* Explicitly reserve slot 0 for flags 0: the following logic - * cannot distinguish an unoccupied slot from slot (flags==0). - */ - if (flags == 0) - return 0; - - /* search through the remaining (HASH_SIZE-1) slots */ - for (i = 1; i < ARRAY_SIZE(page_flags); i++, k++) { - if (!k || k >= ARRAY_SIZE(page_flags)) - k = 1; - if (page_flags[k] == 0) { - page_flags[k] = flags; - return k; - } - if (page_flags[k] == flags) - return k; - } - - fatal("hash table full: bump up HASH_SHIFT?\n"); - exit(EXIT_FAILURE); -} - -static void add_page(unsigned long voffset, - unsigned long offset, uint64_t flags) -{ - flags = kpageflags_flags(flags); - - if (!bit_mask_ok(flags)) - return; - - if (opt_hwpoison) - hwpoison_page(offset); - if (opt_unpoison) - unpoison_page(offset); - - if (opt_list == 1) - show_page_range(voffset, offset, flags); - else if (opt_list == 2) - show_page(voffset, offset, flags); - - nr_pages[hash_slot(flags)]++; - total_pages++; -} - -#define KPAGEFLAGS_BATCH (64 << 10) /* 64k pages */ -static void walk_pfn(unsigned long voffset, - unsigned long index, - unsigned long count) -{ - uint64_t buf[KPAGEFLAGS_BATCH]; - unsigned long batch; - long pages; - unsigned long i; - - while (count) { - batch = min_t(unsigned long, count, KPAGEFLAGS_BATCH); - pages = kpageflags_read(buf, index, batch); - if (pages == 0) - break; - - for (i = 0; i < pages; i++) - add_page(voffset + i, index + i, buf[i]); - - index += pages; - count -= pages; - } -} - -#define PAGEMAP_BATCH (64 << 10) -static void walk_vma(unsigned long index, unsigned long count) -{ - uint64_t buf[PAGEMAP_BATCH]; - unsigned long batch; - unsigned long pages; - unsigned long pfn; - unsigned long i; - - while (count) { - batch = min_t(unsigned long, count, PAGEMAP_BATCH); - pages = pagemap_read(buf, index, batch); - if (pages == 0) - break; - - for (i = 0; i < pages; i++) { - pfn = pagemap_pfn(buf[i]); - if (pfn) - walk_pfn(index + i, pfn, 1); - } - - index += pages; - count -= pages; - } -} - -static void walk_task(unsigned long index, unsigned long count) -{ - const unsigned long end = index + count; - unsigned long start; - int i = 0; - - while (index < end) { - - while (pg_end[i] <= index) - if (++i >= nr_vmas) - return; - if (pg_start[i] >= end) - return; - - start = max_t(unsigned long, pg_start[i], index); - index = min_t(unsigned long, pg_end[i], end); - - assert(start < index); - walk_vma(start, index - start); - } -} - -static void add_addr_range(unsigned long offset, unsigned long size) -{ - if (nr_addr_ranges >= MAX_ADDR_RANGES) - fatal("too many addr ranges\n"); - - opt_offset[nr_addr_ranges] = offset; - opt_size[nr_addr_ranges] = min_t(unsigned long, size, ULONG_MAX-offset); - nr_addr_ranges++; -} - -static void walk_addr_ranges(void) -{ - int i; - - kpageflags_fd = checked_open(PROC_KPAGEFLAGS, O_RDONLY); - - if (!nr_addr_ranges) - add_addr_range(0, ULONG_MAX); - - for (i = 0; i < nr_addr_ranges; i++) - if (!opt_pid) - walk_pfn(0, opt_offset[i], opt_size[i]); - else - walk_task(opt_offset[i], opt_size[i]); - - close(kpageflags_fd); -} - - -/* - * user interface - */ - -static const char *page_flag_type(uint64_t flag) -{ - if (flag & KPF_HACKERS_BITS) - return "(r)"; - if (flag & KPF_OVERLOADED_BITS) - return "(o)"; - return " "; -} - -static void usage(void) -{ - int i, j; - - printf( -"page-types [options]\n" -" -r|--raw Raw mode, for kernel developers\n" -" -d|--describe flags Describe flags\n" -" -a|--addr addr-spec Walk a range of pages\n" -" -b|--bits bits-spec Walk pages with specified bits\n" -" -p|--pid pid Walk process address space\n" -#if 0 /* planned features */ -" -f|--file filename Walk file address space\n" -#endif -" -l|--list Show page details in ranges\n" -" -L|--list-each Show page details one by one\n" -" -N|--no-summary Don't show summary info\n" -" -X|--hwpoison hwpoison pages\n" -" -x|--unpoison unpoison pages\n" -" -h|--help Show this usage message\n" -"flags:\n" -" 0x10 bitfield format, e.g.\n" -" anon bit-name, e.g.\n" -" 0x10,anon comma-separated list, e.g.\n" -"addr-spec:\n" -" N one page at offset N (unit: pages)\n" -" N+M pages range from N to N+M-1\n" -" N,M pages range from N to M-1\n" -" N, pages range from N to end\n" -" ,M pages range from 0 to M-1\n" -"bits-spec:\n" -" bit1,bit2 (flags & (bit1|bit2)) != 0\n" -" bit1,bit2=bit1 (flags & (bit1|bit2)) == bit1\n" -" bit1,~bit2 (flags & (bit1|bit2)) == bit1\n" -" =bit1,bit2 flags == (bit1|bit2)\n" -"bit-names:\n" - ); - - for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) { - if (!page_flag_names[i]) - continue; - printf("%16s%s", page_flag_names[i] + 2, - page_flag_type(1ULL << i)); - if (++j > 3) { - j = 0; - putchar('\n'); - } - } - printf("\n " - "(r) raw mode bits (o) overloaded bits\n"); -} - -static unsigned long long parse_number(const char *str) -{ - unsigned long long n; - - n = strtoll(str, NULL, 0); - - if (n == 0 && str[0] != '0') - fatal("invalid name or number: %s\n", str); - - return n; -} - -static void parse_pid(const char *str) -{ - FILE *file; - char buf[5000]; - - opt_pid = parse_number(str); - - sprintf(buf, "/proc/%d/pagemap", opt_pid); - pagemap_fd = checked_open(buf, O_RDONLY); - - sprintf(buf, "/proc/%d/maps", opt_pid); - file = fopen(buf, "r"); - if (!file) { - perror(buf); - exit(EXIT_FAILURE); - } - - while (fgets(buf, sizeof(buf), file) != NULL) { - unsigned long vm_start; - unsigned long vm_end; - unsigned long long pgoff; - int major, minor; - char r, w, x, s; - unsigned long ino; - int n; - - n = sscanf(buf, "%lx-%lx %c%c%c%c %llx %x:%x %lu", - &vm_start, - &vm_end, - &r, &w, &x, &s, - &pgoff, - &major, &minor, - &ino); - if (n < 10) { - fprintf(stderr, "unexpected line: %s\n", buf); - continue; - } - pg_start[nr_vmas] = vm_start / page_size; - pg_end[nr_vmas] = vm_end / page_size; - if (++nr_vmas >= MAX_VMAS) { - fprintf(stderr, "too many VMAs\n"); - break; - } - } - fclose(file); -} - -static void parse_file(const char *name) -{ -} - -static void parse_addr_range(const char *optarg) -{ - unsigned long offset; - unsigned long size; - char *p; - - p = strchr(optarg, ','); - if (!p) - p = strchr(optarg, '+'); - - if (p == optarg) { - offset = 0; - size = parse_number(p + 1); - } else if (p) { - offset = parse_number(optarg); - if (p[1] == '\0') - size = ULONG_MAX; - else { - size = parse_number(p + 1); - if (*p == ',') { - if (size < offset) - fatal("invalid range: %lu,%lu\n", - offset, size); - size -= offset; - } - } - } else { - offset = parse_number(optarg); - size = 1; - } - - add_addr_range(offset, size); -} - -static void add_bits_filter(uint64_t mask, uint64_t bits) -{ - if (nr_bit_filters >= MAX_BIT_FILTERS) - fatal("too much bit filters\n"); - - opt_mask[nr_bit_filters] = mask; - opt_bits[nr_bit_filters] = bits; - nr_bit_filters++; -} - -static uint64_t parse_flag_name(const char *str, int len) -{ - int i; - - if (!*str || !len) - return 0; - - if (len <= 8 && !strncmp(str, "compound", len)) - return BITS_COMPOUND; - - for (i = 0; i < ARRAY_SIZE(page_flag_names); i++) { - if (!page_flag_names[i]) - continue; - if (!strncmp(str, page_flag_names[i] + 2, len)) - return 1ULL << i; - } - - return parse_number(str); -} - -static uint64_t parse_flag_names(const char *str, int all) -{ - const char *p = str; - uint64_t flags = 0; - - while (1) { - if (*p == ',' || *p == '=' || *p == '\0') { - if ((*str != '~') || (*str == '~' && all && *++str)) - flags |= parse_flag_name(str, p - str); - if (*p != ',') - break; - str = p + 1; - } - p++; - } - - return flags; -} - -static void parse_bits_mask(const char *optarg) -{ - uint64_t mask; - uint64_t bits; - const char *p; - - p = strchr(optarg, '='); - if (p == optarg) { - mask = KPF_ALL_BITS; - bits = parse_flag_names(p + 1, 0); - } else if (p) { - mask = parse_flag_names(optarg, 0); - bits = parse_flag_names(p + 1, 0); - } else if (strchr(optarg, '~')) { - mask = parse_flag_names(optarg, 1); - bits = parse_flag_names(optarg, 0); - } else { - mask = parse_flag_names(optarg, 0); - bits = KPF_ALL_BITS; - } - - add_bits_filter(mask, bits); -} - -static void describe_flags(const char *optarg) -{ - uint64_t flags = parse_flag_names(optarg, 0); - - printf("0x%016llx\t%s\t%s\n", - (unsigned long long)flags, - page_flag_name(flags), - page_flag_longname(flags)); -} - -static const struct option opts[] = { - { "raw" , 0, NULL, 'r' }, - { "pid" , 1, NULL, 'p' }, - { "file" , 1, NULL, 'f' }, - { "addr" , 1, NULL, 'a' }, - { "bits" , 1, NULL, 'b' }, - { "describe" , 1, NULL, 'd' }, - { "list" , 0, NULL, 'l' }, - { "list-each" , 0, NULL, 'L' }, - { "no-summary", 0, NULL, 'N' }, - { "hwpoison" , 0, NULL, 'X' }, - { "unpoison" , 0, NULL, 'x' }, - { "help" , 0, NULL, 'h' }, - { NULL , 0, NULL, 0 } -}; - -int main(int argc, char *argv[]) -{ - int c; - - page_size = getpagesize(); - - while ((c = getopt_long(argc, argv, - "rp:f:a:b:d:lLNXxh", opts, NULL)) != -1) { - switch (c) { - case 'r': - opt_raw = 1; - break; - case 'p': - parse_pid(optarg); - break; - case 'f': - parse_file(optarg); - break; - case 'a': - parse_addr_range(optarg); - break; - case 'b': - parse_bits_mask(optarg); - break; - case 'd': - describe_flags(optarg); - exit(0); - case 'l': - opt_list = 1; - break; - case 'L': - opt_list = 2; - break; - case 'N': - opt_no_summary = 1; - break; - case 'X': - opt_hwpoison = 1; - prepare_hwpoison_fd(); - break; - case 'x': - opt_unpoison = 1; - prepare_hwpoison_fd(); - break; - case 'h': - usage(); - exit(0); - default: - usage(); - exit(1); - } - } - - if (opt_list && opt_pid) - printf("voffset\t"); - if (opt_list == 1) - printf("offset\tlen\tflags\n"); - if (opt_list == 2) - printf("offset\tflags\n"); - - walk_addr_ranges(); - - if (opt_list == 1) - show_page_range(0, 0, 0); /* drain the buffer */ - - if (opt_no_summary) - return 0; - - if (opt_list) - printf("\n\n"); - - show_summary(); - - return 0; -} diff --git a/tools/vm/Makefile b/tools/vm/Makefile new file mode 100644 index 000000000000..3823d4b1fa75 --- /dev/null +++ b/tools/vm/Makefile @@ -0,0 +1,11 @@ +# Makefile for vm tools + +CC = $(CROSS_COMPILE)gcc +CFLAGS = -Wall -Wextra + +all: page-types +%: %.c + $(CC) $(CFLAGS) -o $@ $^ + +clean: + $(RM) page-types diff --git a/tools/vm/page-types.c b/tools/vm/page-types.c new file mode 100644 index 000000000000..7dab7b25b5c6 --- /dev/null +++ b/tools/vm/page-types.c @@ -0,0 +1,1102 @@ +/* + * page-types: Tool for querying page flags + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; version 2. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should find a copy of v2 of the GNU General Public License somewhere on + * your Linux system; if not, write to the Free Software Foundation, Inc., 59 + * Temple Place, Suite 330, Boston, MA 02111-1307 USA. + * + * Copyright (C) 2009 Intel corporation + * + * Authors: Wu Fengguang + */ + +#define _LARGEFILE64_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "../../include/linux/magic.h" + + +#ifndef MAX_PATH +# define MAX_PATH 256 +#endif + +#ifndef STR +# define _STR(x) #x +# define STR(x) _STR(x) +#endif + +/* + * pagemap kernel ABI bits + */ + +#define PM_ENTRY_BYTES sizeof(uint64_t) +#define PM_STATUS_BITS 3 +#define PM_STATUS_OFFSET (64 - PM_STATUS_BITS) +#define PM_STATUS_MASK (((1LL << PM_STATUS_BITS) - 1) << PM_STATUS_OFFSET) +#define PM_STATUS(nr) (((nr) << PM_STATUS_OFFSET) & PM_STATUS_MASK) +#define PM_PSHIFT_BITS 6 +#define PM_PSHIFT_OFFSET (PM_STATUS_OFFSET - PM_PSHIFT_BITS) +#define PM_PSHIFT_MASK (((1LL << PM_PSHIFT_BITS) - 1) << PM_PSHIFT_OFFSET) +#define PM_PSHIFT(x) (((u64) (x) << PM_PSHIFT_OFFSET) & PM_PSHIFT_MASK) +#define PM_PFRAME_MASK ((1LL << PM_PSHIFT_OFFSET) - 1) +#define PM_PFRAME(x) ((x) & PM_PFRAME_MASK) + +#define PM_PRESENT PM_STATUS(4LL) +#define PM_SWAP PM_STATUS(2LL) + + +/* + * kernel page flags + */ + +#define KPF_BYTES 8 +#define PROC_KPAGEFLAGS "/proc/kpageflags" + +/* copied from kpageflags_read() */ +#define KPF_LOCKED 0 +#define KPF_ERROR 1 +#define KPF_REFERENCED 2 +#define KPF_UPTODATE 3 +#define KPF_DIRTY 4 +#define KPF_LRU 5 +#define KPF_ACTIVE 6 +#define KPF_SLAB 7 +#define KPF_WRITEBACK 8 +#define KPF_RECLAIM 9 +#define KPF_BUDDY 10 + +/* [11-20] new additions in 2.6.31 */ +#define KPF_MMAP 11 +#define KPF_ANON 12 +#define KPF_SWAPCACHE 13 +#define KPF_SWAPBACKED 14 +#define KPF_COMPOUND_HEAD 15 +#define KPF_COMPOUND_TAIL 16 +#define KPF_HUGE 17 +#define KPF_UNEVICTABLE 18 +#define KPF_HWPOISON 19 +#define KPF_NOPAGE 20 +#define KPF_KSM 21 +#define KPF_THP 22 + +/* [32-] kernel hacking assistances */ +#define KPF_RESERVED 32 +#define KPF_MLOCKED 33 +#define KPF_MAPPEDTODISK 34 +#define KPF_PRIVATE 35 +#define KPF_PRIVATE_2 36 +#define KPF_OWNER_PRIVATE 37 +#define KPF_ARCH 38 +#define KPF_UNCACHED 39 + +/* [48-] take some arbitrary free slots for expanding overloaded flags + * not part of kernel API + */ +#define KPF_READAHEAD 48 +#define KPF_SLOB_FREE 49 +#define KPF_SLUB_FROZEN 50 +#define KPF_SLUB_DEBUG 51 + +#define KPF_ALL_BITS ((uint64_t)~0ULL) +#define KPF_HACKERS_BITS (0xffffULL << 32) +#define KPF_OVERLOADED_BITS (0xffffULL << 48) +#define BIT(name) (1ULL << KPF_##name) +#define BITS_COMPOUND (BIT(COMPOUND_HEAD) | BIT(COMPOUND_TAIL)) + +static const char * const page_flag_names[] = { + [KPF_LOCKED] = "L:locked", + [KPF_ERROR] = "E:error", + [KPF_REFERENCED] = "R:referenced", + [KPF_UPTODATE] = "U:uptodate", + [KPF_DIRTY] = "D:dirty", + [KPF_LRU] = "l:lru", + [KPF_ACTIVE] = "A:active", + [KPF_SLAB] = "S:slab", + [KPF_WRITEBACK] = "W:writeback", + [KPF_RECLAIM] = "I:reclaim", + [KPF_BUDDY] = "B:buddy", + + [KPF_MMAP] = "M:mmap", + [KPF_ANON] = "a:anonymous", + [KPF_SWAPCACHE] = "s:swapcache", + [KPF_SWAPBACKED] = "b:swapbacked", + [KPF_COMPOUND_HEAD] = "H:compound_head", + [KPF_COMPOUND_TAIL] = "T:compound_tail", + [KPF_HUGE] = "G:huge", + [KPF_UNEVICTABLE] = "u:unevictable", + [KPF_HWPOISON] = "X:hwpoison", + [KPF_NOPAGE] = "n:nopage", + [KPF_KSM] = "x:ksm", + [KPF_THP] = "t:thp", + + [KPF_RESERVED] = "r:reserved", + [KPF_MLOCKED] = "m:mlocked", + [KPF_MAPPEDTODISK] = "d:mappedtodisk", + [KPF_PRIVATE] = "P:private", + [KPF_PRIVATE_2] = "p:private_2", + [KPF_OWNER_PRIVATE] = "O:owner_private", + [KPF_ARCH] = "h:arch", + [KPF_UNCACHED] = "c:uncached", + + [KPF_READAHEAD] = "I:readahead", + [KPF_SLOB_FREE] = "P:slob_free", + [KPF_SLUB_FROZEN] = "A:slub_frozen", + [KPF_SLUB_DEBUG] = "E:slub_debug", +}; + + +static const char * const debugfs_known_mountpoints[] = { + "/sys/kernel/debug", + "/debug", + 0, +}; + +/* + * data structures + */ + +static int opt_raw; /* for kernel developers */ +static int opt_list; /* list pages (in ranges) */ +static int opt_no_summary; /* don't show summary */ +static pid_t opt_pid; /* process to walk */ + +#define MAX_ADDR_RANGES 1024 +static int nr_addr_ranges; +static unsigned long opt_offset[MAX_ADDR_RANGES]; +static unsigned long opt_size[MAX_ADDR_RANGES]; + +#define MAX_VMAS 10240 +static int nr_vmas; +static unsigned long pg_start[MAX_VMAS]; +static unsigned long pg_end[MAX_VMAS]; + +#define MAX_BIT_FILTERS 64 +static int nr_bit_filters; +static uint64_t opt_mask[MAX_BIT_FILTERS]; +static uint64_t opt_bits[MAX_BIT_FILTERS]; + +static int page_size; + +static int pagemap_fd; +static int kpageflags_fd; + +static int opt_hwpoison; +static int opt_unpoison; + +static char hwpoison_debug_fs[MAX_PATH+1]; +static int hwpoison_inject_fd; +static int hwpoison_forget_fd; + +#define HASH_SHIFT 13 +#define HASH_SIZE (1 << HASH_SHIFT) +#define HASH_MASK (HASH_SIZE - 1) +#define HASH_KEY(flags) (flags & HASH_MASK) + +static unsigned long total_pages; +static unsigned long nr_pages[HASH_SIZE]; +static uint64_t page_flags[HASH_SIZE]; + + +/* + * helper functions + */ + +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + +#define min_t(type, x, y) ({ \ + type __min1 = (x); \ + type __min2 = (y); \ + __min1 < __min2 ? __min1 : __min2; }) + +#define max_t(type, x, y) ({ \ + type __max1 = (x); \ + type __max2 = (y); \ + __max1 > __max2 ? __max1 : __max2; }) + +static unsigned long pages2mb(unsigned long pages) +{ + return (pages * page_size) >> 20; +} + +static void fatal(const char *x, ...) +{ + va_list ap; + + va_start(ap, x); + vfprintf(stderr, x, ap); + va_end(ap); + exit(EXIT_FAILURE); +} + +static int checked_open(const char *pathname, int flags) +{ + int fd = open(pathname, flags); + + if (fd < 0) { + perror(pathname); + exit(EXIT_FAILURE); + } + + return fd; +} + +/* + * pagemap/kpageflags routines + */ + +static unsigned long do_u64_read(int fd, char *name, + uint64_t *buf, + unsigned long index, + unsigned long count) +{ + long bytes; + + if (index > ULONG_MAX / 8) + fatal("index overflow: %lu\n", index); + + if (lseek(fd, index * 8, SEEK_SET) < 0) { + perror(name); + exit(EXIT_FAILURE); + } + + bytes = read(fd, buf, count * 8); + if (bytes < 0) { + perror(name); + exit(EXIT_FAILURE); + } + if (bytes % 8) + fatal("partial read: %lu bytes\n", bytes); + + return bytes / 8; +} + +static unsigned long kpageflags_read(uint64_t *buf, + unsigned long index, + unsigned long pages) +{ + return do_u64_read(kpageflags_fd, PROC_KPAGEFLAGS, buf, index, pages); +} + +static unsigned long pagemap_read(uint64_t *buf, + unsigned long index, + unsigned long pages) +{ + return do_u64_read(pagemap_fd, "/proc/pid/pagemap", buf, index, pages); +} + +static unsigned long pagemap_pfn(uint64_t val) +{ + unsigned long pfn; + + if (val & PM_PRESENT) + pfn = PM_PFRAME(val); + else + pfn = 0; + + return pfn; +} + + +/* + * page flag names + */ + +static char *page_flag_name(uint64_t flags) +{ + static char buf[65]; + int present; + int i, j; + + for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) { + present = (flags >> i) & 1; + if (!page_flag_names[i]) { + if (present) + fatal("unknown flag bit %d\n", i); + continue; + } + buf[j++] = present ? page_flag_names[i][0] : '_'; + } + + return buf; +} + +static char *page_flag_longname(uint64_t flags) +{ + static char buf[1024]; + int i, n; + + for (i = 0, n = 0; i < ARRAY_SIZE(page_flag_names); i++) { + if (!page_flag_names[i]) + continue; + if ((flags >> i) & 1) + n += snprintf(buf + n, sizeof(buf) - n, "%s,", + page_flag_names[i] + 2); + } + if (n) + n--; + buf[n] = '\0'; + + return buf; +} + + +/* + * page list and summary + */ + +static void show_page_range(unsigned long voffset, + unsigned long offset, uint64_t flags) +{ + static uint64_t flags0; + static unsigned long voff; + static unsigned long index; + static unsigned long count; + + if (flags == flags0 && offset == index + count && + (!opt_pid || voffset == voff + count)) { + count++; + return; + } + + if (count) { + if (opt_pid) + printf("%lx\t", voff); + printf("%lx\t%lx\t%s\n", + index, count, page_flag_name(flags0)); + } + + flags0 = flags; + index = offset; + voff = voffset; + count = 1; +} + +static void show_page(unsigned long voffset, + unsigned long offset, uint64_t flags) +{ + if (opt_pid) + printf("%lx\t", voffset); + printf("%lx\t%s\n", offset, page_flag_name(flags)); +} + +static void show_summary(void) +{ + int i; + + printf(" flags\tpage-count MB" + " symbolic-flags\t\t\tlong-symbolic-flags\n"); + + for (i = 0; i < ARRAY_SIZE(nr_pages); i++) { + if (nr_pages[i]) + printf("0x%016llx\t%10lu %8lu %s\t%s\n", + (unsigned long long)page_flags[i], + nr_pages[i], + pages2mb(nr_pages[i]), + page_flag_name(page_flags[i]), + page_flag_longname(page_flags[i])); + } + + printf(" total\t%10lu %8lu\n", + total_pages, pages2mb(total_pages)); +} + + +/* + * page flag filters + */ + +static int bit_mask_ok(uint64_t flags) +{ + int i; + + for (i = 0; i < nr_bit_filters; i++) { + if (opt_bits[i] == KPF_ALL_BITS) { + if ((flags & opt_mask[i]) == 0) + return 0; + } else { + if ((flags & opt_mask[i]) != opt_bits[i]) + return 0; + } + } + + return 1; +} + +static uint64_t expand_overloaded_flags(uint64_t flags) +{ + /* SLOB/SLUB overload several page flags */ + if (flags & BIT(SLAB)) { + if (flags & BIT(PRIVATE)) + flags ^= BIT(PRIVATE) | BIT(SLOB_FREE); + if (flags & BIT(ACTIVE)) + flags ^= BIT(ACTIVE) | BIT(SLUB_FROZEN); + if (flags & BIT(ERROR)) + flags ^= BIT(ERROR) | BIT(SLUB_DEBUG); + } + + /* PG_reclaim is overloaded as PG_readahead in the read path */ + if ((flags & (BIT(RECLAIM) | BIT(WRITEBACK))) == BIT(RECLAIM)) + flags ^= BIT(RECLAIM) | BIT(READAHEAD); + + return flags; +} + +static uint64_t well_known_flags(uint64_t flags) +{ + /* hide flags intended only for kernel hacker */ + flags &= ~KPF_HACKERS_BITS; + + /* hide non-hugeTLB compound pages */ + if ((flags & BITS_COMPOUND) && !(flags & BIT(HUGE))) + flags &= ~BITS_COMPOUND; + + return flags; +} + +static uint64_t kpageflags_flags(uint64_t flags) +{ + flags = expand_overloaded_flags(flags); + + if (!opt_raw) + flags = well_known_flags(flags); + + return flags; +} + +/* verify that a mountpoint is actually a debugfs instance */ +static int debugfs_valid_mountpoint(const char *debugfs) +{ + struct statfs st_fs; + + if (statfs(debugfs, &st_fs) < 0) + return -ENOENT; + else if (st_fs.f_type != (long) DEBUGFS_MAGIC) + return -ENOENT; + + return 0; +} + +/* find the path to the mounted debugfs */ +static const char *debugfs_find_mountpoint(void) +{ + const char **ptr; + char type[100]; + FILE *fp; + + ptr = debugfs_known_mountpoints; + while (*ptr) { + if (debugfs_valid_mountpoint(*ptr) == 0) { + strcpy(hwpoison_debug_fs, *ptr); + return hwpoison_debug_fs; + } + ptr++; + } + + /* give up and parse /proc/mounts */ + fp = fopen("/proc/mounts", "r"); + if (fp == NULL) + perror("Can't open /proc/mounts for read"); + + while (fscanf(fp, "%*s %" + STR(MAX_PATH) + "s %99s %*s %*d %*d\n", + hwpoison_debug_fs, type) == 2) { + if (strcmp(type, "debugfs") == 0) + break; + } + fclose(fp); + + if (strcmp(type, "debugfs") != 0) + return NULL; + + return hwpoison_debug_fs; +} + +/* mount the debugfs somewhere if it's not mounted */ + +static void debugfs_mount(void) +{ + const char **ptr; + + /* see if it's already mounted */ + if (debugfs_find_mountpoint()) + return; + + ptr = debugfs_known_mountpoints; + while (*ptr) { + if (mount(NULL, *ptr, "debugfs", 0, NULL) == 0) { + /* save the mountpoint */ + strcpy(hwpoison_debug_fs, *ptr); + break; + } + ptr++; + } + + if (*ptr == NULL) { + perror("mount debugfs"); + exit(EXIT_FAILURE); + } +} + +/* + * page actions + */ + +static void prepare_hwpoison_fd(void) +{ + char buf[MAX_PATH + 1]; + + debugfs_mount(); + + if (opt_hwpoison && !hwpoison_inject_fd) { + snprintf(buf, MAX_PATH, "%s/hwpoison/corrupt-pfn", + hwpoison_debug_fs); + hwpoison_inject_fd = checked_open(buf, O_WRONLY); + } + + if (opt_unpoison && !hwpoison_forget_fd) { + snprintf(buf, MAX_PATH, "%s/hwpoison/unpoison-pfn", + hwpoison_debug_fs); + hwpoison_forget_fd = checked_open(buf, O_WRONLY); + } +} + +static int hwpoison_page(unsigned long offset) +{ + char buf[100]; + int len; + + len = sprintf(buf, "0x%lx\n", offset); + len = write(hwpoison_inject_fd, buf, len); + if (len < 0) { + perror("hwpoison inject"); + return len; + } + return 0; +} + +static int unpoison_page(unsigned long offset) +{ + char buf[100]; + int len; + + len = sprintf(buf, "0x%lx\n", offset); + len = write(hwpoison_forget_fd, buf, len); + if (len < 0) { + perror("hwpoison forget"); + return len; + } + return 0; +} + +/* + * page frame walker + */ + +static int hash_slot(uint64_t flags) +{ + int k = HASH_KEY(flags); + int i; + + /* Explicitly reserve slot 0 for flags 0: the following logic + * cannot distinguish an unoccupied slot from slot (flags==0). + */ + if (flags == 0) + return 0; + + /* search through the remaining (HASH_SIZE-1) slots */ + for (i = 1; i < ARRAY_SIZE(page_flags); i++, k++) { + if (!k || k >= ARRAY_SIZE(page_flags)) + k = 1; + if (page_flags[k] == 0) { + page_flags[k] = flags; + return k; + } + if (page_flags[k] == flags) + return k; + } + + fatal("hash table full: bump up HASH_SHIFT?\n"); + exit(EXIT_FAILURE); +} + +static void add_page(unsigned long voffset, + unsigned long offset, uint64_t flags) +{ + flags = kpageflags_flags(flags); + + if (!bit_mask_ok(flags)) + return; + + if (opt_hwpoison) + hwpoison_page(offset); + if (opt_unpoison) + unpoison_page(offset); + + if (opt_list == 1) + show_page_range(voffset, offset, flags); + else if (opt_list == 2) + show_page(voffset, offset, flags); + + nr_pages[hash_slot(flags)]++; + total_pages++; +} + +#define KPAGEFLAGS_BATCH (64 << 10) /* 64k pages */ +static void walk_pfn(unsigned long voffset, + unsigned long index, + unsigned long count) +{ + uint64_t buf[KPAGEFLAGS_BATCH]; + unsigned long batch; + long pages; + unsigned long i; + + while (count) { + batch = min_t(unsigned long, count, KPAGEFLAGS_BATCH); + pages = kpageflags_read(buf, index, batch); + if (pages == 0) + break; + + for (i = 0; i < pages; i++) + add_page(voffset + i, index + i, buf[i]); + + index += pages; + count -= pages; + } +} + +#define PAGEMAP_BATCH (64 << 10) +static void walk_vma(unsigned long index, unsigned long count) +{ + uint64_t buf[PAGEMAP_BATCH]; + unsigned long batch; + unsigned long pages; + unsigned long pfn; + unsigned long i; + + while (count) { + batch = min_t(unsigned long, count, PAGEMAP_BATCH); + pages = pagemap_read(buf, index, batch); + if (pages == 0) + break; + + for (i = 0; i < pages; i++) { + pfn = pagemap_pfn(buf[i]); + if (pfn) + walk_pfn(index + i, pfn, 1); + } + + index += pages; + count -= pages; + } +} + +static void walk_task(unsigned long index, unsigned long count) +{ + const unsigned long end = index + count; + unsigned long start; + int i = 0; + + while (index < end) { + + while (pg_end[i] <= index) + if (++i >= nr_vmas) + return; + if (pg_start[i] >= end) + return; + + start = max_t(unsigned long, pg_start[i], index); + index = min_t(unsigned long, pg_end[i], end); + + assert(start < index); + walk_vma(start, index - start); + } +} + +static void add_addr_range(unsigned long offset, unsigned long size) +{ + if (nr_addr_ranges >= MAX_ADDR_RANGES) + fatal("too many addr ranges\n"); + + opt_offset[nr_addr_ranges] = offset; + opt_size[nr_addr_ranges] = min_t(unsigned long, size, ULONG_MAX-offset); + nr_addr_ranges++; +} + +static void walk_addr_ranges(void) +{ + int i; + + kpageflags_fd = checked_open(PROC_KPAGEFLAGS, O_RDONLY); + + if (!nr_addr_ranges) + add_addr_range(0, ULONG_MAX); + + for (i = 0; i < nr_addr_ranges; i++) + if (!opt_pid) + walk_pfn(0, opt_offset[i], opt_size[i]); + else + walk_task(opt_offset[i], opt_size[i]); + + close(kpageflags_fd); +} + + +/* + * user interface + */ + +static const char *page_flag_type(uint64_t flag) +{ + if (flag & KPF_HACKERS_BITS) + return "(r)"; + if (flag & KPF_OVERLOADED_BITS) + return "(o)"; + return " "; +} + +static void usage(void) +{ + int i, j; + + printf( +"page-types [options]\n" +" -r|--raw Raw mode, for kernel developers\n" +" -d|--describe flags Describe flags\n" +" -a|--addr addr-spec Walk a range of pages\n" +" -b|--bits bits-spec Walk pages with specified bits\n" +" -p|--pid pid Walk process address space\n" +#if 0 /* planned features */ +" -f|--file filename Walk file address space\n" +#endif +" -l|--list Show page details in ranges\n" +" -L|--list-each Show page details one by one\n" +" -N|--no-summary Don't show summary info\n" +" -X|--hwpoison hwpoison pages\n" +" -x|--unpoison unpoison pages\n" +" -h|--help Show this usage message\n" +"flags:\n" +" 0x10 bitfield format, e.g.\n" +" anon bit-name, e.g.\n" +" 0x10,anon comma-separated list, e.g.\n" +"addr-spec:\n" +" N one page at offset N (unit: pages)\n" +" N+M pages range from N to N+M-1\n" +" N,M pages range from N to M-1\n" +" N, pages range from N to end\n" +" ,M pages range from 0 to M-1\n" +"bits-spec:\n" +" bit1,bit2 (flags & (bit1|bit2)) != 0\n" +" bit1,bit2=bit1 (flags & (bit1|bit2)) == bit1\n" +" bit1,~bit2 (flags & (bit1|bit2)) == bit1\n" +" =bit1,bit2 flags == (bit1|bit2)\n" +"bit-names:\n" + ); + + for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) { + if (!page_flag_names[i]) + continue; + printf("%16s%s", page_flag_names[i] + 2, + page_flag_type(1ULL << i)); + if (++j > 3) { + j = 0; + putchar('\n'); + } + } + printf("\n " + "(r) raw mode bits (o) overloaded bits\n"); +} + +static unsigned long long parse_number(const char *str) +{ + unsigned long long n; + + n = strtoll(str, NULL, 0); + + if (n == 0 && str[0] != '0') + fatal("invalid name or number: %s\n", str); + + return n; +} + +static void parse_pid(const char *str) +{ + FILE *file; + char buf[5000]; + + opt_pid = parse_number(str); + + sprintf(buf, "/proc/%d/pagemap", opt_pid); + pagemap_fd = checked_open(buf, O_RDONLY); + + sprintf(buf, "/proc/%d/maps", opt_pid); + file = fopen(buf, "r"); + if (!file) { + perror(buf); + exit(EXIT_FAILURE); + } + + while (fgets(buf, sizeof(buf), file) != NULL) { + unsigned long vm_start; + unsigned long vm_end; + unsigned long long pgoff; + int major, minor; + char r, w, x, s; + unsigned long ino; + int n; + + n = sscanf(buf, "%lx-%lx %c%c%c%c %llx %x:%x %lu", + &vm_start, + &vm_end, + &r, &w, &x, &s, + &pgoff, + &major, &minor, + &ino); + if (n < 10) { + fprintf(stderr, "unexpected line: %s\n", buf); + continue; + } + pg_start[nr_vmas] = vm_start / page_size; + pg_end[nr_vmas] = vm_end / page_size; + if (++nr_vmas >= MAX_VMAS) { + fprintf(stderr, "too many VMAs\n"); + break; + } + } + fclose(file); +} + +static void parse_file(const char *name) +{ +} + +static void parse_addr_range(const char *optarg) +{ + unsigned long offset; + unsigned long size; + char *p; + + p = strchr(optarg, ','); + if (!p) + p = strchr(optarg, '+'); + + if (p == optarg) { + offset = 0; + size = parse_number(p + 1); + } else if (p) { + offset = parse_number(optarg); + if (p[1] == '\0') + size = ULONG_MAX; + else { + size = parse_number(p + 1); + if (*p == ',') { + if (size < offset) + fatal("invalid range: %lu,%lu\n", + offset, size); + size -= offset; + } + } + } else { + offset = parse_number(optarg); + size = 1; + } + + add_addr_range(offset, size); +} + +static void add_bits_filter(uint64_t mask, uint64_t bits) +{ + if (nr_bit_filters >= MAX_BIT_FILTERS) + fatal("too much bit filters\n"); + + opt_mask[nr_bit_filters] = mask; + opt_bits[nr_bit_filters] = bits; + nr_bit_filters++; +} + +static uint64_t parse_flag_name(const char *str, int len) +{ + int i; + + if (!*str || !len) + return 0; + + if (len <= 8 && !strncmp(str, "compound", len)) + return BITS_COMPOUND; + + for (i = 0; i < ARRAY_SIZE(page_flag_names); i++) { + if (!page_flag_names[i]) + continue; + if (!strncmp(str, page_flag_names[i] + 2, len)) + return 1ULL << i; + } + + return parse_number(str); +} + +static uint64_t parse_flag_names(const char *str, int all) +{ + const char *p = str; + uint64_t flags = 0; + + while (1) { + if (*p == ',' || *p == '=' || *p == '\0') { + if ((*str != '~') || (*str == '~' && all && *++str)) + flags |= parse_flag_name(str, p - str); + if (*p != ',') + break; + str = p + 1; + } + p++; + } + + return flags; +} + +static void parse_bits_mask(const char *optarg) +{ + uint64_t mask; + uint64_t bits; + const char *p; + + p = strchr(optarg, '='); + if (p == optarg) { + mask = KPF_ALL_BITS; + bits = parse_flag_names(p + 1, 0); + } else if (p) { + mask = parse_flag_names(optarg, 0); + bits = parse_flag_names(p + 1, 0); + } else if (strchr(optarg, '~')) { + mask = parse_flag_names(optarg, 1); + bits = parse_flag_names(optarg, 0); + } else { + mask = parse_flag_names(optarg, 0); + bits = KPF_ALL_BITS; + } + + add_bits_filter(mask, bits); +} + +static void describe_flags(const char *optarg) +{ + uint64_t flags = parse_flag_names(optarg, 0); + + printf("0x%016llx\t%s\t%s\n", + (unsigned long long)flags, + page_flag_name(flags), + page_flag_longname(flags)); +} + +static const struct option opts[] = { + { "raw" , 0, NULL, 'r' }, + { "pid" , 1, NULL, 'p' }, + { "file" , 1, NULL, 'f' }, + { "addr" , 1, NULL, 'a' }, + { "bits" , 1, NULL, 'b' }, + { "describe" , 1, NULL, 'd' }, + { "list" , 0, NULL, 'l' }, + { "list-each" , 0, NULL, 'L' }, + { "no-summary", 0, NULL, 'N' }, + { "hwpoison" , 0, NULL, 'X' }, + { "unpoison" , 0, NULL, 'x' }, + { "help" , 0, NULL, 'h' }, + { NULL , 0, NULL, 0 } +}; + +int main(int argc, char *argv[]) +{ + int c; + + page_size = getpagesize(); + + while ((c = getopt_long(argc, argv, + "rp:f:a:b:d:lLNXxh", opts, NULL)) != -1) { + switch (c) { + case 'r': + opt_raw = 1; + break; + case 'p': + parse_pid(optarg); + break; + case 'f': + parse_file(optarg); + break; + case 'a': + parse_addr_range(optarg); + break; + case 'b': + parse_bits_mask(optarg); + break; + case 'd': + describe_flags(optarg); + exit(0); + case 'l': + opt_list = 1; + break; + case 'L': + opt_list = 2; + break; + case 'N': + opt_no_summary = 1; + break; + case 'X': + opt_hwpoison = 1; + prepare_hwpoison_fd(); + break; + case 'x': + opt_unpoison = 1; + prepare_hwpoison_fd(); + break; + case 'h': + usage(); + exit(0); + default: + usage(); + exit(1); + } + } + + if (opt_list && opt_pid) + printf("voffset\t"); + if (opt_list == 1) + printf("offset\tlen\tflags\n"); + if (opt_list == 2) + printf("offset\tflags\n"); + + walk_addr_ranges(); + + if (opt_list == 1) + show_page_range(0, 0, 0); /* drain the buffer */ + + if (opt_no_summary) + return 0; + + if (opt_list) + printf("\n\n"); + + show_summary(); + + return 0; +} -- cgit v1.2.3 From f0f57b2b1488251970c25deea0ea150a8d0911ed Mon Sep 17 00:00:00 2001 From: Dave Young Date: Wed, 28 Mar 2012 14:42:56 -0700 Subject: mm: move hugepage test examples to tools/testing/selftests/vm hugepage-mmap.c, hugepage-shm.c and map_hugetlb.c in Documentation/vm are simple pass/fail tests, It's better to promote them to tools/testing/selftests. Thanks suggestion of Andrew Morton about this. They all need firstly setting up proper nr_hugepages and hugepage-mmap need to mount hugetlbfs. So I add a shell script run_vmtests to do such work which will call the three test programs and check the return value of them. Changes to original code including below: a. add run_vmtests script b. return error when read_bytes mismatch with writed bytes. c. coding style fixes: do not use assignment in if condition [akpm@linux-foundation.org: build the targets before trying to execute them] [akpm@linux-foundation.org: Documentation/vm/ no longer has a Makefile. Fixes "make clean"] Signed-off-by: Dave Young Cc: Wu Fengguang Cc: Christoph Lameter Cc: Pekka Enberg Cc: Frederic Weisbecker Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Documentation/Makefile | 2 +- Documentation/vm/Makefile | 8 --- Documentation/vm/hugepage-mmap.c | 91 -------------------------- Documentation/vm/hugepage-shm.c | 98 ---------------------------- Documentation/vm/map_hugetlb.c | 77 ---------------------- tools/testing/selftests/Makefile | 2 +- tools/testing/selftests/vm/Makefile | 14 ++++ tools/testing/selftests/vm/hugepage-mmap.c | 92 ++++++++++++++++++++++++++ tools/testing/selftests/vm/hugepage-shm.c | 100 +++++++++++++++++++++++++++++ tools/testing/selftests/vm/map_hugetlb.c | 79 +++++++++++++++++++++++ tools/testing/selftests/vm/run_vmtests | 77 ++++++++++++++++++++++ 11 files changed, 364 insertions(+), 276 deletions(-) delete mode 100644 Documentation/vm/Makefile delete mode 100644 Documentation/vm/hugepage-mmap.c delete mode 100644 Documentation/vm/hugepage-shm.c delete mode 100644 Documentation/vm/map_hugetlb.c create mode 100644 tools/testing/selftests/vm/Makefile create mode 100644 tools/testing/selftests/vm/hugepage-mmap.c create mode 100644 tools/testing/selftests/vm/hugepage-shm.c create mode 100644 tools/testing/selftests/vm/map_hugetlb.c create mode 100644 tools/testing/selftests/vm/run_vmtests (limited to 'Documentation') diff --git a/Documentation/Makefile b/Documentation/Makefile index 9b4bc5c76f33..30b656ece7aa 100644 --- a/Documentation/Makefile +++ b/Documentation/Makefile @@ -1,3 +1,3 @@ obj-m := DocBook/ accounting/ auxdisplay/ connector/ \ filesystems/ filesystems/configfs/ ia64/ laptops/ networking/ \ - pcmcia/ spi/ timers/ vm/ watchdog/src/ + pcmcia/ spi/ timers/ watchdog/src/ diff --git a/Documentation/vm/Makefile b/Documentation/vm/Makefile deleted file mode 100644 index e538864bfc63..000000000000 --- a/Documentation/vm/Makefile +++ /dev/null @@ -1,8 +0,0 @@ -# kbuild trick to avoid linker error. Can be omitted if a module is built. -obj- := dummy.o - -# List of programs to build -hostprogs-y := hugepage-mmap hugepage-shm map_hugetlb - -# Tell kbuild to always build the programs -always := $(hostprogs-y) diff --git a/Documentation/vm/hugepage-mmap.c b/Documentation/vm/hugepage-mmap.c deleted file mode 100644 index db0dd9a33d54..000000000000 --- a/Documentation/vm/hugepage-mmap.c +++ /dev/null @@ -1,91 +0,0 @@ -/* - * hugepage-mmap: - * - * Example of using huge page memory in a user application using the mmap - * system call. Before running this application, make sure that the - * administrator has mounted the hugetlbfs filesystem (on some directory - * like /mnt) using the command mount -t hugetlbfs nodev /mnt. In this - * example, the app is requesting memory of size 256MB that is backed by - * huge pages. - * - * For the ia64 architecture, the Linux kernel reserves Region number 4 for - * huge pages. That means that if one requires a fixed address, a huge page - * aligned address starting with 0x800000... will be required. If a fixed - * address is not required, the kernel will select an address in the proper - * range. - * Other architectures, such as ppc64, i386 or x86_64 are not so constrained. - */ - -#include -#include -#include -#include -#include - -#define FILE_NAME "/mnt/hugepagefile" -#define LENGTH (256UL*1024*1024) -#define PROTECTION (PROT_READ | PROT_WRITE) - -/* Only ia64 requires this */ -#ifdef __ia64__ -#define ADDR (void *)(0x8000000000000000UL) -#define FLAGS (MAP_SHARED | MAP_FIXED) -#else -#define ADDR (void *)(0x0UL) -#define FLAGS (MAP_SHARED) -#endif - -static void check_bytes(char *addr) -{ - printf("First hex is %x\n", *((unsigned int *)addr)); -} - -static void write_bytes(char *addr) -{ - unsigned long i; - - for (i = 0; i < LENGTH; i++) - *(addr + i) = (char)i; -} - -static void read_bytes(char *addr) -{ - unsigned long i; - - check_bytes(addr); - for (i = 0; i < LENGTH; i++) - if (*(addr + i) != (char)i) { - printf("Mismatch at %lu\n", i); - break; - } -} - -int main(void) -{ - void *addr; - int fd; - - fd = open(FILE_NAME, O_CREAT | O_RDWR, 0755); - if (fd < 0) { - perror("Open failed"); - exit(1); - } - - addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, fd, 0); - if (addr == MAP_FAILED) { - perror("mmap"); - unlink(FILE_NAME); - exit(1); - } - - printf("Returned address is %p\n", addr); - check_bytes(addr); - write_bytes(addr); - read_bytes(addr); - - munmap(addr, LENGTH); - close(fd); - unlink(FILE_NAME); - - return 0; -} diff --git a/Documentation/vm/hugepage-shm.c b/Documentation/vm/hugepage-shm.c deleted file mode 100644 index 07956d8592c9..000000000000 --- a/Documentation/vm/hugepage-shm.c +++ /dev/null @@ -1,98 +0,0 @@ -/* - * hugepage-shm: - * - * Example of using huge page memory in a user application using Sys V shared - * memory system calls. In this example the app is requesting 256MB of - * memory that is backed by huge pages. The application uses the flag - * SHM_HUGETLB in the shmget system call to inform the kernel that it is - * requesting huge pages. - * - * For the ia64 architecture, the Linux kernel reserves Region number 4 for - * huge pages. That means that if one requires a fixed address, a huge page - * aligned address starting with 0x800000... will be required. If a fixed - * address is not required, the kernel will select an address in the proper - * range. - * Other architectures, such as ppc64, i386 or x86_64 are not so constrained. - * - * Note: The default shared memory limit is quite low on many kernels, - * you may need to increase it via: - * - * echo 268435456 > /proc/sys/kernel/shmmax - * - * This will increase the maximum size per shared memory segment to 256MB. - * The other limit that you will hit eventually is shmall which is the - * total amount of shared memory in pages. To set it to 16GB on a system - * with a 4kB pagesize do: - * - * echo 4194304 > /proc/sys/kernel/shmall - */ - -#include -#include -#include -#include -#include -#include - -#ifndef SHM_HUGETLB -#define SHM_HUGETLB 04000 -#endif - -#define LENGTH (256UL*1024*1024) - -#define dprintf(x) printf(x) - -/* Only ia64 requires this */ -#ifdef __ia64__ -#define ADDR (void *)(0x8000000000000000UL) -#define SHMAT_FLAGS (SHM_RND) -#else -#define ADDR (void *)(0x0UL) -#define SHMAT_FLAGS (0) -#endif - -int main(void) -{ - int shmid; - unsigned long i; - char *shmaddr; - - if ((shmid = shmget(2, LENGTH, - SHM_HUGETLB | IPC_CREAT | SHM_R | SHM_W)) < 0) { - perror("shmget"); - exit(1); - } - printf("shmid: 0x%x\n", shmid); - - shmaddr = shmat(shmid, ADDR, SHMAT_FLAGS); - if (shmaddr == (char *)-1) { - perror("Shared memory attach failure"); - shmctl(shmid, IPC_RMID, NULL); - exit(2); - } - printf("shmaddr: %p\n", shmaddr); - - dprintf("Starting the writes:\n"); - for (i = 0; i < LENGTH; i++) { - shmaddr[i] = (char)(i); - if (!(i % (1024 * 1024))) - dprintf("."); - } - dprintf("\n"); - - dprintf("Starting the Check..."); - for (i = 0; i < LENGTH; i++) - if (shmaddr[i] != (char)i) - printf("\nIndex %lu mismatched\n", i); - dprintf("Done.\n"); - - if (shmdt((const void *)shmaddr) != 0) { - perror("Detach failure"); - shmctl(shmid, IPC_RMID, NULL); - exit(3); - } - - shmctl(shmid, IPC_RMID, NULL); - - return 0; -} diff --git a/Documentation/vm/map_hugetlb.c b/Documentation/vm/map_hugetlb.c deleted file mode 100644 index eda1a6d3578a..000000000000 --- a/Documentation/vm/map_hugetlb.c +++ /dev/null @@ -1,77 +0,0 @@ -/* - * Example of using hugepage memory in a user application using the mmap - * system call with MAP_HUGETLB flag. Before running this program make - * sure the administrator has allocated enough default sized huge pages - * to cover the 256 MB allocation. - * - * For ia64 architecture, Linux kernel reserves Region number 4 for hugepages. - * That means the addresses starting with 0x800000... will need to be - * specified. Specifying a fixed address is not required on ppc64, i386 - * or x86_64. - */ -#include -#include -#include -#include -#include - -#define LENGTH (256UL*1024*1024) -#define PROTECTION (PROT_READ | PROT_WRITE) - -#ifndef MAP_HUGETLB -#define MAP_HUGETLB 0x40000 /* arch specific */ -#endif - -/* Only ia64 requires this */ -#ifdef __ia64__ -#define ADDR (void *)(0x8000000000000000UL) -#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_FIXED) -#else -#define ADDR (void *)(0x0UL) -#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB) -#endif - -static void check_bytes(char *addr) -{ - printf("First hex is %x\n", *((unsigned int *)addr)); -} - -static void write_bytes(char *addr) -{ - unsigned long i; - - for (i = 0; i < LENGTH; i++) - *(addr + i) = (char)i; -} - -static void read_bytes(char *addr) -{ - unsigned long i; - - check_bytes(addr); - for (i = 0; i < LENGTH; i++) - if (*(addr + i) != (char)i) { - printf("Mismatch at %lu\n", i); - break; - } -} - -int main(void) -{ - void *addr; - - addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0); - if (addr == MAP_FAILED) { - perror("mmap"); - exit(1); - } - - printf("Returned address is %p\n", addr); - check_bytes(addr); - write_bytes(addr); - read_bytes(addr); - - munmap(addr, LENGTH); - - return 0; -} diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 9203cd77fc33..28bc57ee757c 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -1,4 +1,4 @@ -TARGETS = breakpoints +TARGETS = breakpoints vm all: for TARGET in $(TARGETS); do \ diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile new file mode 100644 index 000000000000..b336b24aa6c0 --- /dev/null +++ b/tools/testing/selftests/vm/Makefile @@ -0,0 +1,14 @@ +# Makefile for vm selftests + +CC = $(CROSS_COMPILE)gcc +CFLAGS = -Wall -Wextra + +all: hugepage-mmap hugepage-shm map_hugetlb +%: %.c + $(CC) $(CFLAGS) -o $@ $^ + +run_tests: all + /bin/sh ./run_vmtests + +clean: + $(RM) hugepage-mmap hugepage-shm map_hugetlb diff --git a/tools/testing/selftests/vm/hugepage-mmap.c b/tools/testing/selftests/vm/hugepage-mmap.c new file mode 100644 index 000000000000..a10f310d2362 --- /dev/null +++ b/tools/testing/selftests/vm/hugepage-mmap.c @@ -0,0 +1,92 @@ +/* + * hugepage-mmap: + * + * Example of using huge page memory in a user application using the mmap + * system call. Before running this application, make sure that the + * administrator has mounted the hugetlbfs filesystem (on some directory + * like /mnt) using the command mount -t hugetlbfs nodev /mnt. In this + * example, the app is requesting memory of size 256MB that is backed by + * huge pages. + * + * For the ia64 architecture, the Linux kernel reserves Region number 4 for + * huge pages. That means that if one requires a fixed address, a huge page + * aligned address starting with 0x800000... will be required. If a fixed + * address is not required, the kernel will select an address in the proper + * range. + * Other architectures, such as ppc64, i386 or x86_64 are not so constrained. + */ + +#include +#include +#include +#include +#include + +#define FILE_NAME "huge/hugepagefile" +#define LENGTH (256UL*1024*1024) +#define PROTECTION (PROT_READ | PROT_WRITE) + +/* Only ia64 requires this */ +#ifdef __ia64__ +#define ADDR (void *)(0x8000000000000000UL) +#define FLAGS (MAP_SHARED | MAP_FIXED) +#else +#define ADDR (void *)(0x0UL) +#define FLAGS (MAP_SHARED) +#endif + +static void check_bytes(char *addr) +{ + printf("First hex is %x\n", *((unsigned int *)addr)); +} + +static void write_bytes(char *addr) +{ + unsigned long i; + + for (i = 0; i < LENGTH; i++) + *(addr + i) = (char)i; +} + +static int read_bytes(char *addr) +{ + unsigned long i; + + check_bytes(addr); + for (i = 0; i < LENGTH; i++) + if (*(addr + i) != (char)i) { + printf("Mismatch at %lu\n", i); + return 1; + } + return 0; +} + +int main(void) +{ + void *addr; + int fd, ret; + + fd = open(FILE_NAME, O_CREAT | O_RDWR, 0755); + if (fd < 0) { + perror("Open failed"); + exit(1); + } + + addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, fd, 0); + if (addr == MAP_FAILED) { + perror("mmap"); + unlink(FILE_NAME); + exit(1); + } + + printf("Returned address is %p\n", addr); + check_bytes(addr); + write_bytes(addr); + ret = read_bytes(addr); + + munmap(addr, LENGTH); + close(fd); + unlink(FILE_NAME); + + return ret; +} diff --git a/tools/testing/selftests/vm/hugepage-shm.c b/tools/testing/selftests/vm/hugepage-shm.c new file mode 100644 index 000000000000..0d0ef4fc0c04 --- /dev/null +++ b/tools/testing/selftests/vm/hugepage-shm.c @@ -0,0 +1,100 @@ +/* + * hugepage-shm: + * + * Example of using huge page memory in a user application using Sys V shared + * memory system calls. In this example the app is requesting 256MB of + * memory that is backed by huge pages. The application uses the flag + * SHM_HUGETLB in the shmget system call to inform the kernel that it is + * requesting huge pages. + * + * For the ia64 architecture, the Linux kernel reserves Region number 4 for + * huge pages. That means that if one requires a fixed address, a huge page + * aligned address starting with 0x800000... will be required. If a fixed + * address is not required, the kernel will select an address in the proper + * range. + * Other architectures, such as ppc64, i386 or x86_64 are not so constrained. + * + * Note: The default shared memory limit is quite low on many kernels, + * you may need to increase it via: + * + * echo 268435456 > /proc/sys/kernel/shmmax + * + * This will increase the maximum size per shared memory segment to 256MB. + * The other limit that you will hit eventually is shmall which is the + * total amount of shared memory in pages. To set it to 16GB on a system + * with a 4kB pagesize do: + * + * echo 4194304 > /proc/sys/kernel/shmall + */ + +#include +#include +#include +#include +#include +#include + +#ifndef SHM_HUGETLB +#define SHM_HUGETLB 04000 +#endif + +#define LENGTH (256UL*1024*1024) + +#define dprintf(x) printf(x) + +/* Only ia64 requires this */ +#ifdef __ia64__ +#define ADDR (void *)(0x8000000000000000UL) +#define SHMAT_FLAGS (SHM_RND) +#else +#define ADDR (void *)(0x0UL) +#define SHMAT_FLAGS (0) +#endif + +int main(void) +{ + int shmid; + unsigned long i; + char *shmaddr; + + shmid = shmget(2, LENGTH, SHM_HUGETLB | IPC_CREAT | SHM_R | SHM_W); + if (shmid < 0) { + perror("shmget"); + exit(1); + } + printf("shmid: 0x%x\n", shmid); + + shmaddr = shmat(shmid, ADDR, SHMAT_FLAGS); + if (shmaddr == (char *)-1) { + perror("Shared memory attach failure"); + shmctl(shmid, IPC_RMID, NULL); + exit(2); + } + printf("shmaddr: %p\n", shmaddr); + + dprintf("Starting the writes:\n"); + for (i = 0; i < LENGTH; i++) { + shmaddr[i] = (char)(i); + if (!(i % (1024 * 1024))) + dprintf("."); + } + dprintf("\n"); + + dprintf("Starting the Check..."); + for (i = 0; i < LENGTH; i++) + if (shmaddr[i] != (char)i) { + printf("\nIndex %lu mismatched\n", i); + exit(3); + } + dprintf("Done.\n"); + + if (shmdt((const void *)shmaddr) != 0) { + perror("Detach failure"); + shmctl(shmid, IPC_RMID, NULL); + exit(4); + } + + shmctl(shmid, IPC_RMID, NULL); + + return 0; +} diff --git a/tools/testing/selftests/vm/map_hugetlb.c b/tools/testing/selftests/vm/map_hugetlb.c new file mode 100644 index 000000000000..ac56639dd4a9 --- /dev/null +++ b/tools/testing/selftests/vm/map_hugetlb.c @@ -0,0 +1,79 @@ +/* + * Example of using hugepage memory in a user application using the mmap + * system call with MAP_HUGETLB flag. Before running this program make + * sure the administrator has allocated enough default sized huge pages + * to cover the 256 MB allocation. + * + * For ia64 architecture, Linux kernel reserves Region number 4 for hugepages. + * That means the addresses starting with 0x800000... will need to be + * specified. Specifying a fixed address is not required on ppc64, i386 + * or x86_64. + */ +#include +#include +#include +#include +#include + +#define LENGTH (256UL*1024*1024) +#define PROTECTION (PROT_READ | PROT_WRITE) + +#ifndef MAP_HUGETLB +#define MAP_HUGETLB 0x40000 /* arch specific */ +#endif + +/* Only ia64 requires this */ +#ifdef __ia64__ +#define ADDR (void *)(0x8000000000000000UL) +#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB | MAP_FIXED) +#else +#define ADDR (void *)(0x0UL) +#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB) +#endif + +static void check_bytes(char *addr) +{ + printf("First hex is %x\n", *((unsigned int *)addr)); +} + +static void write_bytes(char *addr) +{ + unsigned long i; + + for (i = 0; i < LENGTH; i++) + *(addr + i) = (char)i; +} + +static int read_bytes(char *addr) +{ + unsigned long i; + + check_bytes(addr); + for (i = 0; i < LENGTH; i++) + if (*(addr + i) != (char)i) { + printf("Mismatch at %lu\n", i); + return 1; + } + return 0; +} + +int main(void) +{ + void *addr; + int ret; + + addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0); + if (addr == MAP_FAILED) { + perror("mmap"); + exit(1); + } + + printf("Returned address is %p\n", addr); + check_bytes(addr); + write_bytes(addr); + ret = read_bytes(addr); + + munmap(addr, LENGTH); + + return ret; +} diff --git a/tools/testing/selftests/vm/run_vmtests b/tools/testing/selftests/vm/run_vmtests new file mode 100644 index 000000000000..8b40bd5e5cc2 --- /dev/null +++ b/tools/testing/selftests/vm/run_vmtests @@ -0,0 +1,77 @@ +#!/bin/bash +#please run as root + +#we need 256M, below is the size in kB +needmem=262144 +mnt=./huge + +#get pagesize and freepages from /proc/meminfo +while read name size unit; do + if [ "$name" = "HugePages_Free:" ]; then + freepgs=$size + fi + if [ "$name" = "Hugepagesize:" ]; then + pgsize=$size + fi +done < /proc/meminfo + +#set proper nr_hugepages +if [ -n "$freepgs" ] && [ -n "$pgsize" ]; then + nr_hugepgs=`cat /proc/sys/vm/nr_hugepages` + needpgs=`expr $needmem / $pgsize` + if [ $freepgs -lt $needpgs ]; then + lackpgs=$(( $needpgs - $freepgs )) + echo $(( $lackpgs + $nr_hugepgs )) > /proc/sys/vm/nr_hugepages + if [ $? -ne 0 ]; then + echo "Please run this test as root" + exit 1 + fi + fi +else + echo "no hugetlbfs support in kernel?" + exit 1 +fi + +mkdir $mnt +mount -t hugetlbfs none $mnt + +echo "--------------------" +echo "runing hugepage-mmap" +echo "--------------------" +./hugepage-mmap +if [ $? -ne 0 ]; then + echo "[FAIL]" +else + echo "[PASS]" +fi + +shmmax=`cat /proc/sys/kernel/shmmax` +shmall=`cat /proc/sys/kernel/shmall` +echo 268435456 > /proc/sys/kernel/shmmax +echo 4194304 > /proc/sys/kernel/shmall +echo "--------------------" +echo "runing hugepage-shm" +echo "--------------------" +./hugepage-shm +if [ $? -ne 0 ]; then + echo "[FAIL]" +else + echo "[PASS]" +fi +echo $shmmax > /proc/sys/kernel/shmmax +echo $shmall > /proc/sys/kernel/shmall + +echo "--------------------" +echo "runing map_hugetlb" +echo "--------------------" +./map_hugetlb +if [ $? -ne 0 ]; then + echo "[FAIL]" +else + echo "[PASS]" +fi + +#cleanup +umount $mnt +rm -rf $mnt +echo $nr_hugepgs > /proc/sys/vm/nr_hugepages -- cgit v1.2.3 From 5f054e31c63be774bf1ce252f20d56012a00f8a5 Mon Sep 17 00:00:00 2001 From: Rusty Russell Date: Thu, 29 Mar 2012 15:38:31 +1030 Subject: documentation: remove references to cpu_*_map. This has been obsolescent for a while, fix documentation and misc comments. Signed-off-by: Rusty Russell --- Documentation/cgroups/cpusets.txt | 2 +- Documentation/cpu-hotplug.txt | 22 +++++++++++----------- arch/alpha/kernel/smp.c | 2 +- arch/ia64/kernel/acpi.c | 2 +- arch/mips/cavium-octeon/smp.c | 2 +- arch/mips/pmc-sierra/yosemite/smp.c | 2 +- arch/mips/sibyte/bcm1480/smp.c | 2 +- arch/tile/kernel/setup.c | 2 +- arch/x86/xen/enlighten.c | 2 +- init/Kconfig | 4 ++-- kernel/cpuset.c | 10 +++++----- 11 files changed, 26 insertions(+), 26 deletions(-) (limited to 'Documentation') diff --git a/Documentation/cgroups/cpusets.txt b/Documentation/cgroups/cpusets.txt index 5c51ed406d1d..cefd3d8bbd11 100644 --- a/Documentation/cgroups/cpusets.txt +++ b/Documentation/cgroups/cpusets.txt @@ -217,7 +217,7 @@ and name space for cpusets, with a minimum of additional kernel code. The cpus and mems files in the root (top_cpuset) cpuset are read-only. The cpus file automatically tracks the value of -cpu_online_map using a CPU hotplug notifier, and the mems file +cpu_online_mask using a CPU hotplug notifier, and the mems file automatically tracks the value of node_states[N_HIGH_MEMORY]--i.e., nodes with memory--using the cpuset_track_online_nodes() hook. diff --git a/Documentation/cpu-hotplug.txt b/Documentation/cpu-hotplug.txt index a20bfd415e41..66ef8f35613d 100644 --- a/Documentation/cpu-hotplug.txt +++ b/Documentation/cpu-hotplug.txt @@ -47,7 +47,7 @@ maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using other cpus later online, read FAQ's for more info. additional_cpus=n (*) Use this to limit hotpluggable cpus. This option sets - cpu_possible_map = cpu_present_map + additional_cpus + cpu_possible_mask = cpu_present_mask + additional_cpus cede_offline={"off","on"} Use this option to disable/enable putting offlined processors to an extended H_CEDE state on @@ -64,11 +64,11 @@ should only rely on this to count the # of cpus, but *MUST* not rely on the apicid values in those tables for disabled apics. In the event BIOS doesn't mark such hot-pluggable cpus as disabled entries, one could use this parameter "additional_cpus=x" to represent those cpus in the -cpu_possible_map. +cpu_possible_mask. possible_cpus=n [s390,x86_64] use this to set hotpluggable cpus. This option sets possible_cpus bits in - cpu_possible_map. Thus keeping the numbers of bits set + cpu_possible_mask. Thus keeping the numbers of bits set constant even if the machine gets rebooted. CPU maps and such @@ -76,7 +76,7 @@ CPU maps and such [More on cpumaps and primitive to manipulate, please check include/linux/cpumask.h that has more descriptive text.] -cpu_possible_map: Bitmap of possible CPUs that can ever be available in the +cpu_possible_mask: Bitmap of possible CPUs that can ever be available in the system. This is used to allocate some boot time memory for per_cpu variables that aren't designed to grow/shrink as CPUs are made available or removed. Once set during boot time discovery phase, the map is static, i.e no bits @@ -84,13 +84,13 @@ are added or removed anytime. Trimming it accurately for your system needs upfront can save some boot time memory. See below for how we use heuristics in x86_64 case to keep this under check. -cpu_online_map: Bitmap of all CPUs currently online. Its set in __cpu_up() +cpu_online_mask: Bitmap of all CPUs currently online. Its set in __cpu_up() after a cpu is available for kernel scheduling and ready to receive interrupts from devices. Its cleared when a cpu is brought down using __cpu_disable(), before which all OS services including interrupts are migrated to another target CPU. -cpu_present_map: Bitmap of CPUs currently present in the system. Not all +cpu_present_mask: Bitmap of CPUs currently present in the system. Not all of them may be online. When physical hotplug is processed by the relevant subsystem (e.g ACPI) can change and new bit either be added or removed from the map depending on the event is hot-add/hot-remove. There are currently @@ -99,22 +99,22 @@ at which time hotplug is disabled. You really dont need to manipulate any of the system cpu maps. They should be read-only for most use. When setting up per-cpu resources almost always use -cpu_possible_map/for_each_possible_cpu() to iterate. +cpu_possible_mask/for_each_possible_cpu() to iterate. Never use anything other than cpumask_t to represent bitmap of CPUs. #include - for_each_possible_cpu - Iterate over cpu_possible_map - for_each_online_cpu - Iterate over cpu_online_map - for_each_present_cpu - Iterate over cpu_present_map + for_each_possible_cpu - Iterate over cpu_possible_mask + for_each_online_cpu - Iterate over cpu_online_mask + for_each_present_cpu - Iterate over cpu_present_mask for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask. #include get_online_cpus() and put_online_cpus(): The above calls are used to inhibit cpu hotplug operations. While the -cpu_hotplug.refcount is non zero, the cpu_online_map will not change. +cpu_hotplug.refcount is non zero, the cpu_online_mask will not change. If you merely need to avoid cpus going away, you could also use preempt_disable() and preempt_enable() for those sections. Just remember the critical section cannot call any diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c index 4087a569b43b..50d438db1f6b 100644 --- a/arch/alpha/kernel/smp.c +++ b/arch/alpha/kernel/smp.c @@ -450,7 +450,7 @@ setup_smp(void) smp_num_probed = 1; } - printk(KERN_INFO "SMP: %d CPUs probed -- cpu_present_map = %lx\n", + printk(KERN_INFO "SMP: %d CPUs probed -- cpu_present_mask = %lx\n", smp_num_probed, cpumask_bits(cpu_present_mask)[0]); } diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c index ac795d311f44..6f38b6120d96 100644 --- a/arch/ia64/kernel/acpi.c +++ b/arch/ia64/kernel/acpi.c @@ -839,7 +839,7 @@ static __init int setup_additional_cpus(char *s) early_param("additional_cpus", setup_additional_cpus); /* - * cpu_possible_map should be static, it cannot change as CPUs + * cpu_possible_mask should be static, it cannot change as CPUs * are onlined, or offlined. The reason is per-cpu data-structures * are allocated by some modules at init time, and dont expect to * do this dynamically on cpu arrival/departure. diff --git a/arch/mips/cavium-octeon/smp.c b/arch/mips/cavium-octeon/smp.c index ef56d8a0215f..97e7ce9b50ed 100644 --- a/arch/mips/cavium-octeon/smp.c +++ b/arch/mips/cavium-octeon/smp.c @@ -78,7 +78,7 @@ static inline void octeon_send_ipi_mask(const struct cpumask *mask, } /** - * Detect available CPUs, populate cpu_possible_map + * Detect available CPUs, populate cpu_possible_mask */ static void octeon_smp_hotplug_setup(void) { diff --git a/arch/mips/pmc-sierra/yosemite/smp.c b/arch/mips/pmc-sierra/yosemite/smp.c index 00a0d2e90d04..b71fae231049 100644 --- a/arch/mips/pmc-sierra/yosemite/smp.c +++ b/arch/mips/pmc-sierra/yosemite/smp.c @@ -146,7 +146,7 @@ static void __cpuinit yos_boot_secondary(int cpu, struct task_struct *idle) } /* - * Detect available CPUs, populate cpu_possible_map before smp_init + * Detect available CPUs, populate cpu_possible_mask before smp_init * * We don't want to start the secondary CPU yet nor do we have a nice probing * feature in PMON so we just assume presence of the secondary core. diff --git a/arch/mips/sibyte/bcm1480/smp.c b/arch/mips/sibyte/bcm1480/smp.c index 63d2211e6167..de88e22694a0 100644 --- a/arch/mips/sibyte/bcm1480/smp.c +++ b/arch/mips/sibyte/bcm1480/smp.c @@ -138,7 +138,7 @@ static void __cpuinit bcm1480_boot_secondary(int cpu, struct task_struct *idle) /* * Use CFE to find out how many CPUs are available, setting up - * cpu_possible_map and the logical/physical mappings. + * cpu_possible_mask and the logical/physical mappings. * XXXKW will the boot CPU ever not be physical 0? * * Common setup before any secondaries are started diff --git a/arch/tile/kernel/setup.c b/arch/tile/kernel/setup.c index 5124093b2e1d..92a94f4920ad 100644 --- a/arch/tile/kernel/setup.c +++ b/arch/tile/kernel/setup.c @@ -1100,7 +1100,7 @@ EXPORT_SYMBOL(hash_for_home_map); /* * cpu_cacheable_map lists all the cpus whose caches the hypervisor can - * flush on our behalf. It is set to cpu_possible_map OR'ed with + * flush on our behalf. It is set to cpu_possible_mask OR'ed with * hash_for_home_map, and it is what should be passed to * hv_flush_remote() to flush all caches. Note that if there are * dedicated hypervisor driver tiles that have authorized use of their diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index b132ade26f77..4f51bebac02c 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -967,7 +967,7 @@ void xen_setup_shared_info(void) xen_setup_mfn_list_list(); } -/* This is called once we have the cpu_possible_map */ +/* This is called once we have the cpu_possible_mask */ void xen_setup_vcpu_info_placement(void) { int cpu; diff --git a/init/Kconfig b/init/Kconfig index 72f33faca44f..6cfd71d06463 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1414,8 +1414,8 @@ endif # MODULES config INIT_ALL_POSSIBLE bool help - Back when each arch used to define their own cpu_online_map and - cpu_possible_map, some of them chose to initialize cpu_possible_map + Back when each arch used to define their own cpu_online_mask and + cpu_possible_mask, some of them chose to initialize cpu_possible_mask with all 1s, and others with all 0s. When they were centralised, it was better to provide this option than to break all the archs and have several arch maintainers pursuing me down dark alleys. diff --git a/kernel/cpuset.c b/kernel/cpuset.c index 1010cc61931f..eedeebe64b1a 100644 --- a/kernel/cpuset.c +++ b/kernel/cpuset.c @@ -270,11 +270,11 @@ static struct file_system_type cpuset_fs_type = { * are online. If none are online, walk up the cpuset hierarchy * until we find one that does have some online cpus. If we get * all the way to the top and still haven't found any online cpus, - * return cpu_online_map. Or if passed a NULL cs from an exit'ing - * task, return cpu_online_map. + * return cpu_online_mask. Or if passed a NULL cs from an exit'ing + * task, return cpu_online_mask. * * One way or another, we guarantee to return some non-empty subset - * of cpu_online_map. + * of cpu_online_mask. * * Call with callback_mutex held. */ @@ -867,7 +867,7 @@ static int update_cpumask(struct cpuset *cs, struct cpuset *trialcs, int retval; int is_load_balanced; - /* top_cpuset.cpus_allowed tracks cpu_online_map; it's read-only */ + /* top_cpuset.cpus_allowed tracks cpu_online_mask; it's read-only */ if (cs == &top_cpuset) return -EACCES; @@ -2149,7 +2149,7 @@ void __init cpuset_init_smp(void) * * Description: Returns the cpumask_var_t cpus_allowed of the cpuset * attached to the specified @tsk. Guaranteed to return some non-empty - * subset of cpu_online_map, even if this means going outside the + * subset of cpu_online_mask, even if this means going outside the * tasks cpuset. **/ -- cgit v1.2.3 From ec0c4274e33c0373e476b73e01995c53128f1257 Mon Sep 17 00:00:00 2001 From: Kees Cook Date: Fri, 23 Mar 2012 12:08:55 -0700 Subject: futex: Mark get_robust_list as deprecated Notify get_robust_list users that the syscall is going away. Suggested-by: Thomas Gleixner Signed-off-by: Kees Cook Cc: Randy Dunlap Cc: Darren Hart Cc: Peter Zijlstra Cc: Jiri Kosina Cc: Eric W. Biederman Cc: David Howells Cc: Serge E. Hallyn Cc: kernel-hardening@lists.openwall.com Cc: spender@grsecurity.net Link: http://lkml.kernel.org/r/20120323190855.GA27213@www.outflux.net Signed-off-by: Thomas Gleixner --- Documentation/feature-removal-schedule.txt | 10 ++++++++++ kernel/futex.c | 2 ++ kernel/futex_compat.c | 2 ++ 3 files changed, 14 insertions(+) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 0cad4803ffac..c1be8066ea59 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -529,3 +529,13 @@ When: 3.5 Why: The old kmap_atomic() with two arguments is deprecated, we only keep it for backward compatibility for few cycles and then drop it. Who: Cong Wang + +---------------------------- + +What: get_robust_list syscall +When: 2013 +Why: There appear to be no production users of the get_robust_list syscall, + and it runs the risk of leaking address locations, allowing the bypass + of ASLR. It was only ever intended for debugging, so it should be + removed. +Who: Kees Cook diff --git a/kernel/futex.c b/kernel/futex.c index d701be57c423..e2b0fb9a0b3b 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -2449,6 +2449,8 @@ SYSCALL_DEFINE3(get_robust_list, int, pid, if (!futex_cmpxchg_enabled) return -ENOSYS; + WARN_ONCE(1, "deprecated: get_robust_list will be deleted in 2013.\n"); + rcu_read_lock(); ret = -ESRCH; diff --git a/kernel/futex_compat.c b/kernel/futex_compat.c index a9642d528630..83e368b005fc 100644 --- a/kernel/futex_compat.c +++ b/kernel/futex_compat.c @@ -142,6 +142,8 @@ compat_sys_get_robust_list(int pid, compat_uptr_t __user *head_ptr, if (!futex_cmpxchg_enabled) return -ENOSYS; + WARN_ONCE(1, "deprecated: get_robust_list will be deleted in 2013.\n"); + rcu_read_lock(); ret = -ESRCH; -- cgit v1.2.3 From 31134efc681a5440e2b952eed3bf9a5306a95062 Mon Sep 17 00:00:00 2001 From: Grant Likely Date: Fri, 4 Nov 2011 11:51:22 -0400 Subject: dt: Linux DT usage model documentation v2: 2nd draft - Editorial cleanups (Randy Dunlap and Stephen Warren) - Added missing Microblaze reference (Stephen Neuendorffer) - Make example of platform_device creation clearer (Shawn Guo) - Expand on PowerPC history and mention i2c mess (David Gibson) - convert to plain text (remove bits of html formating) Signed-off-by: Grant Likely --- Documentation/devicetree/usage-model.txt | 412 +++++++++++++++++++++++++++++++ 1 file changed, 412 insertions(+) create mode 100644 Documentation/devicetree/usage-model.txt (limited to 'Documentation') diff --git a/Documentation/devicetree/usage-model.txt b/Documentation/devicetree/usage-model.txt new file mode 100644 index 000000000000..c5a80099b71c --- /dev/null +++ b/Documentation/devicetree/usage-model.txt @@ -0,0 +1,412 @@ +Linux and the Device Tree +------------------------- +The Linux usage model for device tree data + +Author: Grant Likely + +This article describes how Linux uses the device tree. An overview of +the device tree data format can be found on the device tree usage page +at devicetree.org[1]. + +[1] http://devicetree.org/Device_Tree_Usage + +The "Open Firmware Device Tree", or simply Device Tree (DT), is a data +structure and language for describing hardware. More specifically, it +is a description of hardware that is readable by an operating system +so that the operating system doesn't need to hard code details of the +machine. + +Structurally, the DT is a tree, or acyclic graph with named nodes, and +nodes may have an arbitrary number of named properties encapsulating +arbitrary data. A mechanism also exists to create arbitrary +links from one node to another outside of the natural tree structure. + +Conceptually, a common set of usage conventions, called 'bindings', +is defined for how data should appear in the tree to describe typical +hardware characteristics including data busses, interrupt lines, GPIO +connections, and peripheral devices. + +As much as possible, hardware is described using existing bindings to +maximize use of existing support code, but since property and node +names are simply text strings, it is easy to extend existing bindings +or create new ones by defining new nodes and properties. Be wary, +however, of creating a new binding without first doing some homework +about what already exists. There are currently two different, +incompatible, bindings for i2c busses that came about because the new +binding was created without first investigating how i2c devices were +already being enumerated in existing systems. + +1. History +---------- +The DT was originally created by Open Firmware as part of the +communication method for passing data from Open Firmware to a client +program (like to an operating system). An operating system used the +Device Tree to discover the topology of the hardware at runtime, and +thereby support a majority of available hardware without hard coded +information (assuming drivers were available for all devices). + +Since Open Firmware is commonly used on PowerPC and SPARC platforms, +the Linux support for those architectures has for a long time used the +Device Tree. + +In 2005, when PowerPC Linux began a major cleanup and to merge 32-bit +and 64-bit support, the decision was made to require DT support on all +powerpc platforms, regardless of whether or not they used Open +Firmware. To do this, a DT representation called the Flattened Device +Tree (FDT) was created which could be passed to the kernel as a binary +blob without requiring a real Open Firmware implementation. U-Boot, +kexec, and other bootloaders were modified to support both passing a +Device Tree Binary (dtb) and to modify a dtb at boot time. DT was +also added to the PowerPC boot wrapper (arch/powerpc/boot/*) so that +a dtb could be wrapped up with the kernel image to support booting +existing non-DT aware firmware. + +Some time later, FDT infrastructure was generalized to be usable by +all architectures. At the time of this writing, 6 mainlined +architectures (arm, microblaze, mips, powerpc, sparc, and x86) and 1 +out of mainline (nios) have some level of DT support. + +2. Data Model +------------- +If you haven't already read the Device Tree Usage[1] page, +then go read it now. It's okay, I'll wait.... + +2.1 High Level View +------------------- +The most important thing to understand is that the DT is simply a data +structure that describes the hardware. There is nothing magical about +it, and it doesn't magically make all hardware configuration problems +go away. What it does do is provide a language for decoupling the +hardware configuration from the board and device driver support in the +Linux kernel (or any other operating system for that matter). Using +it allows board and device support to become data driven; to make +setup decisions based on data passed into the kernel instead of on +per-machine hard coded selections. + +Ideally, data driven platform setup should result in less code +duplication and make it easier to support a wide range of hardware +with a single kernel image. + +Linux uses DT data for three major purposes: +1) platform identification, +2) runtime configuration, and +3) device population. + +2.2 Platform Identification +--------------------------- +First and foremost, the kernel will use data in the DT to identify the +specific machine. In a perfect world, the specific platform shouldn't +matter to the kernel because all platform details would be described +perfectly by the device tree in a consistent and reliable manner. +Hardware is not perfect though, and so the kernel must identify the +machine during early boot so that it has the opportunity to run +machine-specific fixups. + +In the majority of cases, the machine identity is irrelevant, and the +kernel will instead select setup code based on the machine's core +CPU or SoC. On ARM for example, setup_arch() in +arch/arm/kernel/setup.c will call setup_machine_fdt() in +arch/arm/kernel/devicetree.c which searches through the machine_desc +table and selects the machine_desc which best matches the device tree +data. It determines the best match by looking at the 'compatible' +property in the root device tree node, and comparing it with the +dt_compat list in struct machine_desc. + +The 'compatible' property contains a sorted list of strings starting +with the exact name of the machine, followed by an optional list of +boards it is compatible with sorted from most compatible to least. For +example, the root compatible properties for the TI BeagleBoard and its +successor, the BeagleBoard xM board might look like: + + compatible = "ti,omap3-beagleboard", "ti,omap3450", "ti,omap3"; + compatible = "ti,omap3-beagleboard-xm", "ti,omap3450", "ti,omap3"; + +Where "ti,omap3-beagleboard-xm" specifies the exact model, it also +claims that it compatible with the OMAP 3450 SoC, and the omap3 family +of SoCs in general. You'll notice that the list is sorted from most +specific (exact board) to least specific (SoC family). + +Astute readers might point out that the Beagle xM could also claim +compatibility with the original Beagle board. However, one should be +cautioned about doing so at the board level since there is typically a +high level of change from one board to another, even within the same +product line, and it is hard to nail down exactly what is meant when one +board claims to be compatible with another. For the top level, it is +better to err on the side of caution and not claim one board is +compatible with another. The notable exception would be when one +board is a carrier for another, such as a CPU module attached to a +carrier board. + +One more note on compatible values. Any string used in a compatible +property must be documented as to what it indicates. Add +documentation for compatible strings in Documentation/devicetree/bindings. + +Again on ARM, for each machine_desc, the kernel looks to see if +any of the dt_compat list entries appear in the compatible property. +If one does, then that machine_desc is a candidate for driving the +machine. After searching the entire table of machine_descs, +setup_machine_fdt() returns the 'most compatible' machine_desc based +on which entry in the compatible property each machine_desc matches +against. If no matching machine_desc is found, then it returns NULL. + +The reasoning behind this scheme is the observation that in the majority +of cases, a single machine_desc can support a large number of boards +if they all use the same SoC, or same family of SoCs. However, +invariably there will be some exceptions where a specific board will +require special setup code that is not useful in the generic case. +Special cases could be handled by explicitly checking for the +troublesome board(s) in generic setup code, but doing so very quickly +becomes ugly and/or unmaintainable if it is more than just a couple of +cases. + +Instead, the compatible list allows a generic machine_desc to provide +support for a wide common set of boards by specifying "less +compatible" value in the dt_compat list. In the example above, +generic board support can claim compatibility with "ti,omap3" or +"ti,omap3450". If a bug was discovered on the original beagleboard +that required special workaround code during early boot, then a new +machine_desc could be added which implements the workarounds and only +matches on "ti,omap3-beagleboard". + +PowerPC uses a slightly different scheme where it calls the .probe() +hook from each machine_desc, and the first one returning TRUE is used. +However, this approach does not take into account the priority of the +compatible list, and probably should be avoided for new architecture +support. + +2.3 Runtime configuration +------------------------- +In most cases, a DT will be the sole method of communicating data from +firmware to the kernel, so also gets used to pass in runtime and +configuration data like the kernel parameters string and the location +of an initrd image. + +Most of this data is contained in the /chosen node, and when booting +Linux it will look something like this: + + chosen { + bootargs = "console=ttyS0,115200 loglevel=8"; + initrd-start = <0xc8000000>; + initrd-end = <0xc8200000>; + }; + +The bootargs property contains the kernel arguments, and the initrd-* +properties define the address and size of an initrd blob. The +chosen node may also optionally contain an arbitrary number of +additional properties for platform-specific configuration data. + +During early boot, the architecture setup code calls of_scan_flat_dt() +several times with different helper callbacks to parse device tree +data before paging is setup. The of_scan_flat_dt() code scans through +the device tree and uses the helpers to extract information required +during early boot. Typically the early_init_dt_scan_chosen() helper +is used to parse the chosen node including kernel parameters, +early_init_dt_scan_root() to initialize the DT address space model, +and early_init_dt_scan_memory() to determine the size and +location of usable RAM. + +On ARM, the function setup_machine_fdt() is responsible for early +scanning of the device tree after selecting the correct machine_desc +that supports the board. + +2.4 Device population +--------------------- +After the board has been identified, and after the early configuration data +has been parsed, then kernel initialization can proceed in the normal +way. At some point in this process, unflatten_device_tree() is called +to convert the data into a more efficient runtime representation. +This is also when machine-specific setup hooks will get called, like +the machine_desc .init_early(), .init_irq() and .init_machine() hooks +on ARM. The remainder of this section uses examples from the ARM +implementation, but all architectures will do pretty much the same +thing when using a DT. + +As can be guessed by the names, .init_early() is used for any machine- +specific setup that needs to be executed early in the boot process, +and .init_irq() is used to set up interrupt handling. Using a DT +doesn't materially change the behaviour of either of these functions. +If a DT is provided, then both .init_early() and .init_irq() are able +to call any of the DT query functions (of_* in include/linux/of*.h) to +get additional data about the platform. + +The most interesting hook in the DT context is .init_machine() which +is primarily responsible for populating the Linux device model with +data about the platform. Historically this has been implemented on +embedded platforms by defining a set of static clock structures, +platform_devices, and other data in the board support .c file, and +registering it en-masse in .init_machine(). When DT is used, then +instead of hard coding static devices for each platform, the list of +devices can be obtained by parsing the DT, and allocating device +structures dynamically. + +The simplest case is when .init_machine() is only responsible for +registering a block of platform_devices. A platform_device is a concept +used by Linux for memory or I/O mapped devices which cannot be detected +by hardware, and for 'composite' or 'virtual' devices (more on those +later). While there is no 'platform device' terminology for the DT, +platform devices roughly correspond to device nodes at the root of the +tree and children of simple memory mapped bus nodes. + +About now is a good time to lay out an example. Here is part of the +device tree for the NVIDIA Tegra board. + +/{ + compatible = "nvidia,harmony", "nvidia,tegra20"; + #address-cells = <1>; + #size-cells = <1>; + interrupt-parent = <&intc>; + + chosen { }; + aliases { }; + + memory { + device_type = "memory"; + reg = <0x00000000 0x40000000>; + }; + + soc { + compatible = "nvidia,tegra20-soc", "simple-bus"; + #address-cells = <1>; + #size-cells = <1>; + ranges; + + intc: interrupt-controller@50041000 { + compatible = "nvidia,tegra20-gic"; + interrupt-controller; + #interrupt-cells = <1>; + reg = <0x50041000 0x1000>, < 0x50040100 0x0100 >; + }; + + serial@70006300 { + compatible = "nvidia,tegra20-uart"; + reg = <0x70006300 0x100>; + interrupts = <122>; + }; + + i2s1: i2s@70002800 { + compatible = "nvidia,tegra20-i2s"; + reg = <0x70002800 0x100>; + interrupts = <77>; + codec = <&wm8903>; + }; + + i2c@7000c000 { + compatible = "nvidia,tegra20-i2c"; + #address-cells = <1>; + #size-cells = <0>; + reg = <0x7000c000 0x100>; + interrupts = <70>; + + wm8903: codec@1a { + compatible = "wlf,wm8903"; + reg = <0x1a>; + interrupts = <347>; + }; + }; + }; + + sound { + compatible = "nvidia,harmony-sound"; + i2s-controller = <&i2s1>; + i2s-codec = <&wm8903>; + }; +}; + +At .machine_init() time, Tegra board support code will need to look at +this DT and decide which nodes to create platform_devices for. +However, looking at the tree, it is not immediately obvious what kind +of device each node represents, or even if a node represents a device +at all. The /chosen, /aliases, and /memory nodes are informational +nodes that don't describe devices (although arguably memory could be +considered a device). The children of the /soc node are memory mapped +devices, but the codec@1a is an i2c device, and the sound node +represents not a device, but rather how other devices are connected +together to create the audio subsystem. I know what each device is +because I'm familiar with the board design, but how does the kernel +know what to do with each node? + +The trick is that the kernel starts at the root of the tree and looks +for nodes that have a 'compatible' property. First, it is generally +assumed that any node with a 'compatible' property represents a device +of some kind, and second, it can be assumed that any node at the root +of the tree is either directly attached to the processor bus, or is a +miscellaneous system device that cannot be described any other way. +For each of these nodes, Linux allocates and registers a +platform_device, which in turn may get bound to a platform_driver. + +Why is using a platform_device for these nodes a safe assumption? +Well, for the way that Linux models devices, just about all bus_types +assume that its devices are children of a bus controller. For +example, each i2c_client is a child of an i2c_master. Each spi_device +is a child of an SPI bus. Similarly for USB, PCI, MDIO, etc. The +same hierarchy is also found in the DT, where I2C device nodes only +ever appear as children of an I2C bus node. Ditto for SPI, MDIO, USB, +etc. The only devices which do not require a specific type of parent +device are platform_devices (and amba_devices, but more on that +later), which will happily live at the base of the Linux /sys/devices +tree. Therefore, if a DT node is at the root of the tree, then it +really probably is best registered as a platform_device. + +Linux board support code calls of_platform_populate(NULL, NULL, NULL) +to kick off discovery of devices at the root of the tree. The +parameters are all NULL because when starting from the root of the +tree, there is no need to provide a starting node (the first NULL), a +parent struct device (the last NULL), and we're not using a match +table (yet). For a board that only needs to register devices, +.init_machine() can be completely empty except for the +of_platform_populate() call. + +In the Tegra example, this accounts for the /soc and /sound nodes, but +what about the children of the SoC node? Shouldn't they be registered +as platform devices too? For Linux DT support, the generic behaviour +is for child devices to be registered by the parent's device driver at +driver .probe() time. So, an i2c bus device driver will register a +i2c_client for each child node, an SPI bus driver will register +its spi_device children, and similarly for other bus_types. +According to that model, a driver could be written that binds to the +SoC node and simply registers platform_devices for each of its +children. The board support code would allocate and register an SoC +device, a (theoretical) SoC device driver could bind to the SoC device, +and register platform_devices for /soc/interrupt-controller, /soc/serial, +/soc/i2s, and /soc/i2c in its .probe() hook. Easy, right? + +Actually, it turns out that registering children of some +platform_devices as more platform_devices is a common pattern, and the +device tree support code reflects that and makes the above example +simpler. The second argument to of_platform_populate() is an +of_device_id table, and any node that matches an entry in that table +will also get its child nodes registered. In the tegra case, the code +can look something like this: + +static void __init harmony_init_machine(void) +{ + /* ... */ + of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL); +} + +"simple-bus" is defined in the ePAPR 1.0 specification as a property +meaning a simple memory mapped bus, so the of_platform_populate() code +could be written to just assume simple-bus compatible nodes will +always be traversed. However, we pass it in as an argument so that +board support code can always override the default behaviour. + +[Need to add discussion of adding i2c/spi/etc child devices] + +Appendix A: AMBA devices +------------------------ + +ARM Primecells are a certain kind of device attached to the ARM AMBA +bus which include some support for hardware detection and power +management. In Linux, struct amba_device and the amba_bus_type is +used to represent Primecell devices. However, the fiddly bit is that +not all devices on an AMBA bus are Primecells, and for Linux it is +typical for both amba_device and platform_device instances to be +siblings of the same bus segment. + +When using the DT, this creates problems for of_platform_populate() +because it must decide whether to register each node as either a +platform_device or an amba_device. This unfortunately complicates the +device creation model a little bit, but the solution turns out not to +be too invasive. If a node is compatible with "arm,amba-primecell", then +of_platform_populate() will register it as an amba_device instead of a +platform_device. -- cgit v1.2.3 From 3a53396b0381ec9d5180fd8fe7a681c8ce95fd9a Mon Sep 17 00:00:00 2001 From: ShuoX Liu Date: Wed, 28 Mar 2012 15:19:11 -0700 Subject: cpuidle: add a sysfs entry to disable specific C state for debug purpose. Some C states of new CPU might be not good. One reason is BIOS might configure them incorrectly. To help developers root cause it quickly, the patch adds a new sysfs entry, so developers could disable specific C state manually. In addition, C state might have much impact on performance tuning, as it takes much time to enter/exit C states, which might delay interrupt processing. With the new debug option, developers could check if a deep C state could impact performance and how much impact it could cause. Also add this option in Documentation/cpuidle/sysfs.txt. [akpm@linux-foundation.org: check kstrtol return value] Signed-off-by: ShuoX Liu Reviewed-by: Yanmin Zhang Reviewed-and-Tested-by: Deepthi Dharwar Signed-off-by: Andrew Morton Signed-off-by: Len Brown --- Documentation/cpuidle/sysfs.txt | 5 +++++ drivers/cpuidle/cpuidle.c | 1 + drivers/cpuidle/governors/menu.c | 5 ++++- drivers/cpuidle/sysfs.c | 40 ++++++++++++++++++++++++++++++++++++++++ include/linux/cpuidle.h | 1 + 5 files changed, 51 insertions(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/cpuidle/sysfs.txt b/Documentation/cpuidle/sysfs.txt index 50d7b1642759..9d28a3406e74 100644 --- a/Documentation/cpuidle/sysfs.txt +++ b/Documentation/cpuidle/sysfs.txt @@ -36,6 +36,7 @@ drwxr-xr-x 2 root root 0 Feb 8 10:42 state3 /sys/devices/system/cpu/cpu0/cpuidle/state0: total 0 -r--r--r-- 1 root root 4096 Feb 8 10:42 desc +-rw-r--r-- 1 root root 4096 Feb 8 10:42 disable -r--r--r-- 1 root root 4096 Feb 8 10:42 latency -r--r--r-- 1 root root 4096 Feb 8 10:42 name -r--r--r-- 1 root root 4096 Feb 8 10:42 power @@ -45,6 +46,7 @@ total 0 /sys/devices/system/cpu/cpu0/cpuidle/state1: total 0 -r--r--r-- 1 root root 4096 Feb 8 10:42 desc +-rw-r--r-- 1 root root 4096 Feb 8 10:42 disable -r--r--r-- 1 root root 4096 Feb 8 10:42 latency -r--r--r-- 1 root root 4096 Feb 8 10:42 name -r--r--r-- 1 root root 4096 Feb 8 10:42 power @@ -54,6 +56,7 @@ total 0 /sys/devices/system/cpu/cpu0/cpuidle/state2: total 0 -r--r--r-- 1 root root 4096 Feb 8 10:42 desc +-rw-r--r-- 1 root root 4096 Feb 8 10:42 disable -r--r--r-- 1 root root 4096 Feb 8 10:42 latency -r--r--r-- 1 root root 4096 Feb 8 10:42 name -r--r--r-- 1 root root 4096 Feb 8 10:42 power @@ -63,6 +66,7 @@ total 0 /sys/devices/system/cpu/cpu0/cpuidle/state3: total 0 -r--r--r-- 1 root root 4096 Feb 8 10:42 desc +-rw-r--r-- 1 root root 4096 Feb 8 10:42 disable -r--r--r-- 1 root root 4096 Feb 8 10:42 latency -r--r--r-- 1 root root 4096 Feb 8 10:42 name -r--r--r-- 1 root root 4096 Feb 8 10:42 power @@ -72,6 +76,7 @@ total 0 * desc : Small description about the idle state (string) +* disable : Option to disable this idle state (bool) * latency : Latency to exit out of this idle state (in microseconds) * name : Name of the idle state (string) * power : Power consumed while in this idle state (in milliwatts) diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 4869b5500234..77304b6b8aef 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -245,6 +245,7 @@ static void poll_idle_init(struct cpuidle_driver *drv) state->power_usage = -1; state->flags = 0; state->enter = poll_idle; + state->disable = 0; } #else static void poll_idle_init(struct cpuidle_driver *drv) {} diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c index ad0952601ae2..5c17ca112fc2 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -280,7 +280,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) * We want to default to C1 (hlt), not to busy polling * unless the timer is happening really really soon. */ - if (data->expected_us > 5) + if (data->expected_us > 5 && + drv->states[CPUIDLE_DRIVER_STATE_START].disable == 0) data->last_state_idx = CPUIDLE_DRIVER_STATE_START; /* @@ -290,6 +291,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev) for (i = CPUIDLE_DRIVER_STATE_START; i < drv->state_count; i++) { struct cpuidle_state *s = &drv->states[i]; + if (s->disable) + continue; if (s->target_residency > data->predicted_us) continue; if (s->exit_latency > latency_req) diff --git a/drivers/cpuidle/sysfs.c b/drivers/cpuidle/sysfs.c index 3fe41fe4851a..88032b4dc6d2 100644 --- a/drivers/cpuidle/sysfs.c +++ b/drivers/cpuidle/sysfs.c @@ -11,6 +11,7 @@ #include #include #include +#include #include "cpuidle.h" @@ -222,6 +223,9 @@ struct cpuidle_state_attr { #define define_one_state_ro(_name, show) \ static struct cpuidle_state_attr attr_##_name = __ATTR(_name, 0444, show, NULL) +#define define_one_state_rw(_name, show, store) \ +static struct cpuidle_state_attr attr_##_name = __ATTR(_name, 0644, show, store) + #define define_show_state_function(_name) \ static ssize_t show_state_##_name(struct cpuidle_state *state, \ struct cpuidle_state_usage *state_usage, char *buf) \ @@ -229,6 +233,24 @@ static ssize_t show_state_##_name(struct cpuidle_state *state, \ return sprintf(buf, "%u\n", state->_name);\ } +#define define_store_state_function(_name) \ +static ssize_t store_state_##_name(struct cpuidle_state *state, \ + const char *buf, size_t size) \ +{ \ + long value; \ + int err; \ + if (!capable(CAP_SYS_ADMIN)) \ + return -EPERM; \ + err = kstrtol(buf, 0, &value); \ + if (err) \ + return err; \ + if (value) \ + state->disable = 1; \ + else \ + state->disable = 0; \ + return size; \ +} + #define define_show_state_ull_function(_name) \ static ssize_t show_state_##_name(struct cpuidle_state *state, \ struct cpuidle_state_usage *state_usage, char *buf) \ @@ -251,6 +273,8 @@ define_show_state_ull_function(usage) define_show_state_ull_function(time) define_show_state_str_function(name) define_show_state_str_function(desc) +define_show_state_function(disable) +define_store_state_function(disable) define_one_state_ro(name, show_state_name); define_one_state_ro(desc, show_state_desc); @@ -258,6 +282,7 @@ define_one_state_ro(latency, show_state_exit_latency); define_one_state_ro(power, show_state_power_usage); define_one_state_ro(usage, show_state_usage); define_one_state_ro(time, show_state_time); +define_one_state_rw(disable, show_state_disable, store_state_disable); static struct attribute *cpuidle_state_default_attrs[] = { &attr_name.attr, @@ -266,6 +291,7 @@ static struct attribute *cpuidle_state_default_attrs[] = { &attr_power.attr, &attr_usage.attr, &attr_time.attr, + &attr_disable.attr, NULL }; @@ -287,8 +313,22 @@ static ssize_t cpuidle_state_show(struct kobject * kobj, return ret; } +static ssize_t cpuidle_state_store(struct kobject *kobj, + struct attribute *attr, const char *buf, size_t size) +{ + int ret = -EIO; + struct cpuidle_state *state = kobj_to_state(kobj); + struct cpuidle_state_attr *cattr = attr_to_stateattr(attr); + + if (cattr->store) + ret = cattr->store(state, buf, size); + + return ret; +} + static const struct sysfs_ops cpuidle_state_sysfs_ops = { .show = cpuidle_state_show, + .store = cpuidle_state_store, }; static void cpuidle_state_sysfs_release(struct kobject *kobj) diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 927db28a2a4c..ca4e4983773f 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -46,6 +46,7 @@ struct cpuidle_state { unsigned int exit_latency; /* in US */ unsigned int power_usage; /* in mW */ unsigned int target_residency; /* in US */ + unsigned int disable; int (*enter) (struct cpuidle_device *dev, struct cpuidle_driver *drv, -- cgit v1.2.3 From f6365201d8a21fb347260f89d6e9b3e718d63c70 Mon Sep 17 00:00:00 2001 From: Len Brown Date: Thu, 29 Mar 2012 14:49:17 -0700 Subject: x86: Remove the ancient and deprecated disable_hlt() and enable_hlt() facility The X86_32-only disable_hlt/enable_hlt mechanism was used by the 32-bit floppy driver. Its effect was to replace the use of the HLT instruction inside default_idle() with cpu_relax() - essentially it turned off the use of HLT. This workaround was commented in the code as: "disable hlt during certain critical i/o operations" "This halt magic was a workaround for ancient floppy DMA wreckage. It should be safe to remove." H. Peter Anvin additionally adds: "To the best of my knowledge, no-hlt only existed because of flaky power distributions on 386/486 systems which were sold to run DOS. Since DOS did no power management of any kind, including HLT, the power draw was fairly uniform; when exposed to the much hhigher noise levels you got when Linux used HLT caused some of these systems to fail. They were by far in the minority even back then." Alan Cox further says: "Also for the Cyrix 5510 which tended to go castors up if a HLT occurred during a DMA cycle and on a few other boxes HLT during DMA tended to go astray. Do we care ? I doubt it. The 5510 was pretty obscure, the 5520 fixed it, the 5530 is probably the oldest still in any kind of use." So, let's finally drop this. Signed-off-by: Len Brown Signed-off-by: Josh Boyer Signed-off-by: Andrew Morton Acked-by: "H. Peter Anvin" Acked-by: Alan Cox Cc: Stephen Hemminger Cc: Link: http://lkml.kernel.org/n/tip-3rhk9bzf0x9rljkv488tloib@git.kernel.org [ If anyone cares then alternative instruction patching could be used to replace HLT with a one-byte NOP instruction. Much simpler. ] Signed-off-by: Ingo Molnar --- Documentation/feature-removal-schedule.txt | 8 ------- arch/x86/include/asm/processor.h | 10 --------- arch/x86/kernel/process.c | 24 -------------------- drivers/block/floppy.c | 36 ------------------------------ 4 files changed, 78 deletions(-) (limited to 'Documentation') diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt index 0cad4803ffac..7c950d48d76e 100644 --- a/Documentation/feature-removal-schedule.txt +++ b/Documentation/feature-removal-schedule.txt @@ -6,14 +6,6 @@ be removed from this file. --------------------------- -What: x86 floppy disable_hlt -When: 2012 -Why: ancient workaround of dubious utility clutters the - code used by everybody else. -Who: Len Brown - ---------------------------- - What: CONFIG_APM_CPU_IDLE, and its ability to call APM BIOS in idle When: 2012 Why: This optional sub-feature of APM is of dubious reliability, diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 7284c9a6a0b5..4fa7dcceb6c0 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -974,16 +974,6 @@ extern bool cpu_has_amd_erratum(const int *); #define cpu_has_amd_erratum(x) (false) #endif /* CONFIG_CPU_SUP_AMD */ -#ifdef CONFIG_X86_32 -/* - * disable hlt during certain critical i/o operations - */ -#define HAVE_DISABLE_HLT -#endif - -void disable_hlt(void); -void enable_hlt(void); - void cpu_idle_wait(void); extern unsigned long arch_align_stack(unsigned long sp); diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index a33afaa5ddb7..1d92a5ab6e8b 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -362,34 +362,10 @@ void (*pm_idle)(void); EXPORT_SYMBOL(pm_idle); #endif -#ifdef CONFIG_X86_32 -/* - * This halt magic was a workaround for ancient floppy DMA - * wreckage. It should be safe to remove. - */ -static int hlt_counter; -void disable_hlt(void) -{ - hlt_counter++; -} -EXPORT_SYMBOL(disable_hlt); - -void enable_hlt(void) -{ - hlt_counter--; -} -EXPORT_SYMBOL(enable_hlt); - -static inline int hlt_use_halt(void) -{ - return (!hlt_counter && boot_cpu_data.hlt_works_ok); -} -#else static inline int hlt_use_halt(void) { return 1; } -#endif #ifndef CONFIG_SMP static inline void play_dead(void) diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c index 76a08236430a..b0b00d70c166 100644 --- a/drivers/block/floppy.c +++ b/drivers/block/floppy.c @@ -1030,37 +1030,6 @@ static int fd_wait_for_completion(unsigned long delay, timeout_fn function) return 0; } -static DEFINE_SPINLOCK(floppy_hlt_lock); -static int hlt_disabled; -static void floppy_disable_hlt(void) -{ - unsigned long flags; - - WARN_ONCE(1, "floppy_disable_hlt() scheduled for removal in 2012"); - spin_lock_irqsave(&floppy_hlt_lock, flags); - if (!hlt_disabled) { - hlt_disabled = 1; -#ifdef HAVE_DISABLE_HLT - disable_hlt(); -#endif - } - spin_unlock_irqrestore(&floppy_hlt_lock, flags); -} - -static void floppy_enable_hlt(void) -{ - unsigned long flags; - - spin_lock_irqsave(&floppy_hlt_lock, flags); - if (hlt_disabled) { - hlt_disabled = 0; -#ifdef HAVE_DISABLE_HLT - enable_hlt(); -#endif - } - spin_unlock_irqrestore(&floppy_hlt_lock, flags); -} - static void setup_DMA(void) { unsigned long f; @@ -1105,7 +1074,6 @@ static void setup_DMA(void) fd_enable_dma(); release_dma_lock(f); #endif - floppy_disable_hlt(); } static void show_floppy(void); @@ -1707,7 +1675,6 @@ irqreturn_t floppy_interrupt(int irq, void *dev_id) fd_disable_dma(); release_dma_lock(f); - floppy_enable_hlt(); do_floppy = NULL; if (fdc >= N_FDC || FDCS->address == -1) { /* we don't even know which FDC is the culprit */ @@ -1856,8 +1823,6 @@ static void floppy_shutdown(unsigned long data) show_floppy(); cancel_activity(); - floppy_enable_hlt(); - flags = claim_dma_lock(); fd_disable_dma(); release_dma_lock(flags); @@ -4508,7 +4473,6 @@ static void floppy_release_irq_and_dma(void) #if N_FDC > 1 set_dor(1, ~8, 0); #endif - floppy_enable_hlt(); if (floppy_track_buffer && max_buffer_sectors) { tmpsize = max_buffer_sectors * 1024; -- cgit v1.2.3 From 6ef19ab7fa1535d35006535dba6c407dad2d845c Mon Sep 17 00:00:00 2001 From: Chen Gong Date: Thu, 15 Mar 2012 16:53:37 +0800 Subject: Update documentation for parameter *notrigger* in einj.txt Add description of parameter notrigger in the einj.txt. One can utilize this new parameter to do some SRAR injection test. Pay attention, the operation is highly depended on the BIOS implementation. If no proper BIOS supports it, even if enabling this parameter, expected result will not happen. v2: Update the documentation suggested by Tony Suggested-by: Tony Luck Signed-off-by: Chen Gong Signed-off-by: Len Brown --- Documentation/acpi/apei/einj.txt | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'Documentation') diff --git a/Documentation/acpi/apei/einj.txt b/Documentation/acpi/apei/einj.txt index e7cc36397217..e20b6daaced4 100644 --- a/Documentation/acpi/apei/einj.txt +++ b/Documentation/acpi/apei/einj.txt @@ -53,6 +53,14 @@ directory apei/einj. The following files are provided. This file is used to set the second error parameter value. Effect of parameter depends on error_type specified. +- notrigger + The EINJ mechanism is a two step process. First inject the error, then + perform some actions to trigger it. Setting "notrigger" to 1 skips the + trigger phase, which *may* allow the user to cause the error in some other + context by a simple access to the cpu, memory location, or device that is + the target of the error injection. Whether this actually works depends + on what operations the BIOS actually includes in the trigger phase. + BIOS versions based in the ACPI 4.0 specification have limited options to control where the errors are injected. Your BIOS may support an extension (enabled with the param_extension=1 module parameter, or -- cgit v1.2.3 From d1ff4b1cdbabb9ab9813f3d6e1cbec42cc5d6ed8 Mon Sep 17 00:00:00 2001 From: Matthew Garrett Date: Tue, 31 Jan 2012 13:19:20 -0500 Subject: ACPI: Add support for exposing BGRT data ACPI 5.0 adds the BGRT, a table that contains a pointer to the firmware boot splash and associated metadata. This simple driver exposes it via /sys/firmware/acpi in order to allow bootsplash applications to draw their splash around the firmware image and reduce the number of jarring graphical transitions during boot. Signed-off-by: Matthew Garrett Signed-off-by: Len Brown --- Documentation/ABI/testing/sysfs-firmware-acpi | 20 +++ drivers/acpi/Kconfig | 9 ++ drivers/acpi/Makefile | 1 + drivers/acpi/bgrt.c | 175 ++++++++++++++++++++++++++ 4 files changed, 205 insertions(+) create mode 100644 drivers/acpi/bgrt.c (limited to 'Documentation') diff --git a/Documentation/ABI/testing/sysfs-firmware-acpi b/Documentation/ABI/testing/sysfs-firmware-acpi index 4f9ba3c2fca7..dd930c8db41f 100644 --- a/Documentation/ABI/testing/sysfs-firmware-acpi +++ b/Documentation/ABI/testing/sysfs-firmware-acpi @@ -1,3 +1,23 @@ +What: /sys/firmware/acpi/bgrt/ +Date: January 2012 +Contact: Matthew Garrett +Description: + The BGRT is an ACPI 5.0 feature that allows the OS + to obtain a copy of the firmware boot splash and + some associated metadata. This is intended to be used + by boot splash applications in order to interact with + the firmware boot splash in order to avoid jarring + transitions. + + image: The image bitmap. Currently a 32-bit BMP. + status: 1 if the image is valid, 0 if firmware invalidated it. + type: 0 indicates image is in BMP format. + version: The version of the BGRT. Currently 1. + xoffset: The number of pixels between the left of the screen + and the left edge of the image. + yoffset: The number of pixels between the top of the screen + and the top edge of the image. + What: /sys/firmware/acpi/interrupts/ Date: February 2008 Contact: Len Brown diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 7556913aba45..47768ff87343 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -384,6 +384,15 @@ config ACPI_CUSTOM_METHOD load additional kernel modules after boot, this feature may be used to override that restriction). +config ACPI_BGRT + tristate "Boottime Graphics Resource Table support" + default n + help + This driver adds support for exposing the ACPI Boottime Graphics + Resource Table, which allows the operating system to obtain + data from the firmware boot splash. It will appear under + /sys/firmware/acpi/bgrt/ . + source "drivers/acpi/apei/Kconfig" endif # ACPI diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile index 1567028d2038..47199e2a9130 100644 --- a/drivers/acpi/Makefile +++ b/drivers/acpi/Makefile @@ -62,6 +62,7 @@ obj-$(CONFIG_ACPI_SBS) += sbs.o obj-$(CONFIG_ACPI_HED) += hed.o obj-$(CONFIG_ACPI_EC_DEBUGFS) += ec_sys.o obj-$(CONFIG_ACPI_CUSTOM_METHOD)+= custom_method.o +obj-$(CONFIG_ACPI_BGRT) += bgrt.o # processor has its own "processor." module_param namespace processor-y := processor_driver.o processor_throttling.o diff --git a/drivers/acpi/bgrt.c b/drivers/acpi/bgrt.c new file mode 100644 index 000000000000..8cf6c46e99fb --- /dev/null +++ b/drivers/acpi/bgrt.c @@ -0,0 +1,175 @@ +/* + * Copyright 2012 Red Hat, Inc + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include +#include +#include + +static struct acpi_table_bgrt *bgrt_tab; +static struct kobject *bgrt_kobj; + +struct bmp_header { + u16 id; + u32 size; +} __attribute ((packed)); + +static struct bmp_header bmp_header; + +static ssize_t show_version(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return snprintf(buf, PAGE_SIZE, "%d\n", bgrt_tab->version); +} +static DEVICE_ATTR(version, S_IRUGO, show_version, NULL); + +static ssize_t show_status(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return snprintf(buf, PAGE_SIZE, "%d\n", bgrt_tab->status); +} +static DEVICE_ATTR(status, S_IRUGO, show_status, NULL); + +static ssize_t show_type(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return snprintf(buf, PAGE_SIZE, "%d\n", bgrt_tab->image_type); +} +static DEVICE_ATTR(type, S_IRUGO, show_type, NULL); + +static ssize_t show_xoffset(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return snprintf(buf, PAGE_SIZE, "%d\n", bgrt_tab->image_offset_x); +} +static DEVICE_ATTR(xoffset, S_IRUGO, show_xoffset, NULL); + +static ssize_t show_yoffset(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return snprintf(buf, PAGE_SIZE, "%d\n", bgrt_tab->image_offset_y); +} +static DEVICE_ATTR(yoffset, S_IRUGO, show_yoffset, NULL); + +static ssize_t show_image(struct file *file, struct kobject *kobj, + struct bin_attribute *attr, char *buf, loff_t off, size_t count) +{ + int size = attr->size; + void __iomem *image = attr->private; + + if (off >= size) { + count = 0; + } else { + if (off + count > size) + count = size - off; + + memcpy_fromio(buf, image+off, count); + } + + return count; +} + +static struct bin_attribute image_attr = { + .attr = { + .name = "image", + .mode = S_IRUGO, + }, + .read = show_image, +}; + +static struct attribute *bgrt_attributes[] = { + &dev_attr_version.attr, + &dev_attr_status.attr, + &dev_attr_type.attr, + &dev_attr_xoffset.attr, + &dev_attr_yoffset.attr, + NULL, +}; + +static struct attribute_group bgrt_attribute_group = { + .attrs = bgrt_attributes, +}; + +static int __init bgrt_init(void) +{ + acpi_status status; + int ret; + void __iomem *bgrt; + + if (acpi_disabled) + return -ENODEV; + + status = acpi_get_table("BGRT", 0, + (struct acpi_table_header **)&bgrt_tab); + + if (ACPI_FAILURE(status)) + return -ENODEV; + + sysfs_bin_attr_init(&image_attr); + + bgrt = ioremap(bgrt_tab->image_address, sizeof(struct bmp_header)); + + if (!bgrt) { + ret = -EINVAL; + goto out_err; + } + + memcpy_fromio(&bmp_header, bgrt, sizeof(bmp_header)); + image_attr.size = bmp_header.size; + iounmap(bgrt); + + image_attr.private = ioremap(bgrt_tab->image_address, image_attr.size); + + if (!image_attr.private) { + ret = -EINVAL; + goto out_err; + } + + + bgrt_kobj = kobject_create_and_add("bgrt", acpi_kobj); + if (!bgrt_kobj) { + ret = -EINVAL; + goto out_iounmap; + } + + ret = sysfs_create_group(bgrt_kobj, &bgrt_attribute_group); + if (ret) + goto out_kobject; + + ret = sysfs_create_bin_file(bgrt_kobj, &image_attr); + if (ret) + goto out_group; + + return 0; + +out_group: + sysfs_remove_group(bgrt_kobj, &bgrt_attribute_group); +out_kobject: + kobject_put(bgrt_kobj); +out_iounmap: + iounmap(image_attr.private); +out_err: + return ret; +} + +static void __exit bgrt_exit(void) +{ + iounmap(image_attr.private); + sysfs_remove_group(bgrt_kobj, &bgrt_attribute_group); + sysfs_remove_bin_file(bgrt_kobj, &image_attr); +} + +module_init(bgrt_init); +module_exit(bgrt_exit); + +MODULE_AUTHOR("Matthew Garrett"); +MODULE_DESCRIPTION("BGRT boot graphic support"); +MODULE_LICENSE("GPL"); -- cgit v1.2.3 From f52a759327b8b30fbf66da2d75961d2dbc7c6cfa Mon Sep 17 00:00:00 2001 From: H Hartley Sweeten Date: Fri, 30 Mar 2012 13:36:58 -0700 Subject: Documentation: remove 'mach' from dontdiff file The mach entry in the dontdiff file causes all the arch/arm/mach-*/include/mach directories to be skipped. Signed-off-by: H Hartley Sweeten Signed-off-by: Randy Dunlap Signed-off-by: Linus Torvalds --- Documentation/dontdiff | 1 - 1 file changed, 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/dontdiff b/Documentation/dontdiff index 0c083c5c2faa..b4a898f43c37 100644 --- a/Documentation/dontdiff +++ b/Documentation/dontdiff @@ -158,7 +158,6 @@ logo_*.c logo_*_clut224.c logo_*_mono.c lxdialog -mach mach-types mach-types.h machtypes.h -- cgit v1.2.3 From 673d29f9a00b0c63ad3f4e2136eb4e181281fbfd Mon Sep 17 00:00:00 2001 From: Javi Merino Date: Fri, 30 Mar 2012 13:37:02 -0700 Subject: Documentation: mention scripts/diffconfig tool The kconfig documentation suggests using plain 'diff' to compare config files and then adds "Yes, we need something better here". Commit a717417e7f96 ("kconfig: add diffconfig utility") added what that comment was looking for. Signed-off-by: Javi Merino Cc: Michal Marek Signed-off-by: Linus Torvalds --- Documentation/kbuild/kconfig.txt | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/kbuild/kconfig.txt b/Documentation/kbuild/kconfig.txt index c313d71324b4..9d5f2a90dca9 100644 --- a/Documentation/kbuild/kconfig.txt +++ b/Documentation/kbuild/kconfig.txt @@ -28,12 +28,10 @@ new (default) values, so you can use: grep "(NEW)" conf.new -to see the new config symbols or you can 'diff' the previous and -new .config files to see the differences: +to see the new config symbols or you can use diffconfig to see the +differences between the previous and new .config files: - diff .config.old .config | less - -(Yes, we need something better here.) + scripts/diffconfig .config.old .config | less ______________________________________________________________________ Environment variables for '*config' -- cgit v1.2.3 From 21106b0114f75cff199d6834f676877c3edae928 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 30 Mar 2012 13:37:06 -0700 Subject: Documentation: sysrq: Crutcher Dunnavant is unavailable MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reported-by: KÅ™iÅ¡tof Želechovski Signed-off-by: Randy Dunlap Signed-off-by: Linus Torvalds --- Documentation/sysrq.txt | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/sysrq.txt b/Documentation/sysrq.txt index 312e3754e8c5..642f84495b29 100644 --- a/Documentation/sysrq.txt +++ b/Documentation/sysrq.txt @@ -241,9 +241,8 @@ command you are interested in. * I have more questions, who can I ask? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -And I'll answer any questions about the registration system you got, also -responding as soon as possible. - -Crutcher +Just ask them on the linux-kernel mailing list: + linux-kernel@vger.kernel.org * Credits ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- cgit v1.2.3 From 9a7c48b7c3d58835b3a91d86c55e0ae77d15ddd5 Mon Sep 17 00:00:00 2001 From: Josh Triplett Date: Fri, 30 Mar 2012 13:37:10 -0700 Subject: Documentation: CodingStyle: add inline assembly guidelines Signed-off-by: Josh Triplett Signed-off-by: Randy Dunlap Signed-off-by: Linus Torvalds --- Documentation/CodingStyle | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) (limited to 'Documentation') diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index 2b90d328b3ba..c58b236bbe04 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle @@ -793,6 +793,35 @@ own custom mode, or may have some other magic method for making indentation work correctly. + Chapter 19: Inline assembly + +In architecture-specific code, you may need to use inline assembly to interface +with CPU or platform functionality. Don't hesitate to do so when necessary. +However, don't use inline assembly gratuitously when C can do the job. You can +and should poke hardware from C when possible. + +Consider writing simple helper functions that wrap common bits of inline +assembly, rather than repeatedly writing them with slight variations. Remember +that inline assembly can use C parameters. + +Large, non-trivial assembly functions should go in .S files, with corresponding +C prototypes defined in C header files. The C prototypes for assembly +functions should use "asmlinkage". + +You may need to mark your asm statement as volatile, to prevent GCC from +removing it if GCC doesn't notice any side effects. You don't always need to +do so, though, and doing so unnecessarily can limit optimization. + +When writing a single inline assembly statement containing multiple +instructions, put each instruction on a separate line in a separate quoted +string, and end each string except the last with \n\t to properly indent the +next instruction in the assembly output: + + asm ("magic %reg1, #42\n\t" + "more_magic %reg2, %reg3" + : /* outputs */ : /* inputs */ : /* clobbers */); + + Appendix I: References -- cgit v1.2.3 From 096015236df46c64be8b86e41fd4e28522e5f7e5 Mon Sep 17 00:00:00 2001 From: Randy Dunlap Date: Fri, 30 Mar 2012 13:37:13 -0700 Subject: Documentation: input.txt: clarify mousedev 'cat' command syntax Clarify that the 'cat' command does not include the (c, 13, 32) after it. Reported-by: Dan Jidanni Jacobson Signed-off-by: Randy Dunlap Cc: Dmitry Torokhov Signed-off-by: Linus Torvalds --- Documentation/input/input.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/input/input.txt b/Documentation/input/input.txt index b3d6787b4fb1..666c06c5ab0c 100644 --- a/Documentation/input/input.txt +++ b/Documentation/input/input.txt @@ -250,8 +250,8 @@ And so on up to event31. a USB keyboard works and is correctly connected to the kernel keyboard driver. - Doing a cat /dev/input/mouse0 (c, 13, 32) will verify that a mouse -is also emulated, characters should appear if you move it. + Doing a "cat /dev/input/mouse0" (c, 13, 32) will verify that a mouse +is also emulated; characters should appear if you move it. You can test the joystick emulation with the 'jstest' utility, available in the joystick package (see Documentation/input/joystick.txt). -- cgit v1.2.3 From 970e2486492aa1eb47a436a5a4c81e92558986a9 Mon Sep 17 00:00:00 2001 From: Lucas De Marchi Date: Fri, 30 Mar 2012 13:37:16 -0700 Subject: Documentation: remove references to /etc/modprobe.conf Usage of /etc/modprobe.conf file was deprecated by module-init-tools and is no longer parsed by new kmod tool. References to this file are replaced in Documentation, comments and Kconfig according to the context. There are also some references to the old /etc/modules.conf from 2.4 kernels that are being removed. Signed-off-by: Lucas De Marchi Acked-by: Takashi Iwai Acked-by: Mauro Carvalho Chehab Signed-off-by: Randy Dunlap Signed-off-by: Linus Torvalds --- Documentation/aoe/aoe.txt | 2 +- Documentation/aoe/autoload.sh | 4 +-- Documentation/blockdev/floppy.txt | 2 +- Documentation/fb/intel810.txt | 2 +- Documentation/fb/intelfb.txt | 2 +- Documentation/i2c/busses/scx200_acb | 2 +- Documentation/ide/ide.txt | 2 +- Documentation/isdn/README.gigaset | 16 ++++----- Documentation/laptops/sonypi.txt | 2 +- Documentation/mono.txt | 8 ++--- Documentation/networking/baycom.txt | 2 +- Documentation/networking/bonding.txt | 43 +++++++++++-------------- Documentation/networking/dl2k.txt | 11 ++++--- Documentation/networking/e100.txt | 6 ++-- Documentation/networking/ipv6.txt | 6 ++-- Documentation/networking/ixgb.txt | 6 ++-- Documentation/networking/ltpc.txt | 2 +- Documentation/networking/vortex.txt | 6 ++-- Documentation/parport.txt | 13 ++++---- Documentation/s390/3270.txt | 21 ++++++------ Documentation/scsi/aic79xx.txt | 2 +- Documentation/scsi/aic7xxx.txt | 2 +- Documentation/scsi/osst.txt | 2 +- Documentation/serial/computone.txt | 8 ++--- Documentation/serial/rocket.txt | 2 +- Documentation/serial/stallion.txt | 4 +-- Documentation/sound/alsa/ALSA-Configuration.txt | 10 +++--- Documentation/sound/alsa/Audiophile-Usb.txt | 4 +-- Documentation/sound/alsa/MIXART.txt | 6 ++-- Documentation/sound/alsa/OSS-Emulation.txt | 2 +- Documentation/sound/oss/AudioExcelDSP16 | 6 ++-- Documentation/sound/oss/CMI8330 | 5 ++- Documentation/sound/oss/Introduction | 10 +++--- Documentation/sound/oss/Opti | 8 ++--- Documentation/sound/oss/PAS16 | 4 +-- Documentation/sound/oss/README.modules | 8 ++--- Documentation/usb/power-management.txt | 3 +- Documentation/video4linux/CQcam.txt | 14 ++------ Documentation/video4linux/Zoran | 2 +- Documentation/video4linux/bttv/Modules.conf | 2 +- Documentation/video4linux/meye.txt | 2 +- drivers/net/wan/Kconfig | 4 +-- drivers/scsi/aic7xxx/aic79xx_osm.c | 8 ++--- drivers/scsi/aic7xxx/aic7xxx_osm.c | 8 ++--- drivers/staging/asus_oled/README | 2 +- drivers/tty/isicom.c | 2 +- drivers/usb/serial/ftdi_sio.c | 3 +- drivers/usb/storage/Kconfig | 2 +- sound/core/seq/seq_dummy.c | 2 +- sound/drivers/Kconfig | 3 +- 50 files changed, 138 insertions(+), 160 deletions(-) (limited to 'Documentation') diff --git a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.txt index b5aada9f20cc..5f5aa16047ff 100644 --- a/Documentation/aoe/aoe.txt +++ b/Documentation/aoe/aoe.txt @@ -35,7 +35,7 @@ CREATING DEVICE NODES sh Documentation/aoe/mkshelf.sh /dev/etherd 0 There is also an autoload script that shows how to edit - /etc/modprobe.conf to ensure that the aoe module is loaded when + /etc/modprobe.d/aoe.conf to ensure that the aoe module is loaded when necessary. USING DEVICE NODES diff --git a/Documentation/aoe/autoload.sh b/Documentation/aoe/autoload.sh index 78dad1334c6f..815dff4691c9 100644 --- a/Documentation/aoe/autoload.sh +++ b/Documentation/aoe/autoload.sh @@ -1,8 +1,8 @@ #!/bin/sh # set aoe to autoload by installing the -# aliases in /etc/modprobe.conf +# aliases in /etc/modprobe.d/ -f=/etc/modprobe.conf +f=/etc/modprobe.d/aoe.conf if test ! -r $f || test ! -w $f; then echo "cannot configure $f for module autoloading" 1>&2 diff --git a/Documentation/blockdev/floppy.txt b/Documentation/blockdev/floppy.txt index 6ccab88705cb..470fe4b5e379 100644 --- a/Documentation/blockdev/floppy.txt +++ b/Documentation/blockdev/floppy.txt @@ -49,7 +49,7 @@ you can put: options floppy omnibook messages -in /etc/modprobe.conf. +in a configuration file in /etc/modprobe.d/. The floppy driver related options are: diff --git a/Documentation/fb/intel810.txt b/Documentation/fb/intel810.txt index be3e7836abef..a8e9f5bca6f3 100644 --- a/Documentation/fb/intel810.txt +++ b/Documentation/fb/intel810.txt @@ -211,7 +211,7 @@ Using the same setup as described above, load the module like this: modprobe i810fb vram=2 xres=1024 bpp=8 hsync1=30 hsync2=55 vsync1=50 \ vsync2=85 accel=1 mtrr=1 -Or just add the following to /etc/modprobe.conf +Or just add the following to a configuration file in /etc/modprobe.d/ options i810fb vram=2 xres=1024 bpp=16 hsync1=30 hsync2=55 vsync1=50 \ vsync2=85 accel=1 mtrr=1 diff --git a/Documentation/fb/intelfb.txt b/Documentation/fb/intelfb.txt index dd9e944ea628..feac4e4d6968 100644 --- a/Documentation/fb/intelfb.txt +++ b/Documentation/fb/intelfb.txt @@ -120,7 +120,7 @@ Using the same setup as described above, load the module like this: modprobe intelfb mode=800x600-32@75 vram=8 accel=1 hwcursor=1 -Or just add the following to /etc/modprobe.conf +Or just add the following to a configuration file in /etc/modprobe.d/ options intelfb mode=800x600-32@75 vram=8 accel=1 hwcursor=1 diff --git a/Documentation/i2c/busses/scx200_acb b/Documentation/i2c/busses/scx200_acb index 7c07883d4dfc..ce83c871fe95 100644 --- a/Documentation/i2c/busses/scx200_acb +++ b/Documentation/i2c/busses/scx200_acb @@ -28,5 +28,5 @@ If the scx200_acb driver is built into the kernel, add the following parameter to your boot command line: scx200_acb.base=0x810,0x820 If the scx200_acb driver is built as a module, add the following line to -the file /etc/modprobe.conf instead: +a configuration file in /etc/modprobe.d/ instead: options scx200_acb base=0x810,0x820 diff --git a/Documentation/ide/ide.txt b/Documentation/ide/ide.txt index e77bebfa7b0d..7aca987c23d9 100644 --- a/Documentation/ide/ide.txt +++ b/Documentation/ide/ide.txt @@ -169,7 +169,7 @@ When using ide.c as a module in combination with kmod, add: alias block-major-3 ide-probe -to /etc/modprobe.conf. +to a configuration file in /etc/modprobe.d/. When ide.c is used as a module, you can pass command line parameters to the driver using the "options=" keyword to insmod, while replacing any ',' with diff --git a/Documentation/isdn/README.gigaset b/Documentation/isdn/README.gigaset index ef3343eaa002..7534c6039adc 100644 --- a/Documentation/isdn/README.gigaset +++ b/Documentation/isdn/README.gigaset @@ -97,8 +97,7 @@ GigaSet 307x Device Driver 2.5.): 1=on (default), 0=off Depending on your distribution you may want to create a separate module - configuration file /etc/modprobe.d/gigaset for these, or add them to a - custom file like /etc/modprobe.conf.local. + configuration file like /etc/modprobe.d/gigaset.conf for these. 2.2. Device nodes for user space programs ------------------------------------ @@ -212,8 +211,8 @@ GigaSet 307x Device Driver options ppp_async flag_time=0 - to an appropriate module configuration file, like /etc/modprobe.d/gigaset - or /etc/modprobe.conf.local. + to an appropriate module configuration file, like + /etc/modprobe.d/gigaset.conf. Unimodem mode is needed for making some devices [e.g. SX100] work which do not support the regular Gigaset command set. If debug output (see @@ -237,8 +236,8 @@ GigaSet 307x Device Driver modprobe usb_gigaset startmode=0 or by adding a line like options usb_gigaset startmode=0 - to an appropriate module configuration file, like /etc/modprobe.d/gigaset - or /etc/modprobe.conf.local. + to an appropriate module configuration file, like + /etc/modprobe.d/gigaset.conf 2.6. Call-ID (CID) mode ------------------ @@ -310,7 +309,7 @@ GigaSet 307x Device Driver options isdn dialtimeout=15 - to /etc/modprobe.d/gigaset, /etc/modprobe.conf.local or a similar file. + to /etc/modprobe.d/gigaset.conf or a similar file. Problem: The isdnlog program emits error messages or just doesn't work. @@ -350,8 +349,7 @@ GigaSet 307x Device Driver The initial value can be set using the debug parameter when loading the module "gigaset", e.g. by adding a line options gigaset debug=0 - to your module configuration file, eg. /etc/modprobe.d/gigaset or - /etc/modprobe.conf.local. + to your module configuration file, eg. /etc/modprobe.d/gigaset.conf Generated debugging information can be found - as output of the command diff --git a/Documentation/laptops/sonypi.txt b/Documentation/laptops/sonypi.txt index 4857acfc50f1..606bdb9ce036 100644 --- a/Documentation/laptops/sonypi.txt +++ b/Documentation/laptops/sonypi.txt @@ -110,7 +110,7 @@ Module use: ----------- In order to automatically load the sonypi module on use, you can put those -lines in your /etc/modprobe.conf file: +lines a configuration file in /etc/modprobe.d/: alias char-major-10-250 sonypi options sonypi minor=250 diff --git a/Documentation/mono.txt b/Documentation/mono.txt index e8e1758e87da..d01ac6052194 100644 --- a/Documentation/mono.txt +++ b/Documentation/mono.txt @@ -38,11 +38,11 @@ if [ ! -e /proc/sys/fs/binfmt_misc/register ]; then /sbin/modprobe binfmt_misc # Some distributions, like Fedora Core, perform # the following command automatically when the - # binfmt_misc module is loaded into the kernel. + # binfmt_misc module is loaded into the kernel + # or during normal boot up (systemd-based systems). # Thus, it is possible that the following line - # is not needed at all. Look at /etc/modprobe.conf - # to check whether this is applicable or not. - mount -t binfmt_misc none /proc/sys/fs/binfmt_misc + # is not needed at all. + mount -t binfmt_misc none /proc/sys/fs/binfmt_misc fi # Register support for .NET CLR binaries diff --git a/Documentation/networking/baycom.txt b/Documentation/networking/baycom.txt index 4e68849d5639..688f18fd4467 100644 --- a/Documentation/networking/baycom.txt +++ b/Documentation/networking/baycom.txt @@ -93,7 +93,7 @@ Every time a driver is inserted into the kernel, it has to know which modems it should access at which ports. This can be done with the setbaycom utility. If you are only using one modem, you can also configure the driver from the insmod command line (or by means of an option line in -/etc/modprobe.conf). +/etc/modprobe.d/*.conf). Examples: modprobe baycom_ser_fdx mode="ser12*" iobase=0x3f8 irq=4 diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index 080ad26690ae..d5e869814040 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -173,9 +173,8 @@ bonding module at load time, or are specified via sysfs. Module options may be given as command line arguments to the insmod or modprobe command, but are usually specified in either the -/etc/modules.conf or /etc/modprobe.conf configuration file, or in a -distro-specific configuration file (some of which are detailed in the next -section). +/etc/modrobe.d/*.conf configuration files, or in a distro-specific +configuration file (some of which are detailed in the next section). Details on bonding support for sysfs is provided in the "Configuring Bonding Manually via Sysfs" section, below. @@ -1021,7 +1020,7 @@ ifcfg-bondX files. Because the sysconfig scripts supply the bonding module options in the ifcfg-bondX file, it is not necessary to add them to -the system /etc/modules.conf or /etc/modprobe.conf configuration file. +the system /etc/modules.d/*.conf configuration files. 3.2 Configuration with Initscripts Support ------------------------------------------ @@ -1098,15 +1097,13 @@ queried targets, e.g., arp_ip_target=+192.168.1.1 arp_ip_target=+192.168.1.2 is the proper syntax to specify multiple targets. When specifying -options via BONDING_OPTS, it is not necessary to edit /etc/modules.conf or -/etc/modprobe.conf. +options via BONDING_OPTS, it is not necessary to edit /etc/modprobe.d/*.conf. For even older versions of initscripts that do not support -BONDING_OPTS, it is necessary to edit /etc/modules.conf (or -/etc/modprobe.conf, depending upon your distro) to load the bonding module -with your desired options when the bond0 interface is brought up. The -following lines in /etc/modules.conf (or modprobe.conf) will load the -bonding module, and select its options: +BONDING_OPTS, it is necessary to edit /etc/modprobe.d/*.conf, depending upon +your distro) to load the bonding module with your desired options when the +bond0 interface is brought up. The following lines in /etc/modprobe.d/*.conf +will load the bonding module, and select its options: alias bond0 bonding options bond0 mode=balance-alb miimon=100 @@ -1152,7 +1149,7 @@ knowledge of bonding. One such distro is SuSE Linux Enterprise Server version 8. The general method for these systems is to place the bonding -module parameters into /etc/modules.conf or /etc/modprobe.conf (as +module parameters into a config file in /etc/modprobe.d/ (as appropriate for the installed distro), then add modprobe and/or ifenslave commands to the system's global init script. The name of the global init script differs; for sysconfig, it is @@ -1228,7 +1225,7 @@ network initialization scripts. specify a different name for each instance (the module loading system requires that every loaded module, even multiple instances of the same module, have a unique name). This is accomplished by supplying multiple -sets of bonding options in /etc/modprobe.conf, for example: +sets of bonding options in /etc/modprobe.d/*.conf, for example: alias bond0 bonding options bond0 -o bond0 mode=balance-rr miimon=100 @@ -1793,8 +1790,8 @@ route additions may cause trouble. On systems with network configuration scripts that do not associate physical devices directly with network interface names (so that the same physical device always has the same "ethX" name), it may -be necessary to add some special logic to either /etc/modules.conf or -/etc/modprobe.conf (depending upon which is installed on the system). +be necessary to add some special logic to config files in +/etc/modprobe.d/. For example, given a modules.conf containing the following: @@ -1821,20 +1818,16 @@ add above bonding e1000 tg3 bonding is loaded. This command is fully documented in the modules.conf manual page. - On systems utilizing modprobe.conf (or modprobe.conf.local), -an equivalent problem can occur. In this case, the following can be -added to modprobe.conf (or modprobe.conf.local, as appropriate), as -follows (all on one line; it has been split here for clarity): + On systems utilizing modprobe an equivalent problem can occur. +In this case, the following can be added to config files in +/etc/modprobe.d/ as: install bonding /sbin/modprobe tg3; /sbin/modprobe e1000; /sbin/modprobe --ignore-install bonding - This will, when loading the bonding module, rather than -performing the normal action, instead execute the provided command. -This command loads the device drivers in the order needed, then calls -modprobe with --ignore-install to cause the normal action to then take -place. Full documentation on this can be found in the modprobe.conf -and modprobe manual pages. + This will load tg3 and e1000 modules before loading the bonding one. +Full documentation on this can be found in the modprobe.d and modprobe +manual pages. 8.3. Painfully Slow Or No Failed Link Detection By Miimon --------------------------------------------------------- diff --git a/Documentation/networking/dl2k.txt b/Documentation/networking/dl2k.txt index 10e8490fa406..cba74f7a3abc 100644 --- a/Documentation/networking/dl2k.txt +++ b/Documentation/networking/dl2k.txt @@ -45,12 +45,13 @@ Now eth0 should active, you can test it by "ping" or get more information by "ifconfig". If tested ok, continue the next step. 4. cp dl2k.ko /lib/modules/`uname -r`/kernel/drivers/net -5. Add the following line to /etc/modprobe.conf: +5. Add the following line to /etc/modprobe.d/dl2k.conf: alias eth0 dl2k -6. Run "netconfig" or "netconf" to create configuration script ifcfg-eth0 +6. Run depmod to updated module indexes. +7. Run "netconfig" or "netconf" to create configuration script ifcfg-eth0 located at /etc/sysconfig/network-scripts or create it manually. [see - Configuration Script Sample] -7. Driver will automatically load and configure at next boot time. +8. Driver will automatically load and configure at next boot time. Compiling the Driver ==================== @@ -154,8 +155,8 @@ Installing the Driver ----------------- 1. Copy dl2k.o to the network modules directory, typically /lib/modules/2.x.x-xx/net or /lib/modules/2.x.x/kernel/drivers/net. - 2. Locate the boot module configuration file, most commonly modprobe.conf - or modules.conf (for 2.4) in the /etc directory. Add the following lines: + 2. Locate the boot module configuration file, most commonly in the + /etc/modprobe.d/ directory. Add the following lines: alias ethx dl2k options dl2k diff --git a/Documentation/networking/e100.txt b/Documentation/networking/e100.txt index 162f323a7a1f..fcb6c71cdb69 100644 --- a/Documentation/networking/e100.txt +++ b/Documentation/networking/e100.txt @@ -94,8 +94,8 @@ Additional Configurations Configuring a network driver to load properly when the system is started is distribution dependent. Typically, the configuration process involves adding - an alias line to /etc/modules.conf or /etc/modprobe.conf as well as editing - other system startup scripts and/or configuration files. Many popular Linux + an alias line to /etc/modprobe.d/*.conf as well as editing other system + startup scripts and/or configuration files. Many popular Linux distributions ship with tools to make these changes for you. To learn the proper way to configure a network device for your system, refer to your distribution documentation. If during this process you are asked for the @@ -103,7 +103,7 @@ Additional Configurations PRO/100 Family of Adapters is e100. As an example, if you install the e100 driver for two PRO/100 adapters - (eth0 and eth1), add the following to modules.conf or modprobe.conf: + (eth0 and eth1), add the following to a configuraton file in /etc/modprobe.d/ alias eth0 e100 alias eth1 e100 diff --git a/Documentation/networking/ipv6.txt b/Documentation/networking/ipv6.txt index 9fd7e21296c8..6cd74fa55358 100644 --- a/Documentation/networking/ipv6.txt +++ b/Documentation/networking/ipv6.txt @@ -2,9 +2,9 @@ Options for the ipv6 module are supplied as parameters at load time. Module options may be given as command line arguments to the insmod -or modprobe command, but are usually specified in either the -/etc/modules.conf or /etc/modprobe.conf configuration file, or in a -distro-specific configuration file. +or modprobe command, but are usually specified in either +/etc/modules.d/*.conf configuration files, or in a distro-specific +configuration file. The available ipv6 module parameters are listed below. If a parameter is not specified the default value is used. diff --git a/Documentation/networking/ixgb.txt b/Documentation/networking/ixgb.txt index e196f16df313..d75a1f9565bb 100644 --- a/Documentation/networking/ixgb.txt +++ b/Documentation/networking/ixgb.txt @@ -274,9 +274,9 @@ Additional Configurations ------------------------------------------------- Configuring a network driver to load properly when the system is started is distribution dependent. Typically, the configuration process involves adding - an alias line to /etc/modprobe.conf as well as editing other system startup - scripts and/or configuration files. Many popular Linux distributions ship - with tools to make these changes for you. To learn the proper way to + an alias line to files in /etc/modprobe.d/ as well as editing other system + startup scripts and/or configuration files. Many popular Linux distributions + ship with tools to make these changes for you. To learn the proper way to configure a network device for your system, refer to your distribution documentation. If during this process you are asked for the driver or module name, the name for the Linux Base Driver for the Intel 10GbE Family of diff --git a/Documentation/networking/ltpc.txt b/Documentation/networking/ltpc.txt index fe2a9129d959..0bf3220c715b 100644 --- a/Documentation/networking/ltpc.txt +++ b/Documentation/networking/ltpc.txt @@ -25,7 +25,7 @@ the driver will try to determine them itself. If you load the driver as a module, you can pass the parameters "io=", "irq=", and "dma=" on the command line with insmod or modprobe, or add -them as options in /etc/modprobe.conf: +them as options in a configuration file in /etc/modprobe.d/ directory: alias lt0 ltpc # autoload the module when the interface is configured options ltpc io=0x240 irq=9 dma=1 diff --git a/Documentation/networking/vortex.txt b/Documentation/networking/vortex.txt index bd70976b8160..b4038ffb3bc5 100644 --- a/Documentation/networking/vortex.txt +++ b/Documentation/networking/vortex.txt @@ -67,8 +67,8 @@ Module parameters ================= There are several parameters which may be provided to the driver when -its module is loaded. These are usually placed in /etc/modprobe.conf -(/etc/modules.conf in 2.4). Example: +its module is loaded. These are usually placed in /etc/modprobe.d/*.conf +configuretion files. Example: options 3c59x debug=3 rx_copybreak=300 @@ -425,7 +425,7 @@ steps you should take: 1) Increase the debug level. Usually this is done via: a) modprobe driver debug=7 - b) In /etc/modprobe.conf (or /etc/modules.conf for 2.4): + b) In /etc/modprobe.d/driver.conf: options driver debug=7 2) Recreate the problem with the higher debug level, diff --git a/Documentation/parport.txt b/Documentation/parport.txt index 93a7ceef398d..c208e4366c03 100644 --- a/Documentation/parport.txt +++ b/Documentation/parport.txt @@ -36,18 +36,17 @@ addresses should not be specified for supported PCI cards since they are automatically detected. -KMod ----- +modprobe +-------- -If you use kmod, you will find it useful to edit /etc/modprobe.conf. -Here is an example of the lines that need to be added: +If you use modprobe , you will find it useful to add lines as below to a +configuration file in /etc/modprobe.d/ directory:. alias parport_lowlevel parport_pc options parport_pc io=0x378,0x278 irq=7,auto -KMod will then automatically load parport_pc (with the options -"io=0x378,0x278 irq=7,auto") whenever a parallel port device driver -(such as lp) is loaded. +modprobe will load parport_pc (with the options "io=0x378,0x278 irq=7,auto") +whenever a parallel port device driver (such as lp) is loaded. Note that these are example lines only! You shouldn't in general need to specify any options to parport_pc in order to be able to use a diff --git a/Documentation/s390/3270.txt b/Documentation/s390/3270.txt index 7a5c73a7ed7f..7c715de99774 100644 --- a/Documentation/s390/3270.txt +++ b/Documentation/s390/3270.txt @@ -47,9 +47,9 @@ including the console 3270, changes subchannel identifier relative to one another. ReIPL as soon as possible after running the configuration script and the resulting /tmp/mkdev3270. -If you have chosen to make tub3270 a module, you add a line to -/etc/modprobe.conf. If you are working on a VM virtual machine, you -can use DEF GRAF to define virtual 3270 devices. +If you have chosen to make tub3270 a module, you add a line to a +configuration file under /etc/modprobe.d/. If you are working on a VM +virtual machine, you can use DEF GRAF to define virtual 3270 devices. You may generate both 3270 and 3215 console support, or one or the other, or neither. If you generate both, the console type under VM is @@ -60,7 +60,7 @@ at boot time to a 3270 if it is a 3215. In brief, these are the steps: 1. Install the tub3270 patch - 2. (If a module) add a line to /etc/modprobe.conf + 2. (If a module) add a line to a file in /etc/modprobe.d/*.conf 3. (If VM) define devices with DEF GRAF 4. Reboot 5. Configure @@ -84,13 +84,12 @@ Here are the installation steps in detail: make modules_install 2. (Perform this step only if you have configured tub3270 as a - module.) Add a line to /etc/modprobe.conf to automatically - load the driver when it's needed. With this line added, - you will see login prompts appear on your 3270s as soon as - boot is complete (or with emulated 3270s, as soon as you dial - into your vm guest using the command "DIAL "). - Since the line-mode major number is 227, the line to add to - /etc/modprobe.conf should be: + module.) Add a line to a file /etc/modprobe.d/*.conf to automatically + load the driver when it's needed. With this line added, you will see + login prompts appear on your 3270s as soon as boot is complete (or + with emulated 3270s, as soon as you dial into your vm guest using the + command "DIAL "). Since the line-mode major number is + 227, the line to add should be: alias char-major-227 tub3270 3. Define graphic devices to your vm guest machine, if you diff --git a/Documentation/scsi/aic79xx.txt b/Documentation/scsi/aic79xx.txt index 64ac7093c872..e2d3273000d4 100644 --- a/Documentation/scsi/aic79xx.txt +++ b/Documentation/scsi/aic79xx.txt @@ -215,7 +215,7 @@ The following information is available in this file: INCORRECTLY CAN RENDER YOUR SYSTEM INOPERABLE. USE THEM WITH CAUTION. - Edit the file "modprobe.conf" in the directory /etc and add/edit a + Put a .conf file in the /etc/modprobe.d/ directory and add/edit a line containing 'options aic79xx aic79xx=[command[,command...]]' where 'command' is one or more of the following: ----------------------------------------------------------------- diff --git a/Documentation/scsi/aic7xxx.txt b/Documentation/scsi/aic7xxx.txt index 18f8d1905e6a..7c5d0223d444 100644 --- a/Documentation/scsi/aic7xxx.txt +++ b/Documentation/scsi/aic7xxx.txt @@ -190,7 +190,7 @@ The following information is available in this file: INCORRECTLY CAN RENDER YOUR SYSTEM INOPERABLE. USE THEM WITH CAUTION. - Edit the file "modprobe.conf" in the directory /etc and add/edit a + Put a .conf file in the /etc/modprobe.d directory and add/edit a line containing 'options aic7xxx aic7xxx=[command[,command...]]' where 'command' is one or more of the following: ----------------------------------------------------------------- diff --git a/Documentation/scsi/osst.txt b/Documentation/scsi/osst.txt index ad86c6d1e898..00c8ebb2fd18 100644 --- a/Documentation/scsi/osst.txt +++ b/Documentation/scsi/osst.txt @@ -66,7 +66,7 @@ recognized. If you want to have the module autoloaded on access to /dev/osst, you may add something like alias char-major-206 osst -to your /etc/modprobe.conf (before 2.6: modules.conf). +to a file under /etc/modprobe.d/ directory. You may find it convenient to create a symbolic link ln -s nosst0 /dev/tape diff --git a/Documentation/serial/computone.txt b/Documentation/serial/computone.txt index 39ddcdbeeb85..a6a1158ea2ba 100644 --- a/Documentation/serial/computone.txt +++ b/Documentation/serial/computone.txt @@ -49,7 +49,7 @@ Hardware - If you have an ISA card, find a free interrupt and io port. Note the hardware address from the Computone ISA cards installed into the system. These are required for editing ip2.c or editing - /etc/modprobe.conf, or for specification on the modprobe + /etc/modprobe.d/*.conf, or for specification on the modprobe command line. Note that the /etc/modules.conf should be used for older (pre-2.6) @@ -66,7 +66,7 @@ b) Run "make config" or "make menuconfig" or "make xconfig" c) Set address on ISA cards then: edit /usr/src/linux/drivers/char/ip2.c if needed or - edit /etc/modprobe.conf if needed (module). + edit config file in /etc/modprobe.d/ if needed (module). or both to match this setting. d) Run "make modules" e) Run "make modules_install" @@ -153,11 +153,11 @@ the irqs are not specified the driver uses the default in ip2.c (which selects polled mode). If no base addresses are specified the defaults in ip2.c are used. If you are autoloading the driver module with kerneld or kmod the base addresses and interrupt number must also be set in ip2.c -and recompile or just insert and options line in /etc/modprobe.conf or both. +and recompile or just insert and options line in /etc/modprobe.d/*.conf or both. The options line is equivalent to the command line and takes precedence over what is in ip2.c. -/etc/modprobe.conf sample: +config sample to put /etc/modprobe.d/*.conf: options ip2 io=1,0x328 irq=1,10 alias char-major-71 ip2 alias char-major-72 ip2 diff --git a/Documentation/serial/rocket.txt b/Documentation/serial/rocket.txt index 1d8582990435..60b039891057 100644 --- a/Documentation/serial/rocket.txt +++ b/Documentation/serial/rocket.txt @@ -62,7 +62,7 @@ in the system log at /var/log/messages. If installed as a module, the module must be loaded. This can be done manually by entering "modprobe rocket". To have the module loaded automatically -upon system boot, edit the /etc/modprobe.conf file and add the line +upon system boot, edit a /etc/modprobe.d/*.conf file and add the line "alias char-major-46 rocket". In order to use the ports, their device names (nodes) must be created with mknod. diff --git a/Documentation/serial/stallion.txt b/Documentation/serial/stallion.txt index 5c4902d9a5be..55090914a9c5 100644 --- a/Documentation/serial/stallion.txt +++ b/Documentation/serial/stallion.txt @@ -139,8 +139,8 @@ secondary address 0x280 and IRQ 10. You will probably want to enter this module load and configuration information into your system startup scripts so that the drivers are loaded and configured -on each system boot. Typically the start up script would be something like -/etc/modprobe.conf. +on each system boot. Typically configuration files are put in the +/etc/modprobe.d/ directory. 2.2 STATIC DRIVER CONFIGURATION: diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 6f75ba3b8a39..8c16d50f6cb6 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -2044,7 +2044,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Install the necessary firmware files in alsa-firmware package. When no hotplug fw loader is available, you need to load the firmware via vxloader utility in alsa-tools package. To invoke - vxloader automatically, add the following to /etc/modprobe.conf + vxloader automatically, add the following to /etc/modprobe.d/alsa.conf install snd-vx222 /sbin/modprobe --first-time -i snd-vx222 && /usr/bin/vxloader @@ -2168,10 +2168,10 @@ corresponds to the card index of ALSA. Usually, define this as the same card module. An example configuration for a single emu10k1 card is like below: ------ /etc/modprobe.conf +----- /etc/modprobe.d/alsa.conf alias snd-card-0 snd-emu10k1 alias sound-slot-0 snd-emu10k1 ------ /etc/modprobe.conf +----- /etc/modprobe.d/alsa.conf The available number of auto-loaded sound cards depends on the module option "cards_limit" of snd module. As default it's set to 1. @@ -2184,7 +2184,7 @@ cards is kept consistent. An example configuration for two sound cards is like below: ------ /etc/modprobe.conf +----- /etc/modprobe.d/alsa.conf # ALSA portion options snd cards_limit=2 alias snd-card-0 snd-interwave @@ -2194,7 +2194,7 @@ options snd-ens1371 index=1 # OSS/Free portion alias sound-slot-0 snd-interwave alias sound-slot-1 snd-ens1371 ------ /etc/modprobe.conf +----- /etc/modprobe.d/alsa.conf In this example, the interwave card is always loaded as the first card (index 0) and ens1371 as the second (index 1). diff --git a/Documentation/sound/alsa/Audiophile-Usb.txt b/Documentation/sound/alsa/Audiophile-Usb.txt index a4c53d8961e1..654dd3b694a8 100644 --- a/Documentation/sound/alsa/Audiophile-Usb.txt +++ b/Documentation/sound/alsa/Audiophile-Usb.txt @@ -232,7 +232,7 @@ The parameter can be given: # modprobe snd-usb-audio index=1 device_setup=0x09 * Or while configuring the modules options in your modules configuration file - - For Fedora distributions, edit the /etc/modprobe.conf file: + (tipically a .conf file in /etc/modprobe.d/ directory: alias snd-card-1 snd-usb-audio options snd-usb-audio index=1 device_setup=0x09 @@ -253,7 +253,7 @@ CAUTION when initializing the device - first turn off the device - de-register the snd-usb-audio module (modprobe -r) - change the device_setup parameter by changing the device_setup - option in /etc/modprobe.conf + option in /etc/modprobe.d/*.conf - turn on the device * A workaround for this last issue has been applied to kernel 2.6.23, but it may not be enough to ensure the 'stability' of the device initialization. diff --git a/Documentation/sound/alsa/MIXART.txt b/Documentation/sound/alsa/MIXART.txt index ef42c44fa1f2..4ee35b4fbe4a 100644 --- a/Documentation/sound/alsa/MIXART.txt +++ b/Documentation/sound/alsa/MIXART.txt @@ -76,9 +76,9 @@ FIRMWARE when CONFIG_FW_LOADER is set. The mixartloader is necessary only for older versions or when you build the driver into kernel.] -For loading the firmware automatically after the module is loaded, use -the post-install command. For example, add the following entry to -/etc/modprobe.conf for miXart driver: +For loading the firmware automatically after the module is loaded, use a +install command. For example, add the following entry to +/etc/modprobe.d/mixart.conf for miXart driver: install snd-mixart /sbin/modprobe --first-time -i snd-mixart && \ /usr/bin/mixartloader diff --git a/Documentation/sound/alsa/OSS-Emulation.txt b/Documentation/sound/alsa/OSS-Emulation.txt index 022aaeb0e9dd..152ca2a3f1bd 100644 --- a/Documentation/sound/alsa/OSS-Emulation.txt +++ b/Documentation/sound/alsa/OSS-Emulation.txt @@ -19,7 +19,7 @@ the card number and the minor unit number. Usually you don't have to define these aliases by yourself. Only necessary step for auto-loading of OSS modules is to define the -card alias in /etc/modprobe.conf, such as +card alias in /etc/modprobe.d/alsa.conf, such as alias sound-slot-0 snd-emu10k1 diff --git a/Documentation/sound/oss/AudioExcelDSP16 b/Documentation/sound/oss/AudioExcelDSP16 index e0dc0641b480..e863f9cd5941 100644 --- a/Documentation/sound/oss/AudioExcelDSP16 +++ b/Documentation/sound/oss/AudioExcelDSP16 @@ -41,7 +41,7 @@ mpu_base I/O base address for activate MPU-401 mode (0x300, 0x310, 0x320 or 0x330) mpu_irq MPU-401 irq line (5, 7, 9, 10 or 0) -The /etc/modprobe.conf will have lines like this: +A configuration file in /etc/modprobe.d/ directory will have lines like this: options opl3 io=0x388 options ad1848 io=0x530 irq=11 dma=3 @@ -51,8 +51,8 @@ Where the aedsp16 options are the options for this driver while opl3 and ad1848 are the corresponding options for the MSS and OPL3 modules. Loading MSS and OPL3 needs to pre load the aedsp16 module to set up correctly -the sound card. Installation dependencies must be written in the modprobe.conf -file: +the sound card. Installation dependencies must be written in configuration +files under /etc/modprobe.d/ directory: install ad1848 /sbin/modprobe aedsp16 && /sbin/modprobe -i ad1848 install opl3 /sbin/modprobe aedsp16 && /sbin/modprobe -i opl3 diff --git a/Documentation/sound/oss/CMI8330 b/Documentation/sound/oss/CMI8330 index 9c439f1a6dba..8a5fd1611c6f 100644 --- a/Documentation/sound/oss/CMI8330 +++ b/Documentation/sound/oss/CMI8330 @@ -143,11 +143,10 @@ CONFIG_SOUND_MSS=m -Alma Chao suggests the following /etc/modprobe.conf: +Alma Chao suggests the following in +a /etc/modprobe.d/*conf file: alias sound ad1848 alias synth0 opl3 options ad1848 io=0x530 irq=7 dma=0 soundpro=1 options opl3 io=0x388 - - diff --git a/Documentation/sound/oss/Introduction b/Documentation/sound/oss/Introduction index 75d967ff9266..42da2d8fa372 100644 --- a/Documentation/sound/oss/Introduction +++ b/Documentation/sound/oss/Introduction @@ -167,8 +167,8 @@ in a file such as /root/soundon.sh. MODPROBE: ========= -If loading via modprobe, these common files are automatically loaded -when requested by modprobe. For example, my /etc/modprobe.conf contains: +If loading via modprobe, these common files are automatically loaded when +requested by modprobe. For example, my /etc/modprobe.d/oss.conf contains: alias sound sb options sb io=0x240 irq=9 dma=3 dma16=5 mpu_io=0x300 @@ -228,7 +228,7 @@ http://www.opensound.com. Before loading the commercial sound driver, you should do the following: 1. remove sound modules (detailed above) -2. remove the sound modules from /etc/modprobe.conf +2. remove the sound modules from /etc/modprobe.d/*.conf 3. move the sound modules from /lib/modules//misc (for example, I make a /lib/modules//misc/tmp directory and copy the sound module files to that @@ -265,7 +265,7 @@ twice, you need to do the following: sb.o could be copied (or symlinked) to sb1.o for the second SoundBlaster. -2. Make a second entry in /etc/modprobe.conf, for example, +2. Make a second entry in /etc/modprobe.d/*conf, for example, sound1 or sb1. This second entry should refer to the new module names for example sb1, and should include the I/O, etc. for the second sound card. @@ -369,7 +369,7 @@ There are several ways of configuring your sound: 2) On the command line when using insmod or in a bash script using command line calls to load sound. -3) In /etc/modprobe.conf when using modprobe. +3) In /etc/modprobe.d/*conf when using modprobe. 4) Via Red Hat's GPL'd /usr/sbin/sndconfig program (text based). diff --git a/Documentation/sound/oss/Opti b/Documentation/sound/oss/Opti index c15af3c07d46..4cd5d9ab3580 100644 --- a/Documentation/sound/oss/Opti +++ b/Documentation/sound/oss/Opti @@ -18,7 +18,7 @@ force the card into a mode in which it can be programmed. If you have another OS installed on your computer it is recommended that Linux and the other OS use the same resources. -Also, it is recommended that resources specified in /etc/modprobe.conf +Also, it is recommended that resources specified in /etc/modprobe.d/*.conf and resources specified in /etc/isapnp.conf agree. Compiling the sound driver @@ -67,11 +67,7 @@ address is hard-coded into the driver. Using kmod and autoloading the sound driver ------------------------------------------- -Comment: as of linux-2.1.90 kmod is replacing kerneld. -The config file '/etc/modprobe.conf' is used as before. - -This is the sound part of my /etc/modprobe.conf file. -Following that I will explain each line. +Config files in '/etc/modprobe.d/' are used as below: alias mixer0 mad16 alias audio0 mad16 diff --git a/Documentation/sound/oss/PAS16 b/Documentation/sound/oss/PAS16 index 3dca4b75988e..5c27229eec8c 100644 --- a/Documentation/sound/oss/PAS16 +++ b/Documentation/sound/oss/PAS16 @@ -128,7 +128,7 @@ CONFIG_SOUND_YM3812 You can then get OPL3 functionality by issuing the command: insmod opl3 In addition, you must either add the following line to - /etc/modprobe.conf: + /etc/modprobe.d/*.conf: options opl3 io=0x388 or else add the following line to /etc/lilo.conf: opl3=0x388 @@ -158,5 +158,5 @@ following line would be appropriate: append="pas2=0x388,10,3,-1,0,-1,-1,-1 opl3=0x388" If sound is built totally modular, the above options may be -specified in /etc/modprobe.conf for pas2, sb and opl3 +specified in /etc/modprobe.d/*.conf for pas2, sb and opl3 respectively. diff --git a/Documentation/sound/oss/README.modules b/Documentation/sound/oss/README.modules index e691d74e1e5e..bf5142a7be79 100644 --- a/Documentation/sound/oss/README.modules +++ b/Documentation/sound/oss/README.modules @@ -26,7 +26,7 @@ Note that it is no longer necessary or possible to configure sound in the drivers/sound dir. Now one simply configures and makes one's kernel and modules in the usual way. - Then, add to your /etc/modprobe.conf something like: + Then, add to your /etc/modprobe.d/oss.conf something like: alias char-major-14-* sb install sb /sbin/modprobe -i sb && /sbin/modprobe adlib_card @@ -66,12 +66,12 @@ args are expected. Note that at present there is no way to configure the io, irq and other parameters for the modular drivers as one does for the wired drivers.. One needs to pass the modules the necessary parameters as arguments, either -with /etc/modprobe.conf or with command-line args to modprobe, e.g. +with /etc/modprobe.d/*.conf or with command-line args to modprobe, e.g. modprobe sb io=0x220 irq=7 dma=1 dma16=5 mpu_io=0x330 modprobe adlib_card io=0x388 - recommend using /etc/modprobe.conf. + recommend using /etc/modprobe.d/*.conf. Persistent DMA Buffers: @@ -89,7 +89,7 @@ wasteful of RAM, but it guarantees that sound always works. To make the sound driver use persistent DMA buffers we need to pass the sound.o module a "dmabuf=1" command-line argument. This is normally done -in /etc/modprobe.conf like so: +in /etc/modprobe.d/*.conf files like so: options sound dmabuf=1 diff --git a/Documentation/usb/power-management.txt b/Documentation/usb/power-management.txt index 817df299ea07..4204eb01fd38 100644 --- a/Documentation/usb/power-management.txt +++ b/Documentation/usb/power-management.txt @@ -179,7 +179,8 @@ do: modprobe usbcore autosuspend=5 -Equivalently, you could add to /etc/modprobe.conf a line saying: +Equivalently, you could add to a configuration file in /etc/modprobe.d +a line saying: options usbcore autosuspend=5 diff --git a/Documentation/video4linux/CQcam.txt b/Documentation/video4linux/CQcam.txt index 8977e7ce4dab..6e680fec1e9c 100644 --- a/Documentation/video4linux/CQcam.txt +++ b/Documentation/video4linux/CQcam.txt @@ -61,29 +61,19 @@ But that is my personal preference. 2.2 Configuration The configuration requires module configuration and device -configuration. I like kmod or kerneld process with the -/etc/modprobe.conf file so the modules can automatically load/unload as -they are used. The video devices could already exist, be generated -using MAKEDEV, or need to be created. The following sections detail -these procedures. +configuration. The following sections detail these procedures. 2.1 Module Configuration Using modules requires a bit of work to install and pass the -parameters. Understand that entries in /etc/modprobe.conf of: +parameters. Understand that entries in /etc/modprobe.d/*.conf of: alias parport_lowlevel parport_pc options parport_pc io=0x378 irq=none alias char-major-81 videodev alias char-major-81-0 c-qcam -will cause the kmod/modprobe to do certain things. If you are -using kmod, then a request for a 'char-major-81-0' will cause -the 'c-qcam' module to load. If you have other video sources with -modules, you might want to assign the different minor numbers to -different modules. - 2.2 Device Configuration At this point, we need to ensure that the device files exist. diff --git a/Documentation/video4linux/Zoran b/Documentation/video4linux/Zoran index 9ed629d4874b..b5a911fd0602 100644 --- a/Documentation/video4linux/Zoran +++ b/Documentation/video4linux/Zoran @@ -255,7 +255,7 @@ Load zr36067.o. If it can't autodetect your card, use the card=X insmod option with X being the card number as given in the previous section. To have more than one card, use card=X1[,X2[,X3,[X4[..]]]] -To automate this, add the following to your /etc/modprobe.conf: +To automate this, add the following to your /etc/modprobe.d/zoran.conf: options zr36067 card=X1[,X2[,X3[,X4[..]]]] alias char-major-81-0 zr36067 diff --git a/Documentation/video4linux/bttv/Modules.conf b/Documentation/video4linux/bttv/Modules.conf index 753f15956eb8..8f258faf18f1 100644 --- a/Documentation/video4linux/bttv/Modules.conf +++ b/Documentation/video4linux/bttv/Modules.conf @@ -1,4 +1,4 @@ -# For modern kernels (2.6 or above), this belongs in /etc/modprobe.conf +# For modern kernels (2.6 or above), this belongs in /etc/modprobe.d/*.conf # For for 2.4 kernels or earlier, this belongs in /etc/modules.conf. # i2c diff --git a/Documentation/video4linux/meye.txt b/Documentation/video4linux/meye.txt index 34e2842c70ae..a051152ea99c 100644 --- a/Documentation/video4linux/meye.txt +++ b/Documentation/video4linux/meye.txt @@ -55,7 +55,7 @@ Module use: ----------- In order to automatically load the meye module on use, you can put those lines -in your /etc/modprobe.conf file: +in your /etc/modprobe.d/meye.conf file: alias char-major-81 videodev alias char-major-81-0 meye diff --git a/drivers/net/wan/Kconfig b/drivers/net/wan/Kconfig index 423eb26386c8..d70ede7a7f96 100644 --- a/drivers/net/wan/Kconfig +++ b/drivers/net/wan/Kconfig @@ -290,8 +290,8 @@ config FARSYNC Frame Relay or X.25/LAPB. If you want the module to be automatically loaded when the interface - is referenced then you should add "alias hdlcX farsync" to - /etc/modprobe.conf for each interface, where X is 0, 1, 2, ..., or + is referenced then you should add "alias hdlcX farsync" to a file + in /etc/modprobe.d/ for each interface, where X is 0, 1, 2, ..., or simply use "alias hdlc* farsync" to indicate all of them. To compile this driver as a module, choose M here: the diff --git a/drivers/scsi/aic7xxx/aic79xx_osm.c b/drivers/scsi/aic7xxx/aic79xx_osm.c index 7d48700257a7..9328121804bb 100644 --- a/drivers/scsi/aic7xxx/aic79xx_osm.c +++ b/drivers/scsi/aic7xxx/aic79xx_osm.c @@ -341,10 +341,10 @@ MODULE_PARM_DESC(aic79xx, " (0/256ms,1/128ms,2/64ms,3/32ms)\n" " slowcrc Turn on the SLOWCRC bit (Rev B only)\n" "\n" -" Sample /etc/modprobe.conf line:\n" -" Enable verbose logging\n" -" Set tag depth on Controller 2/Target 2 to 10 tags\n" -" Shorten the selection timeout to 128ms\n" +" Sample modprobe configuration file:\n" +" # Enable verbose logging\n" +" # Set tag depth on Controller 2/Target 2 to 10 tags\n" +" # Shorten the selection timeout to 128ms\n" "\n" " options aic79xx 'aic79xx=verbose.tag_info:{{}.{}.{..10}}.seltime:1'\n" ); diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm.c b/drivers/scsi/aic7xxx/aic7xxx_osm.c index c6251bb4f438..5a477cdc780d 100644 --- a/drivers/scsi/aic7xxx/aic7xxx_osm.c +++ b/drivers/scsi/aic7xxx/aic7xxx_osm.c @@ -360,10 +360,10 @@ MODULE_PARM_DESC(aic7xxx, " seltime: Selection Timeout\n" " (0/256ms,1/128ms,2/64ms,3/32ms)\n" "\n" -" Sample /etc/modprobe.conf line:\n" -" Toggle EISA/VLB probing\n" -" Set tag depth on Controller 1/Target 1 to 10 tags\n" -" Shorten the selection timeout to 128ms\n" +" Sample modprobe configuration file:\n" +" # Toggle EISA/VLB probing\n" +" # Set tag depth on Controller 1/Target 1 to 10 tags\n" +" # Shorten the selection timeout to 128ms\n" "\n" " options aic7xxx 'aic7xxx=probe_eisa_vl.tag_info:{{}.{.10}}.seltime:1'\n" ); diff --git a/drivers/staging/asus_oled/README b/drivers/staging/asus_oled/README index 0d82a6d5fa58..2d721232467a 100644 --- a/drivers/staging/asus_oled/README +++ b/drivers/staging/asus_oled/README @@ -52,7 +52,7 @@ Configuration There is only one option: start_off. You can use it by: 'modprobe asus_oled start_off=1', or by adding this - line to /etc/modprobe.conf: + line to /etc/modprobe.d/asus_oled.conf: options asus_oled start_off=1 With this option provided, asus_oled driver will switch off the display diff --git a/drivers/tty/isicom.c b/drivers/tty/isicom.c index 794ecb40017c..e1235accab74 100644 --- a/drivers/tty/isicom.c +++ b/drivers/tty/isicom.c @@ -102,7 +102,7 @@ * You can find the original tools for this direct from Multitech * ftp://ftp.multitech.com/ISI-Cards/ * - * Having installed the cards the module options (/etc/modprobe.conf) + * Having installed the cards the module options (/etc/modprobe.d/) * * options isicom io=card1,card2,card3,card4 irq=card1,card2,card3,card4 * diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c index 7c229d304684..ff8605b4b4be 100644 --- a/drivers/usb/serial/ftdi_sio.c +++ b/drivers/usb/serial/ftdi_sio.c @@ -1724,7 +1724,8 @@ static void ftdi_HE_TIRA1_setup(struct ftdi_private *priv) /* * Module parameter to control latency timer for NDI FTDI-based USB devices. - * If this value is not set in modprobe.conf.local its value will be set to 1ms. + * If this value is not set in /etc/modprobe.d/ its value will be set + * to 1ms. */ static int ndi_latency_timer = 1; diff --git a/drivers/usb/storage/Kconfig b/drivers/usb/storage/Kconfig index fe2d803a6347..7691c866637b 100644 --- a/drivers/usb/storage/Kconfig +++ b/drivers/usb/storage/Kconfig @@ -222,7 +222,7 @@ config USB_LIBUSUAL for usb-storage and ub drivers, and allows to switch binding of these devices without rebuilding modules. - Typical syntax of /etc/modprobe.conf is: + Typical syntax of /etc/modprobe.d/*conf is: options libusual bias="ub" diff --git a/sound/core/seq/seq_dummy.c b/sound/core/seq/seq_dummy.c index bbe32d2177d9..dbc550716790 100644 --- a/sound/core/seq/seq_dummy.c +++ b/sound/core/seq/seq_dummy.c @@ -46,7 +46,7 @@ The number of ports to be created can be specified via the module parameter "ports". For example, to create four ports, add the - following option in /etc/modprobe.conf: + following option in a configuration file under /etc/modprobe.d/: option snd-seq-dummy ports=4 diff --git a/sound/drivers/Kconfig b/sound/drivers/Kconfig index c8961165277c..fe5ae09ffccb 100644 --- a/sound/drivers/Kconfig +++ b/sound/drivers/Kconfig @@ -50,7 +50,8 @@ config SND_PCSP before the other sound driver of yours, making the pc-speaker a default sound device. Which is likely not what you want. To make this driver play nicely with other - sound driver, you can add this into your /etc/modprobe.conf: + sound driver, you can add this in a configuration file under + /etc/modprobe.d/ directory: options snd-pcsp index=2 You don't need this driver if you only want your pc-speaker to beep. -- cgit v1.2.3 From 78286cdf054212c6d2fe6524fbf673fb9ead1abe Mon Sep 17 00:00:00 2001 From: Lucas De Marchi Date: Fri, 30 Mar 2012 13:37:20 -0700 Subject: Documentation: replace install commands with softdeps Install commands should not be used to specify soft dependencies among modules. When loading modules it's much better to have a softdep that modprobe knows what's being done than having to fork/exec another instance of modprobe to load the other module. By using a softdep user has also an option to remove the dependencies when removing the module (and if its refcount dropped to 0) Signed-off-by: Lucas De Marchi Signed-off-by: Linus Torvalds --- Documentation/networking/bonding.txt | 3 +-- Documentation/sound/oss/AudioExcelDSP16 | 4 ++-- Documentation/sound/oss/README.modules | 2 +- 3 files changed, 4 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt index d5e869814040..bfea8a338901 100644 --- a/Documentation/networking/bonding.txt +++ b/Documentation/networking/bonding.txt @@ -1822,8 +1822,7 @@ modules.conf manual page. In this case, the following can be added to config files in /etc/modprobe.d/ as: -install bonding /sbin/modprobe tg3; /sbin/modprobe e1000; - /sbin/modprobe --ignore-install bonding +softdep bonding pre: tg3 e1000 This will load tg3 and e1000 modules before loading the bonding one. Full documentation on this can be found in the modprobe.d and modprobe diff --git a/Documentation/sound/oss/AudioExcelDSP16 b/Documentation/sound/oss/AudioExcelDSP16 index e863f9cd5941..ea8549faede9 100644 --- a/Documentation/sound/oss/AudioExcelDSP16 +++ b/Documentation/sound/oss/AudioExcelDSP16 @@ -54,8 +54,8 @@ Loading MSS and OPL3 needs to pre load the aedsp16 module to set up correctly the sound card. Installation dependencies must be written in configuration files under /etc/modprobe.d/ directory: -install ad1848 /sbin/modprobe aedsp16 && /sbin/modprobe -i ad1848 -install opl3 /sbin/modprobe aedsp16 && /sbin/modprobe -i opl3 +softdep ad1848 pre: aedsp16 +softdep opl3 pre: aedsp16 Then you must load the sound modules stack in this order: sound -> aedsp16 -> [ ad1848, opl3 ] diff --git a/Documentation/sound/oss/README.modules b/Documentation/sound/oss/README.modules index bf5142a7be79..cdc039421a46 100644 --- a/Documentation/sound/oss/README.modules +++ b/Documentation/sound/oss/README.modules @@ -36,7 +36,7 @@ options adlib_card io=0x388 # FM synthesizer Alternatively, if you have compiled in kernel level ISAPnP support: alias char-major-14 sb -post-install sb /sbin/modprobe "-k" "adlib_card" +softdep sb post: adlib_card options adlib_card io=0x388 The effect of this is that the sound driver and all necessary bits and -- cgit v1.2.3 From c48013824b855c4498092aa6f2fa5345962c1f65 Mon Sep 17 00:00:00 2001 From: Rafal Kapela Date: Fri, 30 Mar 2012 13:37:26 -0700 Subject: Documentation: fix typo in ABI/stable/sysfs-driver-usb-usbtmc Fix "the the" in ABI/stable/sysfs-driver-usb-usbtmc Signed-off-by: Rafal Kapela Signed-off-by: Randy Dunlap Cc: Greg KH Signed-off-by: Linus Torvalds --- Documentation/ABI/stable/sysfs-driver-usb-usbtmc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'Documentation') diff --git a/Documentation/ABI/stable/sysfs-driver-usb-usbtmc b/Documentation/ABI/stable/sysfs-driver-usb-usbtmc index 23a43b8207e6..2a7f9a00cb0a 100644 --- a/Documentation/ABI/stable/sysfs-driver-usb-usbtmc +++ b/Documentation/ABI/stable/sysfs-driver-usb-usbtmc @@ -55,7 +55,7 @@ What: /sys/bus/usb/drivers/usbtmc/devices/*/auto_abort Date: August 2008 Contact: Greg Kroah-Hartman Description: - This file determines if the the transaction of the USB TMC + This file determines if the transaction of the USB TMC device is to be automatically aborted if there is any error. For more details about this, please see the document, "Universal Serial Bus Test and Measurement Class Specification -- cgit v1.2.3 From fbc729a446f7d80ec8b73fe90d8c0cc3e95ad277 Mon Sep 17 00:00:00 2001 From: Andre Przywara Date: Fri, 30 Mar 2012 16:48:20 -0400 Subject: hwmon: (k10temp) Add support for AMD Trinity CPUs The on-chip northbridge's temperature sensor of the upcoming AMD Trinity CPUs works the same as for the previous CPUs. Since it has a different PCI-ID, we just add the new one to the list supported by k10temp. This allows to use the k10temp driver on those CPUs. Signed-off-by: Andre Przywara Cc: stable@vger.kernel.org # 3.0+ Signed-off-by: Guenter Roeck --- Documentation/hwmon/k10temp | 2 +- drivers/hwmon/Kconfig | 3 ++- drivers/hwmon/k10temp.c | 4 ++++ 3 files changed, 7 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/hwmon/k10temp b/Documentation/hwmon/k10temp index a10f73624ad3..90956b618025 100644 --- a/Documentation/hwmon/k10temp +++ b/Documentation/hwmon/k10temp @@ -11,7 +11,7 @@ Supported chips: Socket S1G2: Athlon (X2), Sempron (X2), Turion X2 (Ultra) * AMD Family 12h processors: "Llano" (E2/A4/A6/A8-Series) * AMD Family 14h processors: "Brazos" (C/E/G/Z-Series) -* AMD Family 15h processors: "Bulldozer" +* AMD Family 15h processors: "Bulldozer" (FX-Series), "Trinity" Prefix: 'k10temp' Addresses scanned: PCI space diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig index 670f274a18e6..8deedc1b9840 100644 --- a/drivers/hwmon/Kconfig +++ b/drivers/hwmon/Kconfig @@ -253,7 +253,8 @@ config SENSORS_K10TEMP If you say yes here you get support for the temperature sensor(s) inside your CPU. Supported are later revisions of the AMD Family 10h and all revisions of the AMD Family 11h, - 12h (Llano), 14h (Brazos) and 15h (Bulldozer) microarchitectures. + 12h (Llano), 14h (Brazos) and 15h (Bulldozer/Trinity) + microarchitectures. This driver can also be built as a module. If so, the module will be called k10temp. diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c index aba29d63f195..307bb325dde9 100644 --- a/drivers/hwmon/k10temp.c +++ b/drivers/hwmon/k10temp.c @@ -33,6 +33,9 @@ static bool force; module_param(force, bool, 0444); MODULE_PARM_DESC(force, "force loading on processors with erratum 319"); +/* PCI-IDs for Northbridge devices not used anywhere else */ +#define PCI_DEVICE_ID_AMD_15H_M10H_NB_F3 0x1403 + /* CPUID function 0x80000001, ebx */ #define CPUID_PKGTYPE_MASK 0xf0000000 #define CPUID_PKGTYPE_F 0x00000000 @@ -210,6 +213,7 @@ static DEFINE_PCI_DEVICE_TABLE(k10temp_id_table) = { { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_11H_NB_MISC) }, { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_CNB17H_F3) }, { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_15H_NB_F3) }, + { PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_15H_M10H_NB_F3) }, {} }; MODULE_DEVICE_TABLE(pci, k10temp_id_table); -- cgit v1.2.3 From 5d6bd8619db5a30668093c1b2967674645cf0736 Mon Sep 17 00:00:00 2001 From: Fernando Luis Vazquez Cao Date: Tue, 3 Apr 2012 08:41:40 +0000 Subject: TCP: update ip_local_port_range documentation The explanation of ip_local_port_range in Documentation/networking/ip-sysctl.txt contains several factual errors: - The default value of ip_local_port_range does not depend on the amount of memory available in the system. - tcp_tw_recycle is not enabled by default. - 1024-4999 is not the default value. - Etc. Clean up the mess. Signed-off-by: Fernando Luis Vazquez Cao Signed-off-by: David S. Miller --- Documentation/networking/ip-sysctl.txt | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index ad3e80e17b4f..bd80ba5847d2 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -604,15 +604,8 @@ IP Variables: ip_local_port_range - 2 INTEGERS Defines the local port range that is used by TCP and UDP to choose the local port. The first number is the first, the - second the last local port number. Default value depends on - amount of memory available on the system: - > 128Mb 32768-61000 - < 128Mb 1024-4999 or even less. - This number defines number of active connections, which this - system can issue simultaneously to systems not supporting - TCP extensions (timestamps). With tcp_tw_recycle enabled - (i.e. by default) range 1024-4999 is enough to issue up to - 2000 connections per second to systems supporting timestamps. + second the last local port number. The default values are + 32768 and 61000 respectively. ip_local_reserved_ports - list of comma separated ranges Specify the ports which are reserved for known third-party -- cgit v1.2.3 From c16524e6a957bc96ed02738d5396d355c0027d00 Mon Sep 17 00:00:00 2001 From: Nicolas Ferre Date: Thu, 22 Mar 2012 14:48:47 +0100 Subject: ARM: at91/NAND DT bindings: add comments Add comments to NAND "gpios" property to make it clearer. Signed-off-by: Nicolas Ferre Acked-by: Jean-Christophe PLAGNIOL-VILLARD --- Documentation/devicetree/bindings/mtd/atmel-nand.txt | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/devicetree/bindings/mtd/atmel-nand.txt b/Documentation/devicetree/bindings/mtd/atmel-nand.txt index 5903ecf6e895..a20069502f5a 100644 --- a/Documentation/devicetree/bindings/mtd/atmel-nand.txt +++ b/Documentation/devicetree/bindings/mtd/atmel-nand.txt @@ -27,13 +27,13 @@ nand0: nand@40000000,0 { reg = <0x40000000 0x10000000 0xffffe800 0x200 >; - atmel,nand-addr-offset = <21>; - atmel,nand-cmd-offset = <22>; + atmel,nand-addr-offset = <21>; /* ale */ + atmel,nand-cmd-offset = <22>; /* cle */ nand-on-flash-bbt; nand-ecc-mode = "soft"; - gpios = <&pioC 13 0 - &pioC 14 0 - 0 + gpios = <&pioC 13 0 /* rdy */ + &pioC 14 0 /* nce */ + 0 /* cd */ >; partition@0 { ... -- cgit v1.2.3 From 93b6a3adbd159174772702744b142d60e3891dfa Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Thu, 5 Apr 2012 14:39:10 +0000 Subject: doc, net: Remove obsolete reference to dev->poll Commit bea3348eef27e6044b6161fd04c3152215f96411 ('[NET]: Make NAPI polling independent of struct net_device objects.') removed the automatic disabling of NAPI polling by dev_close(), and drivers must now do this themselves. Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller --- Documentation/networking/netdevices.txt | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt index 89358341682a..336fe8e85b20 100644 --- a/Documentation/networking/netdevices.txt +++ b/Documentation/networking/netdevices.txt @@ -54,8 +54,7 @@ dev->open: dev->stop: Synchronization: rtnl_lock() semaphore. Context: process - Note1: netif_running() is guaranteed false - Note2: dev->poll() is guaranteed to be stopped + Note: netif_running() is guaranteed false dev->do_ioctl: Synchronization: rtnl_lock() semaphore. -- cgit v1.2.3 From 04fd3d3515612b71f96b851db7888bfe58ef2142 Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Thu, 5 Apr 2012 14:39:30 +0000 Subject: doc, net: Update documentation of synchronisation for TX multiqueue Commits e308a5d806c852f56590ffdd3834d0df0cbed8d7 ('netdev: Add netdev->addr_list_lock protection.') and e8a0464cc950972824e2e128028ae3db666ec1ed ('netdev: Allocate multiple queues for TX.') introduced more fine-grained locks. Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller --- Documentation/networking/netdevices.txt | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt index 336fe8e85b20..b107733cfdcd 100644 --- a/Documentation/networking/netdevices.txt +++ b/Documentation/networking/netdevices.txt @@ -65,7 +65,7 @@ dev->get_stats: Context: nominally process, but don't sleep inside an rwlock dev->hard_start_xmit: - Synchronization: netif_tx_lock spinlock. + Synchronization: __netif_tx_lock spinlock. When the driver sets NETIF_F_LLTX in dev->features this will be called without holding netif_tx_lock. In this case the driver @@ -87,12 +87,12 @@ dev->hard_start_xmit: Only valid when NETIF_F_LLTX is set. dev->tx_timeout: - Synchronization: netif_tx_lock spinlock. + Synchronization: netif_tx_lock spinlock; all TX queues frozen. Context: BHs disabled Notes: netif_queue_stopped() is guaranteed true dev->set_rx_mode: - Synchronization: netif_tx_lock spinlock. + Synchronization: netif_addr_lock spinlock. Context: BHs disabled struct napi_struct synchronization rules -- cgit v1.2.3 From b3cf65457fc0c8d183bdb9bc4358e5706aa63cc5 Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Thu, 5 Apr 2012 14:39:47 +0000 Subject: doc, net: Update netdev operation names Commits d314774cf2cd5dfeb39a00d37deee65d4c627927 ('netdev: network device operations infrastructure') and 008298231abbeb91bc7be9e8b078607b816d1a4a ('netdev: add more functions to netdevice ops') moved and renamed net device operation pointers. Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller --- Documentation/networking/driver.txt | 12 ++++++------ Documentation/networking/netdevices.txt | 16 ++++++++-------- 2 files changed, 14 insertions(+), 14 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/driver.txt b/Documentation/networking/driver.txt index 03283daa64fe..83ce06080ceb 100644 --- a/Documentation/networking/driver.txt +++ b/Documentation/networking/driver.txt @@ -2,7 +2,7 @@ Document about softnet driver issues Transmit path guidelines: -1) The hard_start_xmit method must never return '1' under any +1) The ndo_start_xmit method must never return '1' under any normal circumstances. It is considered a hard error unless there is no way your device can tell ahead of time when it's transmit function will become busy. @@ -61,10 +61,10 @@ Transmit path guidelines: 2) Do not forget to update netdev->trans_start to jiffies after each new tx packet is given to the hardware. -3) A hard_start_xmit method must not modify the shared parts of a +3) An ndo_start_xmit method must not modify the shared parts of a cloned SKB. -4) Do not forget that once you return 0 from your hard_start_xmit +4) Do not forget that once you return 0 from your ndo_start_xmit method, it is your driver's responsibility to free up the SKB and in some finite amount of time. @@ -74,7 +74,7 @@ Transmit path guidelines: This error can deadlock sockets waiting for send buffer room to be freed up. - If you return 1 from the hard_start_xmit method, you must not keep + If you return 1 from the ndo_start_xmit method, you must not keep any reference to that SKB and you must not attempt to free it up. Probing guidelines: @@ -85,10 +85,10 @@ Probing guidelines: Close/stop guidelines: -1) After the dev->stop routine has been called, the hardware must +1) After the ndo_stop routine has been called, the hardware must not receive or transmit any data. All in flight packets must be aborted. If necessary, poll or wait for completion of any reset commands. -2) The dev->stop routine will be called by unregister_netdevice +2) The ndo_stop routine will be called by unregister_netdevice if device is still UP. diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt index b107733cfdcd..c7ecc7080494 100644 --- a/Documentation/networking/netdevices.txt +++ b/Documentation/networking/netdevices.txt @@ -47,24 +47,24 @@ packets is preferred. struct net_device synchronization rules ======================================= -dev->open: +ndo_open: Synchronization: rtnl_lock() semaphore. Context: process -dev->stop: +ndo_stop: Synchronization: rtnl_lock() semaphore. Context: process Note: netif_running() is guaranteed false -dev->do_ioctl: +ndo_do_ioctl: Synchronization: rtnl_lock() semaphore. Context: process -dev->get_stats: +ndo_get_stats: Synchronization: dev_base_lock rwlock. Context: nominally process, but don't sleep inside an rwlock -dev->hard_start_xmit: +ndo_start_xmit: Synchronization: __netif_tx_lock spinlock. When the driver sets NETIF_F_LLTX in dev->features this will be @@ -86,12 +86,12 @@ dev->hard_start_xmit: o NETDEV_TX_LOCKED Locking failed, please retry quickly. Only valid when NETIF_F_LLTX is set. -dev->tx_timeout: +ndo_tx_timeout: Synchronization: netif_tx_lock spinlock; all TX queues frozen. Context: BHs disabled Notes: netif_queue_stopped() is guaranteed true -dev->set_rx_mode: +ndo_set_rx_mode: Synchronization: netif_addr_lock spinlock. Context: BHs disabled @@ -99,7 +99,7 @@ struct napi_struct synchronization rules ======================================== napi->poll: Synchronization: NAPI_STATE_SCHED bit in napi->state. Device - driver's dev->close method will invoke napi_disable() on + driver's ndo_stop method will invoke napi_disable() on all NAPI instances which will do a sleeping poll on the NAPI_STATE_SCHED napi->state bit, waiting for all pending NAPI activity to cease. -- cgit v1.2.3 From de7aca16fd6c32719b6a7d4480b8f4685f69f7ff Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Thu, 5 Apr 2012 14:40:06 +0000 Subject: doc, net: Remove instruction to set net_device::trans_start Commit 08baf561083bc27a953aa087dd8a664bb2b88e8e ('net: txq_trans_update() helper') made it unnecessary for most drivers to set net_device::trans_start (or netdev_queue::trans_start). Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller --- Documentation/networking/driver.txt | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/driver.txt b/Documentation/networking/driver.txt index 83ce06080ceb..2128e4169c5b 100644 --- a/Documentation/networking/driver.txt +++ b/Documentation/networking/driver.txt @@ -58,13 +58,10 @@ Transmit path guidelines: TX_BUFFS_AVAIL(dp) > 0) netif_wake_queue(dp->dev); -2) Do not forget to update netdev->trans_start to jiffies after - each new tx packet is given to the hardware. - -3) An ndo_start_xmit method must not modify the shared parts of a +2) An ndo_start_xmit method must not modify the shared parts of a cloned SKB. -4) Do not forget that once you return 0 from your ndo_start_xmit +3) Do not forget that once you return 0 from your ndo_start_xmit method, it is your driver's responsibility to free up the SKB and in some finite amount of time. -- cgit v1.2.3 From e34fac1c2e9ec531c2d63a5e3aa9a6d0ef36a1d3 Mon Sep 17 00:00:00 2001 From: Ben Hutchings Date: Thu, 5 Apr 2012 14:40:25 +0000 Subject: doc, net: Update ndo_start_xmit return type and values Commit dc1f8bf68b311b1537cb65893430b6796118498a ('netdev: change transmit to limited range type') changed the required return type and 9a1654ba0b50402a6bd03c7b0fe9b0200a5ea7b1 ('net: Optimize hard_start_xmit() return checking') changed the valid numerical return values. Signed-off-by: Ben Hutchings Signed-off-by: David S. Miller --- Documentation/networking/driver.txt | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) (limited to 'Documentation') diff --git a/Documentation/networking/driver.txt b/Documentation/networking/driver.txt index 2128e4169c5b..da59e2884130 100644 --- a/Documentation/networking/driver.txt +++ b/Documentation/networking/driver.txt @@ -2,16 +2,16 @@ Document about softnet driver issues Transmit path guidelines: -1) The ndo_start_xmit method must never return '1' under any - normal circumstances. It is considered a hard error unless +1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under + any normal circumstances. It is considered a hard error unless there is no way your device can tell ahead of time when it's transmit function will become busy. Instead it must maintain the queue properly. For example, for a driver implementing scatter-gather this means: - static int drv_hard_start_xmit(struct sk_buff *skb, - struct net_device *dev) + static netdev_tx_t drv_hard_start_xmit(struct sk_buff *skb, + struct net_device *dev) { struct drv *dp = netdev_priv(dev); @@ -23,7 +23,7 @@ Transmit path guidelines: unlock_tx(dp); printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", dev->name); - return 1; + return NETDEV_TX_BUSY; } ... queue packet to card ... @@ -35,6 +35,7 @@ Transmit path guidelines: ... unlock_tx(dp); ... + return NETDEV_TX_OK; } And then at the end of your TX reclamation event handling: @@ -61,9 +62,9 @@ Transmit path guidelines: 2) An ndo_start_xmit method must not modify the shared parts of a cloned SKB. -3) Do not forget that once you return 0 from your ndo_start_xmit - method, it is your driver's responsibility to free up the SKB - and in some finite amount of time. +3) Do not forget that once you return NETDEV_TX_OK from your + ndo_start_xmit method, it is your driver's responsibility to free + up the SKB and in some finite amount of time. For example, this means that it is not allowed for your TX mitigation scheme to let TX packets "hang out" in the TX @@ -71,8 +72,9 @@ Transmit path guidelines: This error can deadlock sockets waiting for send buffer room to be freed up. - If you return 1 from the ndo_start_xmit method, you must not keep - any reference to that SKB and you must not attempt to free it up. + If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you + must not keep any reference to that SKB and you must not attempt + to free it up. Probing guidelines: -- cgit v1.2.3