summaryrefslogtreecommitdiffstats
path: root/drivers (follow)
Commit message (Collapse)AuthorAgeFilesLines
* scsi: introduce a result field in struct scsi_requestChristoph Hellwig2017-04-2025-103/+102
| | | | | | | | | | | | | | | This passes on the scsi_cmnd result field to users of passthrough requests. Currently we abuse req->errors for this purpose, but that field will go away in its current form. Note that the old IDE code abuses the errors field in very creative ways and stores all kinds of different values in it. I didn't dare to touch this magic, so the abuses are brought forward 1:1. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* virtio_blk: don't use req->errorsChristoph Hellwig2017-04-201-7/+3
| | | | | | | | | | Remove passing req->errors (which at that point is always 0) to blk_mq_complete_request, and rely on the virtio status code for the serial number passthrough request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* virtio: fix spelling of virtblk_scsi_request_doneChristoph Hellwig2017-04-201-3/+3
| | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nvme: make nvme_error_status privateChristoph Hellwig2017-04-203-6/+4
| | | | | | | | | | | | Currently it's used by the lighnvm passthrough ioctl, but we'd like to make it private in preparation of block layer specific error code. Lighnvm already returns the real NVMe status anyway, so I think we can just limit it to returning -EIO for any status set. This will need a careful audit from the lightnvm folks, though. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nvme: split nvme status from block req->errorsChristoph Hellwig2017-04-207-60/+65
| | | | | | | | | | | | | | | | | We want our own clearly defined error field for NVMe passthrough commands, and the request errors field is going away in its current form. Just store the status and result field in the nvme_request field from hardirq completion context (using a new helper) and then generate a Linux errno for the block layer only when we actually need it. Because we can't overload the status value with a negative error code for cancelled command we now have a flags filed in struct nvme_request that contains a bit for this condition. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* nvme-fc: fix status code handling in nvme_fc_fcpio_doneChristoph Hellwig2017-04-201-8/+8
| | | | | | | | | | | | | nvme_complete_async_event expects the little endian status code including the phase bit, and a new completion handler I plan to introduce will do so as well. Change the status variable into the little endian format with the phase bit used in the NVMe CQE to fix / enable this. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* block: remove the blk_execute_rq return valueChristoph Hellwig2017-04-2012-17/+27
| | | | | | | | | | | | The function only returns -EIO if rq->errors is non-zero, which is not very useful and lets a large number of callers ignore the return value. Just let the callers figure out their error themselves. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* pd: don't check blk_execute_rq return value.Christoph Hellwig2017-04-201-5/+2
| | | | | | | | | The driver never sets req->errors, so blk_execute_rq will always return 0. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* bdi: Drop 'parent' argument from bdi_register[_va]()Jan Kara2017-04-201-1/+1
| | | | | | | | | Drop 'parent' argument of bdi_register() and bdi_register_va(). It is always NULL. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* fs: Remove SB_I_DYNBDI flagJan Kara2017-04-201-1/+0
| | | | | | | | | Now that all bdi structures filesystems use are properly refcounted, we can remove the SB_I_DYNBDI flag. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* mtd: Convert to dynamically allocated bdi infrastructureJan Kara2017-04-202-12/+18
| | | | | | | | | | | | | | | | MTD already allocates backing_dev_info dynamically. Convert it to use generic infrastructure for this including proper refcounting. We drop mtd->backing_dev_info as its only use was to pass mtd_bdi pointer from one file into another and if we wanted to keep that in a clean way, we'd have to make mtd hold and drop bdi reference as needed which seems pointless for passing one global pointer... CC: David Woodhouse <dwmw2@infradead.org> CC: Brian Norris <computersforpeace@gmail.com> CC: linux-mtd@lists.infradead.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* lustre: Convert to separately allocated bdiJan Kara2017-04-202-25/+3
| | | | | | | | | | | | | | Allocate struct backing_dev_info separately instead of embedding it inside superblock. This unifies handling of bdi among users. CC: Oleg Drokin <oleg.drokin@intel.com> CC: Andreas Dilger <andreas.dilger@intel.com> CC: James Simmons <jsimmons@infradead.org> CC: lustre-devel@lists.lustre.org Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com>
* ligtnvm: fix double blk_put_queue on same queueRakesh Pandit2017-04-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On an error path in NVM_DEV_CREATE ioctl blk_put_queue is being called twice: one via blk_cleanup_queue and another via put_disk. Straight fix seems to remove queue pointer so that disk_release never ends up caling blk_put_queue again. [ 391.808827] WARNING: CPU: 1 PID: 1250 at lib/refcount.c:128 refcount_sub_and_test+0x70/0x80 [ 391.808830] refcount_t: underflow; use-after-free. [ 391.808832] Modules linked in: nf_conntrack_netbios_ns............ [ 391.809052] CPU: 1 PID: 1250 Comm: nvme Not tainted......... [ 391.809057] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 391.809060] Call Trace: [ 391.809079] dump_stack+0x63/0x86 [ 391.809094] __warn+0xcb/0xf0 [ 391.809103] warn_slowpath_fmt+0x5f/0x80 [ 391.809118] refcount_sub_and_test+0x70/0x80 [ 391.809125] refcount_dec_and_test+0x11/0x20 [ 391.809136] kobject_put+0x1f/0x60 [ 391.809149] blk_put_queue+0x15/0x20 [ 391.809159] disk_release+0xae/0xf0 [ 391.809172] device_release+0x32/0x90 [ 391.809184] kobject_release+0x6a/0x170 [ 391.809196] kobject_put+0x2f/0x60 [ 391.809206] put_disk+0x17/0x20 [ 391.809219] nvm_ioctl_dev_create.isra.16+0x897/0xa30 [ 391.809236] nvm_ctl_ioctl+0x23c/0x4c0 [ 391.809248] do_vfs_ioctl+0xa3/0x5f0 [ 391.809258] SyS_ioctl+0x79/0x90 [ 391.809271] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 391.809280] RIP: 0033:0x7f5d3ef363c7 [ 391.809286] RSP: 002b:00007ffc72ed8d78 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 [ 391.809296] RAX: ffffffffffffffda RBX: 00007ffc72edb552 RCX: 00007f5d3ef363c7 [ 391.809301] RDX: 00007ffc72ed8d90 RSI: 0000000040804c22 RDI: 0000000000000003 [ 391.809306] RBP: 0000000000000001 R08: 0000000000000020 R09: 0000000000000001 [ 391.809311] R10: 000000000000053f R11: 0000000000000206 R12: 0000000000000000 [ 391.809316] R13: 0000000000000000 R14: 00007ffc72edb58d R15: 00007ffc72edb581 Signed-off-by: Rakesh Pandit <rakesh@tuxera.com> Reviewed-by: Matias Bjørling <matias@cnexlabs.com> Fixes: 7d1ef2f408ab "lightnvm: fix cleanup order of disk on init error" Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: Use blk_init_request_from_bio() instead of open-coding itBart Van Assche2017-04-201-5/+1
| | | | | | | | | | | | | | | | | This patch changes the behavior of the lightnvm driver as follows: * REQ_FAILFAST_MASK is set for read-ahead requests. * If no I/O priority has been set in the bio, the I/O priority is copied from the I/O context. * The rq_disk member is initialized if bio->bi_bdev != NULL. * The bio sector offset is copied into req->__sector instead of retaining the value -1 set by blk_mq_alloc_request(). * req->errors is initialized to zero. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matias Bjørling <m@bjorling.me> Cc: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* null_blk: Use blk_init_request_from_bio() instead of open-coding itBart Van Assche2017-04-201-8/+1
| | | | | | | | | | | | | | | | This patch changes the behavior of the null_blk driver for the LightNVM mode as follows: * REQ_FAILFAST_MASK is set for read-ahead requests. * If no I/O priority has been set in the bio, the I/O priority is copied from the I/O context. * The rq_disk member is initialized if bio->bi_bdev != NULL. * req->errors is initialized to zero. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Matias Bjørling <m@bjorling.me> Cc: Adam Manzanares <adam.manzanares@wdc.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: assume 64-bit lba numbersArnd Bergmann2017-04-191-1/+1
| | | | | | | | | | | | | | | | The driver uses both u64 and sector_t to refer to offsets, and assigns between the two. This causes one harmless warning when sector_t is 32-bit: drivers/lightnvm/pblk-rb.c: In function 'pblk_rb_write_entry_gc': include/linux/lightnvm.h:215:20: error: large integer implicitly truncated to unsigned type [-Werror=overflow] drivers/lightnvm/pblk-rb.c:324:22: note: in expansion of macro 'ADDR_EMPTY' As the driver is already doing this inconsistently, changing the type won't make it worse and is an easy way to avoid the warning. Fixes: a4bd217b4326 ("lightnvm: physical block device (pblk) target") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <axboe@fb.com>
* block: remove the osdblk driverChristoph Hellwig2017-04-193-710/+0
| | | | | | | | | This was just a proof of concept user for the SCSI OSD library, and never had any real users. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Boaz Harrosh <ooo@electrozaur.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: set the max segment size to UINT_MAXJosef Bacik2017-04-191-0/+1
| | | | | | | | | | NBD doesn't care about limiting the segment size, let the user push the largest bio's they want. This allows us to control the request size solely through max_sectors_kb. Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* blkfront: add uevent for size changeMarc Olson2017-04-181-0/+3
| | | | | | | | | | | | | | When a blkfront device is resized from dom0, emit a KOBJ_CHANGE uevent to notify the guest about the change. This allows for custom udev rules, such as automatically resizing a filesystem, when an event occurs. With this patch you get these udev KERNEL[577.206230] change /devices/vbd-51728/block/xvdb (block) UDEV [577.226218] change /devices/vbd-51728/block/xvdb (block) Signed-off-by: Marc Olson <marcolso@amazon.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* nbd: add a flag to destroy an nbd device on disconnectJosef Bacik2017-04-171-0/+30
| | | | | | | | | | For ease of management it would be nice for users to specify that the device node for a nbd device is destroyed once it is disconnected and there are no more users. Add a client flag and enable this operation to happen. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: add device refcountingJosef Bacik2017-04-171-18/+86
| | | | | | | | | | In order to support deleting the device on disconnect we need to refcount the actual nbd_device struct. So add the refcounting framework and change how we free the normal devices at rmmod time so we can catch reference leaks. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: add a status netlink commandJosef Bacik2017-04-171-0/+108
| | | | | | | | | Allow users to query the status of existing nbd devices. Right now this only returns whether or not the device is connected, but could be extended in the future to include more information. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: handle dead connectionsJosef Bacik2017-04-171-4/+59
| | | | | | | | | | | | Sometimes we like to upgrade our server without making all of our clients freak out and reconnect. This patch provides a way to specify a dead connection timeout to allow us to pause all requests and wait for new connections to be opened. With this in place I can take down the nbd server for less than the dead connection timeout time and bring it back up and everything resumes gracefully. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: only clear the queue on device teardownJosef Bacik2017-04-171-1/+3
| | | | | | | | | | | | | When running a disconnect torture test I noticed that sometimes we would crash with a negative ref count on our queue. This was because we were ending the same request twice. Turns out we were racing with NBD_CLEAR_SOCK clearing the requests as well as the teardown of the device clearing the requests. So instead make the ioctl only shutdown the sockets and make it so that we only ever run nbd_clear_que from the device teardown. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: multicast dead link notificationsJosef Bacik2017-04-171-12/+77
| | | | | | | | | Provide a mechanism to notify userspace that there's been a link problem on a NBD device. This will allow userspace to re-establish a connection and provide the new socket to the device without disrupting the device. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: add a reconfigure netlink commandJosef Bacik2017-04-171-0/+141
| | | | | | | | | | | | We want to be able to reconnect dead connections to existing block devices, so add a reconfigure netlink command. We will also allow users to change their timeout on the fly, but everything else will require a disconnect and reconnect. You won't be able to add more connections either, simply replace dead connections with new more lively connections. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: add a basic netlink interfaceJosef Bacik2017-04-171-26/+290
| | | | | | | | | | | | | | | | | | | | | The existing ioctl interface for configuring NBD devices is a bit cumbersome and hard to extend. The other problem is we leave a userspace app sitting in it's syscall until the device disconnects, which is less than ideal. This patch introduces a netlink interface for adding and disconnecting nbd devices. This has the benefits of being easily extendable without breaking older userspace applications, and allows us to configure a nbd device without leaving a userspace app sitting waiting for the device to disconnect. With this interface we also gain the ability to configure more devices than are preallocated at insmod time. We also have gained the ability to not specify a particular device and be provided one for us so that userspace doesn't need to find a free device to configure. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: stop using the bdev everywhereJosef Bacik2017-04-171-55/+37
| | | | | | | | | | In preparation for the upcoming netlink interface we need to not rely on already having the bdev for the NBD device we are doing operations on. Instead of passing the bdev around, just use it in places where we know we already have the bdev. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: separate out the config informationJosef Bacik2017-04-171-174/+263
| | | | | | | | | | | | | | | | | | | In order to properly refcount the various aspects of a NBD device we need to separate out the configuration elements of the nbd device. The configuration of a NBD device has a different lifetime from the actual device, so it doesn't make sense to bundle these two concepts. Add a config_refs to keep track of the configuration structure, that way we can be sure that we never access it when we've torn down the device. Add a new nbd_config structure to hold all of the transient configuration information. Finally create this when we open the device so that it is in place when we start to configure the device. This has a nice side-effect of fixing a long standing problem where you could end up with a half-configured nbd device that needed to be "disconnected" in order to be usable again. Now once we close our device the configuration will be discarded. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: handle single path failures gracefullyJosef Bacik2017-04-171-26/+125
| | | | | | | | | | | | | Currently if we have multiple connections and one of them goes down we will tear down the whole device. However there's no reason we need to do this as we could have other connections that are working fine. Deal with this by keeping track of the state of the different connections, and if we lose one we mark it as dead and send all IO destined for that socket to one of the other healthy sockets. Any outstanding requests that were on the dead socket will timeout and be re-submitted properly. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nbd: put socket in error casesJosef Bacik2017-04-171-2/+7
| | | | | | | | | When adding a new socket we look it up and then try to add it to our configuration. If any of those steps fail we need to make sure we put the socket so we don't leak them. Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: fix some error code in pblk-init.cDan Carpenter2017-04-161-6/+14
| | | | | | | | | | | There were a bunch of places in pblk_lines_init() where we didn't set an error code. And in pblk_writer_init() we accidentally return 1 instead of a correct error code, which would result in a Oops later. Fixes: 11a5d6fdf919 ("lightnvm: physical block device (pblk) target") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: fix some WARN() messagesDan Carpenter2017-04-163-8/+8
| | | | | | | | | WARN_ON() takes a condition, not an error message. I slightly tweaked some conditions so hopefully it's more clear. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: pblk-gc: fix an error pointer dereference in initDan Carpenter2017-04-161-2/+2
| | | | | | | | | | These labels are reversed so we could end up dereferencing an error pointer or leaking. Fixes: 7f347ba6bb3a ("lightnvm: physical block device (pblk) target") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: physical block device (pblk) targetJavier González2017-04-1614-0/+8023
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces pblk, a host-side translation layer for Open-Channel SSDs to expose them like block devices. The translation layer allows data placement decisions, and I/O scheduling to be managed by the host, enabling users to optimize the SSD for their specific workloads. An open-channel SSD has a set of LUNs (parallel units) and a collection of blocks. Each block can be read in any order, but writes must be sequential. Writes may also fail, and if a block requires it, must also be reset before new writes can be applied. To manage the constraints, pblk maintains a logical to physical address (L2P) table, write cache, garbage collection logic, recovery scheme, and logic to rate-limit user I/Os versus garbage collection I/Os. The L2P table is fully-associative and manages sectors at a 4KB granularity. Pblk stores the L2P table in two places, in the out-of-band area of the media and on the last page of a line. In the cause of a power failure, pblk will perform a scan to recover the L2P table. The user data is organized into lines. A line is data striped across blocks and LUNs. The lines enable the host to reduce the amount of metadata to maintain besides the user data and makes it easier to implement RAID or erasure coding in the future. pblk implements multi-tenant support and can be instantiated multiple times on the same drive. Each instance owns a portion of the SSD - both regarding I/O bandwidth and capacity - providing I/O isolation for each case. Finally, pblk also exposes a sysfs interface that allows user-space to peek into the internals of pblk. The interface is available at /dev/block/*/pblk/ where * is the block device name exposed. This work also contains contributions from: Matias Bjørling <matias@cnexlabs.com> Simon A. F. Lund <slund@cnexlabs.com> Young Tack Jin <youngtack.jin@gmail.com> Huaicheng Li <huaicheng@cs.uchicago.edu> Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: convert sprintf into strlcpyJavier González2017-04-161-3/+3
| | | | | | | | | Convert sprintf calls to strlcpy in order to make possible buffer overflow more obvious. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: fix type checks on rrpcJavier González2017-04-161-2/+2
| | | | | | | | sector_t is always unsigned, therefore avoid < 0 checks on it. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: clean unused variableJavier González2017-04-161-3/+0
| | | | | | | | Clean unused variable on lightnvm core. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: make nvm_free staticJavier González2017-04-161-1/+1
| | | | | | | | Prefix the nvm_free static function with a missing static keyword. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: allow to init targets on factory modeJavier González2017-04-162-4/+13
| | | | | | | | | | | | | Target initialization has two responsibilities: creating the target partition and instantiating the target. This patch enables to create a factory partition (e.g., do not trigger recovery on the given target). This is useful for target development and for being able to restore the device state at any moment in time without requiring a full-device erase. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: bad type conversion for nvme control bitsJavier González2017-04-161-1/+1
| | | | | | | | | The NVMe I/O command control bits are 16 bytes, but is interpreted as 32 bytes in the lightnvm user I/O data path. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: fix cleanup order of disk on init errorJavier González2017-04-161-7/+7
| | | | | | | | | Reorder disk allocation such that the disk structure can be put safely. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: double-clear of dev->lun_map on target init errorJavier González2017-04-161-7/+10
| | | | | | | | | | | | | | | The dev->lun_map bits are cleared twice if an target init error occurs. First in the target clean routine, and then next in the nvm_tgt_create error function. Make sure that it is only cleared once by extending nvm_remove_tgt_devi() with a clear bit, such that clearing of bits can ignored when cleaning up a successful initialized target. Signed-off-by: Javier González <javier@cnexlabs.com> Fix style. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: don't check for failure from mempool_alloc()NeilBrown2017-04-161-9/+0
| | | | | | | | | | | | mempool_alloc() cannot fail if the gfp flags allow it to sleep, and both GFP_KERNEL and GFP_NOIO allows for sleeping. So rrpc_move_valid_pages() and rrpc_make_rq() don't need to test the return value. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: enable nvme size compile assertsMatias Bjørling2017-04-161-2/+4
| | | | | | | | | | | The asserts in _nvme_nvm_check_size are not compiled due to the function not begin called. Make sure that it is called, and also fix the wrong sizes of asserts for nvme_nvm_addr_format, and nvme_nvm_bb_tbl, which checked for number of bits instead of bytes. Reported-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: free reverse device mapJavier González2017-04-161-1/+13
| | | | | | | | Free the reverse mapping table correctly on target tear down Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: submit erases using the I/O pathJavier González2017-04-163-45/+44
| | | | | | | | | | | | | | | Until now erases have been submitted as synchronous commands through a dedicated erase function. In order to enable targets implementing asynchronous erases, refactor the erase path so that it uses the normal async I/O submission functions. If a target requires sync I/O, it can implement it internally. Also, adapt rrpc to use the new erase path. Signed-off-by: Javier González <javier@cnexlabs.com> Fixed spelling error. Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* nvme/lightnvm: Prevent small buffer overflow in nvme_nvm_identifyScott Bauer2017-04-161-1/+1
| | | | | | | | | | | | | | | | | | | | There are two closely named structs in lightnvm: struct nvme_nvm_addr_format and struct nvme_addr_format. The first struct has 4 reserved bytes at the end, the second does not. (gdb) p sizeof(struct nvme_nvm_addr_format) $1 = 16 (gdb) p sizeof(struct nvm_addr_format) $2 = 12 In the nvme_nvm_identify function we memcpy from the larger struct to the smaller struct. We incorrectly pass the length of the larger struct and overflow by 4 bytes, lets not do that. Signed-off-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* lightnvm: Fix error handlingChristophe JAILLET2017-04-161-2/+4
| | | | | | | | | According to error handling in this function, it is likely that going to 'out' was expected here. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Matias Bjørling <matias@cnexlabs.com> Signed-off-by: Jens Axboe <axboe@fb.com>
* remove the mg_disk driverChristoph Hellwig2017-04-143-1128/+0
| | | | | | | | | | | | This drivers was added in 2008, but as far as a I can tell we never had a single platform that actually registered resources for the platform driver. It's also been unmaintained for a long time and apparently has a ATA mode that can be driven using the IDE/libata subsystem. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>