summaryrefslogtreecommitdiffstats
path: root/drivers/vhost (follow)
Commit message (Collapse)AuthorAgeFilesLines
* vhost/scsi: Fix incorrect usage of get_user_pages_fast write parameterNicholas Bellinger2013-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | This patch addresses a long-standing bug where the get_user_pages_fast() write parameter used for setting the underlying page table entry permission bits was incorrectly set to write=1 for data_direction=DMA_TO_DEVICE, and passed into get_user_pages_fast() via vhost_scsi_map_iov_to_sgl(). However, this parameter is intended to signal WRITEs to pinned userspace PTEs for the virtio-scsi DMA_FROM_DEVICE -> READ payload case, and *not* for the virtio-scsi DMA_TO_DEVICE -> WRITE payload case. This bug would manifest itself as random process segmentation faults on KVM host after repeated vhost starts + stops and/or with lots of vhost endpoints + LUNs. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Asias He <asias@redhat.com> Cc: <stable@vger.kernel.org> # 3.6+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* vhost/scsi: Use GFP_ATOMIC with percpu_ida_alloc for obtaining tagNicholas Bellinger2013-10-021-1/+6
| | | | | | | | | | | | | | | Fix GFP_KERNEL -> GFP_ATOMIC usage of percpu_ida_alloc() within vhost_scsi_get_tag(), as this code is expected to be called directly from interrupt context. v2 changes: - Handle possible tag < 0 failure with GFP_ATOMIC Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Asias He <asias@redhat.com> Cc: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* vhost-scsi: whitespace tweakMichael S. Tsirkin2013-09-171-1/+1
| | | | | | Remove space at start of line that sneaked in. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vhost/scsi: use vmalloc for order-10 allocationMichael S. Tsirkin2013-09-171-14/+27
| | | | | | | | | | | | | As vhost scsi device struct is large, if the device is created on a busy system, kzalloc() might fail, so this patch does a fallback to vzalloc(). As vmalloc() adds overhead on data-path, add __GFP_REPEAT to kzalloc() flags to do this fallback only when really needed. Reviewed-by: Asias He <asias@redhat.com> Reported-by: Dan Aloni <alonid@stratoscale.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* vhost: wake up worker outside spin_lockQin Chuanyu2013-09-171-1/+3
| | | | | | | | | | | | | | | | | the wake_up_process func is included by spin_lock/unlock in vhost_work_queue, but it could be done outside the spin_lock. I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf, the num as below. original modified thread_num tp(Gbps) vhost(%) | tp(Gbps) vhost(%) 1 9.59 28.82 | 9.59 27.49 8 9.61 32.92 | 9.62 26.77 64 9.58 46.48 | 9.55 38.99 256 9.6 63.7 | 9.6 52.59 Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Merge branch 'for-next' of ↵Linus Torvalds2013-09-131-33/+103
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Lots of activity again this round for I/O performance optimizations (per-cpu IDA pre-allocation for vhost + iscsi/target), and the addition of new fabric independent features to target-core (COMPARE_AND_WRITE + EXTENDED_COPY). The main highlights include: - Support for iscsi-target login multiplexing across individual network portals - Generic Per-cpu IDA logic (kent + akpm + clameter) - Conversion of vhost to use per-cpu IDA pre-allocation for descriptors, SGLs and userspace page pointer list - Conversion of iscsi-target + iser-target to use per-cpu IDA pre-allocation for descriptors - Add support for generic COMPARE_AND_WRITE (AtomicTestandSet) emulation for virtual backend drivers - Add support for generic EXTENDED_COPY (CopyOffload) emulation for virtual backend drivers. - Add support for fast memory registration mode to iser-target (Vu) The patches to add COMPARE_AND_WRITE and EXTENDED_COPY support are of particular significance, which make us the first and only open source target to support the full set of VAAI primitives. Currently Linux clients are lacking upstream support to actually utilize these primitives. However, with server side support now in place for folks like MKP + ZAB working on the client, this logic once reserved for the highest end of storage arrays, can now be run in VMs on their laptops" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (50 commits) target/iscsi: Bump versions to v4.1.0 target: Update copyright ownership/year information to 2013 iscsi-target: Bump default TCP listen backlog to 256 target: Fix >= v3.9+ regression in PR APTPL + ALUA metadata write-out iscsi-target; Bump default CmdSN Depth to 64 iscsi-target: Remove unnecessary wait_for_completion in iscsi_get_thread_set iscsi-target: Add thread_set->ts_activate_sem + use common deallocate iscsi-target: Fix race with thread_pre_handler flush_signals + ISCSI_THREAD_SET_DIE target: remove unused including <linux/version.h> iser-target: introduce fast memory registration mode (FRWR) iser-target: generalize rdma memory registration and cleanup iser-target: move rdma wr processing to a shared function target: Enable global EXTENDED_COPY setup/release target: Add Third Party Copy (3PC) bit in INQUIRY response target: Enable EXTENDED_COPY setup in spc_parse_cdb target: Add support for EXTENDED_COPY copy offload emulation target: Avoid non-existent tg_pt_gp_mem in target_alua_state_check target: Add global device list for EXTENDED_COPY target: Make helpers non static for EXTENDED_COPY command setup target: Make spc_parse_naa_6h_vendor_specific non static ...
| * target: Update copyright ownership/year information to 2013Nicholas Bellinger2013-09-111-2/+2
| | | | | | | | | | | | | | Update copyright ownership/year information for target-core, loopback, iscsi-target, tcm_qla2xx, vhost and iser-target. Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * vhost/scsi: Add pre-allocation for tv_cmd SGL + upages memoryNicholas Bellinger2013-09-091-19/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for pre-allocation of per tv_cmd descriptor scatterlist + user-space page pointer memory using se_sess->sess_cmd_map within tcm_vhost_make_nexus() code. This includes sanity checks within vhost_scsi_map_to_sgl() to reject I/O that exceeds these initial hardcoded values, and the necessary cleanup in tcm_vhost_make_nexus() failure path + tcm_vhost_drop_nexus(). v3 changes: - Rebase to v3.11-rc5 code Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Asias He <asias@redhat.com> Cc: Kent Overstreet <kmo@daterainc.com> Reviewed-by: Asias He <asias@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * vhost/scsi: Convert to per-cpu ida_alloc + ida_free command mapNicholas Bellinger2013-09-091-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes vhost/scsi to use transport_init_session_tags() pre-allocation logic for per-cpu session tag pooling with internal ida_alloc() + ida_free() calls based upon the saved se_cmd->map_tag id. FIXME: Make transport_init_session_tags() number of tags setup configurable per vring client setting via configfs v5 changes: - Convert to percpu_ida.h include v3 changes: - Update to percpu-ida usage - Rebase to v3.11-rc5 code Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Asias He <asias@redhat.com> Cc: Kent Overstreet <kmo@daterainc.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | vhost_net: correctly limit the max pending buffersJason Wang2013-09-041-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | As Michael point out, We used to limit the max pending DMAs to get better cache utilization. But it was not done correctly since it was one done when there's no new buffers submitted from guest. Guest can easily exceeds the limitation by keeping sending packets. So this patch moves the check into main loop. Tests shows about 5%-10% improvement on per cpu throughput for guest tx. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost_net: poll vhost queue after marking DMA is doneJason Wang2013-09-041-4/+5
| | | | | | | | | | | | | | | | | | We used to poll vhost queue before making DMA is done, this is racy if vhost thread were waked up before marking DMA is done which can result the signal to be missed. Fix this by always polling the vhost thread before DMA is done. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost_net: determine whether or not to use zerocopy at one timeJason Wang2013-09-041-27/+20
| | | | | | | | | | | | | | | | | | | | Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if upend_idx != done_idx we still set zcopy_used to true and rollback this choice later. This could be avoided by determining zerocopy once by checking all conditions at one time before. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost: switch to use vhost_add_used_n()Jason Wang2013-09-041-42/+12
| | | | | | | | | | | | | | | | | | Let vhost_add_used() to use vhost_add_used_n() to reduce the code duplication. To avoid the overhead brought by __copy_to_user(). We will use put_user() when one used need to be added. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used()Jason Wang2013-09-041-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We tend to batch the used adding and signaling in vhost_zerocopy_callback() which may result more than 100 used buffers to be updated in vhost_zerocopy_signal_used() in some cases. So switch to use vhost_add_used_and_signal_n() to avoid multiple calls to vhost_add_used_and_signal(). Which means much less times of used index updating and memory barriers. 2% performance improvement were seen on netperf TCP_RR test. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost_net: make vhost_zerocopy_signal_used() return voidJason Wang2013-09-041-3/+2
| | | | | | | | | | | | | | None of its caller use its return value, so let it return void. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost: Include linux/uio.h instead of linux/socket.hAsias He2013-08-211-1/+1
|/ | | | | | | | | | | memcpy_fromiovec is moved from net/core/iovec.c to lib/iovec.c. linux/uio.h provides the declaration for memcpy_fromiovec. Include linux/uio.h instead of inux/socket.h for it. Signed-off-by: Asias He <asias@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2013-07-234-44/+26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull vhost fixes from Michael Tsirkin: "vhost: more fixes for 3.11 This includes some fixes for vhost net and scsi drivers. The test module has already been reworked to avoid rcu usage, but the necessary core changes are missing, we fixed this. Unlikely to affect any real-world users, but it's early in the cycle so, let's merge them" (It was earlier when Michael originally sent the email, but it somehot got missed in the flood, so here it is after -rc2) * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost: Remove custom vhost rcu usage vhost-scsi: Always access vq->private_data under vq mutex vhost-net: Always access vq->private_data under vq mutex
| * vhost: Remove custom vhost rcu usageAsias He2013-07-114-26/+12
| | | | | | | | | | | | | | | | Now, vq->private_data is always accessed under vq mutex. No need to play the vhost rcu trick. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost-scsi: Always access vq->private_data under vq mutexAsias He2013-07-111-7/+4
| | | | | | | | | | Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost-net: Always access vq->private_data under vq mutexAsias He2013-07-111-11/+10
| | | | | | | | | | Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge branch 'for-next' of ↵Linus Torvalds2013-07-111-15/+22
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Lots of activity this round on performance improvements in target-core while benchmarking the prototype scsi-mq initiator code with vhost-scsi fabric ports, along with a number of iscsi/iser-target improvements and hardening fixes for exception path cases post v3.10 merge. The highlights include: - Make persistent reservations APTPL buffer allocated on-demand, and drop per t10_reservation buffer. (grover) - Make virtual LUN=0 a NULLIO device, and skip allocation of NULLIO device pages (grover) - Add transport_cmd_check_stop write_pending bit to avoid extra access of ->t_state_lock is WRITE I/O submission fast-path. (nab) - Drop unnecessary CMD_T_DEV_ACTIVE check from transport_lun_remove_cmd to avoid extra access of ->t_state_lock in release fast-path. (nab) - Avoid extra t_state_lock access in __target_execute_cmd fast-path (nab) - Drop unnecessary vhost-scsi wait_for_tasks=true usage + ->t_state_lock access in release fast-path. (nab) - Convert vhost-scsi to use modern se_cmd->cmd_kref TARGET_SCF_ACK_KREF usage (nab) - Add tracepoints for SCSI commands being processed (roland) - Refactoring of iscsi-target handling of ISCSI_OP_NOOP + ISCSI_OP_TEXT to be transport independent (nab) - Add iscsi-target SendTargets=$IQN support for in-band discovery (nab) - Add iser-target support for in-band discovery (nab + Or) - Add iscsi-target demo-mode TPG authentication context support (nab) - Fix isert_put_reject payload buffer post (nab) - Fix iscsit_add_reject* usage for iser (nab) - Fix iscsit_sequence_cmd reject handling for iser (nab) - Fix ISCSI_OP_SCSI_TMFUNC handling for iser (nab) - Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED (nab) The last five iscsi/iser-target items are CC'ed to stable, as they do address issues present in v3.10 code. They are certainly larger than I'd like for stable patch set, but are important to ensure proper REJECT exception handling in iser-target for 3.10.y" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (51 commits) iser-target: Ignore non TEXT + LOGOUT opcodes for discovery target: make queue_tm_rsp() return void target: remove unused codes from enum tcm_tmrsp_table iscsi-target: kstrtou* configfs attribute parameter cleanups iscsi-target: Fix tfc_tpg_auth_cit configfs length overflow iscsi-target: Fix tfc_tpg_nacl_auth_cit configfs length overflow iser-target: Add support for ISCSI_OP_TEXT opcode + payload handling iser-target: Rename sense_buf_[dma,len] to pdu_[dma,len] iser-target: Add vendor_err debug output target: Add (obsolete) checking for PMI/LBA fields in READ CAPACITY(10) target: Return correct sense data for IO past the end of a device target: Add tracepoints for SCSI commands being processed iser-target: Fix session reset bug with RDMA_CM_EVENT_DISCONNECTED iscsi-target: Fix ISCSI_OP_SCSI_TMFUNC handling for iser iscsi-target: Fix iscsit_sequence_cmd reject handling for iser iscsi-target: Fix iscsit_add_reject* usage for iser iser-target: Fix isert_put_reject payload buffer post iscsi-target: missing kfree() on error path iscsi-target: Drop left-over iscsi_conn->bad_hdr target: Make core_scsi3_update_and_write_aptpl return sense_reason_t ...
| * target: make queue_tm_rsp() return voidJoern Engel2013-07-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | The return value wasn't checked by any of the callers. Assuming this is correct behaviour, we can simplify some code by not bothering to generate it. nab: Add srpt_queue_data_in() + srpt_queue_tm_rsp() nops around srpt_queue_response() void return Signed-off-by: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * vhost/scsi: Convert to se_cmd->cmd_kref TARGET_SCF_ACK_KREF usageNicholas Bellinger2013-06-201-12/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch coverts vhost/scsi to se_cmd->cmd_kref TARGET_SCF_ACK_KREF usage, instead of assuming that vhost_scsi_free_cmd() is always called before TCM processing is completed in the response fast path. This includes adding vhost_scsi_check_stop_free() -> target_put_sess_cmd() to perform the second se_cmd->cmd_kref put, and moving vhost_scsi_free_cmd() resource release into tcm_vhost_release_cmd() that is invoked once the last se_cmd->cmd_kref put occurs. Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@kernel.org> Cc: Kent Overstreet <koverstreet@google.com> Cc: Asias He <asias@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Moussa Ba <moussaba@micron.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * vhost/scsi: Drop unnecessary wait_for_tasks=true usage with ↵Nicholas Bellinger2013-06-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | transport_generic_free_cmd This patch changes vhost_scsi_free_cmd() to call transport_generic_free_cmd() with wait_for_tasks=false in order to avoid the extra se_cmd->t_state_lock access for the wait_for_tasks=true case. This is unnecessary because vhost_scsi_free_cmd() is only ever called by vhost_scsi_complete_cmd_work() after TCM completion handoff, and by vhost_scsi_handle_vq() exception code before TCM submission handoff, so there is never a case where se_cmd is still active from TCM's perspective when transport_generic_free_cmd() is called. Cc: Christoph Hellwig <hch@lst.de> Cc: Roland Dreier <roland@kernel.org> Cc: Kent Overstreet <koverstreet@google.com> Cc: Asias He <asias@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Moussa Ba <moussaba@micron.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* | vhost/test: update test after vhost cleanupsMichael S. Tsirkin2013-07-071-11/+20
| | | | | | | | | | | | | | We changed the type of dev.vqs field, update test module accordingly. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: Make local function staticAsias He2013-07-072-3/+3
| | | | | | | | | | | | | | | | | | | | | | $ make C=1 M=drivers/vhost drivers/vhost/net.c:168:5: warning: symbol 'vhost_net_set_ubuf_info' was not declared. Should it be static? drivers/vhost/net.c:194:6: warning: symbol 'vhost_net_vq_reset' was not declared. Should it be static? drivers/vhost/scsi.c:219:6: warning: symbol 'tcm_vhost_done_inflight' was not declared. Should it be static? Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: Make vhost a separate moduleAsias He2013-07-076-4/+63
| | | | | | | | | | | | | | | | | | | | | | | | Currently, vhost-net and vhost-scsi are sharing the vhost core code. However, vhost-scsi shares the code by including the vhost.c file directly. Making vhost a separate module makes it is easier to share code with other vhost devices. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost-scsi: Rename struct tcm_vhost_cmd *tv_cmd to *cmdAsias He2013-07-071-71/+71
| | | | | | | | | | | | | | | | This way, we use cmd for struct tcm_vhost_cmd and evt for struct tcm_vhost_cmd. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost-scsi: Rename struct tcm_vhost_tpg *tv_tpg to *tpgAsias He2013-07-071-61/+61
| | | | | | | | | | Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost-scsi: Make func indention more consistentAsias He2013-07-071-66/+88
| | | | | | | | | | Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost-scsi: Rename struct vhost_scsi *s to *vsAsias He2013-07-071-28/+28
| | | | | | | | | | | | | | vs is used everywhere, make the naming more consistent. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost-scsi: Remove unnecessary forward struct vhost_scsi declarationAsias He2013-07-071-1/+0
| | | | | | | | | | | | | | It was needed when struct tcm_vhost_tpg is in tcm_vhost.h Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: Simplify dev->vqs[i] accessAsias He2013-07-071-17/+18
| | | | | | | | | | Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost-net: fix use-after-free in vhost_net_flushMichael S. Tsirkin2013-07-071-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vhost_net_ubuf_put_and_wait has a confusing name: it will actually also free it's argument. Thus since commit 1280c27f8e29acf4af2da914e80ec27c3dbd5c01 "vhost-net: flush outstanding DMAs on memory change" vhost_net_flush tries to use the argument after passing it to vhost_net_ubuf_put_and_wait, this results in use after free. To fix, don't free the argument in vhost_net_ubuf_put_and_wait, add an new API for callers that want to free ubufs. Acked-by: Asias He <asias@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | vhost: fix ubuf_info cleanupMichael S. Tsirkin2013-06-111-15/+7
| | | | | | | | | | | | | | | | | | | | | | vhost_net_clear_ubuf_info didn't clear ubuf_info after kfree, this could trigger double free. Fix this and simplify this code to make it more robust: make sure ubuf info is always freed through vhost_net_clear_ubuf_info. Reported-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost: check owner before we overwrite ubuf_infoMichael S. Tsirkin2013-06-113-1/+12
| | | | | | | | | | | | | | | | If device has an owner, we shouldn't touch ubuf_info since it might be in use. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vhost_net: clear msg.control for non-zerocopy case during txJason Wang2013-06-101-1/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we decide not use zero-copy, msg.control should be set to NULL otherwise macvtap/tap may set zerocopy callbacks which may decrease the kref of ubufs wrongly. Bug were introduced by commit cedb9bdce099206290a2bdd02ce47a7b253b6a84 (vhost-net: skip head management if no outstanding). This solves the following warnings: WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]() Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun] CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566 Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011 ffffffffa0198323 ffff88007c9ebd08 ffffffff81796b73 ffff88007c9ebd48 ffffffff8103d66b 000000007b773e20 ffff8800779f0000 ffff8800779f43f0 ffff8800779f8418 000000000000015c 0000000000000062 ffff88007c9ebd58 Call Trace: [<ffffffff81796b73>] dump_stack+0x19/0x1e [<ffffffff8103d66b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8103d6b5>] warn_slowpath_null+0x15/0x20 [<ffffffffa0197627>] handle_tx+0x477/0x4b0 [vhost_net] [<ffffffffa0197690>] handle_tx_kick+0x10/0x20 [vhost_net] [<ffffffffa019541e>] vhost_worker+0xfe/0x1a0 [vhost_net] [<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net] [<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net] [<ffffffff81061f46>] kthread+0xc6/0xd0 [<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff817a1aec>] ret_from_fork+0x7c/0xb0 [<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70 Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Add missing module license tag to vring helpers.Dave Jones2013-05-081-0/+3
| | | | | | | [ 624.286653] vringh: module license 'unspecified' taints kernel. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds2013-05-074-51/+71
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull more vhost fixes from Michael Tsirkin: "This fixes some minor issues in the patches that have been merged. We also finally drop the workaround disabling event_idx for scsi: it was always questionable, and now we know it's not needed. There's also a memory leak fix" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vhost-scsi: Enable VIRTIO_RING_F_EVENT_IDX vhost: drop virtio_net.h dependency vhost-net: Cleanup vhost_ubuf and vhost_zcopy vhost: Remove vhost_enable_zcopy in vhost.h vhost: Remove comments for hdr in vhost.h vhost: Move VHOST_NET_FEATURES to net.c vhost-net: Free ubuf when vhost_dev_set_owner fails vhost: Export vhost_dev_set_owner
| * vhost-scsi: Enable VIRTIO_RING_F_EVENT_IDXAsias He2013-05-071-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | It was disabled as a workaround. Now userspace bits work fine with it. The broken version was not ever committed to QEMU, I guess the same is true for nlkt. So, let's enable it. Signed-off-by: Asias He <asias@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: drop virtio_net.h dependencyMichael S. Tsirkin2013-05-061-1/+1
| | | | | | | | | | | | | | There's no net specific code in vhost.c anymore, don't include the virtio_net.h header. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost-net: Cleanup vhost_ubuf and vhost_zcopyAsias He2013-05-061-28/+30
| | | | | | | | | | | | | | | | | | - Rename vhost_ubuf to vhost_net_ubuf - Rename vhost_zcopy_mask to vhost_net_zcopy_mask - Make funcs static Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: Remove vhost_enable_zcopy in vhost.hAsias He2013-05-061-3/+0
| | | | | | | | | | | | | | It is net.c specific. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: Remove comments for hdr in vhost.hAsias He2013-05-061-3/+0
| | | | | | | | | | | | | | It is supposed to be removed when hdr is moved into vhost_net_virtqueue. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: Move VHOST_NET_FEATURES to net.cAsias He2013-05-062-3/+6
| | | | | | | | | | | | | | | | vhost.h should not depend on device specific marcos like VHOST_NET_F_VIRTIO_NET_HDR and VIRTIO_NET_F_MRG_RXBUF. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost-net: Free ubuf when vhost_dev_set_owner failsAsias He2013-05-061-6/+32
| | | | | | | | | | Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: Export vhost_dev_set_ownerAsias He2013-05-062-1/+2
| | | | | | | | | | Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge tag 'virtio-next-for-linus' of ↵Linus Torvalds2013-05-024-1/+1020
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull virtio & lguest updates from Rusty Russell: "Lots of virtio work which wasn't quite ready for last merge window. Plus I dived into lguest again, reworking the pagetable code so we can move the switcher page: our fixmaps sometimes take more than 2MB now..." Ugh. Annoying conflicts with the tcm_vhost -> vhost_scsi rename. Hopefully correctly resolved. * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (57 commits) caif_virtio: Remove bouncing email addresses lguest: improve code readability in lg_cpu_start. virtio-net: fill only rx queues which are being used lguest: map Switcher below fixmap. lguest: cache last cpu we ran on. lguest: map Switcher text whenever we allocate a new pagetable. lguest: don't share Switcher PTE pages between guests. lguest: expost switcher_pages array (as lg_switcher_pages). lguest: extract shadow PTE walking / allocating. lguest: make check_gpte et. al return bool. lguest: assume Switcher text is a single page. lguest: rename switcher_page to switcher_pages. lguest: remove RESERVE_MEM constant. lguest: check vaddr not pgd for Switcher protection. lguest: prepare to make SWITCHER_ADDR a variable. virtio: console: replace EMFILE with EBUSY for already-open port virtio-scsi: reset virtqueue affinity when doing cpu hotplug virtio-scsi: introduce multiqueue support virtio-scsi: push vq lock/unlock into virtscsi_vq_done virtio-scsi: pass struct virtio_scsi to virtqueue completion function ...
| * vringh: host-side implementation of virtio rings.Rusty Russell2013-03-204-0/+1018
| | | | | | | | | | | | | | | | | | | | | | | | Getting use of virtio rings correct is tricky, and a recent patch saw an implementation of in-kernel rings (as separate from userspace). This abstracts the business of dealing with the virtio ring layout from the access (userspace or direct); to do this, we use function pointers, which gcc inlines correctly. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Michael S. Tsirkin <mst@redhat.com>
| * tools/virtio: fix build for 3.8Michael S. Tsirkin2013-03-201-1/+3
| | | | | | | | | | Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>