summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* smb: client: fix mount when dns_resolver key is not availablePaulo Alcantara2023-11-093-7/+29
| | | | | | | | | | | | | | | | | | | There was a wrong assumption that with CONFIG_CIFS_DFS_UPCALL=y there would always be a dns_resolver key set up so we could unconditionally upcall to resolve UNC hostname rather than using the value provided by mount(2). Only require it when performing automount of junctions within a DFS share so users that don't have dns_resolver key still can mount their regular shares with server hostname resolved by mount.cifs(8). Fixes: 348a04a8d113 ("smb: client: get rid of dfs code dep in namespace.c") Cc: stable@vger.kernel.org Tested-by: Eduard Bachmakov <e.bachmakov@gmail.com> Reported-by: Eduard Bachmakov <e.bachmakov@gmail.com> Closes: https://lore.kernel.org/all/CADCRUiNvZuiUZ0VGZZO9HRyPyw6x92kiA7o7Q4tsX5FkZqUkKg@mail.gmail.com/ Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* smb3: fix caching of ctime on setxattrSteve French2023-11-091-1/+4
| | | | | | | | | | Fixes xfstest generic/728 which had been failing due to incorrect ctime after setxattr and removexattr Update ctime on successful set of xattr Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
* smb3: minor cleanup of session handling codeSteve French2023-11-091-6/+12
| | | | | | | Minor cleanup of style issues found by checkpatch Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: reconnect work should have reference on server structShyam Prasad N2023-11-092-16/+34
| | | | | | | | | | | | | | | | | | | | The delayed work for reconnect takes server struct as a parameter. But it does so without holding a ref to it. Normally, this may not show a problem as the reconnect work is only cancelled on umount. However, since we now plan to support scaling down of channels, and the scale down can happen from reconnect work itself, we need to fix it. This change takes a reference on the server struct before it is passed to the delayed work. And drops the reference in the delayed work itself. Or if the delayed work is successfully cancelled, by the process that cancels it. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: do not pass cifs_sb when trying to add channelsShyam Prasad N2023-11-093-8/+8
| | | | | | | | | | | | | | | The only reason why cifs_sb gets passed today to cifs_try_adding_channels is to pass the local_nls field for the new channels and binding session. However, the ses struct already has local_nls field that is setup during the first cifs_setup_session. So there is no need to pass cifs_sb. This change removes cifs_sb from the arg list for this and the functions that it calls and uses ses->local_nls instead. Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: account for primary channel in the interface listShyam Prasad N2023-11-092-0/+34
| | | | | | | | | | | | The refcounting of server interfaces should account for the primary channel too. Although this is not strictly necessary, doing so will account for the primary channel in DebugData. Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: distribute channels across interfaces based on speedShyam Prasad N2023-11-093-14/+88
| | | | | | | | | | | | | | | | Today, if the server interfaces RSS capable, we simply choose the fastest interface to setup a channel. This is not a scalable approach, and does not make a lot of attempt to distribute the connections. This change does a weighted distribution of channels across all the available server interfaces, where the weight is a function of the advertised interface speed. Also make sure that we don't mix rdma and non-rdma for channels. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: handle cases where a channel is closedShyam Prasad N2023-11-096-7/+43
| | | | | | | | | | | | | | | So far, SMB multichannel could only scale up, but not scale down the number of channels. In this series of patch, we now allow the client to deal with the case of multichannel disabled on the server when the share is mounted. With that change, we now need the ability to scale down the channels. This change allows the client to deal with cases of missing channels more gracefully. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* smb3: more minor cleanups for session handling routinesSteve French2023-11-091-10/+15
| | | | | | | Some trivial cleanup pointed out by checkpatch Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* smb3: minor RDMA cleanupSteve French2023-11-091-2/+2
| | | | | | | | Some minor smbdirect debug cleanup spotted by checkpatch Cc: Long Li <longli@microsoft.com> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* cifs: Fix encryption of cleared, but unset rq_iter data buffersDavid Howells2023-11-081-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each smb_rqst struct contains two things: an array of kvecs (rq_iov) that contains the protocol data for an RPC op and an iterator (rq_iter) that contains the data payload of an RPC op. When an smb_rqst is allocated rq_iter is it always cleared, but we don't set it up unless we're going to use it. The functions that determines the size of the ciphertext buffer that will be needed to encrypt a request, cifs_get_num_sgs(), assumes that rq_iter is always initialised - and employs user_backed_iter() to check that the iterator isn't user-backed. This used to incidentally work, because ->user_backed was set to false because the iterator has never been initialised, but with commit f1b4cb650b9a0eeba206d8f069fcdc532bfbcd74[1] which changes user_backed_iter() to determine this based on the iterator type insted, a warning is now emitted: WARNING: CPU: 7 PID: 4584 at fs/smb/client/cifsglob.h:2165 smb2_get_aead_req+0x3fc/0x420 [cifs] ... RIP: 0010:smb2_get_aead_req+0x3fc/0x420 [cifs] ... crypt_message+0x33e/0x550 [cifs] smb3_init_transform_rq+0x27d/0x3f0 [cifs] smb_send_rqst+0xc7/0x160 [cifs] compound_send_recv+0x3ca/0x9f0 [cifs] cifs_send_recv+0x25/0x30 [cifs] SMB2_tcon+0x38a/0x820 [cifs] cifs_get_smb_ses+0x69c/0xee0 [cifs] cifs_mount_get_session+0x76/0x1d0 [cifs] dfs_mount_share+0x74/0x9d0 [cifs] cifs_mount+0x6e/0x2e0 [cifs] cifs_smb3_do_mount+0x143/0x300 [cifs] smb3_get_tree+0x15e/0x290 [cifs] vfs_get_tree+0x2d/0xe0 do_new_mount+0x124/0x340 __se_sys_mount+0x143/0x1a0 The problem is that rq_iter was never set, so the type is 0 (ie. ITER_UBUF) which causes user_backed_iter() to return true. The code doesn't malfunction because it checks the size of the iterator - which is 0. Fix cifs_get_num_sgs() to ignore rq_iter if its count is 0, thereby bypassing the warnings. It might be better to explicitly initialise rq_iter to a zero-length ITER_BVEC, say, as it can always be reinitialised later. Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Reported-by: Damian Tometzki <damian@riscv-rocks.de> Closes: https://lore.kernel.org/r/ZUfQo47uo0p2ZsYg@fedora.fritz.box/ Tested-by: Damian Tometzki <damian@riscv-rocks.de> Cc: stable@vger.kernel.org cc: Eric Biggers <ebiggers@kernel.org> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f1b4cb650b9a0eeba206d8f069fcdc532bfbcd74 [1] Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com>
* Merge tag '6.7-rc-smb3-client-fixes-part1' of ↵Linus Torvalds2023-11-0414-90/+138
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.samba.org/sfrench/cifs-2.6 Pull smb client updates from Steve French: - use after free fixes and deadlock fix - symlink timestamp fix - hashing perf improvement - multichannel fixes - minor debugging improvements - fix creating fifos when using "sfu" mounts - NTLMSSP authentication improvement - minor fixes to include some missing create flags and structures from recently updated protocol documentation * tag '6.7-rc-smb3-client-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: cifs: force interface update before a fresh session setup cifs: do not reset chan_max if multichannel is not supported at mount cifs: reconnect helper should set reconnect for the right channel smb: client: fix use-after-free in smb2_query_info_compound() smb: client: remove extra @chan_count check in __cifs_put_smb_ses() cifs: add xid to query server interface call cifs: print server capabilities in DebugData smb: use crypto_shash_digest() in symlink_hash() smb: client: fix use-after-free bug in cifs_debug_data_proc_show() smb: client: fix potential deadlock when releasing mids smb3: fix creating FIFOs when mounting with "sfu" mount option Add definition for new smb3.1.1 command type SMB3: clarify some of the unused CreateOption flags cifs: Add client version details to NTLM authenticate message smb3: fix touch -h of symlink
| * cifs: force interface update before a fresh session setupShyam Prasad N2023-11-021-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | During a session reconnect, it is possible that the server moved to another physical server (happens in case of Azure files). So at this time, force a query of server interfaces again (in case of multichannel session), such that the secondary channels connect to the right IP addresses (possibly updated now). Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: do not reset chan_max if multichannel is not supported at mountShyam Prasad N2023-11-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the mount command has specified multichannel as a mount option, but multichannel is found to be unsupported by the server at the time of mount, we set chan_max to 1. Which means that the user needs to remount the share if the server starts supporting multichannel. This change removes this reset. What it means is that if the user specified multichannel or max_channels during mount, and at this time, multichannel is not supported, but the server starts supporting it at a later point, the client will be capable of scaling out the number of channels. Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: reconnect helper should set reconnect for the right channelShyam Prasad N2023-11-021-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We introduced a helper function to be used by non-cifsd threads to mark the connection for reconnect. For multichannel, when only a particular channel needs to be reconnected, this had a bug. This change fixes that by marking that particular channel for reconnect. Fixes: dca65818c80c ("cifs: use a different reconnect helper for non-cifsd threads") Cc: stable@vger.kernel.org Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * smb: client: fix use-after-free in smb2_query_info_compound()Paulo Alcantara2023-11-021-35/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following UAF was triggered when running fstests generic/072 with KASAN enabled against Windows Server 2022 and mount options 'multichannel,max_channels=2,vers=3.1.1,mfsymlinks,noperm' BUG: KASAN: slab-use-after-free in smb2_query_info_compound+0x423/0x6d0 [cifs] Read of size 8 at addr ffff888014941048 by task xfs_io/27534 CPU: 0 PID: 27534 Comm: xfs_io Not tainted 6.6.0-rc7 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack_lvl+0x4a/0x80 print_report+0xcf/0x650 ? srso_alias_return_thunk+0x5/0x7f ? srso_alias_return_thunk+0x5/0x7f ? __phys_addr+0x46/0x90 kasan_report+0xda/0x110 ? smb2_query_info_compound+0x423/0x6d0 [cifs] ? smb2_query_info_compound+0x423/0x6d0 [cifs] smb2_query_info_compound+0x423/0x6d0 [cifs] ? __pfx_smb2_query_info_compound+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __stack_depot_save+0x39/0x480 ? kasan_save_stack+0x33/0x60 ? kasan_set_track+0x25/0x30 ? ____kasan_slab_free+0x126/0x170 smb2_queryfs+0xc2/0x2c0 [cifs] ? __pfx_smb2_queryfs+0x10/0x10 [cifs] ? __pfx___lock_acquire+0x10/0x10 smb311_queryfs+0x210/0x220 [cifs] ? __pfx_smb311_queryfs+0x10/0x10 [cifs] ? srso_alias_return_thunk+0x5/0x7f ? __lock_acquire+0x480/0x26c0 ? lock_release+0x1ed/0x640 ? srso_alias_return_thunk+0x5/0x7f ? do_raw_spin_unlock+0x9b/0x100 cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0 __do_sys_fstatfs+0x7f/0xe0 ? __pfx___do_sys_fstatfs+0x10/0x10 ? srso_alias_return_thunk+0x5/0x7f ? lockdep_hardirqs_on_prepare+0x136/0x200 ? srso_alias_return_thunk+0x5/0x7f do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Allocated by task 27534: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 __kasan_kmalloc+0x8f/0xa0 open_cached_dir+0x71b/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs] smb311_queryfs+0x210/0x220 [cifs] cifs_statfs+0x18c/0x4b0 [cifs] statfs_by_dentry+0x9b/0xf0 fd_statfs+0x4e/0xb0 __do_sys_fstatfs+0x7f/0xe0 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Freed by task 27534: kasan_save_stack+0x33/0x60 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2b/0x50 ____kasan_slab_free+0x126/0x170 slab_free_freelist_hook+0xd0/0x1e0 __kmem_cache_free+0x9d/0x1b0 open_cached_dir+0xff5/0x1240 [cifs] smb2_query_info_compound+0x5c3/0x6d0 [cifs] smb2_queryfs+0xc2/0x2c0 [cifs] This is a race between open_cached_dir() and cached_dir_lease_break() where the cache entry for the open directory handle receives a lease break while creating it. And before returning from open_cached_dir(), we put the last reference of the new @cfid because of !@cfid->has_lease. Besides the UAF, while running xfstests a lot of missed lease breaks have been noticed in tests that run several concurrent statfs(2) calls on those cached fids CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame... CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1... CIFS: VFS: \\w22-root1.gandalf.test smb buf 00000000715bfe83 len 108 CIFS: VFS: Dump pending requests: CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame... CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1... CIFS: VFS: \\w22-root1.gandalf.test smb buf 000000005aa7316e len 108 ... To fix both, in open_cached_dir() ensure that @cfid->has_lease is set right before sending out compounded request so that any potential lease break will be get processed by demultiplex thread while we're still caching @cfid. And, if open failed for some reason, re-check @cfid->has_lease to decide whether or not put lease reference. Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * smb: client: remove extra @chan_count check in __cifs_put_smb_ses()Paulo Alcantara2023-11-021-14/+9
| | | | | | | | | | | | | | | | | | If @ses->chan_count <= 1, then for-loop body will not be executed so no need to check it twice. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: add xid to query server interface callShyam Prasad N2023-10-311-1/+5
| | | | | | | | | | | | | | | | | | | | We were passing 0 as the xid for the call to query server interfaces. This is not great for debugging. This change adds a real xid. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: print server capabilities in DebugDataShyam Prasad N2023-10-311-0/+2
| | | | | | | | | | | | | | | | | | In the output of /proc/fs/cifs/DebugData, we do not print the server->capabilities field today. With this change, we will do that. Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * smb: use crypto_shash_digest() in symlink_hash()Eric Biggers2023-10-311-14/+2
| | | | | | | | | | | | | | | | | | Simplify symlink_hash() by using crypto_shash_digest() instead of an init+update+final sequence. This should also improve performance. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * smb: client: fix use-after-free bug in cifs_debug_data_proc_show()Paulo Alcantara2023-10-311-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Skip SMB sessions that are being teared down (e.g. @ses->ses_status == SES_EXITING) in cifs_debug_data_proc_show() to avoid use-after-free in @ses. This fixes the following GPF when reading from /proc/fs/cifs/DebugData while mounting and umounting [ 816.251274] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6d81: 0000 [#1] PREEMPT SMP NOPTI ... [ 816.260138] Call Trace: [ 816.260329] <TASK> [ 816.260499] ? die_addr+0x36/0x90 [ 816.260762] ? exc_general_protection+0x1b3/0x410 [ 816.261126] ? asm_exc_general_protection+0x26/0x30 [ 816.261502] ? cifs_debug_tcon+0xbd/0x240 [cifs] [ 816.261878] ? cifs_debug_tcon+0xab/0x240 [cifs] [ 816.262249] cifs_debug_data_proc_show+0x516/0xdb0 [cifs] [ 816.262689] ? seq_read_iter+0x379/0x470 [ 816.262995] seq_read_iter+0x118/0x470 [ 816.263291] proc_reg_read_iter+0x53/0x90 [ 816.263596] ? srso_alias_return_thunk+0x5/0x7f [ 816.263945] vfs_read+0x201/0x350 [ 816.264211] ksys_read+0x75/0x100 [ 816.264472] do_syscall_64+0x3f/0x90 [ 816.264750] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 816.265135] RIP: 0033:0x7fd5e669d381 Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * smb: client: fix potential deadlock when releasing midsPaulo Alcantara2023-10-313-12/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All release_mid() callers seem to hold a reference of @mid so there is no need to call kref_put(&mid->refcount, __release_mid) under @server->mid_lock spinlock. If they don't, then an use-after-free bug would have occurred anyways. By getting rid of such spinlock also fixes a potential deadlock as shown below CPU 0 CPU 1 ------------------------------------------------------------------ cifs_demultiplex_thread() cifs_debug_data_proc_show() release_mid() spin_lock(&server->mid_lock); spin_lock(&cifs_tcp_ses_lock) spin_lock(&server->mid_lock) __release_mid() smb2_find_smb_tcon() spin_lock(&cifs_tcp_ses_lock) *deadlock* Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * smb3: fix creating FIFOs when mounting with "sfu" mount optionSteve French2023-10-313-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes some xfstests including generic/564 and generic/157 The "sfu" mount option can be useful for creating special files (character and block devices in particular) but could not create FIFOs. It did recognize existing empty files with the "system" attribute flag as FIFOs but this is too general, so to support creating FIFOs more safely use a new tag (but the same length as those for char and block devices ie "IntxLNK" and "IntxBLK") "LnxFIFO" to indicate that the file should be treated as a FIFO (when mounted with the "sfu"). For some additional context note that "sfu" followed the way that "Services for Unix" on Windows handled these special files (at least for character and block devices and symlinks), which is different than newer Windows which can handle special files as reparse points (which isn't an option to many servers). Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * Add definition for new smb3.1.1 command typeSteve French2023-10-301-0/+15
| | | | | | | | | | | | | | | | Add structs and defines for new SMB3.1.1 command, server to client notification. See MS-SMB2 section 2.2.44 Signed-off-by: Steve French <stfrench@microsoft.com>
| * SMB3: clarify some of the unused CreateOption flagsSteve French2023-10-301-1/+8
| | | | | | | | | | | | | | | | Update comments to show flags which should be not set (zero). See MS-SMB2 section 2.2.13 Signed-off-by: Steve French <stfrench@microsoft.com>
| * cifs: Add client version details to NTLM authenticate messageMeetakshi Setiya2023-10-232-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | The NTLM authenticate message currently sets the NTLMSSP_NEGOTIATE_VERSION flag but does not populate the VERSION structure. This commit fixes this bug by ensuring that the flag is set and the version details are included in the message. Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
| * smb3: fix touch -h of symlinkSteve French2023-10-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For example: touch -h -t 02011200 testfile where testfile is a symlink would not change the timestamp, but touch -t 02011200 testfile does work to change the timestamp of the target Suggested-by: David Howells <dhowells@redhat.com> Reported-by: Micah Veilleux <micah.veilleux@iba-group.com> Closes: https://bugzilla.samba.org/show_bug.cgi?id=14476 Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
* | Merge tag 'efi-next-for-v6.7' of ↵Linus Torvalds2023-11-043-0/+81
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI update from Ard Biesheuvel: "This is the only remaining EFI change, as everything else was taken via -tip this cycle: - implement uid/gid mount options for efivarfs" * tag 'efi-next-for-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efivarfs: Add uid/gid mount options
| * | efivarfs: Add uid/gid mount optionsJiao Zhou2023-10-203-0/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow UEFI variables to be modified by non-root processes in order to run sandboxed code. This doesn't change the behavior of mounting efivarfs unless uid/gid are specified; by default both are set to root. Signed-off-by: Jiao Zhou <jiaozhou@google.com> Acked-by: Matthew Garrett <mgarrett@aurora.tech> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
* | | Merge tag 'x86_microcode_for_v6.7_rc1' of ↵Linus Torvalds2023-11-0421-910/+905
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 microcode loading updates from Borislac Petkov: "Major microcode loader restructuring, cleanup and improvements by Thomas Gleixner: - Restructure the code needed for it and add a temporary initrd mapping on 32-bit so that the loader can access the microcode blobs. This in itself is a preparation for the next major improvement: - Do not load microcode on 32-bit before paging has been enabled. Handling this has caused an endless stream of headaches, issues, ugly code and unnecessary hacks in the past. And there really wasn't any sensible reason to do that in the first place. So switch the 32-bit loading to happen after paging has been enabled and turn the loader code "real purrty" again - Drop mixed microcode steppings loading on Intel - there, a single patch loaded on the whole system is sufficient - Rework late loading to track which CPUs have updated microcode successfully and which haven't, act accordingly - Move late microcode loading on Intel in NMI context in order to guarantee concurrent loading on all threads - Make the late loading CPU-hotplug-safe and have the offlined threads be woken up for the purpose of the update - Add support for a minimum revision which determines whether late microcode loading is safe on a machine and the microcode does not change software visible features which the machine cannot use anyway since feature detection has happened already. Roughly, the minimum revision is the smallest revision number which must be loaded currently on the system so that late updates can be allowed - Other nice leanups, fixess, etc all over the place" * tag 'x86_microcode_for_v6.7_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) x86/microcode/intel: Add a minimum required revision for late loading x86/microcode: Prepare for minimal revision check x86/microcode: Handle "offline" CPUs correctly x86/apic: Provide apic_force_nmi_on_cpu() x86/microcode: Protect against instrumentation x86/microcode: Rendezvous and load in NMI x86/microcode: Replace the all-in-one rendevous handler x86/microcode: Provide new control functions x86/microcode: Add per CPU control field x86/microcode: Add per CPU result state x86/microcode: Sanitize __wait_for_cpus() x86/microcode: Clarify the late load logic x86/microcode: Handle "nosmt" correctly x86/microcode: Clean up mc_cpu_down_prep() x86/microcode: Get rid of the schedule work indirection x86/microcode: Mop up early loading leftovers x86/microcode/amd: Use cached microcode for AP load x86/microcode/amd: Cache builtin/initrd microcode early x86/microcode/amd: Cache builtin microcode too x86/microcode/amd: Use correct per CPU ucode_cpu_info ...
| * | | x86/microcode/intel: Add a minimum required revision for late loadingAshok Raj2023-10-242-5/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In general users, don't have the necessary information to determine whether late loading of a new microcode version is safe and does not modify anything which the currently running kernel uses already, e.g. removal of CPUID bits or behavioural changes of MSRs. To address this issue, Intel has added a "minimum required version" field to a previously reserved field in the microcode header. Microcode updates should only be applied if the current microcode version is equal to, or greater than this minimum required version. Thomas made some suggestions on how meta-data in the microcode file could provide Linux with information to decide if the new microcode is suitable candidate for late loading. But even the "simpler" option requires a lot of metadata and corresponding kernel code to parse it, so the final suggestion was to add the 'minimum required version' field in the header. When microcode changes visible features, microcode will set the minimum required version to its own revision which prevents late loading. Old microcode blobs have the minimum revision field always set to 0, which indicates that there is no information and the kernel considers it unsafe. This is a pure OS software mechanism. The hardware/firmware ignores this header field. For early loading there is no restriction because OS visible features are enumerated after the early load and therefore a change has no effect. The check is always enabled, but by default not enforced. It can be enforced via Kconfig or kernel command line. If enforced, the kernel refuses to late load microcode with a minimum required version field which is zero or when the currently loaded microcode revision is smaller than the minimum required revision. If not enforced the load happens independent of the revision check to stay compatible with the existing behaviour, but it influences the decision whether the kernel is tainted or not. If the check signals that the late load is safe, then the kernel is not tainted. Early loading is not affected by this. [ tglx: Massaged changelog and fixed up the implementation ] Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.776467264@linutronix.de
| * | | x86/microcode: Prepare for minimal revision checkThomas Gleixner2023-10-246-6/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Applying microcode late can be fatal for the running kernel when the update changes functionality which is in use already in a non-compatible way, e.g. by removing a CPUID bit. There is no way for admins which do not have access to the vendors deep technical support to decide whether late loading of such a microcode is safe or not. Intel has added a new field to the microcode header which tells the minimal microcode revision which is required to be active in the CPU in order to be safe. Provide infrastructure for handling this in the core code and a command line switch which allows to enforce it. If the update is considered safe the kernel is not tainted and the annoying warning message not emitted. If it's enforced and the currently loaded microcode revision is not safe for late loading then the load is aborted. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231017211724.079611170@linutronix.de
| * | | x86/microcode: Handle "offline" CPUs correctlyThomas Gleixner2023-10-244-6/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Offline CPUs need to be parked in a safe loop when microcode update is in progress on the primary CPU. Currently, offline CPUs are parked in mwait_play_dead(), and for Intel CPUs, its not a safe instruction, because the MWAIT instruction can be patched in the new microcode update that can cause instability. - Add a new microcode state 'UCODE_OFFLINE' to report status on per-CPU basis. - Force NMI on the offline CPUs. Wake up offline CPUs while the update is in progress and then return them back to mwait_play_dead() after microcode update is complete. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.660850472@linutronix.de
| * | | x86/apic: Provide apic_force_nmi_on_cpu()Thomas Gleixner2023-10-245-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When SMT siblings are soft-offlined and parked in one of the play_dead() variants they still react on NMI, which is problematic on affected Intel CPUs. The default play_dead() variant uses MWAIT on modern CPUs, which is not guaranteed to be safe when updated concurrently. Right now late loading is prevented when not all SMT siblings are online, but as they still react on NMI, it is possible to bring them out of their park position into a trivial rendezvous handler. Provide a function which allows to do that. I does sanity checks whether the target is in the cpus_booted_once_mask and whether the APIC driver supports it. Mark X2APIC and XAPIC as capable, but exclude 32bit and the UV and NUMACHIP variants as that needs feedback from the relevant experts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.603100036@linutronix.de
| * | | x86/microcode: Protect against instrumentationThomas Gleixner2023-10-241-28/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The wait for control loop in which the siblings are waiting for the microcode update on the primary thread must be protected against instrumentation as instrumentation can end up in #INT3, #DB or #PF, which then returns with IRET. That IRET reenables NMI which is the opposite of what the NMI rendezvous is trying to achieve. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.545969323@linutronix.de
| * | | x86/microcode: Rendezvous and load in NMIThomas Gleixner2023-10-245-5/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stop_machine() does not prevent the spin-waiting sibling from handling an NMI, which is obviously violating the whole concept of rendezvous. Implement a static branch right in the beginning of the NMI handler which is nopped out except when enabled by the late loading mechanism. The late loader enables the static branch before stop_machine() is invoked. Each CPU has an nmi_enable in its control structure which indicates whether the CPU should go into the update routine. This is required to bridge the gap between enabling the branch and actually being at the point where it is required to enter the loader wait loop. Each CPU which arrives in the stopper thread function sets that flag and issues a self NMI right after that. If the NMI function sees the flag clear, it returns. If it's set it clears the flag and enters the rendezvous. This is safe against a real NMI which hits in between setting the flag and sending the NMI to itself. The real NMI will be swallowed by the microcode update and the self NMI will then let stuff continue. Otherwise this would end up with a spurious NMI. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.489900814@linutronix.de
| * | | x86/microcode: Replace the all-in-one rendevous handlerThomas Gleixner2023-10-241-42/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with a new handler which just separates the control flow of primary and secondary CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.433704135@linutronix.de
| * | | x86/microcode: Provide new control functionsThomas Gleixner2023-10-241-0/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current all in one code is unreadable and really not suited for adding future features like uniform loading with package or system scope. Provide a set of new control functions which split the handling of the primary and secondary CPUs. These will replace the current rendezvous all in one function in the next step. This is intentionally a separate change because diff makes an complete unreadable mess otherwise. So the flow separates the primary and the secondary CPUs into their own functions which use the control field in the per CPU ucode_ctrl struct. primary() secondary() wait_for_all() wait_for_all() apply_ucode() wait_for_release() release() apply_ucode() Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.377922731@linutronix.de
| * | | x86/microcode: Add per CPU control fieldThomas Gleixner2023-10-241-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a per CPU control field to ucode_ctrl and define constants for it which are going to be used to control the loading state machine. In theory this could be a global control field, but a global control does not cover the following case: 15 primary CPUs load microcode successfully 1 primary CPU fails and returns with an error code With global control the sibling of the failed CPU would either try again or the whole operation would be aborted with the consequence that the 15 siblings do not invoke the apply path and end up with inconsistent software state. The result in dmesg would be inconsistent too. There are two additional fields added and initialized: ctrl_cpu and secondaries. ctrl_cpu is the CPU number of the primary thread for now, but with the upcoming uniform loading at package or system scope this will be one CPU per package or just one CPU. Secondaries hands the control CPU a CPU mask which will be required to release the secondary CPUs out of the wait loop. Preparatory change for implementing a properly split control flow for primary and secondary CPUs. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.319959519@linutronix.de
| * | | x86/microcode: Add per CPU result stateThomas Gleixner2023-10-242-47/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The microcode rendezvous is purely acting on global state, which does not allow to analyze fails in a coherent way. Introduce per CPU state where the results are written into, which allows to analyze the return codes of the individual CPUs. Initialize the state when walking the cpu_present_mask in the online check to avoid another for_each_cpu() loop. Enhance the result print out with that. The structure is intentionally named ucode_ctrl as it will gain control fields in subsequent changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231017211723.632681010@linutronix.de
| * | | x86/microcode: Sanitize __wait_for_cpus()Thomas Gleixner2023-10-241-22/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code is too complicated for no reason: - The return value is pointless as this is a strict boolean. - It's way simpler to count down from num_online_cpus() and check for zero. - The timeout argument is pointless as this is always one second. - Touching the NMI watchdog every 100ns does not make any sense, neither does checking every 100ns. This is really not a hotpath operation. Preload the atomic counter with the number of online CPUs and simplify the whole timeout logic. Delay for one microsecond and touch the NMI watchdog once per millisecond. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.204251527@linutronix.de
| * | | x86/microcode: Clarify the late load logicThomas Gleixner2023-10-241-22/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reload_store() is way too complicated. Split the inner workings out and make the following enhancements: - Taint the kernel only when the microcode was actually updated. If. e.g. the rendezvous fails, then nothing happened and there is no reason for tainting. - Return useful error codes Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/r/20231002115903.145048840@linutronix.de
| * | | x86/microcode: Handle "nosmt" correctlyThomas Gleixner2023-10-244-31/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On CPUs where microcode loading is not NMI-safe the SMT siblings which are parked in one of the play_dead() variants still react to NMIs. So if an NMI hits while the primary thread updates the microcode the resulting behaviour is undefined. The default play_dead() implementation on modern CPUs is using MWAIT which is not guaranteed to be safe against a microcode update which affects MWAIT. Take the cpus_booted_once_mask into account to detect this case and refuse to load late if the vendor specific driver does not advertise that late loading is NMI safe. AMD stated that this is safe, so mark the AMD driver accordingly. This requirement will be partially lifted in later changes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.087472735@linutronix.de
| * | | x86/microcode: Clean up mc_cpu_down_prep()Thomas Gleixner2023-10-241-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function has nothing to do with suspend. It's a hotplug callback. Remove the bogus comment. Drop the pointless debug printk. The hotplug core provides tracepoints which track the invocation of those callbacks. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231002115903.028651784@linutronix.de
| * | | x86/microcode: Get rid of the schedule work indirectionThomas Gleixner2023-10-241-19/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Scheduling work on all CPUs to collect the microcode information is just another extra step for no value. Let the CPU hotplug callback registration do it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231017211723.354748138@linutronix.de
| * | | x86/microcode: Mop up early loading leftoversThomas Gleixner2023-10-242-17/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of the initrd_gone hack which was required to keep find_microcode_in_initrd() functional after init. As find_microcode_in_initrd() is now only used during init, mark it accordingly. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231017211723.298854846@linutronix.de
| * | | x86/microcode/amd: Use cached microcode for AP loadThomas Gleixner2023-10-243-24/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the microcode cache is initialized before the APs are brought up, there is no point in scanning builtin/initrd microcode during AP loading. Convert the AP loader to utilize the cache, which in turn makes the CPU hotplug callback which applies the microcode after initrd/builtin is gone, obsolete as the early loading during late hotplug operations including the resume path depends now only on the cache. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231017211723.243426023@linutronix.de
| * | | x86/microcode/amd: Cache builtin/initrd microcode earlyThomas Gleixner2023-10-242-17/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no reason to scan builtin/initrd microcode on each AP. Cache the builtin/initrd microcode in an early initcall so that the early AP loader can utilize the cache. The existing fs initcall which invoked save_microcode_in_initrd_amd() is still required to maintain the initrd_gone flag. Rename it accordingly. This will be removed once the AP loader code is converted to use the cache. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231017211723.187566507@linutronix.de
| * | | x86/microcode/amd: Cache builtin microcode tooThomas Gleixner2023-10-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | save_microcode_in_initrd_amd() fails to cache builtin microcode and only scans initrd. Use find_blobs_in_containers() instead which covers both. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231010150702.495139089@linutronix.de
| * | | x86/microcode/amd: Use correct per CPU ucode_cpu_infoThomas Gleixner2023-10-241-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | find_blobs_in_containers() is invoked on every CPU but overwrites unconditionally ucode_cpu_info of CPU0. Fix this by using the proper CPU data and move the assignment into the call site apply_ucode_from_containers() so that the function can be reused. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20231010150702.433454320@linutronix.de