summaryrefslogtreecommitdiffstats
path: root/fs/bcachefs/debug.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* bcachefs: Improve bch2_fatal_error()Kent Overstreet2024-03-181-1/+1
| | | | | | error messages should always include __func__ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: kill kvpmalloc()Kent Overstreet2024-03-131-3/+3
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Add gfp flags param to bch2_prt_task_backtrace()Kent Overstreet2024-01-221-1/+1
| | | | | | Fixes: e6a2566f7a00 ("bcachefs: Better journal tracepoints") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> Reported-by: smatch
* bcachefs: Prep work for variable size btree node buffersKent Overstreet2024-01-211-8/+8
| | | | | | | | | | | | | | | | | bcachefs btree nodes are big - typically 256k - and btree roots are pinned in memory. As we're now up to 18 btrees, we now have significant memory overhead in mostly empty btree roots. And in the future we're going to start enforcing that certain btree node boundaries exist, to solve lock contention issues - analagous to XFS's AGIs. Thus, we need to start allocating smaller btree node buffers when we can. This patch changes code that refers to the filesystem constant c->opts.btree_node_size to refer to the btree node buffer size - btree_buf_bytes() - where appropriate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve would_deadlock trace eventKent Overstreet2024-01-061-1/+1
| | | | | | We now include backtraces for every thread involved in the cycle. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: kill useless return retKent Overstreet2024-01-061-3/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: track transaction durationsKent Overstreet2024-01-061-11/+18
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: optimize __bch2_trans_get(), kill DEBUG_TRANSACTIONSKent Overstreet2024-01-011-9/+9
| | | | | | | | | | | | | - Some tweaks to greatly reduce locking overhead for the list of btree transactions, so that it can always be enabled: leave btree_trans objects on the list when they're on the percpu single item freelist, and only check for duplicates in the same process when CONFIG_BCACHEFS_DEBUG is enabled - don't zero out the full btree_trans() unless we allocated it from the mempool Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: btree_iter -> btree_path_idx_tKent Overstreet2024-01-011-1/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: for_each_btree_key() now declares loop iterKent Overstreet2024-01-011-59/+32
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename for_each_btree_key2() -> for_each_btree_key()Kent Overstreet2024-01-011-6/+6
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix bch2_read_btree()Kent Overstreet2024-01-011-2/+4
| | | | | | | | In the debugfs code, we had an incorrect use of drop_locks_do(); on transaction restart we don't want to restart the current loop iteration, since we've already emitted the current key to the buffer for userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_id_str()Kent Overstreet2023-10-311-4/+4
| | | | | | | Since we can run with unknown btree IDs, we can't directly index btree IDs into fixed size arrays. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix copy_to_user() usage in flush_buf()Kent Overstreet2023-10-221-8/+8
| | | | | | | | copy_to_user() returns the number of bytes successfully copied - not an errcode. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Heap allocate btree_transKent Overstreet2023-10-221-17/+17
| | | | | | | | | | We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix W=12 build errorsKent Overstreet2023-10-221-4/+2
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Break up io.cKent Overstreet2023-10-221-1/+0
| | | | | | | | | More reorganization, this splits up io.c into - io_read.c - io_misc.c - fallocate, fpunch, truncate - io_write.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix more lockdep splats in debug.cKent Overstreet2023-10-221-17/+17
| | | | | | | Similar to previous fixes, we can't incur page faults while holding btree locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: seqmutex; fix a lockdep splatKent Overstreet2023-10-221-11/+35
| | | | | | | | We can't be holding btree_trans_lock while copying to user space, which might incur a page fault. To fix this, convert it to a seqmutex so we can unlock/relock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: GFP_NOIO -> GFP_NOFSKent Overstreet2023-10-221-2/+2
| | | | | | | | GFP_NOIO dates from the bcache days, when we operated under the block layer. Now, GFP_NOFS is more appropriate, so switch all GFP_NOIO uses to GFP_NOFS. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: bch2_btree_node_ondisk_to_text()Kent Overstreet2023-10-221-0/+119
| | | | | | | Pulling out a helper from cmd_list.c, as the rest is being rewritten in Rust but we're not ready to rewrite lower-level btree code in Rust. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Drop some anonymous structs, unionsKent Overstreet2023-10-221-1/+1
| | | | | | | | | | Rust bindgen doesn't cope well with anonymous structs and unions. This patch drops the fancy anonymous structs & unions in bkey_i that let us use the same helpers for bkey_i and bkey_packed; since bkey_packed is an internal type that's never exposed to outside code, it's only a minor inconvenienc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: New backtrace utility codeKent Overstreet2023-10-221-1/+1
| | | | Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: debug: Fix some locking bugsKent Overstreet2023-10-221-2/+2
| | | | | | | This fixes a few error paths in debug code that lead to locks failing to be dropped. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Plumb saw_error through to btree_err()Kent Overstreet2023-10-221-2/+2
| | | | | | | | | | The btree node read path has the ability to kick off an asynchronous btree node rewrite if we saw and corrected an error. Previously this was only used for errors that caused one of the replicas to be unusable - this patch plumbs it through to all error paths, so that normal fsck errors can be corrected. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: New bpos_cmp(), bkey_cmp() replacementsKent Overstreet2023-10-221-3/+3
| | | | | | | | | | | | | | | | | This patch introduces - bpos_eq() - bpos_lt() - bpos_le() - bpos_gt() - bpos_ge() and equivalent replacements for bkey_cmp(). Looking at the generated assembly these could probably be improved further, but we already see a significant code size improvement with this patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Assorted checkpatch fixesKent Overstreet2023-10-221-1/+1
| | | | | | | | | | | | | | | | | checkpatch.pl gives lots of warnings that we don't want - suggested ignore list: ASSIGN_IN_IF UNSPECIFIED_INT - bcachefs coding style prefers single token type names NEW_TYPEDEFS - typedefs are occasionally good FUNCTION_ARGUMENTS - we prefer to look at functions in .c files (hopefully with docbook documentation), not .h file prototypes MULTISTATEMENT_MACRO_USE_DO_WHILE - we have _many_ x-macros and other macros where we can't do this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Optimize bch2_trans_init()Kent Overstreet2023-10-221-3/+3
| | | | | | | Now we store the transaction's fn idx in a local variable, instead of redoing the lookup every time we call bch2_trans_init(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Print cycle on unrecoverable deadlockKent Overstreet2023-10-221-21/+1
| | | | | | | | | | | Some lock operations can't fail; a cycle of nofail locks is impossible to recover from. So we want to get rid of these nofail locking operations, but as this is tricky it'll be done incrementally. If such a cycle happens, this patch prints out which codepaths are involved so we know what to work on next. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve btree_deadlock debugfs outputKent Overstreet2023-10-221-5/+12
| | | | | | | | This changes bch2_check_for_deadlock() to print the longest chains it finds - when we have a deadlock because the cycle detector isn't finding something, this will let us see what it's missing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Print deadlock cycle in debugfsKent Overstreet2023-10-221-0/+43
| | | | | | | | | | In the event that we're not finished debugging the cycle detector, this adds a new file to debugfs that shows what the cycle detector finds, if anything. By comparing this with btree_transactions, which shows held locks for every btree_transaction, we'll be able to determine if it's the cycle detector that's buggy or something else. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Deadlock cycle detectorKent Overstreet2023-10-221-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We've outgrown our own deadlock avoidance strategy. The btree iterator API provides an interface where the user doesn't need to concern themselves with lock ordering - different btree iterators can be traversed in any order. Without special care, this will lead to deadlocks. Our previous strategy was to define a lock ordering internally, and whenever we attempt to take a lock and trylock() fails, we'd check if the current btree transaction is holding any locks that cause a lock ordering violation. If so, we'd issue a transaction restart, and then bch2_trans_begin() would re-traverse all previously used iterators, but in the correct order. That approach had some issues, though. - Sometimes we'd issue transaction restarts unnecessarily, when no deadlock would have actually occured. Lock ordering restarts have become our primary cause of transaction restarts, on some workloads totally 20% of actual transaction commits. - To avoid deadlock or livelock, we'd often have to take intent locks when we only wanted a read lock: with the lock ordering approach, it is actually illegal to hold _any_ read lock while blocking on an intent lock, and this has been causing us unnecessary lock contention. - It was getting fragile - the various lock ordering rules are not trivial, and we'd been seeing occasional livelock issues related to this machinery. So, since bcachefs is already a relational database masquerading as a filesystem, we're stealing the next traditional database technique and switching to a cycle detector for avoiding deadlocks. When we block taking a btree lock, after adding ourself to the waitlist but before sleeping, we do a DFS of btree transactions waiting on other btree transactions, starting with the current transaction and walking our held locks, and transactions blocking on our held locks. If we find a cycle, we emit a transaction restart. Occasionally (e.g. the btree split path) we can not allow the lock() operation to fail, so if necessary we'll tell another transaction that it has to fail. Result: trans_restart_would_deadlock events are reduced by a factor of 10 to 100, and we'll be able to delete a whole bunch of grotty, fragile code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Track maximum transaction memoryKent Overstreet2023-10-221-0/+3
| | | | | | | | | | | | | This patch - tracks maximum bch2_trans_kmalloc() memory used in btree_transaction_stats - makes it available in debugfs - switches bch2_trans_init() to using that for the amount of memory to preallocate, instead of the parameter passed in This drastically reduces transaction restarts, and means we no longer need to track this in the source code. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Debugfs cleanupKent Overstreet2023-10-221-64/+51
| | | | | | | | | | | | This improves flush_buf() so that it always returns nonzero when we're done reading and ready to return to userspace, and so that it returns the value we want to return to userspace (number of bytes read, if there wasn't an error). In the future we'll be better abstracting this mechanism and pulling it out of bcachefs, and using it to replace seq_file. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Print last line in debugfs/btree_transaction_statsKent Overstreet2023-10-221-2/+5
| | | | | | | We need to turn the flush_buf() thing into a proper API, to replace seq_file. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Track the maximum btree_paths ever allocated by each transactionKent Overstreet2023-10-221-5/+25
| | | | | | | | | | | | We need a way to check if the machinery for handling btree_paths with in a transaction is behaving reasonably, as it often has not been - we've had bugs with transaction path overflows caused by duplicate paths and plenty of other things. This patch tracks, per transaction fn, the most btree paths ever allocated by that transaction and makes it available in debugfs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Rename lock_held_stats -> btree_transaction_statsKent Overstreet2023-10-221-7/+10
| | | | | | | Going to be adding more things to this in the next patch. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Convert debugfs code to for_each_btree_key2()Kent Overstreet2023-10-221-48/+31
| | | | | | | This fixes a bug where we were leaking a transaction restart error to userspace. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: added lock held time statsDaniel Hill2023-10-221-0/+74
| | | | | | | | | We now record the length of time btree locks are held and expose this in debugfs. Enabled via CONFIG_BCACHEFS_LOCK_TIME_STATS. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Improve an error messageKent Overstreet2023-10-221-0/+77
| | | | | | | When inserting a key type that's not valid for a given btree, we should print out which btree we were inserting into. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Printbuf reworkKent Overstreet2023-10-221-43/+44
| | | | | | | This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix usage of six lock's percpu modeKent Overstreet2023-10-221-0/+5
| | | | | | | | | | | | | | | Six locks have a percpu mode, which we use for interior btree nodes, as well as btree key cache keys for the subvolumes btree. We've been switching locks back and forth between percpu and non percpu mode as needed, but it turns out this is racy - when we're reusing an existing node, other threads could be attempting to lock it while we're switching it between modes. This patch fixes this by never switching 'struct btree' between the two modes, and instead segragating them between two different freed lists. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Start moving debug info from sysfs to debugfsKent Overstreet2023-10-221-13/+163
| | | | | | | | | | | | In sysfs, files can only output at most PAGE_SIZE. This is a problem for debug info that needs to list an arbitrary number of times, and because of this limit some of our debug info has been terser and harder to read than we'd like. This patch moves info about journal pins and cached btree nodes to debugfs, and greatly expands and improves the output we return. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Heap allocate printbufsKent Overstreet2023-10-221-22/+20
| | | | | | | | | | | This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Fix debugfs -bfloat-failedKent Overstreet2023-10-221-1/+3
| | | | | | | It wasn't updated for snapshots - it's iterating across keys in all snapshots, so needs to be specifying BTREE_ITER_ALL_SNAPSHOTS. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: Add missing bch2_trans_iter_exit() callKent Overstreet2023-10-221-0/+2
| | | | | | | This fixes a bug where the filesystem goes read only when reading from debugfs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: for_each_btree_node() now returns errors directlyKent Overstreet2023-10-221-1/+1
| | | | | | | | This changes for_each_btree_node() to work like for_each_btree_key(), and to that end bch2_btree_iter_peek_node() and next_node() also return error ptrs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
* bcachefs: btree_pathKent Overstreet2023-10-221-16/+16
| | | | | | | | | | | | | | | This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Really don't hold btree locks while btree IOs are in flightKent Overstreet2023-10-221-2/+2
| | | | | | | | | | | | This is something we've attempted to stick to for quite some time, as it helps guarantee filesystem latency - but there's a few remaining paths that this patch fixes. This is also necessary for an upcoming patch to update btree pointers after every btree write - since the btree write completion path will now be doing btree operations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
* bcachefs: Split out SPOS_MAXKent Overstreet2023-10-221-2/+2
| | | | | | | | | | Internal btree code really wants a POS_MAX with all fields ~0; external code more likely wants the snapshot field to be 0, because when we're passing it to bch2_trans_get_iter() it's used for the snapshot we're operating in, which should be 0 for most btrees that don't use snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>