summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ext4: defragmentation code cleanupDmitry Monakhov2013-04-121-2/+9
| | | | | | | | | | | | | | | | - grab_cache_page_write_begin() may not wait on page's writeback since (1d1d1a767206). But it is still reasonable to wait on page's writeback here in order to be on the safe side. - Fix miss typo: pass 'length' instead of 'end' to __block_write_begin() https://bugzilla.kernel.org/show_bug.cgi?id=56241 TESTCASE: git://oss.sgi.com/xfs/cmds/xfstests.git MKFS_OPTIONS="-b1024" ; ./check ext4/304 Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Akira Fujita <a-fujita.rs.jp.nec.com>
* ext4: do not convert to indirect with bigalloc enabledLukas Czerner2013-04-111-0/+4
| | | | | | | | | | With bigalloc feature enabled we do not support indirect addressing at all so we have to prevent extent addressing to indirect addressing conversion in this case. The problem has been introduced with the commit "ext4: support simple conversion of extent-mapped inodes to use i_blocks" Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: move ext4_ind_migrate() into migrate.cLukas Czerner2013-04-113-59/+58
| | | | | | | | | | Move ext4_ind_migrate() into migrate.c file since it makes much more sense and ext4_ext_migrate() is there as well. Also fix tiny style problem - add spaces around "=" in "i=0". Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix miscellaneous big endian warningsTheodore Ts'o2013-04-104-15/+17
| | | | | | None of these result in any bug, but they makes sparse complain. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix big-endian bug in metadata checksum calculationsDmitry Monakhov2013-04-102-5/+5
| | | | | | Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: fix big-endian bug in extent migration codeDmitry Monakhov2013-04-101-1/+1
| | | | | | | Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: fix usless declarationsDmitri Monakho2013-04-104-5/+0
| | | | | | | This patch should fix sparse complains about shadow declatations. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: introduce reserved spaceLukas Czerner2013-04-106-23/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently in ENOSPC condition when writing into unwritten space, or punching a hole, we might need to split the extent and grow extent tree. However since we can not allocate any new metadata blocks we'll have to zero out unwritten part of extent or punched out part of extent, or in the worst case return ENOSPC even though use actually does not allocate any space. Also in delalloc path we do reserve metadata and data blocks for the time we're going to write out, however metadata block reservation is very tricky especially since we expect that logical connectivity implies physical connectivity, however that might not be the case and hence we might end up allocating more metadata blocks than previously reserved. So in future, metadata reservation checks should be removed since we can not assure that we do not under reserve. And this is where reserved space comes into the picture. When mounting the file system we slice off a little bit of the file system space (2% or 4096 clusters, whichever is smaller) which can be then used for the cases mentioned above to prevent costly zeroout, or unexpected ENOSPC. The number of reserved clusters can be set via sysfs, however it can never be bigger than number of free clusters in the file system. Note that this patch fixes the failure of xfstest 274 as expected. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
* ext4: improve credit estimate for EXT4_SINGLEDATA_TRANS_BLOCKSJan Kara2013-04-091-2/+4
| | | | | | | | Estimate of 27 credits for allocation of a block in extent based inode is unnecessarily high. We can easily argue 20 is enough. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: speed-up releasing blocks on commitAndrey Sidorov2013-04-091-61/+172
| | | | | | | | | | | | | Improve mb_free_blocks speed by clearing entire range at once instead of iterating over each bit. Freeing block-by-block also makes buddy bitmap subtree flip twice making most of the work a no-op. Very few bits in buddy bitmap require change, e.g. freeing entire group is a 1 bit flip only. As a result, releasing blocks of 60G file now takes 5ms instead of 2.7s. This is especially good for non-preemptive kernels as there is no rescheduling during release. Signed-off-by: Andrey Sidorov <qrxd43@motorola.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix free space estimate in ext4_nonda_switch()Eric Whitney2013-04-091-7/+8
| | | | | | | | | | | | | | Values stored in s_freeclusters_counter and s_dirtyclusters_counter are both in cluster units. Remove the cluster to block conversion applied to s_freeclusters_counter causing an inflated estimate of free space because s_dirtyclusters_counter is not similarly converted. Rename free_blocks and dirty_blocks to better reflect the units these variables contain to avoid future confusion. This fix corrects ENOSPC failures for xfstests 127 and 231 on bigalloc file systems. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix deadlock with quota featureJan Kara2013-04-091-0/+2
| | | | | | | | | | We didn't mark hidden quota files with S_NOQUOTA flag and thus quota was accounted even for quota files. Thus we could recurse back to quota code when adding new blocks to quota file which can easily deadlock. Mark hidden quota files properly. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix incorrect lock ordering for ext4_ind_migrateDmitry Monakhov2013-04-081-7/+5
| | | | | | | | | existing locking ordering: journal-> i_data_sem, but ext4_ind_migrate() grab locks in opposite order which may result in deadlock. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: implementation of a new ioctl called EXT4_IOC_SWAP_BOOTDr. Tilmann Bubeck2013-04-085-26/+248
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new ioctl, EXT4_IOC_SWAP_BOOT which swaps i_blocks and associated attributes (like i_blocks, i_size, i_flags, ...) from the specified inode with inode EXT4_BOOT_LOADER_INO (#5). This is typically used to store a boot loader in a secure part of the filesystem, where it can't be changed by a normal user by accident. The data blocks of the previous boot loader will be associated with the given inode. This usercode program is a simple example of the usage: int main(int argc, char *argv[]) { int fd; int err; if ( argc != 2 ) { printf("usage: ext4-swap-boot-inode FILE-TO-SWAP\n"); exit(1); } fd = open(argv[1], O_WRONLY); if ( fd < 0 ) { perror("open"); exit(1); } err = ioctl(fd, EXT4_IOC_SWAP_BOOT); if ( err < 0 ) { perror("ioctl"); exit(1); } close(fd); exit(0); } [ Modified by Theodore Ts'o to fix a number of bugs in the original code.] Signed-off-by: Dr. Tilmann Bubeck <t.bubeck@reinform.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: print more info in ext4_print_free_blocks()Lukas Czerner2013-04-041-5/+8
| | | | | | Additionally print i_allocated_meta_blocks information as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com>
* ext4: try to prepend extent to the existing oneLukas Czerner2013-04-041-23/+85
| | | | | | | | | | | | | | | | | | | | Currently when inserting extent in ext4_ext_insert_extent() we would only try to to see if we can append new extent to the found extent. If we can not, then we proceed with adding new extent into the extent tree, but then possibly merging it back again. We can avoid this situation by trying to append and prepend new extent to the existing ones. However since the new extent can be on either sides of the existing extent, we have to pick the right extent to try to append/prepend to. This patch adds the conditions to pick the right extent to append/prepend to and adds the actual prepending condition as well. This will also eliminate the need to use "reserved" block for possibly growing extent tree. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Transfer initialized block to right neighbor if possibleLukas Czerner2013-04-041-46/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when converting extent to initialized we attempt to transfer initialized block to the left neighbour if possible when certain criteria are met. However we do not attempt to do the same for the right neighbor. This commit adds the possibility to transfer initialized block to the right neighbour if: 1. We're not converting the whole extent 2. Both extents are stored in the same extent tree node 3. Right neighbor is initialized 4. Right neighbor is logically abutting the current one 5. Right neighbor is physically abutting the current one 6. Right neighbor would not overflow the length limit This is basically the same logic as with transferring to the left. This will gain us some performance benefits since it is faster than inserting extent and then merging it. It would also prevent some situation in delalloc patch when we might run out of metadata reservation. This is due to the fact that we would attempt to split the extent first (possibly allocating new metadata block) even though we did not counted for that because it can (and will) be merged again. This commit fix that scenario, because we no longer need to split the extent in such case. Signed-off-by: Lukas Czerner <lczerner@redhat.com>
* ext4: introduce ext4_get_group_number()Lukas Czerner2013-04-044-18/+33
| | | | | | | | | | | | | | | | Currently on many places in ext4 we're using ext4_get_group_no_and_offset() even though we're only interested in knowing the block group of the particular block, not the offset within the block group so we can use more efficient way to compute block group. This patch introduces ext4_get_group_number() which computes block group for a given block much more efficiently. Use this function instead of ext4_get_group_no_and_offset() everywhere where we're only interested in knowing the block group. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: make ext4_block_in_group() much more efficientLukas Czerner2013-04-043-7/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently in when getting the block group number for a particular block in ext4_block_in_group() we're using ext4_get_group_no_and_offset() which uses do_div() to get the block group and the remainer which is offset within the group. We don't need all of that in ext4_block_in_group() as we only need to figure out the group number. This commit changes ext4_block_in_group() to calculate group number directly. This shows as a big improvement with regards to cpu utilization. Measuring fallocate -l 15T on fresh file system with perf showed that 23% of cpu time was spend in the ext4_get_group_no_and_offset(). With this change it completely disappears from the list only bumping the occurrence of ext4_init_block_bitmap() which is the biggest user of ext4_block_in_group() by 4%. As the result of this change on my system the fallocate call was approx. 10% faster. However since there is '-g' option in mkfs which allow us setting different groups size (mostly for developers) I've introduced new per file system flag whether we have a standard block group size or not. The flag is used to determine whether we can use the bit shift optimization or not. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: unregister es_shrinker if mount failedDmitry Monakhov2013-04-041-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise destroyed ext_sb_info will be part of global shinker list and result in the following OOPS: JBD2: corrupted journal superblock JBD2: recovery failed EXT4-fs (dm-2): error loading journal general protection fault: 0000 [#1] SMP Modules linked in: fuse acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel microcode sg button sd_mod crc_t10dif ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_\ mod CPU 1 Pid: 2758, comm: mount Not tainted 3.8.0-rc3+ #136 /DH55TC RIP: 0010:[<ffffffff811bfb2d>] [<ffffffff811bfb2d>] unregister_shrinker+0xad/0xe0 RSP: 0000:ffff88011d5cbcd8 EFLAGS: 00010207 RAX: 6b6b6b6b6b6b6b6b RBX: 6b6b6b6b6b6b6b53 RCX: 0000000000000006 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000246 RBP: ffff88011d5cbce8 R08: 0000000000000002 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88011cd3f848 R13: ffff88011cd3f830 R14: ffff88011cd3f000 R15: 0000000000000000 FS: 00007f7b721dd7e0(0000) GS:ffff880121a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fffa6f75038 CR3: 000000011bc1c000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process mount (pid: 2758, threadinfo ffff88011d5ca000, task ffff880116aacb80) Stack: ffff88011cd3f000 ffffffff8209b6c0 ffff88011d5cbd18 ffffffff812482f1 00000000000003f3 00000000ffffffea ffff880115f4c200 0000000000000000 ffff88011d5cbda8 ffffffff81249381 ffff8801219d8bf8 ffffffff00000000 Call Trace: [<ffffffff812482f1>] deactivate_locked_super+0x91/0xb0 [<ffffffff81249381>] mount_bdev+0x331/0x340 [<ffffffff81376730>] ? ext4_alloc_flex_bg_array+0x180/0x180 [<ffffffff81362035>] ext4_mount+0x15/0x20 [<ffffffff8124869a>] mount_fs+0x9a/0x2e0 [<ffffffff81277e25>] vfs_kern_mount+0xc5/0x170 [<ffffffff81279c02>] do_new_mount+0x172/0x2e0 [<ffffffff8127aa56>] do_mount+0x376/0x380 [<ffffffff8127ab98>] sys_mount+0x138/0x150 [<ffffffff818ffed9>] system_call_fastpath+0x16/0x1b Code: 8b 05 88 04 eb 00 48 3d 90 ff 06 82 48 8d 58 e8 75 19 4c 89 e7 e8 e4 d7 2c 00 48 c7 c7 00 ff 06 82 e8 58 5f ef ff 5b 41 5c c9 c3 <48> 8b 4b 18 48 8b 73 20 48 89 da 31 c0 48 c7 c7 c5 a0 e4 81 e\ 8 RIP [<ffffffff811bfb2d>] unregister_shrinker+0xad/0xe0 RSP <ffff88011d5cbcd8> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: fix journal callback list traversalDmitry Monakhov2013-04-043-7/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is incorrect to use list_for_each_entry_safe() for journal callback traversial because ->next may be removed by other task: ->ext4_mb_free_metadata() ->ext4_mb_free_metadata() ->ext4_journal_callback_del() This results in the following issue: WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250() Hardware name: list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107 Call Trace: [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0 [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50 [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0 [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250 [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0 [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570 [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0 [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0 [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0 [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40 [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80 [<ffffffff810ac6be>] kthread+0x10e/0x120 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 This patch fix the issue as follows: - ext4_journal_commit_callback() make list truly traversial safe simply by always starting from list_head - fix race between two ext4_journal_callback_del() and ext4_journal_callback_try_del() Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.com
* jbd2: fix race between jbd2_journal_remove_checkpoint and ->j_commit_callbackDmitry Monakhov2013-04-042-22/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following race is possible: [kjournald2] other_task jbd2_journal_commit_transaction() j_state = T_FINISHED; spin_unlock(&journal->j_list_lock); ->jbd2_journal_remove_checkpoint() ->jbd2_journal_free_transaction(); ->kmem_cache_free(transaction) ->j_commit_callback(journal, transaction); -> USE_AFTER_FREE WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250() Hardware name: list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod Pid: 16400, comm: jbd2/dm-1-8 Tainted: G W 3.8.0-rc3+ #107 Call Trace: [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0 [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50 [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0 [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250 [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0 [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570 [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0 [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0 [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0 [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40 [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80 [<ffffffff810ac6be>] kthread+0x10e/0x120 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0 [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70 In order to demonstrace this issue one should mount ext4 with mount -o discard option on SSD disk. This makes callback longer and race window becomes wider. In order to fix this we should mark transaction as finished only after callbacks have completed Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
* ext4: support simple conversion of extent-mapped inodes to use i_blocksTheodore Ts'o2013-04-043-13/+69
| | | | | | | | | | | | | | | | | In order to make it simpler to test the code which support i_blocks/indirect-mapped inodes, support the conversion of inodes which are less than 12 blocks and which are contained in no more than a single extent. The primary intended use of this code is to converting freshly created zero-length files and empty directories. Note that the version of chattr in e2fsprogs 1.42.7 and earlier has a check that prevents the clearing of the extent flag. A simple patch which allows "chattr -e <file>" to work will be checked into the e2fsprogs git repository. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4/jbd2: don't wait (forever) for stale tid caused by wraparoundTheodore Ts'o2013-04-044-4/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where an inode has a very stale transaction id (tid) in i_datasync_tid or i_sync_tid, it's possible that after a very large (2**31) number of transactions, that the tid number space might wrap, causing tid_geq()'s calculations to fail. Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily", attempted to fix this problem, but it only avoided kjournald spinning forever by fixing the logic in jbd2_log_start_commit(). Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c that might call jbd2_log_start_commit() with a stale tid, those functions will subsequently call jbd2_log_wait_commit() with the same stale tid, and then wait for a very long time. To fix this, we replace the calls to jbd2_log_start_commit() and jbd2_log_wait_commit() with a call to a new function, jbd2_complete_transaction(), which will correctly handle stale tid's. As a bonus, jbd2_complete_transaction() will avoid locking j_state_lock for writing unless a commit needs to be started. This should have a small (but probably not measurable) improvement for ext4's scalability. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: Ben Hutchings <ben@decadent.org.uk> Reported-by: George Barnett <gbarnett@atlassian.com> Cc: stable@vger.kernel.org
* ext4: add might_sleep() annotationsTheodore Ts'o2013-04-042-0/+10
| | | | | Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
* ext4: add mutex_is_locked() assertion to ext4_truncate()Theodore Ts'o2013-04-041-0/+7
| | | | | | | [ Added fixup from Lukáš Czerner which only checks the assertion when the inode is not new and is not being freed. ] Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: refactor truncate codeTheodore Ts'o2013-04-034-148/+78
| | | | | | | Move common code in ext4_ind_truncate() and ext4_ext_truncate() into ext4_truncate(). This saves over 60 lines of code. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: refactor punch hole codeTheodore Ts'o2013-04-034-347/+183
| | | | | | | | | | | | | | | | | | | | | Move common code in ext4_ind_punch_hole() and ext4_ext_punch_hole() into ext4_punch_hole(). This saves over 150 lines of code. This also fixes a potential bug when the punch_hole() code is racing against indirect-to-extents or extents-to-indirect migation. We are currently using i_mutex to protect against changes to the inode flag; specifically, the append-only, immutable, and extents inode flags. So we need to take i_mutex before deciding whether to use the extents-specific or indirect-specific punch_hole code. Also, there was a missing call to ext4_inode_block_unlocked_dio() in the indirect punch codepath. This was added in commit 02d262dffcf4c to block DIO readers racing against the punch operation in the codepath for extent-mapped inodes, but it was missing for indirect-block mapped inodes. One of the advantages of refactoring the code is that it makes such oversights much less likely. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fold ext4_alloc_blocks() in ext4_alloc_branch()Theodore Ts'o2013-04-031-181/+46
| | | | | | | | | | The older code was far more complicated than it needed to be because of how we spliced in the ext4's new multiblock allocator into ext3's indirect block code. By folding ext4_alloc_blocks() into ext4_alloc_branch(), we make the code far more understable, shave off over 130 lines of code and half a kilobyte of compiled object code. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fold ext4_generic_write_end() into ext4_write_end()Zheng Liu2013-04-031-40/+27
| | | | | | | | | | After collapsing the handling of data ordered and data writeback codepath, ext4_generic_write_end() has only one caller, ext4_write_end(). So we fold it into ext4_write_end(). Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
* ext4: collapse handling of data=ordered and data=writeback codepathsTheodore Ts'o2013-04-033-105/+31
| | | | | | | | | The only difference between how we handle data=ordered and data=writeback is a single call to ext4_jbd2_file_inode(). Eliminate code duplication by factoring out redundant the code paths. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
* ext4: fix big-endian bugs which could cause fs corruptionsZheng Liu2013-04-032-6/+9
| | | | | | | | | | | | | | | | | | | | | | When an extent was zeroed out, we forgot to do convert from cpu to le16. It could make us hit a BUG_ON when we try to write dirty pages out. So fix it. [ Also fix a bug found by Dmitry Monakhov where we were missing le32_to_cpu() calls in the new indirect punch hole code. There are a number of other big endian warnings found by static code analyzers, but we'll wait for the next merge window to fix them all up. These fixes are designed to be Obviously Correct by code inspection, and easy to demonstrate that it won't make any difference (and hence, won't introduce any bugs) on little endian architectures such as x86. --tytso ] Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reported-by: CAI Qian <caiqian@redhat.com> Reported-by: Christian Kujau <lists@nerdbynature.de> Cc: Dmitry Monakhov <dmonakhov@openvz.org>
* Linux 3.9-rc5v3.9-rc5Linus Torvalds2013-04-011-1/+1
|
* Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds2013-03-312-4/+20
|\ | | | | | | | | | | | | | | | | | | | | | | Pull slave-dmaengine fixes from Vinod Koul: "Two fixes for slave-dmaengine. The first one is for making slave_id value correct for dw_dmac and the other one fixes the endieness in DT parsing" * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: dw_dmac: adjust slave_id accordingly to request line base dmaengine: dw_dma: fix endianess for DT xlate function
| * dw_dmac: adjust slave_id accordingly to request line baseAndy Shevchenko2013-03-302-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | On some hardware configurations we have got the request line with the offset. The patch introduces convert_slave_id() helper for that cases. The request line base is came from the driver data provided by the platform_device_id table. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
| * dmaengine: dw_dma: fix endianess for DT xlate functionArnd Bergmann2013-03-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | As reported by Wu Fengguang's build robot tracking sparse warnings, the dma_spec arguments in the dw_dma_xlate are already byte swapped on little-endian platforms and must not get swapped again. This code is currently not used anywhere, but will be used in Linux 3.10 when the ARM SPEAr platform starts using the generic DMA DT binding. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com>
* | Merge branch 'v4l_for_linus' of ↵Linus Torvalds2013-03-3111-38/+53
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "For a some fixes for Kernel 3.9: - subsystem build fix when VIDEO_DEV=y, VIDEO_V4L2=m and I2C=m - compilation fix for arm multiarch preventing IR_RX51 to be selected - regression fix at bttv crop logic - s5p-mfc/m5mols/exynos: a few fixes for cameras on exynos hardware" * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] [REGRESSION] bt8xx: Fix too large height in cropcap [media] fix compilation with both V4L2 and I2C as 'm' [media] m5mols: Fix bug in stream on handler [media] s5p-fimc: Do not attempt to disable not enabled media pipeline [media] s5p-mfc: Fix encoder control 15 issue [media] s5p-mfc: Fix frame skip bug [media] s5p-fimc: send valid m2m ctx to fimc_m2m_job_finish [media] exynos-gsc: send valid m2m ctx to gsc_m2m_job_finish [media] fimc-lite: Fix the variable type to avoid possible crash [media] fimc-lite: Initialize 'step' field in fimc_lite_ctrl structure [media] ir: IR_RX51 only works on OMAP2
| * | [media] [REGRESSION] bt8xx: Fix too large height in cropcapHans de Goede2013-03-261-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit a1fd287780c8e91fed4957b30c757b0c93021162: "[media] bttv-driver: fix two warnings" cropcap.defrect.height and cropcap.bounds.height for the PAL entry are 32 resp 30 pixels too large, if a userspace app (ie xawtv) actually tries to use the full advertised height, the resulting image is broken in ways only a screenshot can describe. The cause of this is the fix for this warning: drivers/media/pci/bt8xx/bttv-driver.c:308:3: warning: initialized field overwritten [-Woverride-init] In this chunk of the commit: @@ -301,11 +301,10 @@ const struct bttv_tvnorm bttv_tvnorms[] = { /* totalwidth */ 1135, /* sqwidth */ 944, /* vdelay */ 0x20, - /* sheight */ 576, - /* videostart0 */ 23) /* bt878 (and bt848?) can capture another line below active video. */ - .cropcap.bounds.height = (576 + 2) + 0x20 - 2, + /* sheight */ (576 + 2) + 0x20 - 2, + /* videostart0 */ 23) },{ .v4l2_id = V4L2_STD_NTSC_M | V4L2_STD_NTSC_M_KR, .name = "NTSC", Which replaces the overriding of cropcap.bounds.height initialization outside of the CROPCAP macro (which also initializes it), with passing a different sheight value to the CROPCAP macro. There are 2 problems with this warning fix: 1) The sheight value is used twice in the CROPCAP macro, and the old code only changed one resulting value. 2) The old code increased the .cropcap.bounds.height value (and did not touch the .cropcap.defrect.height value at all) by 2, where as the fixed code increases it by 32, as the fixed code passes (576 + 2) + 0x20 - 2 to the CROPCAP macro, but the + 0x20 - 2 is already done by the macro so now is done twice for .cropcap.bounds.height, and also is applied to .cropcap.defrect.height where it should not be applied at all. This patch fixes this by adding an extraheight parameter to the CROPCAP entry and using it for the PAL entry. Cc: stable@kernel.org # For Kernel 3.8 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] fix compilation with both V4L2 and I2C as 'm'Mauro Carvalho Chehab2013-03-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When config options are: CONFIG_VIDEO_DEV=y CONFIG_VIDEO_V4L2=m CONFIG_I2C=m Compilation breaks, as reported by: https://bugzilla.kernel.org/show_bug.cgi?id=55681 Before changeset 7b34be71db533f3e0cf93d53cf62d036cdb5418a, no compilation errors occurred. However, the I2C code there at v4l2-device was incorrectly disabled. Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] m5mols: Fix bug in stream on handlerAndrzej Hajda2013-03-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to improper condition check streaming start for some pixel formats was prevent and the s_stream just reatuned -EINVAL. This fixes regression introduced in commit 5565a2ad47cdd8e697 [media] m5mols: Protect driver data with a mutex. Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] s5p-fimc: Do not attempt to disable not enabled media pipelineSylwester Nawrocki2013-03-211-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes following warnings when all links are being disconnected: [ 20.080000] WARNING: at drivers/media/platform/s5p-fimc/fimc-mdevice.c:1269 __fimc_md_set_camclk+0x208/0x20c() [ 20.090000] Modules linked in: [ 20.095000] [<c001603c>] (unwind_backtrace+0x0/0x13c) from [<c00246dc>] (warn_slowpath_common+0x54/0x64) [ 20.105000] [<c00246dc>] (warn_slowpath_common+0x54/0x64) from [<c0024708>] (warn_slowpath_null+0x1c/0x24) [ 20.115000] [<c0024708>] (warn_slowpath_null+0x1c/0x24) from [<c0323fe8>] (__fimc_md_set_camclk+0x208/0x20c) [ 20.125000] [<c0323fe8>] (__fimc_md_set_camclk+0x208/0x20c) from [<c0324368>] (__fimc_pipeline_close+0x38/0x48) [ 20.135000] [<c0324368>] (__fimc_pipeline_close+0x38/0x48) from [<c0325848>] (fimc_md_link_notify+0x10c/0x198) [ 20.145000] [<c0325848>] (fimc_md_link_notify+0x10c/0x198) from [<c02f9dd4>] (__media_entity_setup_link+0x1c0/0x1e8) [ 20.155000] [<c02f9dd4>] (__media_entity_setup_link+0x1c0/0x1e8) from [<c02f9710>] (media_device_ioctl+0x2c0/0x41c) [ 20.165000] [<c02f9710>] (media_device_ioctl+0x2c0/0x41c) from [<c02f9938>] (media_ioctl+0x30/0x34) [ 20.175000] [<c02f9938>] (media_ioctl+0x30/0x34) from [<c00cf0bc>] (do_vfs_ioctl+0x84/0x5e8) [ 20.185000] [<c00cf0bc>] (do_vfs_ioctl+0x84/0x5e8) from [<c00cf65c>] (sys_ioctl+0x3c/0x60) [ 20.190000] [<c00cf65c>] (sys_ioctl+0x3c/0x60) from [<c000f040>] (ret_fast_syscall+0x0/0x30) Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] s5p-mfc: Fix encoder control 15 issueArun Kumar K2013-03-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mfc-encoder is not working in the latest kernel giving the erorr "Adding control (15) failed". Adding the missing step parameter in this control to fix the issue. Signed-off-by: Arun Kumar K <arun.kk@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] s5p-mfc: Fix frame skip bugArun Kumar K2013-03-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The issue was seen in VP8 decoding where the last frame was skipped by the driver. This patch gets the correct frame_type value to fix this bug. Signed-off-by: Arun Kumar K <arun.kk@samsung.com> Signed-off-by: Arun Mankuzhi <arun.m@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] s5p-fimc: send valid m2m ctx to fimc_m2m_job_finishShaik Ameer Basha2013-03-211-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fimc_m2m_job_finish() has to be called with the m2m context for the necessary cleanup while resume. But currently fimc_m2m_job_finish() always passes m2m context as NULL. This patch preserves the context before making it null, for necessary cleanup. Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] exynos-gsc: send valid m2m ctx to gsc_m2m_job_finishShaik Ameer Basha2013-03-211-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gsc_m2m_job_finish() has to be called with the m2m context for the necessary cleanup while resume. But currently gsc_m2m_job_finish() always passes m2m context as NULL. This patch preserves the context before making it null, for necessary cleanup. Use gsc_m2m_opened() instead gsc_m2m_active() in gsc_resume(). Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] fimc-lite: Fix the variable type to avoid possible crashShaik Ameer Basha2013-03-211-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Changing the variable type to 'int' from 'unsigned int'. Driver logic expects the variable type to be 'int'. Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] fimc-lite: Initialize 'step' field in fimc_lite_ctrl structureShaik Ameer Basha2013-03-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v4l2_ctrl_new() uses check_range() for control range checking. This function expects 'step' value for V4L2_CTRL_TYPE_BOOLEAN type control. If 'step' value doesn't match to '1', it returns -ERANGE error. This patch adds the default .step value to 1. Signed-off-by: Shaik Ameer Basha <shaik.ameer@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | [media] ir: IR_RX51 only works on OMAP2Arnd Bergmann2013-03-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This driver can be enabled on OMAP1 at the moment, which breaks allyesconfig for that platform. Let's mark it OMAP2PLUS-only in Kconfig, since that is the only thing it builds on. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Timo Kokkonen <timo.t.kokkonen@iki.fi> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | Merge tag 'v3.9-rc3' into v4l_for_linusMauro Carvalho Chehab2013-03-1910483-271885/+524714
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 3.9-rc3 * tag 'v3.9-rc3': (11231 commits) Linux 3.9-rc3 perf,x86: fix link failure for non-Intel configs perf,x86: fix wrmsr_on_cpu() warning on suspend/resume Btrfs: fix warning of free_extent_map perf,x86: fix kernel crash with PEBS/BTS after suspend/resume ALSA: hda - Fix missing EAPD/GPIO setup for Cirrus codecs sound: sequencer: cap array index in seq_chn_common_event() mfd: twl4030-madc: Remove __exit_p annotation ALSA: hda/ca0132 - Remove extra setting of dsp_state. ALSA: hda/ca0132 - Check download state of DSP. ALSA: hda/ca0132 - Check if dspload_image succeeded. mm/fremap.c: fix possible oops on error path list: Fix double fetch of pointer in hlist_entry_safe() Btrfs: fix warning when creating snapshots Btrfs: return as soon as possible when edquot happens Btrfs: return EIO if we have extent tree corruption btrfs: use rcu_barrier() to wait for bdev puts at unmount Btrfs: remove btrfs_try_spin_lock Btrfs: get better concurrency for snapshot-aware defrag work hwmon: (pmbus/ltc2978) Fix temperature reporting ...
* | \ \ Merge tag 'for-linus-20130331' of git://git.kernel.dk/linux-blockLinus Torvalds2013-03-3121-261/+716
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: "Alright, this time from 10K up in the air. Collection of fixes that have been queued up since the merge window opened, hence postponed until later in the cycle. The pull request contains: - A bunch of fixes for the xen blk front/back driver. - A round of fixes for the new IBM RamSan driver, fixing various nasty issues. - Fixes for multiple drives from Wei Yongjun, bad handling of return values and wrong pointer math. - A fix for loop properly killing partitions when being detached." * tag 'for-linus-20130331' of git://git.kernel.dk/linux-block: (25 commits) mg_disk: fix error return code in mg_probe() rsxx: remove unused variable rsxx: enable error return of rsxx_eeh_save_issued_dmas() block: removes dynamic allocation on stack Block: blk-flush: Fixed indent code style cciss: fix invalid use of sizeof in cciss_find_cfgtables() loop: cleanup partitions when detaching loop device loop: fix error return code in loop_add() mtip32xx: fix error return code in mtip_pci_probe() xen-blkfront: remove frame list from blk_shadow xen-blkfront: pre-allocate pages for requests xen-blkback: don't store dev_bus_addr xen-blkfront: switch from llist to list xen-blkback: fix foreach_grant_safe to handle empty lists xen-blkfront: replace kmalloc and then memcpy with kmemdup xen-blkback: fix dispatch_rw_block_io() error path rsxx: fix missing unlock on error return in rsxx_eeh_remap_dmas() Adding in EEH support to the IBM FlashSystem 70/80 device driver block: IBM RamSan 70/80 error message bug fix. block: IBM RamSan 70/80 branding changes. ...