summaryrefslogtreecommitdiff
path: root/fs/ext4
AgeCommit message (Collapse)Author
2011-01-11Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (44 commits) ext4: fix trimming starting with block 0 with small blocksize ext4: revert buggy trim overflow patch ext4: don't pass entire map to check_eofblocks_fl ext4: fix memory leak in ext4_free_branches ext4: remove ext4_mb_return_to_preallocation() ext4: flush the i_completed_io_list during ext4_truncate ext4: add error checking to calls to ext4_handle_dirty_metadata() ext4: fix trimming of a single group ext4: fix uninitialized variable in ext4_register_li_request ext4: dynamically allocate the jbd2_inode in ext4_inode_info as necessary ext4: drop i_state_flags on architectures with 64-bit longs ext4: reorder ext4_inode_info structure elements to remove unneeded padding ext4: drop ec_type from the ext4_ext_cache structure ext4: use ext4_lblk_t instead of sector_t for logical blocks ext4: replace i_delalloc_reserved_flag with EXT4_STATE_DELALLOC_RESERVED ext4: fix 32bit overflow in ext4_ext_find_goal() ext4: add more error checks to ext4_mkdir() ext4: ext4_ext_migrate should use NULL not 0 ext4: Use ext4_error_file() to print the pathname to the corrupted inode ext4: use IS_ERR() to check for errors in ext4_error_file ...
2011-01-11ext4: fix trimming starting with block 0 with small blocksizeJan Kara
When s_first_data_block is not zero (which happens e.g. when block size is 1KB) and trim ioctl is called to start trimming from block 0, the math in ext4_get_group_no_and_offset() overflows. The overall result is that ioctl returns EINVAL which is kind of unexpected and we probably don't want userspace tools to bother with internal details of filesystem structure. So just silently increase starting offset (and shorten length) when starting block is below s_first_data_block. CC: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-11ext4: revert buggy trim overflow patchTheodore Ts'o
This reverts commit 4f531501e44: ext4: fix possible overflow in ext4_trim_fs() Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: don't pass entire map to check_eofblocks_flEric Sandeen
Since check_eofblocks_fl() only uses the m_lblk portion of the map structure, we may as well pass that directly, rather than passing the entire map, which IMHO obfuscates what parameters check_eofblocks_fl() cares about. Not a big deal, but seems tidier and less confusing, to me. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: fix memory leak in ext4_free_branchesTheodore Ts'o
Commit 40389687 moved a call to ext4_forget() out of ext4_free_branches and let ext4_free_blocks() handle calling bforget(). But that change unfortunately did not replace the call to ext4_forget() with brelse(), which was needed to drop the in-use count of the indirect block's buffer head, which lead to a memory leak when deleting files that used indirect blocks. Fix this. Thanks to Hugh Dickins for pointing this out. Cc: stable@kernel.org Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: remove ext4_mb_return_to_preallocation()Theodore Ts'o
This function was never implemented, except for a BUG_ON which was tripping when ext4 is run without a journal. The problem is that although the comment asserts that "truncate (which is the only way to free block) discards all preallocations", ext4_free_blocks() is also called in various error recovery paths when blocks have been allocated, but for various reasons, we were not able to use those data blocks (for example, because we ran out of memory while trying to manipulate the extent tree, or some other similar situation). In addition to the fact that this function isn't implemented except for the incorrect BUG_ON, the single caller of this function, ext4_free_blocks(), doesn't use it all if the journal is enabled. So remove the (stub) function entirely for now. If we decide it's better to add it back, it's only going to be useful with a relatively large number of code changes anyway. Google-Bug-Id: 3236408 Cc: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: flush the i_completed_io_list during ext4_truncateJiaying Zhang
Ted first found the bug when running 2.6.36 kernel with dioread_nolock mount option that xfstests #13 complained about wrong file size during fsck. However, the bug exists in the older kernels as well although it is somehow harder to trigger. The problem is that ext4_end_io_work() can happen after we have truncated an inode to a smaller size. Then when ext4_end_io_work() calls ext4_convert_unwritten_extents(), we may reallocate some blocks that have been truncated, so the inode size becomes inconsistent with the allocated blocks. The following patch flushes the i_completed_io_list during truncate to reduce the risk that some pending end_io requests are executed later and convert already truncated blocks to initialized. Note that although the fix helps reduce the problem a lot there may still be a race window between vmtruncate() and ext4_end_io_work(). The fundamental problem is that if vmtruncate() is called without either i_mutex or i_alloc_sem held, it can race with an ongoing write request so that the io_end request is processed later when the corresponding blocks have been truncated. Ted and I have discussed the problem offline and we saw a few ways to fix the race completely: a) We guarantee that i_mutex lock and i_alloc_sem write lock are both hold whenever vmtruncate() is called. The i_mutex lock prevents any new write requests from entering writeback and the i_alloc_sem prevents the race from ext4_page_mkwrite(). Currently we hold both locks if vmtruncate() is called from do_truncate(), which is probably the most common case. However, there are places where we may call vmtruncate() without holding either i_mutex or i_alloc_sem. I would like to ask for other people's opinions on what locks are expected to be held before calling vmtruncate(). There seems a disagreement among the callers of that function. b) We change the ext4 write path so that we change the extent tree to contain the newly allocated blocks and update i_size both at the same time --- when the write of the data blocks is completed. c) We add some additional locking to synchronize vmtruncate() and ext4_end_io_work(). This approach may have performance implications so we need to be careful. All of the above proposals may require more substantial changes, so we may consider to take the following patch as a bandaid. Signed-off-by: Jiaying Zhang <jiayingz@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: add error checking to calls to ext4_handle_dirty_metadata()Theodore Ts'o
Call ext4_std_error() in various places when we can't bail out cleanly, so the file system can be marked as in error. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: fix trimming of a single groupJan Kara
When ext4_trim_fs() is called to trim a part of a single group, the logic will wrongly set last block of the interval to 'len' instead of 'first_block + len'. Thus a shorter interval is possibly trimmed. Fix it. CC: Lukas Czerner <lczerner@redhat.com> Cc: stable@kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: fix uninitialized variable in ext4_register_li_requestAndrew Morton
fs/ext4/super.c: In function 'ext4_register_li_request': fs/ext4/super.c:2936: warning: 'ret' may be used uninitialized in this function It looks buggy to me, too. Cc: Lukas Czerner <lczerner@redhat.com> Cc: stable@kernel.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: dynamically allocate the jbd2_inode in ext4_inode_info as necessaryTheodore Ts'o
Replace the jbd2_inode structure (which is 48 bytes) with a pointer and only allocate the jbd2_inode when it is needed --- that is, when the file system has a journal present and the inode has been opened for writing. This allows us to further slim down the ext4_inode_info structure. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: drop i_state_flags on architectures with 64-bit longsTheodore Ts'o
We can store the dynamic inode state flags in the high bits of EXT4_I(inode)->i_flags, and eliminate i_state_flags. This saves 8 bytes from the size of ext4_inode_info structure, which when multiplied by the number of the number of in the inode cache, can save a lot of memory. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: reorder ext4_inode_info structure elements to remove unneeded paddingTheodore Ts'o
By reordering the elements in the ext4_inode_info structure, we can reduce the padding needed on an x86_64 system by 16 bytes. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: drop ec_type from the ext4_ext_cache structureTheodore Ts'o
We can encode the ec_type information by using ee_len == 0 to denote EXT4_EXT_CACHE_NO, ee_start == 0 to denote EXT4_EXT_CACHE_GAP, and if neither is true, then the cache type must be EXT4_EXT_CACHE_EXTENT. This allows us to reduce the size of ext4_ext_inode by another 8 bytes. (ec_type is 4 bytes, plus another 4 bytes of padding) Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: use ext4_lblk_t instead of sector_t for logical blocksTheodore Ts'o
This fixes a number of places where we used sector_t instead of ext4_lblk_t for logical blocks, which for ext4 are still 32-bit data types. No point wasting space in the ext4_inode_info structure, and requiring 64-bit arithmetic on 32-bit systems, when it isn't necessary. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: replace i_delalloc_reserved_flag with EXT4_STATE_DELALLOC_RESERVEDTheodore Ts'o
Remove the short element i_delalloc_reserved_flag from the ext4_inode_info structure and replace it a new bit in i_state_flags. Since we have an ext4_inode_info for every ext4 inode cached in the inode cache, any savings we can produce here is a very good thing from a memory utilization perspective. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: fix 32bit overflow in ext4_ext_find_goal()Kazuya Mio
ext4_ext_find_goal() returns an ideal physical block number that the block allocator tries to allocate first. However, if a required file offset is smaller than the existing extent's one, ext4_ext_find_goal() returns a wrong block number because it may overflow at "block - le32_to_cpu(ex->ee_block)". This patch fixes the problem. ext4_ext_find_goal() will also return a wrong block number in case a file offset of the existing extent is too big. In this case, the ideal physical block number is fixed in ext4_mb_initialize_context(), so it's no problem. reproduce: # dd if=/dev/zero of=/mnt/mp1/tmp bs=127M count=1 oflag=sync # dd if=/dev/zero of=/mnt/mp1/file bs=512K count=1 seek=1 oflag=sync # filefrag -v /mnt/mp1/file Filesystem type is: ef53 File size of /mnt/mp1/file is 1048576 (256 blocks, blocksize 4096) ext logical physical expected length flags 0 128 67456 128 eof /mnt/mp1/file: 2 extents found # rm -rf /mnt/mp1/tmp # echo $((512*4096)) > /sys/fs/ext4/loop0/mb_stream_req # dd if=/dev/zero of=/mnt/mp1/file bs=512K count=1 oflag=sync conv=notrunc result (linux-2.6.37-rc2 + ext4 patch queue): # filefrag -v /mnt/mp1/file Filesystem type is: ef53 File size of /mnt/mp1/file is 1048576 (256 blocks, blocksize 4096) ext logical physical expected length flags 0 0 33280 128 1 128 67456 33407 128 eof /mnt/mp1/file: 2 extents found result(apply this patch): # filefrag -v /mnt/mp1/file Filesystem type is: ef53 File size of /mnt/mp1/file is 1048576 (256 blocks, blocksize 4096) ext logical physical expected length flags 0 0 66560 128 1 128 67456 66687 128 eof /mnt/mp1/file: 2 extents found Signed-off-by: Kazuya Mio <k-mio@sx.jp.nec.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: add more error checks to ext4_mkdir()Namhyung Kim
Check return value of ext4_journal_get_write_access, ext4_journal_dirty_metadata and ext4_mark_inode_dirty. Move brelse() under 'out_stop' to release bh properly in case of journal error. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: ext4_ext_migrate should use NULL not 0Eric Paris
ext4_ext_migrate() calls ext4_new_inode() and passes 0 instead of a pointer to a struct qstr. This patch uses NULL, to make it obvious to the caller that this was a pointer. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: Use ext4_error_file() to print the pathname to the corrupted inodeTheodore Ts'o
Where the file pointer is available, use ext4_error_file() instead of ext4_error_inode(). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: use IS_ERR() to check for errors in ext4_error_fileDan Carpenter
d_path() returns an ERR_PTR and it doesn't return NULL. This is in ext4_error_file() and no one actually calls ext4_error_file(). Signed-off-by: Dan Carpenter <error27@gmail.com>
2011-01-10ext4: test the correct variable in ext4_init_pageio()Dan Carpenter
This is a copy and paste error. The intent was to check "io_page_cachep". We tested "io_page_cachep" earlier. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext2,ext3,ext4: clarify comment for extN_xattr_set_handleWang Sheng-Hui
Signed-off-by: Wang Sheng-Hui <crosslonelyover@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: clean up ext4_xattr_list()'s error code checking and return strategyTheodore Ts'o
Any time you see code that tries to add error codes together, you should want to claw your eyes out... Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-01-10ext4: remove warning message from ext4_issue_discard helperLukas Czerner
ext4_issue_discard is supposed to be helper for calling discard, however in case that underlying device does not support discard it prints out the warning message and clears the DISCARD t_mount_opt flag. Since it can be (and is) used by others, it should not do anything and let the caller to handle the error case. This commit removes warning message and flag setting from ext4_issue_discard and use it just in place where it is really needed (release_blocks_on_commit). FITRIM ioctl should not set any flags nor it should print out warning messages, so get rid of the warning as well. Signed-off-by: Lukas Czerner <lczerner@redhat.com>
2011-01-10ext4: fix possible overflow in ext4_trim_fs()Lukas Czerner
When determining last group through ext4_get_group_no_and_offset() the result may be wrong in cases when range->start and range-len are too big, because it may overflow when summing up those two numbers. Fix that by checking range->len and limit its value to ext4_blocks_count(). This commit was tested by myself with expected result. Signed-off-by: Lukas Czerner <lczerner@redhat.com>
2011-01-07ext2,3,4: provide simple rcu-walk ACL implementationNick Piggin
This simple implementation just checks for no ACLs on the inode, and if so, then the rcu-walk may proceed, otherwise fail it. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: provide rcu-walk aware permission i_opsNick Piggin
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2011-01-07fs: icache RCU free inodesNick Piggin
RCU free the struct inode. This will allow: - Subsequent store-free path walking patch. The inode must be consulted for permissions when walking, so an RCU inode reference is a must. - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want to take i_lock no longer need to take sb_inode_list_lock to walk the list in the first place. This will simplify and optimize locking. - Could remove some nested trylock loops in dcache code - Could potentially simplify things a bit in VM land. Do not need to take the page lock to follow page->mapping. The downsides of this is the performance cost of using RCU. In a simple creat/unlink microbenchmark, performance drops by about 10% due to inability to reuse cache-hot slab objects. As iterations increase and RCU freeing starts kicking over, this increases to about 20%. In cases where inode lifetimes are longer (ie. many inodes may be allocated during the average life span of a single inode), a lot of this cache reuse is not applicable, so the regression caused by this patch is smaller. The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU, however this adds some complexity to list walking and store-free path walking, so I prefer to implement this at a later date, if it is shown to be a win in real situations. I haven't found a regression in any non-micro benchmark so I doubt it will be a problem. Signed-off-by: Nick Piggin <npiggin@kernel.dk>
2010-12-23ext4: fix on-line resizing regressionTheodore Ts'o
https://bugzilla.kernel.org/show_bug.cgi?id=25352 This regression was caused by commit a31437b85: "ext4: use sb_issue_zeroout in setup_new_group_blocks", by accidentally dropping the code which reserved the block group descriptor and inode table blocks. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-20ext4: Add error checking to kmem_cache_alloc() call in ext4_free_blocks()Theodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-19ext4: Use printf extension %pVJoe Perches
Using %pV reduces the number of printk calls and eliminates any possible message interleaving from other printk calls. In function __ext4_grp_locked_error also added KERN_CONT to some printks. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-19ext4: Use vzalloc in ext4_fill_flex_info()Joe Perches
Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-19ext4: zero out nanosecond timestamps for small inodesEric Sandeen
When nanosecond timestamp resolution isn't supported on an ext4 partition (inode size = 128), stat() appears to be returning uninitialized garbage in the nanosecond component of timestamps. EXT4_INODE_GET_XTIME should zero out tv_nsec when EXT4_FITS_IN_INODE evaluates to false. Reported-by: Jordan Russell <jr-list-2010@quo.to> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-19ext4: optimize ext4_check_dir_entry() with unlikely() annotationsTheodore Ts'o
This function gets called a lot for large directories, and the answer is almost always "no, no, there's no problem". This means using unlikely() is a good thing. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-19ext4: use kmem_cache_zalloc() in ext4_init_io_end()Jesper Juhl
Use advantage of kmem_cache_zalloc() to remove a memset() call in ext4_init_io_end() and save a few bytes. Before: [jj@dragon linux-2.6]$ size fs/ext4/page-io.o text data bss dec hex filename 3016 0 624 3640 e38 fs/ext4/page-io.o After: [jj@dragon linux-2.6]$ size fs/ext4/page-io.o text data bss dec hex filename 3000 0 624 3624 e28 fs/ext4/page-io.o Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-19ext4: Remove redundant unlikely()Tobias Klauser
IS_ERR() already implies unlikely(), so it can be omitted here. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-17ext4: Use pr_warning_ratelimited() instead of printk_ratelimit()Theodore Ts'o
printk_ratelimit() is deprecated since it is a global instead of a per-printk ratelimit. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-16ext4: Fix up comments in inode.cTheodore Ts'o
This fixes up some broken argument descriptions that Namhyung Kim had originally submitted for ext3. This fixes the comments that were still applicable in ext4. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-15ext4: Add second mount options field since the s_mount_opt is full upTheodore Ts'o
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-15ext4: Move struct ext4_mount_options from ext4.h to super.cTheodore Ts'o
Move the ext4_mount_options structure definition from ext4.h, since it is only used in super.c. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-15ext4: Simplify the usage of clear_opt() and set_opt() macrosTheodore Ts'o
Change clear_opt() and set_opt() to take a superblock pointer instead of a pointer to EXT4_SB(sb)->s_mount_opt. This makes it easier for us to support a second mount option field. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-14ext4: fix typo which broke '..' detection in ext4_find_entry()Aaro Koskinen
There should be a check for the NUL character instead of '0'. Fortunately the only thing that cares about this is NFS serving, which is why we didn't notice this in the merge window testing. Reported-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-12-14ext4: Turn off multiple page-io submission by defaultTheodore Ts'o
Jon Nelson has found a test case which causes postgresql to fail with the error: psql:t.sql:4: ERROR: invalid page header in block 38269 of relation base/16384/16581 Under memory pressure, it looks like part of a file can end up getting replaced by zero's. Until we can figure out the cause, we'll roll back the change and use block_write_full_page() instead of ext4_bio_write_page(). The new, more efficient writing function can be used via the mount option mblk_io_submit, so we can test and fix the new page I/O code. To reproduce the problem, install postgres 8.4 or 9.0, and pin enough memory such that the system just at the end of triggering writeback before running the following sql script: begin; create temporary table foo as select x as a, ARRAY[x] as b FROM generate_series(1, 10000000 ) AS x; create index foo_a_idx on foo (a); create index foo_b_idx on foo USING GIN (b); rollback; If the temporary table is created on a hard drive partition which is encrypted using dm_crypt, then under memory pressure, approximately 30-40% of the time, pgsql will issue the above failure. This patch should fix this problem, and the problem will come back if the file system is mounted with the mblk_io_submit mount option. Reported-by: Jon Nelson <jnelson@jamponi.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-11-19ext4: Add EXT4_IOC_TRIM ioctl to handle batched discardLukas Czerner
Filesystem independent ioctl was rejected as not common enough to be in core vfs ioctl. Since we still need to access to this functionality this commit adds ext4 specific ioctl EXT4_IOC_TRIM to dispatch ext4_trim_fs(). It takes fstrim_range structure as an argument. fstrim_range is definec in the include/linux/fs.h and its definition is as follows. struct fstrim_range { __u64 start; __u64 len; __u64 minlen; } start - first Byte to trim len - number of Bytes to trim from start minlen - minimum extent length to trim, free extents shorter than this number of Bytes will be ignored. This will be rounded up to fs block size. After the FITRIM is done, the number of actually discarded Bytes is stored in fstrim_range.len to give the user better insight on how much storage space has been really released for wear-leveling. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-11-19fs: Do not dispatch FITRIM through separate super_operationLukas Czerner
There was concern that FITRIM ioctl is not common enough to be included in core vfs ioctl, as Christoph Hellwig pointed out there's no real point in dispatching this out to a separate vector instead of just through ->ioctl. So this commit removes ioctl_fstrim() from vfs ioctl and trim_fs from super_operation structure. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-11-19ext4: ext4_fill_super shouldn't return 0 on corruptionDarrick J. Wong
At the start of ext4_fill_super, ret is set to -EINVAL, and any failure path out of that function returns ret. However, the generic_check_addressable clause sets ret = 0 (if it passes), which means that a subsequent failure (e.g. a group checksum error) returns 0 even though the mount should fail. This causes vfs_kern_mount in turn to think that the mount succeeded, leading to an oops. A simple fix is to avoid using ret for the generic_check_addressable check, which was last changed in commit 30ca22c70e3ef0a96ff84de69cd7e8561b416cb2. Signed-off-by: Darrick J. Wong <djwong@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-11-17ext4: missing unlock in ext4_clear_request_list()Dan Carpenter
If the the li_request_list was empty then it returned with the lock held. Instead of adding a "goto unlock" I just removed that special case and let it go past the empty list_for_each_safe(). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-11-17ext4: fix setting random pages PageUptodateMarkus Trippelsdorf
ext4_end_bio calls put_page and kmem_cache_free before calling SetPageUpdate(). This can result in setting the PageUptodate bit on random pages and causes the following BUG: BUG: Bad page state in process rm pfn:52e54 page:ffffea0001222260 count:0 mapcount:0 mapping: (null) index:0x0 arch kernel: page flags: 0x4000000000000008(uptodate) Fix the problem by moving put_io_page() after the SetPageUpdate() call. Thanks to Hugh Dickins for analyzing this problem. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-11-08Merge branch 'for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Add new ext4 inode tracepoints ext4: Don't call sb_issue_discard() in ext4_free_blocks() ext4: do not try to grab the s_umount semaphore in ext4_quota_off ext4: fix potential race when freeing ext4_io_page structures ext4: handle writeback of inodes which are being freed ext4: initialize the percpu counters before replaying the journal ext4: "ret" may be used uninitialized in ext4_lazyinit_thread() ext4: fix lazyinit hang after removing request