summaryrefslogtreecommitdiff
path: root/fs/btrfs
AgeCommit message (Collapse)Author
2008-09-25Btrfs: Fix split_node to require more empty slots in the node as wellChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Make sure nodes have enough room for a double splitChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fix the unplug_io_fn to grab a consistent copy of page->mappingChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Fix btrfs_get_extent and get_block corner cases, and disable O_DIRECT readsChris Mason
The generic O_DIRECT code assumes all the bios have the same bdev, which isn't true for multi-device btrfs. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Set nodatasum on the inode when written by a nodatasum mountChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Deal with page == NULL in the btrfs_unplug_io_fnChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add a special device list for chunk allocationsChris Mason
This allows other code that needs to walk every device in the FS to do so without locking against allocations. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Simplify device selection for mirrored readsChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Make an unplug function that doesn't unplug every spindleChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Remove debugging statements from the invalidatepage callsChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add 1MB to the min_free in alloc_chunkChris Mason
This properly reflects the first 1MB we skip at the start of the device Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Scale the bdi ra_pages by the number of devices in the FSChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Force page->private removal in btrfs_invalidatepageChris Mason
btrfs_invalidatepage is not allowed to leave pages around on the lru. Any such pages will trigger an oops later on because the VM will see page->private and assume it is a buffer head. This also forces extra flushes of the async work queues before dropping all the pages on the btree inode during unmount. Left over items on the work queues are one possible cause of busy state ranges during truncate_inode_pages. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Set the btree inode i_size to OFFSET_MAXChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fix chunk allocation when some devices don't have enough room for stripesChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Calculate appropriate chunk sizes for both small and large filesystemsChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Don't drop extent_map cache during releasepage on the btree inodeChris Mason
The btree inode should only have a single extent_map in the cache, it doesn't make sense to ever drop it. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add support for labels in the super blockChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Check device uuids along with devidsChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Remove bogus max_sector warnings from the extent_io codeChris Mason
It was testing the bio before doing logical->physical mapping, so the test was always wrong. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Avoid 64 bit div for RAID10Chris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Use the extent map cache to find the logical disk block during data ↵Chris Mason
retries The data read retry code needs to find the logical disk block before it can resubmit new bios. But, finding this block isn't allowed to take the fs_mutex because that will deadlock with a number of different callers. This changes the retry code to use the extent map cache instead, but that requires the extent map cache to have the extent we're looking for. This is a problem because btrfs_drop_extent_cache just drops the entire extent instead of the little tiny part it is invalidating. The bulk of the code in this patch changes btrfs_drop_extent_cache to invalidate only a portion of the extent cache, and changes btrfs_get_extent to deal with the results. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Only do async bio submission for pdflushChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Don't wait on tree block writeback before freeing them anymoreChris Mason
This isn't required anymore because we don't reallocate blocks that have already been written in this transaction. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Write bio checksumming outside the FS mutexChris Mason
This significantly improves streaming write performance by allowing concurrency in the data checksumming. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Create a work queue for bio writesChris Mason
This allows checksumming to happen in parallel among many cpus, and keeps us from bogging down pdflush with the checksumming code. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add RAID10 supportChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add chunk uuids and update multi-device back referencesChris Mason
Block headers now store the chunk tree uuid Chunk items records the device uuid for each stripes Device extent items record better back refs to the chunk tree Block groups record better back refs to the chunk tree The chunk tree format has also changed. The objectid of BTRFS_CHUNK_ITEM_KEY used to be the logical offset of the chunk. Now it is a chunk tree id, with the logical offset being stored in the offset field of the key. This allows a single chunk tree to record multiple logical address spaces, upping the number of bytes indexed by a chunk tree from 2^64 to 2^128. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: A few updates for 2.6.18 and versions older than 2.6.25Chris Mason
This includes fixing a missing spinlock init call that caused oops on mount for most kernels other than 2.6.25. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Add a min size parameter to btrfs_alloc_extentChris Mason
On huge machines, delayed allocation may try to allocate massive extents. This change allows btrfs_alloc_extent to return something smaller than the caller asked for, and the data allocation routines will loop over the allocations until it fills the whole delayed alloc. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: bio_endio support for linux 2.6.23 and older.Miguel
bio_endio() changed prototype on linux 2.6.24, support older kernels using the older prototype. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: define write_cache_pages for linux kernel <= 2.6.20 insteadMiguel
write_cache_pages doesn't exist in linux 2.6.20, change the #if condition to match that. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Endianess bug fix for v0.13 with kernelsMiguel
Fix for a endianess BUG when using btrfs v0.13 with kernels older than 2.6.23 Problem: Has of v0.13, btrfs-progs is using crc32c.c equivalent to the one found on linux-2.6.23/lib/libcrc32c.c Since crc32c_le() changed in linux-2.6.23, when running btrfs v0.13 with older kernels we have a missmatch between the versions of crc32c_le() from btrfs-progs and libcrc32c in the kernel. This missmatch causes a bug when using btrfs on big endian machines. Solution: btrfs_crc32c() macro that when compiling for kernels older than 2.6.23, does endianess conversion to parameters and return value of crc32c(). This endianess conversion nullifies the differences in implementation of crc32c_le(). If kernel 2.6.23 or better, it calls crc32c(). Signed-off-by: Miguel Sousa Filipe <miguel.filipe@gmail.com> --- Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fixup a few u64<->pointer casts for 32 bitChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add extra checks to avoid removing extent_state from pages we can't freeChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Write out all super blocks on commit, and bring back proper barrier ↵Chris Mason
support Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add O_DIRECT read and write (writes == buffered + cache flush)Chris Mason
This adds basic O_DIRECT read and write support. In the write case, we just do a normal buffered write followed by a cache flush. O_DIRECT + O_SYNC are required to trigger metadata syncs. In the read case, there is a basic btrfs_get_block call for use by the generic O_DIRECT code. This does honor multi-volume mapping rules but it skips all checksumming. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Disable extra debugging checks on tree blocksChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Handle checksumming errors while reading data blocksChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Retry metadata reads in the face of checksum failuresChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Handle data block end_io through the async work queueChris Mason
Before it was done by the bio end_io routine, the work queue code is able to scale much better with faster IO subsystems. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Do metadata checksums for reads via a workqueueChris Mason
Before, metadata checksumming was done by the callers of read_tree_block, which would set EXTENT_CSUM bits in the extent tree to show that a given range of pages was already checksummed and didn't need to be verified again. But, those bits could go away via try_to_releasepage, and the end result was bogus checksum failures on pages that never left the cache. The new code validates checksums when the page is read. It is a little tricky because metadata blocks can span pages and a single read may end up going via multiple bios. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add additional debugging for metadata checksum failuresChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Change btrfs_map_block to return a structure with mappings for all stripesChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Fix allocation profile initChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Don't allow written blocks from this transaction to be reallocatedChris Mason
When a block is freed, it can be immediately reused if it is from the current transaction. But, an extra check is required to make sure the block had not been written yet. If it were reused after being written, the transid in the block header might match the transid of the next time the block was allocated. The parent node records the transaction ID of the block it is pointing to, and this is used as part of validating the block on reads. So, there can only be one version of a block per transaction. Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add support for duplicate blocks on a single spindleChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Add support for mirroring across drivesChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Properly dirty buffers in the split corner casesChris Mason
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2008-09-25Btrfs: Verify checksums on tree blocks found without read_tree_blockChris Mason
Checksums were only verified by btrfs_read_tree_block, which meant the functions to probe the page cache for blocks were not validating checksums. Normally this is fine because the buffers will only be in cache if they have already been validated. But, there is a window while the buffer is being read from disk where it could be up to date in the cache but not yet verified. This patch makes sure all buffers go through checksum verification before they are used. This is safer, and it prevents modification of buffers before they go through the csum code. Signed-off-by: Chris Mason <chris.mason@oracle.com>