summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2013-04-20bcache: Correctly check against BIO_MAX_PAGESKent Overstreet
bch_bio_max_sectors() was checking against BIO_MAX_PAGES as if the limit was for the total bytes in the bio, not the number of segments. Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-20bcache: Hack around stuff that clones up to bi_max_vecsKent Overstreet
Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-20bcache: Set ra_pages based on backing device's ra_pagesKent Overstreet
Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-20bcache: Take data offset from the bdev superblock.Kent Overstreet
Add a new superblock version, and consolidate related defines. Signed-off-by: Gabriel de Perthuis <g2p.code+bcache@gmail.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-08bcache: Disable broken btree fuzz testerKent Overstreet
Reported-by: <sasha.levin@oracle.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-08bcache: Fix a format string overflowKent Overstreet
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-08bcache: Fix a minor memory leak on device teardownKent Overstreet
Reported-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-08bcache: Use WARN_ONCE() instead of __WARN()Kent Overstreet
Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-08bcache: Add missing #include <linux/prefetch.h>Geert Uytterhoeven
m68k/allmodconfig: drivers/md/bcache/bset.c: In function ‘bset_search_tree’: drivers/md/bcache/bset.c:727: error: implicit declaration of function ‘prefetch’ drivers/md/bcache/btree.c: In function ‘bch_btree_node_get’: drivers/md/bcache/btree.c:933: error: implicit declaration of function ‘prefetch’ Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-04-08bcache: Sparse fixesKent Overstreet
Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-03-28bcache: Don't export utility code, prefix with bch_Kent Overstreet
Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: fix if(); found by kbuild test robotLars Ellenberg
Recently introduced al_begin_io_nonblock() was returning -EBUSY, even when it should return -EWOULDBLOCK. Impact: A few spurious wake_up() calls in prepare_al_transaction_nonblock(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: use sched_setscheduler()Philipp Reisner
It was unnoticed for some time that assigning to current->policy is no longer sufficient to set a real time priority for a kernel thread. Reported-by: Charlie Suffin <Charlie.Suffin@stratus.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: fix for deadlock when using automatic split-brain-recoveryPhilipp Reisner
With an automatic after split-brain recovery policy of "after-sb-1pri call-pri-lost-after-sb", when trying to drbd_set_role() to R_SECONDARY, we run into a deadlock. This was first recognized and supposedly fixed by 2009-06-10 "Fixed a deadlock when using automatic split brain recovery when both nodes are" replacing drbd_set_role() with drbd_change_state() in that code-path, but the first hunk of that patch forgets to remove the drbd_set_role(). We apparently only ever tested the "two primaries" case. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: add module_put() on error path in drbd_proc_open()Alexey Khoroshilov
If single_open() fails in drbd_proc_open(), module refcount is left incremented. The patch adds module_put() on the error path. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: fix drbd epoch write count for ahead/behind modeLars Ellenberg
The sanity check when receiving P_BARRIER_ACK does expect all write requests with a given req->epoch to have been either all replicated, or all not replicated. Because req->epoch was assigned before calling maybe_pull_ahead(), this expectation was not met, leading to an off-by-one in the sanity check, and further to a "Protocol Error". Fix: move the call to maybe_pull_ahead() a few lines up, and assign req->epoch only after that. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: Fix build error when CONFIG_CRYPTO_HMAC is not setPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: validate resync_after dependency on attach alreadyLars Ellenberg
We validated resync_after dependencies, if changed via disk-options. But we did not validate them when first created via attach. We also did not check or cleanup dependencies that used to be correct, but now point to meanwhile removed minor devices. If the drbd_resync_after_valid() validation in disk-options tried to follow a dependency chain in this way, this could lead to NULL pointer dereference. Validate resync_after settings in drbd_adm_attach() already, as well as in drbd_adm_disk_opts(), and and only reject dependency loops. Depending on non-existing disks is allowed and equivalent to no dependency. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: fix memory leakLars Ellenberg
We forgot to free the disk_conf, so for each attach/detach cycle we leaked 336 bytes. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: only fail empty flushes if no good data is reachableLars Ellenberg
We completed empty flushes (blkdev_issue_flush()) with IO error if we lost the local disk, even if we still have an established replication link to a healthy remote disk. Fix this to only report errors to upper layers, if neither local nor remote data is reachable. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: Fix disconnect to keep the peer disk state if connection breaks during ↵Philipp Reisner
operation The issue was that if the connection broke while we did the gracefull state change to C_DISCONNECTING (C_TEARDOWN), then we returned a success code from the state engine. (SS_CW_NO_NEED) The result of that is that we missed to call the fence-peer script in such a case. Fixed that by introducing a new error code (SS_OUTDATE_WO_CONN). This one should never reach back into user space. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: fix spurious warning about bitmap being locked from detachPhilipp Reisner
Introduced in drbd: always write bitmap on detach, the bitmap bulk writeout on detach was indicating it expected exclusive bitmap access. Where I meant to say: expect no more modifications, but testing/counting is still allowed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: drop now useless duplicate state request from invalidatePhilipp Reisner
Patch best viewed with git diff --ignore-space-change. Now that we attempt the fallback to local bitmap operation only when disconnected, we can safely drop the extra "silent" state request from both invalidate and invalidate-remote. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: fix effective error returned when refusing an invalidatePhilipp Reisner
Since commit drbd: Disallow the peer_disk_state to be D_OUTDATED while connected trying to invalidate a disconnected Primary returned an error code that did not really match the situation: "Refusing to be Outdated while Connected" Insert two more specific conditions into is_valid_state(), changing that to "Need access to UpToDate data", respectively "Need a connection to start verify or resync". Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: move invalidating the whole bitmap out of after_state ch()Philipp Reisner
To avoid other state change requests, after passing through sanitize_state(), to be mistaken for an invalidate, move the "set all bits as out-of-sync" into the invalidate path. Make invalidate and invalidate-remote behave consistently wrt. current connection state (need either an established replication link, or really be disconnected). Also mention that in the documentation. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: abort start of resync early, if it raced with connection breakagePhilipp Reisner
We've seen a spurious full resync, because a connection breakage raced with drbd_start_resync(, C_SYNC_TARGET), and the resulting state change request intended to start the resync ended up looking like a local invalidate. Fix: Double check the state inside the lock, and don't even request that state change, if we had connection or IO problems. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-28drbd: reset ap_in_flight counter for new connectionsPhilipp Reisner
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-25bcache: Fix for the build fixesKent Overstreet
Commit 82a84eaf7e51ba3da0c36cbc401034a4e943492d left a return 0 in closure_debug_init(). Whoops. Signed-off-by: Kent Overstreet <koverstreet@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-25aoe: get rid of cached bv variable in bufinit()Jens Axboe
Less error prone if we just kill it, it's only used once anyway. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-25bcache: Style/checkpatch fixesKent Overstreet
Took out some nested functions, and fixed some more checkpatch complaints. Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-25bcache: Build fixes from test robotKent Overstreet
config: make ARCH=i386 allmodconfig All error/warnings: drivers/md/bcache/bset.c: In function 'bch_ptr_bad': >> drivers/md/bcache/bset.c:164:2: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/debug.c: In function 'bch_pbtree': >> drivers/md/bcache/debug.c:86:4: warning: format '%li' expects argument of type 'long int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/btree.c: In function 'bch_btree_read_done': >> drivers/md/bcache/btree.c:245:8: warning: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Wformat] -- drivers/md/bcache/closure.o: In function `closure_debug_init': >> (.init.text+0x0): multiple definition of `init_module' >> drivers/md/bcache/super.o:super.c:(.init.text+0x0): first defined here Signed-off-by: Kent Overstreet <koverstreet@google.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: linux-bcache@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-24Merge branch 'bcache-for-upstream' of ↵Jens Axboe
http://evilpiepirate.org/git/linux-bcache into for-3.10/drivers
2013-03-23bcache: A block layer cacheKent Overstreet
Does writethrough and writeback caching, handles unclean shutdown, and has a bunch of other nifty features motivated by real world usage. See the wiki at http://bcache.evilpiepirate.org for more. Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-03-23Export get_random_int()Kent Overstreet
Needed for bcache - need a cheap source of random numbers for perturbing IO sizes, for rate limiting IO to the SSD. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: "Theodore Ts'o" <tytso@mit.edu>
2013-03-22drbd: adjust upper limit for activity log extentsLars Ellenberg
Now that the on-disk activity-log ring buffer size is adjustable, the maximum active set can become larger, and is now limited by the use of 16bit "labels". This increases the maximum working set from 6433 to 65534 extents, each of which covers an area of 4MiB. Which means that if you use the maximum, you'd have to resync more than 250 GiB after an unclean Primary shutdown. With capable backend storage and replication links, this is entirely feasible. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: try hard to max out the updates per AL transactionLars Ellenberg
There may have been more incoming requests while we where preparing the current transaction. Try to consolidate more updates into this transaction until we make no more progres. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: move start io accounting before activity log transactionLars Ellenberg
The IO accounting of the drbd "queue depth" was misleading. We only started IO accounting once we already wrote the activity log. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: consolidate as many updates as possible into one AL transactionLars Ellenberg
Depending on current IO depth, try to consolidate as many updates as possible into one activity log transaction. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: queue writes on submitter thread, unless they pass the activity log ↵Lars Ellenberg
fastpath Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: split out some helper functions to drbd_al_begin_ioLars Ellenberg
To make the code easier to follow, use an explicit find_active_resync_extent(), and add a "nonblock" parameter to _al_get(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: split drbd_al_begin_io into fastpath, prepare, and commitLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: prepare to queue write requests on a submit workerLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: split __drbd_make_request in before and after drbd_al_begin_ioLars Ellenberg
This is in preparation to be able to defer requests that need to wait for an activity log transaction to a submitter workqueue. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: drbd_al_being_io: short circuit to reduce latencyLars Ellenberg
A request hitting an already "hot" extent should proceed right away, even if some other requests need to wait for pending transactions. Without that short-circuit, several simultaneous make_request contexts race for committing the transaction, possibly penalizing the innocent. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: Clarify when activity log I/O is delegated to the worker threadLars Ellenberg
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: read meta data early, base on-disk offsets on super blockLars Ellenberg
We used to calculate all on-disk meta data offsets, and then compare the stored offsets, basically treating them as magic numbers. Now with the activity log striping, the activity log size is no longer fixed. We need to first read the super block, then base the activity log and bitmap offsets on the stored offsets/al stripe settings. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: mechanically rename la_size to la_size_sectLars Ellenberg
Make it obvious that this value is in units of 512 Byte sectors. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: use the cached meta_dev_idxLars Ellenberg
Now we have the cached meta_dev_idx member, we can get rid of a few rcu_read_lock() sections and rcu_dereference(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: prepare for new striped layout of activity logLars Ellenberg
Introduce two new on-disk meta data fields: al_stripes and al_stripe_size_4k The intended use case is activity log on RAID 0 or similar. Logically consecutive transactions will advance their on-disk position by al_stripe_size_4k 4kB (transaction sized) blocks. Right now, these are still asserted to be the backward compatible values al_stripes = 1, al_stripe_size_4k = 8 (which amounts to 32kB). Also introduce a caching member for meta_dev_idx in the in-core structure: even though it is initially passed in in the rcu-protected disk_conf structure, it cannot change without a detach/attach cycle. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-03-22drbd: cleanup ondisk meta data layout calculations and definesLars Ellenberg
Add a comment about our meta data layout variants, and rename a few defines (e.g. MD_RESERVED_SECT -> MD_128MB_SECT) to make it clear that they are short hand for fixed constants, and not arbitrarily to be redefined as one may see fit. Properly pad struct meta_data_on_disk to 4kB, and initialize to zero not only the first 512 Byte, but all of it in drbd_md_sync(). Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>