summaryrefslogtreecommitdiffstats
path: root/drivers/block/drbd/drbd_int.h
diff options
context:
space:
mode:
authorLars Ellenberg <lars.ellenberg@linbit.com>2018-12-20 17:23:42 +0100
committerJens Axboe <axboe@kernel.dk>2018-12-20 17:51:31 +0100
commitf31e583aa2c20892aca3add26957dee6ab80a534 (patch)
treecd57652fc773adee2e259c783a6abe30da88acfd /drivers/block/drbd/drbd_int.h
parentdrbd: skip spurious timeout (ping-timeo) when failing promote (diff)
downloadlinux-f31e583aa2c20892aca3add26957dee6ab80a534.tar.xz
linux-f31e583aa2c20892aca3add26957dee6ab80a534.zip
drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")
And also re-enable partial-zero-out + discard aligned. With the introduction of REQ_OP_WRITE_ZEROES, we started to use that for both WRITE_ZEROES and DISCARDS, hoping that WRITE_ZEROES would "do what we want", UNMAP if possible, zero-out the rest. The example scenario is some LVM "thin" backend. While an un-allocated block on dm-thin reads as zeroes, on a dm-thin with "skip_block_zeroing=true", after a partial block write allocated that block, that same block may well map "undefined old garbage" from the backends on LBAs that have not yet been written to. If we cannot distinguish between zero-out and discard on the receiving side, to avoid "undefined old garbage" to pop up randomly at later times on supposedly zero-initialized blocks, we'd need to map all discards to zero-out on the receiving side. But that would potentially do a full alloc on thinly provisioned backends, even when the expectation was to unmap/trim/discard/de-allocate. We need to distinguish on the protocol level, whether we need to guarantee zeroes (and thus use zero-out, potentially doing the mentioned full-alloc), or if we want to put the emphasis on discard, and only do a "best effort zeroing" (by "discarding" blocks aligned to discard-granularity, and zeroing only potential unaligned head and tail clippings to at least *try* to avoid "false positives" in an online-verify later), hoping that someone set skip_block_zeroing=false. For some discussion regarding this on dm-devel, see also https://www.mail-archive.com/dm-devel%40redhat.com/msg07965.html https://www.redhat.com/archives/dm-devel/2018-January/msg00271.html For backward compatibility, P_TRIM means zero-out, unless the DRBD_FF_WZEROES feature flag is agreed upon during handshake. To have upper layers even try to submit WRITE ZEROES requests, we need to announce "efficient zeroout" independently. We need to fixup max_write_zeroes_sectors after blk_queue_stack_limits(): if we can handle "zeroes" efficiently on the protocol, we want to do that, even if our backend does not announce max_write_zeroes_sectors itself. Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Diffstat (limited to 'drivers/block/drbd/drbd_int.h')
-rw-r--r--drivers/block/drbd/drbd_int.h11
1 files changed, 9 insertions, 2 deletions
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index ab718582a092..000a2f4c0e92 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -430,7 +430,11 @@ enum {
__EE_MAY_SET_IN_SYNC,
/* is this a TRIM aka REQ_OP_DISCARD? */
- __EE_IS_TRIM,
+ __EE_TRIM,
+ /* explicit zero-out requested, or
+ * our lower level cannot handle trim,
+ * and we want to fall back to zeroout instead */
+ __EE_ZEROOUT,
/* In case a barrier failed,
* we need to resubmit without the barrier flag. */
@@ -472,7 +476,8 @@ enum {
};
#define EE_CALL_AL_COMPLETE_IO (1<<__EE_CALL_AL_COMPLETE_IO)
#define EE_MAY_SET_IN_SYNC (1<<__EE_MAY_SET_IN_SYNC)
-#define EE_IS_TRIM (1<<__EE_IS_TRIM)
+#define EE_TRIM (1<<__EE_TRIM)
+#define EE_ZEROOUT (1<<__EE_ZEROOUT)
#define EE_RESUBMITTED (1<<__EE_RESUBMITTED)
#define EE_WAS_ERROR (1<<__EE_WAS_ERROR)
#define EE_HAS_DIGEST (1<<__EE_HAS_DIGEST)
@@ -1556,6 +1561,8 @@ extern void start_resync_timer_fn(struct timer_list *t);
extern void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req);
/* drbd_receiver.c */
+extern int drbd_issue_discard_or_zero_out(struct drbd_device *device,
+ sector_t start, unsigned int nr_sectors, int flags);
extern int drbd_receiver(struct drbd_thread *thi);
extern int drbd_ack_receiver(struct drbd_thread *thi);
extern void drbd_send_ping_wf(struct work_struct *ws);