qemu-irix/block
Eric Blake 4af42e3cf1 block: Perform copy-on-read in loop
Improve our braindead copy-on-read implementation.  Pre-patch,
we have multiple issues:
- we create a bounce buffer and perform a write for the entire
request, even if the active image already has 99% of the
clusters occupied, and really only needs to copy-on-read the
remaining 1% of the clusters
- our bounce buffer was as large as the read request, and can
needlessly exhaust our memory by using double the memory of
the request size (the original request plus our bounce buffer),
rather than a capped maximum overhead beyond the original
- if a driver has a max_transfer limit, we are bypassing the
normal code in bdrv_aligned_preadv() that fragments to that
limit, and instead attempt to read the entire buffer from the
driver in one go, which some drivers may assert on
- a client can request a large request of nearly 2G such that
rounding the request out to cluster boundaries results in a
byte count larger than 2G.  While this cannot exceed 32 bits,
it DOES have some follow-on problems:
-- the call to bdrv_driver_pread() can assert for exceeding
BDRV_REQUEST_MAX_BYTES, if the driver is old and lacks
.bdrv_co_preadv
-- if the buffer is all zeroes, the subsequent call to
bdrv_co_do_pwrite_zeroes is a no-op due to a negative size,
which means we did not actually copy on read

Fix all of these issues by breaking up the action into a loop,
where each iteration is capped to sane limits.  Also, querying
the allocation status allows us to optimize: when data is
already present in the active layer, we don't need to bounce.

Note that the code has a telling comment that copy-on-read
should probably be a filter driver rather than a bolt-on hack
in io.c; but that remains a task for another day.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cb2e28780c)
 Conflicts:
	block/io.c
* remove context dep on d855ebcd3
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-12-04 21:36:39 -06:00
..
Makefile.objs qcow2: add bitmaps extension 2017-07-11 17:44:57 +02:00
accounting.c block: make accounting thread-safe 2017-06-16 07:55:00 +08:00
backup.c Convert error_report() to warn_report() 2017-07-13 13:49:58 +02:00
blkdebug.c block: Add PreallocMode to bdrv_truncate() 2017-07-11 17:45:01 +02:00
blkreplay.c block: change variable names in BlockDriverState 2017-06-26 14:54:46 +02:00
blkverify.c blkverify: Catch bs->exact_filename overflow 2017-06-26 14:54:46 +02:00
block-backend.c block-backend: Allow more "can inactivate" cases 2017-08-23 10:21:55 -05:00
bochs.c block: do not set BDS read_only if copy_on_read enabled 2017-04-24 15:09:33 -04:00
cloop.c block: do not set BDS read_only if copy_on_read enabled 2017-04-24 15:09:33 -04:00
commit.c block: Skip implicit nodes in query-block/blockstats 2017-07-24 15:06:04 +02:00
crypto.c block: Add PreallocMode to bdrv_truncate() 2017-07-11 17:45:01 +02:00
crypto.h qcow: convert QCow to use QCryptoBlock for encryption 2017-07-11 17:44:56 +02:00
curl.c curl: do not do aio_poll when waiting for a free CURLState 2017-05-16 10:34:50 -04:00
dirty-bitmap.c dirty-bitmap: Report BlockDirtyInfo.count in bytes, as documented 2017-07-24 15:06:04 +02:00
dmg-bz2.c
dmg.c block: do not set BDS read_only if copy_on_read enabled 2017-04-24 15:09:33 -04:00
dmg.h
file-posix.c file-posix: Do runtime check for ofd lock API 2017-08-11 14:12:44 +02:00
file-win32.c block: Add PreallocMode to BD.bdrv_truncate() 2017-07-11 17:45:01 +02:00
gluster.c Error reporting patches for 2017-07-13 2017-07-14 09:36:40 +01:00
io.c block: Perform copy-on-read in loop 2017-12-04 21:36:39 -06:00
iscsi-opts.c block/iscsi: statically link qemu_iscsi_opts 2017-01-27 18:07:58 +01:00
iscsi.c Error reporting patches for 2017-07-13 2017-07-14 09:36:40 +01:00
linux-aio.c block: explicitly acquire aiocontext in aio callbacks that need it 2017-02-21 11:39:39 +00:00
mirror.c block/mirror: check backing in bdrv_mirror_top_flush 2017-12-04 20:40:23 -06:00
nbd-client.c nbd-client: avoid read_reply_co entry if send failed 2017-09-28 16:52:37 -05:00
nbd-client.h nbd-client: avoid spurious qio_channel_yield() re-entry 2017-08-23 11:22:15 -05:00
nbd.c nbd: Implement NBD_INFO_BLOCK_SIZE on client 2017-07-14 12:04:42 +02:00
nfs.c block/nfs: fix mutex assertion in nfs_file_close() 2017-08-08 15:19:16 +02:00
null.c block/null: Remove 'filename' option 2017-08-08 15:19:16 +02:00
parallels.c parallels: drop check that bdrv_truncate() is working 2017-08-08 15:19:16 +02:00
qapi.c block/qapi: Remove redundant NULL check to silence Coverity 2017-08-01 18:09:33 +02:00
qcow.c qcow: fix memory leaks related to encryption 2017-07-25 16:33:31 +02:00
qcow2-bitmap.c block/qcow2-bitmap: fix use of uninitialized pointer 2017-09-28 16:51:42 -05:00
qcow2-cache.c qcow2: Remove stale comment 2016-11-25 13:51:30 +01:00
qcow2-cluster.c qcow2: add support for LUKS encryption format 2017-07-11 17:44:56 +02:00
qcow2-refcount.c qcow2: fix null pointer dereference 2017-07-31 13:06:38 +03:00
qcow2-snapshot.c qcow2: Discard/zero clusters by byte count 2017-05-11 14:28:07 +02:00
qcow2.c qcow2: move qcow2_store_persistent_dirty_bitmaps() before cache flushing 2017-09-14 19:29:40 -05:00
qcow2.h block/qcow2: falloc/full preallocating growth 2017-07-11 17:45:02 +02:00
qed-check.c
qed-cluster.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-l2-cache.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-table.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed.h qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
quorum.c quorum: Set sectors-count to 0 when reporting a flush error 2017-08-08 14:37:00 +02:00
raw-format.c block: Add PreallocMode to bdrv_truncate() 2017-07-11 17:45:01 +02:00
rbd.c Error reporting patches for 2017-07-13 2017-07-14 09:36:40 +01:00
replication.c block: Make bdrv_is_allocated_above() byte-based 2017-07-10 13:18:07 +02:00
sheepdog.c sheepdog: add queue_lock 2017-07-17 11:34:20 +08:00
snapshot.c qobject: Use simpler QDict/QList scalar insertion macros 2017-05-09 09:13:51 +02:00
ssh.c ssh: support I/O from any AioContext 2017-07-17 11:34:20 +08:00
stream.c block: Make bdrv_is_allocated_above() byte-based 2017-07-10 13:18:07 +02:00
throttle-groups.c block/throttle-groups.c: allocate RestartData on the heap 2017-09-28 16:49:39 -05:00
trace-events block: move trace probes into bdrv_co_preadv|pwritev 2017-08-07 09:39:35 +01:00
vdi.c vdi: make it thread-safe 2017-07-17 11:28:15 +08:00
vhdx-endian.c
vhdx-log.c block/vhdx: check error return of bdrv_truncate() 2017-08-08 14:37:00 +02:00
vhdx.c block/vhdx: check for offset overflow to bdrv_truncate() 2017-08-08 14:37:00 +02:00
vhdx.h
vmdk.c vmdk: Fix error handling/reporting of vmdk_check 2017-08-08 15:19:16 +02:00
vpc.c vpc: Check failure of bdrv_getlength() 2017-08-11 13:23:40 +02:00
vvfat.c block/vvfat: Fix compiler warning with gcc 7 2017-07-18 15:14:36 +02:00
vxhs.c qobject: Use simpler QDict/QList scalar insertion macros 2017-05-09 09:13:51 +02:00
win32-aio.c block: explicitly acquire aiocontext in aio callbacks that need it 2017-02-21 11:39:39 +00:00
write-threshold.c