qemu-irix

Commit Graph

Author	SHA1	Message	Date
Maxime Coquelin	0bc76c8d08	vhost: restore avail index from vring used index on disconnection vhost_virtqueue_stop() gets avail index value from the backend, except if the backend is not responding. It happens when the backend crashes, and in this case, internal state of the virtio queue is inconsistent, making packets to corrupt the vring state. With a Linux guest, it results in following error message on backend reconnection: [ 22.444905] virtio_net virtio0: output.0:id 0 is not a head! [ 22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5 [ 22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5 Fixes: `283e2c2adc` ("net: virtio-net discards TX data after link down") Cc: qemu-stable@nongnu.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit `2ae39a113a`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:55:22 -06:00
Maxime Coquelin	059422ddbc	virtio: Add queue interface to restore avail index from vring used index In case of backend crash, it is not possible to restore internal avail index from the backend value as vhost_get_vring_base callback fails. This patch provides a new interface to restore internal avail index from the vring used index, as done by some vhost-user backend on reconnection. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit `2d4ba6cc74`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:55:16 -06:00
Max Reitz	d6c99e8ff5	util/stats64: Fix min/max comparisons stat64_min_slow() and stat64_max_slow() compare the wrong way. This makes iotest 136 fail with clang and -m32. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20171114232223.25207-1-mreitz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `26a5db322b`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:53:22 -06:00
Eric Blake	56a10ff664	nbd/client: Use error_prepend() correctly When using error prepend(), it is necessary to end with a space in the format string; otherwise, messages come out incorrectly, such as when connecting to a socket that hangs up immediately: can't open device nbd://localhost:10809/: Failed to read dataUnexpected end-of-file before all bytes were read Originally botched in commit `e44ed99d`, then several more instances added in the meantime. Pre-existing and not fixed here: we are inconsistent on capitalization; some of our messages start with lower case, and others start with upper, although the use of error_prepend() is much nicer to read when all fragments consistently start with lower. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171113152424.25381-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> (cherry picked from commit `cb6b1a3fc3`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:50:49 -06:00
Jens Freimann	69f562ad9e	net: fix check for number of parameters to -netdev socket Since commit `0f8c289ad` "net: fix -netdev socket,fd= for UDP sockets" we allow more than one parameter for -netdev socket. But now we run into an assert when no parameter at all is specified > qemu-system-x86_64 -netdev socket socket.c:729: net_init_socket: Assertion `sock->has_udp' failed. Fix this by reverting the change of the if condition done in `0f8c289ad`. Cc: Jason Wang <jasowang@redhat.com> Cc: qemu-stable@nongnu.org Fixes: `0f8c289ad5` Reported-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Signed-off-by: Jens Freimann <jfreimann@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit `ff86d57625`) Conflicts: net/socket.c * drop context dep on `0522a959` Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:49:07 -06:00
Jens Freimann	957bd48acf	net/socket: fix coverity issue This fixes coverity issue CID1005339. Make sure that saddr is not used uninitialized if the mcast parameter is NULL. Cc: qemu-stable@nongnu.org Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Jens Freimann <jfreimann@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> (cherry picked from commit `bb160b571f`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:43:29 -06:00
Eric Auger	3a82a03a2e	hw/intc/arm_gicv3_its: Don't abort on table save failure The ITS is not fully properly reset at the moment. Caches are not emptied. After a reset, in case we attempt to save the state before the bound devices have registered their MSIs and after the 1st level table has been allocated by the ITS driver (device BASER is valid), the first level entries are still invalid. If the device cache is not empty (devices registered before the reset), vgic_its_save_device_tables fails with -EINVAL. This causes a QEMU abort(). Cc: qemu-stable@nongnu.org Signed-off-by: Eric Auger <eric.auger@redhat.com> Reported-by: wanghaibin <wanghaibin.wang@huawei.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit `8a7348b5d6`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:42:12 -06:00
Peter Maydell	b637b865ed	translate.c: Fix usermode big-endian AArch32 LDREXD and STREXD For AArch32 LDREXD and STREXD, architecturally the 32-bit word at the lowest address is always Rt and the one at addr+4 is Rt2, even if the CPU is big-endian. Our implementation does these with a single 64-bit store, so if we're big-endian then we need to put the two 32-bit halves together in the opposite order to little-endian, so that they end up in the right places. We were trying to do this with the gen_aa32_frob64() function, but that is not correct for the usermode emulator, because there there is a distinction between "load a 64 bit value" (which does a BE 64-bit access and doesn't need swapping) and "load two 32 bit values as one 64 bit access" (where we still need to do the swapping, like system mode BE32). Fixes: https://bugs.launchpad.net/qemu/+bug/1725267 Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 1509622400-13351-1-git-send-email-peter.maydell@linaro.org (cherry picked from commit `3448d47b31`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:41:14 -06:00
Greg Kurz	3342fd0286	ppc: fix setting of compat mode While trying to make KVM PR usable again, commit 5dfaa532ae introduced a regression: the current compat_pvr value is passed to KVM instead of the new one. This means that we always pass 0 instead of the max-cpu-compat PVR during the initial machine reset. And at CAS time, we either pass the PVR from the command line or even don't call kvmppc_set_compat() at all, ie, the PCR will not be set as expected. For example if we start a big endian fedora26 guest in power7 compat mode on a POWER8 host, we get this in the guest: $ cat /proc/cpuinfo processor : 0 cpu : POWER7 (architected), altivec supported clock : 4024.000000MHz revision : 2.0 (pvr 004d 0200) timebase : 512000000 platform : pSeries model : IBM pSeries (emulated by qemu) machine : CHRP IBM pSeries (emulated by qemu) MMU : Hash but the guest can still execute POWER8 instructions, and the following program succeeds: int main() { asm("vncipher 0,0,0"); // ISA 2.07 instruction } Let's pass the new compat_pvr to kvmppc_set_compat() and the program fails with SIGILL as expected. Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com> Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit `e4f0c6bb1a`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:38:59 -06:00
Daniel P. Berrange	e0809fcc4b	io: monitor encoutput buffer size from websocket GSource The websocket GSource is monitoring the size of the rawoutput buffer to determine if the channel can accepts more writes. The rawoutput buffer, however, is merely a temporary staging buffer before data is copied into the encoutput buffer. Thus its size will always be zero when the GSource runs. This flaw causes the encoutput buffer to grow without bound if the other end of the underlying data channel doesn't read data being sent. This can be seen with VNC if a client is on a slow WAN link and the guest OS is sending many screen updates. A malicious VNC client can act like it is on a slow link by playing a video in the guest and then reading data very slowly, causing QEMU host memory to expand arbitrarily. This issue is assigned CVE-2017-15268, publically reported in https://bugs.launchpad.net/qemu/+bug/1718964 (cherry picked from commit `a7b20a8efa`) Reviewed-by: Eric Blake <eblake@redhat.com> [Dan: Added extra checks to deal with code refactored in master but not stable 2.10] Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:38:20 -06:00
Paolo Bonzini	e31942b486	nios2: define tcg_env This should be done by all target and, since commit `53f6672bcf` ("gen-icount: use tcg_ctx.tcg_env instead of cpu_env", 2017-06-30), is causing the NIOS2 target to hang. This is because the test for "should I exit to the main loop" was being done with the correct offset to the icount decrementer, but using TCG temporary 0 (the frame pointer) rather than the env pointer. Cc: qemu-stable@nongnu.org Cc: Marek Vasut <marex@denx.de> Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `17bd9597be`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-06 09:32:04 -06:00
Max Reitz	5aa698ab5f	iotests: Add cluster_size=64k to 125 Apparently it would be a good idea to test that, too. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009215533.12530-4-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit `4c112a397c`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-05 19:41:16 -06:00
Max Reitz	39475b8805	qcow2: Always execute preallocate() in a coroutine Some qcow2 functions (at least perform_cow()) expect s->lock to be taken. Therefore, if we want to make use of them, we should execute preallocate() (as "preallocate_co") in a coroutine so that we can use the qemu_co_mutex_* functions. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009215533.12530-3-mreitz@redhat.com Cc: qemu-stable@nongnu.org Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit `572b07bea1`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-05 19:40:38 -06:00
Max Reitz	a25aca75f8	qcow2: Fix unaligned preallocated truncation A qcow2 image file's length is not required to have a length that is a multiple of the cluster size. However, qcow2_refcount_area() expects an aligned value for its @start_offset parameter, so we need to round @old_file_size up to the next cluster boundary. Reported-by: Ping Li <pingl@redhat.com> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1414049 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009215533.12530-2-mreitz@redhat.com Cc: qemu-stable@nongnu.org Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit `e400ad1e1f`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-05 19:40:33 -06:00
Michael Olbrich	64f62e4e90	hw/sd: fix out-of-bounds check for multi block reads The current code checks if the next block exceeds the size of the card. This generates an error while reading the last block of the card. Do the out-of-bounds check when starting to read a new block to fix this. This issue became visible with increased error checking in Linux 4.13. Cc: qemu-stable@nongnu.org Signed-off-by: Michael Olbrich <m.olbrich@pengutronix.de> Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> Message-id: 20170916091611.10241-1-m.olbrich@pengutronix.de Signed-off-by: Peter Maydell <peter.maydell@linaro.org> (cherry picked from commit `8573378e62`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-05 19:39:35 -06:00
Maxime Coquelin	d765c5e577	memory: fix off-by-one error in memory_region_notify_one() This patch fixes an off-by-one error that could lead to the notifyee to receive notifications for ranges it is not registered to. The bug has been spotted by code review. Fixes: `bd2bfa4c52` ("memory: introduce memory_region_notify_one()") Cc: qemu-stable@nongnu.org Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20171010094247.10173-4-maxime.coquelin@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `b021d1c044`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:42:41 -06:00
Peter Xu	ae13e2cfa8	exec: simplify address_space_get_iotlb_entry This patch let address_space_get_iotlb_entry() to use the newly introduced page_mask parameter in flatview_do_translate(). Then we will be sure the IOTLB can be aligned to page mask, also we should nicely support huge pages now when introducing `a764040`. Fixes: `a764040` ("exec: abstract address_space_do_translate()") Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-Id: <20171010094247.10173-3-maxime.coquelin@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `076a93d797`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:42:34 -06:00
Peter Xu	c9dbe3e0fc	exec: add page_mask for flatview_do_translate The function is originally used for flatview_space_translate() and what we care about most is (xlat, plen) range. However for iotlb requests, we don't really care about "plen", but the size of the page that "xlat" is located on. While, plen cannot really contain this information. A simple example to show why "plen" is not good for IOTLB translations: E.g., for huge pages, it is possible that guest mapped 1G huge page on device side that used this GPA range: 0x100000000 - 0x13fffffff Then let's say we want to translate one IOVA that finally mapped to GPA 0x13ffffe00 (which is located on this 1G huge page). Then here we'll get: (xlat, plen) = (0x13fffe00, 0x200) So the IOTLB would be only covering a very small range since from "plen" (which is 0x200 bytes) we cannot tell the size of the page. Actually we can really know that this is a huge page - we just throw the information away in flatview_do_translate(). This patch introduced "page_mask" optional parameter to capture that page mask info. Also, I made "plen" an optional parameter as well, with some comments for the whole function. No functional change yet. Signed-off-by: Peter Xu <peterx@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Message-Id: <20171010094247.10173-2-maxime.coquelin@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `d5e5fafd11`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:42:28 -06:00
Alexey Kardashevskiy	496f97293e	memory: Share special empty FlatView This shares an cached empty FlatView among address spaces. The empty FV is used every time when a root MR renders into a FV without memory sections which happens when MR or its children are not enabled or zero-sized. The empty_view is not NULL to keep the rest of memory API intact; it also has a dispatch tree for the same reason. On POWER8 with 255 CPUs, 255 virtio-net, 40 PCI bridges guest this halves the amount of FlatView's in use (557 -> 260) and dispatch tables (~800000 -> ~370000). In an unrelated experiment with 112 non-virtio devices on x86 ("-M pc"), only 4 FlatViews are alive, and about ~2000 are created at startup. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-16-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `092aa2fc65`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:42:02 -06:00
Paolo Bonzini	639701e4f2	memory: seek FlatView sharing candidates among children subregions A container can be used instead of an alias to allow switching between multiple subregions. In this case we cannot directly share the subregions (since they only belong to a single parent), but if the subregions are aliases we can in turn walk those. This is not enough to remove all source of quadratic FlatView creation, but it enables sharing of the PCI bus master FlatViews (and their AddressSpaceDispatch structures) across all PCI devices. For 112 virtio-net-pci devices, boot time is reduced from 25 to 10 seconds and memory consumption from 1.4 to 1 G. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `e673ba9af9`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:41:57 -06:00
Paolo Bonzini	5dbd1f7884	memory: trace FlatView creation and destruction Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `02d9651d6a`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:41:52 -06:00
Alexey Kardashevskiy	5b5e49ab5f	memory: Create FlatView directly This avoids usual memory_region_transaction_commit() which rebuilds all FVs. On POWER8 with 255 CPUs, 255 virtio-net, 40 PCI bridges guest this brings down the boot time from 25s to 20s and reduces the amount of temporary FVs allocated during machine constructon (~800000 -> ~640000) and amount of temporary dispatch trees (~370000 -> ~300000), the total memory footprint goes down (18G -> 17G). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-18-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `202fc01b05`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:32:47 -06:00
Alexey Kardashevskiy	a7bb94e784	memory: Get rid of address_space_init_shareable Since FlatViews are shared now and ASes not, this gets rid of address_space_init_shareable(). This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-17-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `b516572f31`) Conflicts: target/arm/cpu.c * drop context deps on `1d2091bc` and `1e577cc7` Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:32:11 -06:00
Alexey Kardashevskiy	7dd7f7ef44	memory: Do not allocate FlatView in address_space_init This creates a new AS object without any FlatView as memory_region_transaction_commit() may want to reuse the empty FV. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-14-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `67ace39b25`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:04:50 -06:00
Alexey Kardashevskiy	e8c7ea3e75	memory: Share FlatView's and dispatch trees between address spaces This allows sharing flat views between address spaces (AS) when the same root memory region is used when creating a new address space. This is done by walking through all ASes and caching one FlatView per a physical root MR (i.e. not aliased). This removes search for duplicates from address_space_init_shareable() as FlatViews are shared elsewhere and keeping as::ref_count correct seems an unnecessary and useless complication. This should cause no change and memory use or boot time yet. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-13-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `967dc9b119`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:04:44 -06:00
Alexey Kardashevskiy	c943efe8b5	memory: Move address_space_update_ioeventfds So it is called (twice) from the same function. This is to make the next patches a bit simpler. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-12-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `0221848764`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:04:32 -06:00
Alexey Kardashevskiy	c14ce078b2	memory: Alloc dispatch tree where topology is generared This is to make next patches simpler. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-11-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `9bf561e36c`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:04:19 -06:00
Alexey Kardashevskiy	260d3646b0	memory: Store physical root MR in FlatView Address spaces get to keep a root MR (alias or not) but FlatView stores the actual MR as this is going to be used later on to decide whether to share a particular FlatView or not. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-10-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `89c177bbdd`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:04:14 -06:00
Alexey Kardashevskiy	08101db63b	memory: Rename mem_begin/mem_commit/mem_add helpers This renames some helpers to reflect better what they do. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-9-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `8629d3fcb7`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:04:00 -06:00
Alexey Kardashevskiy	eff5ed4ae9	memory: Cleanup after switching to FlatView We store AddressSpaceDispatch* in FlatView anyway so there is no need to carry it from mem_add() to register_subpage/register_multipage. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-8-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `9950322a59`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:03:54 -06:00
Alexey Kardashevskiy	f7774e329b	memory: Switch memory from using AddressSpace to FlatView FlatView's will be shared between AddressSpace's and subpage_t and MemoryRegionSection cannot store AS anymore, hence this change. In particular, for: typedef struct subpage_t { MemoryRegion iomem; - AddressSpace as; + FlatView fv; hwaddr base; uint16_t sub_section[]; } subpage_t; struct MemoryRegionSection { MemoryRegion mr; - AddressSpace address_space; + FlatView *fv; hwaddr offset_within_region; Int128 size; hwaddr offset_within_address_space; bool readonly; }; This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-7-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `166206845f`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:03:46 -06:00
Paolo Bonzini	3568e11940	memory: avoid "resurrection" of dead FlatViews It's possible for address_space_get_flatview() as it currently stands to cause a use-after-free for the returned FlatView, if the reference count is incremented after the FlatView has been replaced by a writer: thread 1 thread 2 RCU thread ------------------------------------------------------------- rcu_read_lock read as->current_map set as->current_map flatview_unref '--> call_rcu flatview_ref [ref=1] rcu_read_unlock flatview_destroy <badness> Since FlatViews are not updated very often, we can just detect the situation using a new atomic op atomic_fetch_inc_nonzero, similar to Linux's atomic_inc_not_zero, which performs the refcount increment only if it hasn't already hit zero. This is similar to Linux commit de09a9771a53 ("CRED: Fix get_task_cred() and task_state() to not resurrect dead credentials", 2010-07-29). Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `447b0d0b9e`) Conflicts: docs/devel/atomics.txt * drop documentation ref to atomic_fetch_xor * prereq for `166206845f` Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 22:03:33 -06:00
Alexey Kardashevskiy	d0136db812	memory: Remove AddressSpace pointer from AddressSpaceDispatch AS in ASD is only used to pass AS from mem_begin() to register_subpage() to store it in MemoryRegionSection, we can do this directly now. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-6-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `c775252378`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 21:55:04 -06:00
Alexey Kardashevskiy	4d2f8abb22	memory: Move AddressSpaceDispatch from AddressSpace to FlatView As we are going to share FlatView's between AddressSpace's, and AddressSpaceDispatch is a structure to perform quick lookup in FlatView, this moves ASD to FlatView. After previosly open coded ASD rendering, we can also remove as->next_dispatch as the new FlatView pointer is stored on a stack and set to an AS atomically. flatview_destroy() is executed under RCU instead of address_space_dispatch_free() now. This makes mem_begin/mem_commit to work with ASD and mem_add with FV as later on mem_add will be taking FV as an argument anyway. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-5-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `66a6df1dc6`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 21:54:59 -06:00
Alexey Kardashevskiy	de7e6815b8	memory: Move FlatView allocation to a helper This moves a FlatView allocation and initialization to a helper. While we are nere, replace g_new with g_new0 to not to bother if we add new fields in the future. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-4-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `cc94cd6d36`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 21:54:44 -06:00
Alexey Kardashevskiy	1b04a15809	memory: Open code FlatView rendering We are going to share FlatView's between AddressSpace's and per-AS memory listeners won't suit the purpose anymore so open code the dispatch tree rendering. Since there is a good chance that dispatch_listener was the only listener, this avoids address_space_update_topology_pass() if there is no registered listeners; this should improve starting time. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-3-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `9a62e24f45`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 21:54:37 -06:00
Alexey Kardashevskiy	6424975ce9	exec: Explicitly export target AS from address_space_translate_internal This adds an AS** parameter to address_space_do_translate() to make it easier for the next patch to share FlatViews. This should cause no behavioural change. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20170921085110.25598-2-aik@ozlabs.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `e76bb18f7e`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 21:54:29 -06:00
Eric Blake	4af42e3cf1	block: Perform copy-on-read in loop Improve our braindead copy-on-read implementation. Pre-patch, we have multiple issues: - we create a bounce buffer and perform a write for the entire request, even if the active image already has 99% of the clusters occupied, and really only needs to copy-on-read the remaining 1% of the clusters - our bounce buffer was as large as the read request, and can needlessly exhaust our memory by using double the memory of the request size (the original request plus our bounce buffer), rather than a capped maximum overhead beyond the original - if a driver has a max_transfer limit, we are bypassing the normal code in bdrv_aligned_preadv() that fragments to that limit, and instead attempt to read the entire buffer from the driver in one go, which some drivers may assert on - a client can request a large request of nearly 2G such that rounding the request out to cluster boundaries results in a byte count larger than 2G. While this cannot exceed 32 bits, it DOES have some follow-on problems: -- the call to bdrv_driver_pread() can assert for exceeding BDRV_REQUEST_MAX_BYTES, if the driver is old and lacks .bdrv_co_preadv -- if the buffer is all zeroes, the subsequent call to bdrv_co_do_pwrite_zeroes is a no-op due to a negative size, which means we did not actually copy on read Fix all of these issues by breaking up the action into a loop, where each iteration is capped to sane limits. Also, querying the allocation status allows us to optimize: when data is already present in the active layer, we don't need to bounce. Note that the code has a telling comment that copy-on-read should probably be a filter driver rather than a bolt-on hack in io.c; but that remains a task for another day. CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com> (cherry picked from commit `cb2e28780c`) Conflicts: block/io.c * remove context dep on `d855ebcd3` Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 21:36:39 -06:00
Jim Somerville	26914ce48d	kvmclock: use the updated system_timer_msr Fixes `e2b6c17` (kvmclock: update system_time_msr address forcibly) which makes a call to get the latest value of the address stored in system_timer_msr, but then uses the old address anyway. Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com> Message-Id: <59b67db0bd15a46ab47c3aa657c81a4c11f168ea.1506702472.git.Jim.Somerville@windriver.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> (cherry picked from commit `346b1215b1`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 20:42:20 -06:00
Vladimir Sementsov-Ogievskiy	49958d37e7	block/mirror: check backing in bdrv_mirror_top_flush Backing may be zero after failed bdrv_append in mirror_start_job, which leads to SIGSEGV. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170929152255.5431-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com> (cherry picked from commit `ce960aa906`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 20:40:23 -06:00
Thomas Huth	b234266086	hw/usb/bus: Remove bad object_unparent() from usb_try_create_simple() Valgrind detects an invalid read operation when hot-plugging of an USB device fails: $ valgrind x86_64-softmmu/qemu-system-x86_64 -device usb-ehci -nographic -S ==30598== Memcheck, a memory error detector ==30598== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==30598== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info ==30598== Command: x86_64-softmmu/qemu-system-x86_64 -device usb-ehci -nographic -S ==30598== QEMU 2.10.50 monitor - type 'help' for more information (qemu) device_add usb-tablet (qemu) device_add usb-tablet (qemu) device_add usb-tablet (qemu) device_add usb-tablet (qemu) device_add usb-tablet (qemu) device_add usb-tablet ==30598== Invalid read of size 8 ==30598== at 0x60EF50: object_unparent (object.c:445) ==30598== by 0x580F0D: usb_try_create_simple (bus.c:346) ==30598== by 0x581BEB: usb_claim_port (bus.c:451) ==30598== by 0x582310: usb_qdev_realize (bus.c:257) ==30598== by 0x4CB399: device_set_realized (qdev.c:914) ==30598== by 0x60E26D: property_set_bool (object.c:1886) ==30598== by 0x61235E: object_property_set_qobject (qom-qobject.c:27) ==30598== by 0x61000F: object_property_set_bool (object.c:1162) ==30598== by 0x4567C3: qdev_device_add (qdev-monitor.c:630) ==30598== by 0x456D52: qmp_device_add (qdev-monitor.c:807) ==30598== by 0x470A99: hmp_device_add (hmp.c:1933) ==30598== by 0x3679C3: handle_hmp_command (monitor.c:3123) The object_unparent() here is not necessary anymore since commit `69382d8b3e` ("qdev: Fix object reference leak in case device.realize() fails"), so let's remove it now. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-id: 1506526106-30971-1-git-send-email-thuth@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> (cherry picked from commit `f3b2bea3c7`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-12-04 20:37:19 -06:00
Daniel Henrique Barboza	62695f60c3	hw/ppc: CAS reset on early device hotplug This patch is a follow up on the discussions made in patch "hw/ppc: disable hotplug before CAS is completed" that can be found at [1]. At this moment, we do not support CPU/memory hotplug in early boot stages, before CAS. When a hotplug occurs, the event is logged in an internal RTAS event log queue and an IRQ pulse is fired. In regular conditions, the guest handles the interrupt by executing check_exception, fetching the generated hotplug event and enabling the device for use. In early boot, this IRQ isn't caught (SLOF does not handle hotplug events), leaving the event in the rtas event log queue. If the guest executes check_exception due to another hotplug event, the re-assertion of the IRQ ends up de-queuing the first hotplug event as well. In short, a device hotplugged before CAS is considered coldplugged by SLOF. This leads to device misbehavior and, in some cases, guest kernel Ooops when trying to unplug the device. A proper fix would be to turn every device hotplugged before CAS as a colplugged device. This is not trivial to do with the current code base though - the FDT is written in the guest memory at ppc_spapr_reset and can't be retrieved without adding extra state (fdt_size for example) that will need to managed and migrated. Adding the hotplugged DT in the middle of CAS negotiation via the updated DT tree works with CPU devs, but panics the guest kernel at boot. Additional analysis would be necessary for LMBs and PCI devices. There are questions to be made in QEMU/SLOF/kernel level about how we can make this change in a sustainable way. With Linux guests, a fix would be the kernel executing check_exception at boot time, de-queueing the events that happened in early boot and processing them. However, even if/when the newer kernels start fetching these events at boot time, we need to take care of older kernels that won't be doing that. This patch works around the situation by issuing a CAS reset if a hotplugged device is detected during CAS: - the DRC conditions that warrant a CAS reset is the same as those that triggers a DRC migration - the DRC must have a device attached and the DRC state is not equal to its ready_state. With that in mind, this patch makes use of 'spapr_drc_needed' to determine if a CAS reset is needed. - In the middle of CAS negotiations, the function 'spapr_hotplugged_dev_before_cas' goes through all the DRCs to see if there are any DRC that requires a reset, using spapr_drc_needed. If that happens, returns '1' in 'spapr_h_cas_compose_response' which will set spapr->cas_reboot to true, causing the machine to reboot. No changes are made for coldplug devices. [1] http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg02855.html Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> (cherry picked from commit `10f12e6450`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-10-03 17:40:40 -05:00
Michael Roth	7851197b81	Update version for 2.10.1 release Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-10-02 12:42:58 -05:00
Peter Lieven	547435f550	migration: disable auto-converge during bulk block migration auto-converge and block migration currently do not play well together. During block migration the auto-converge logic detects that ram migration makes no progress and thus throttles down the vm until it nearly stalls completely. Avoid this by disabling the throttling logic during the bulk phase of the block migration. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-Id: <1506421996-12513-1-git-send-email-pl@kamp.de> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> (cherry picked from commit `9ac78b6171`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-09-28 16:55:44 -05:00
Christian Borntraeger	17cd46fbdf	s390x/cpumodel: remove ais from z14 default model-> also for 2.10.1 We disabled ais for 2.10, so let's also remove it from the z14 default model. Fixes: `3f2d07b3b0` ("s390x/ais: for 2.10 stable: disable ais facility") CC: qemu-stable@nongnu.org Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20170927072030.35737-2-borntraeger@de.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> (cherry picked from commit `9dacc90846`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-09-28 16:55:08 -05:00
Anthony PERARD	6a903482b1	Revert "ACPI: don't call acpi_pcihp_device_plug_cb on xen" This reverts commit `153eba4726`. This patch prevents PCI passthrough hotplug on Xen. Even if the Xen tool stack prepares its own ACPI tables, we still rely on QEMU for hotplug ACPI notifications. The original issue is fixed by the two previous patch: hw/acpi: Limit hotplug to root bus on legacy mode hw/acpi: Move acpi_set_pci_info to pcihp Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit `2bed1ba77f`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-09-28 16:53:20 -05:00
Anthony PERARD	8edf4c6adc	hw/acpi: Move acpi_set_pci_info to pcihp HW part of ACPI PCI hotplug in QEMU depends on ACPI_PCIHP_PROP_BSEL being set on a PCI bus that supports ACPI hotplug. It should work regardless of the source of ACPI tables (QEMU generator/legacy SeaBIOS/Xen). So move ACPI_PCIHP_PROP_BSEL initialization into HW ACPI implementation part from QEMU's ACPI table generator. To do PCI passthrough with Xen, the property ACPI_PCIHP_PROP_BSEL needs to be set, but this was done only when ACPI tables are built which is not needed for a Xen guest. The need for the property starts with commit "pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice" (`f0c9d64a68`). Adding find_i440fx into stubs so that mips-softmmu target can be built. Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit `ab938ae43f`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-09-28 16:53:11 -05:00
Anthony PERARD	2c3a8cc581	hw/acpi: Limit hotplug to root bus on legacy mode Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> (cherry picked from commit `f5855994fe`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-09-28 16:53:04 -05:00
Stefan Hajnoczi	0691b70a2a	nbd-client: avoid read_reply_co entry if send failed The following segfault is encountered if the NBD server closes the UNIX domain socket immediately after negotiation: Program terminated with signal SIGSEGV, Segmentation fault. #0 aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 441 QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines, (gdb) bt #0 0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441 #1 0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207 #2 0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237 #3 0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836 #4 0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f +lags=0) at block/io.c:1086 #5 0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182 #6 0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032 #7 0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079 #8 0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79 #9 0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6 The problem is that nbd_client_init() uses nbd_client_attach_aio_context() -> aio_co_schedule(new_context, client->read_reply_co). Execution of read_reply_co is deferred to a BH which doesn't run until later. In the mean time blk_co_preadv() can be called and nbd_coroutine_end() calls aio_wake() on read_reply_co. At this point in time read_reply_co's ctx isn't set because it has never been entered yet. This patch simplifies the nbd_co_send_request() -> nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just nbd_co_send_request() -> nbd_co_receive_reply(). The request is "ended" if an error occurs at any point. Callers no longer have to invoke nbd_coroutine_end(). This cleanup also eliminates the segfault because we don't call aio_co_schedule() to wake up s->read_reply_co if sending the request failed. It is only necessary to wake up s->read_reply_co if a reply was received. Note this only happens with UNIX domain sockets on Linux. It doesn't seem possible to reproduce this with TCP sockets. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20170829122745.14309-2-stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> (cherry picked from commit `3c2d5183f9`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-09-28 16:52:37 -05:00
Alex Bennée	4d824886c8	accel/tcg/cputlb: avoid recursive BQL (fixes #1706296 ) The mmio path (see exec.c:prepare_mmio_access) already protects itself against recursive locking and it makes sense to do the same for io_readx/writex. Otherwise any helper running in the BQL context will assert when it attempts to write to device memory as in the case of the bug report. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> CC: Richard Jones <rjones@redhat.com> CC: Paolo Bonzini <bonzini@gnu.org> CC: qemu-stable@nongnu.org Message-Id: <20170921110625.9500-1-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> (cherry picked from commit `8b81253332`) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2017-09-28 16:52:09 -05:00

1 2 3 4 5 ...

55476 Commits All Branches Search

55476 Commits

All Branches