This interface allows a guest to control various platform/device
sensors. Initially, we only implement support necessary to control
sensors that are required for hotplug: DR connector indicators/LEDs,
resource allocation state, and resource isolation state.
See docs/specs/ppc-spapr-hotplug.txt for a complete description of
this interface.
Signed-off-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
These interfaces manage the power domains that guest devices are
assigned to and are used to power on/off devices. Currently we
only utilize 1 power domain, the 'live-insertion' domain, which
automates power management of plugged/unplugged devices, essentially
making these calls no-ops, but the RTAS interfaces are still required
by guest hotplug code and PAPR+.
See docs/specs/ppc-spapr-hotplug.txt for a complete description of
these interfaces.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This device emulates a firmware abstraction used by pSeries guests to
manage hotplug/dynamic-reconfiguration of host-bridges, PCI devices,
memory, and CPUs. It is conceptually similar to an SHPC device,
complete with LED indicators to identify individual slots to physical
physical users and indicate when it is safe to remove a device. In
some cases it is also used to manage virtualized resources, such a
memory, CPUs, and physical-host bridges, which in the case of pSeries
guests are virtualized resources where the physical components are
managed by the host.
Guests communicate with these DR Connectors using RTAS calls,
generally by addressing the unique DRC index associated with a
particular connector for a particular resource. For introspection
purposes we expose this state initially as QOM properties, and
in subsequent patches will introduce the RTAS calls that make use of
it. This constitutes to the 'guest' interface.
On the QEMU side we provide an attach/detach interface to associate
or cleanup a DeviceState with a particular sPAPRDRConnector in
response to hotplug/unplug, respectively. This constitutes the
'physical' interface to the DR Connector.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
hw_error() is designed for printing CPU-related error messages
(e.g. it also prints a full CPU register dump). For error messages
that are not directly related to CPU problems, a function like
error_report() should be used instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When specifying a non-existing file with the "-bios" parameter, QEMU
complained that it "could not find LPAR rtas". That's obviously a
copy-n-paste bug from the code which loads the spapr-rtas.bin, it
should complain about a missing firmware file instead.
Additionally the error message was printed with hw_error() - which
also dumps the whole CPU state. However, this does not make much
sense here since the CPU is not running yet and thus the registers
only contain zeroes. So let's use error_report() here instead.
And while we're at it, let's also bail out if the firmware file
had zero length.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Now that 2.4 development has opened, create a new pseries machine type
variant. For now it is identical to the pseries-2.3 machine type, but
a number of new features are coming that will need to set backwards
compatibility options.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The check "liobn & 0xFFFFFFFF00000000ULL" in spapr_tce_find_by_liobn()
is completely useless since liobn is only declared as an uint32_t
parameter. Fix this by using target_ulong instead (this is what most
of the callers of this function are using, too).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Useful for debugging.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This replaces object_child_foreach() and callback with existing
SPAPR_PCI_LIOBN() and spapr_tce_find_by_liobn() to make the code easier
to read.
This is a mechanical patch so no behaviour change is expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
At the moment spapr_tce_find_by_liobn() is used by H_PUT_TCE/...
handlers to find an IOMMU by LIOBN.
We are going to implement Dynamic DMA windows (DDW), new code
will go to a new file and we will use spapr_tce_find_by_liobn()
there too so let's make it public.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This makes find_phb()/find_dev() public and changed its names
to spapr_pci_find_phb()/spapr_pci_find_dev() as they are going to
be used from other parts of QEMU such as VFIO DDW (dynamic DMA window)
or VFIO PCI error injection or VFIO EEH handling - in all these
cases there are RTAS calls which are addressed to BUID+config_addr
in IEEE1275 format.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This is to reduce VIO noise while debugging PCI DMA.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This gets rid of a magic constant describing the default DMA window size
for an emulated PHB.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This introduces a macro which makes up a LIOBN from fixed prefix and
VIO device address (@reg property).
This is to keep LIOBN macros rendering consistent - the same macro for
PCI has been added by the previous patch.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
We are going to have multiple DMA windows per PHB and we want them to
migrate so we need a predictable way of assigning LIOBNs.
This introduces a macro which makes up a LIOBN from fixed prefix,
PHB index (unique PHB id) and window number.
This introduces a SPAPR_PCI_DMA_WINDOW_NUM() to know the window number
from LIOBN. It is used to distinguish the default 32bit windows from
dynamic windows and avoid picking default DMA window properties from
a wrong TCE table.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR is defined as big endian so TCEs need an adjustment so
does this patch.
This changes code to have ldq_be_phys() in one place.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The existing KVM_CREATE_SPAPR_TCE ioctl only support 4G windows max as
the window size parameter to the kernel ioctl() is 32-bit so
there's no way of expressing a TCE window > 4GB.
We are going to add huge DMA windows support so this will create small
window and unexpectedly fail later.
This disables KVM_CREATE_SPAPR_TCE for windows bigger that 4GB.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
spapr_pci.c contains a number of expressions of the form (uval == -1) or
(uval != -1), where 'uval' is an unsigned value.
This mostly works in practice, because as long as the width of uval is
greater or equal than that of (int), the -1 will be promoted to the
unsigned type, which is the expected outcome.
However, at least for the cases where uval is uint32_t, this would break
on platforms where sizeof(int) > 4 (and a few such do exist), because then
the uint32_t value would be promoted to the larger int type, and never be
equal to -1.
This patch fixes these errors. The fixes for the (uint32_t) cases are
necessary as described above. I've made similar fixes to (uint64_t) and
(hwaddr) cases. Those are strictly theoretical, since I don't know of any
platforms where sizeof(int) > 8, but hey, it's not that hard so we might
as well be strictly C standard compliant.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Convert device models "macio-oldworld" and "macio-newworld".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
PXB does not work with unsupported bioses, but should
not interfere with normal OS operation.
We don't ship them anymore, but it's reasonable
to keep the work-around until we update the bios in qemu.
Fix this by not adding PXB mem/IO chunks to _CRS
if they weren't configured by BIOS.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
The pxb can be attach to and existing numa node by specifying
numa_node option that equals the desired numa nodeid.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
PCI root buses can be attached to a specific NUMA node.
PCI buses are not attached by default to a NUMA node.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
The bios does not index the pxb slot number when
it computes the IRQ because it resides on bus 0
and not on the current bus.
However Qemu routes the irq through bus 0 and adds
the pxb slot to the IRQ computation of the PXB device.
Synchronize between bios and Qemu by canceling
pxb's effect.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
The bios looks for 'etc/extra-pci-roots' to decide if
is going to scan further buses after bus 0 tree.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
PXB is a "light-weight" host bridge whose purpose is to enable
the main host bridge to support multiple PCI root buses
for pc machines.
As oposed to PCI-2-PCI bridge's secondary bus, PXB's bus
is a primary bus and can be associated with a NUMA node
(different from the main host bridge) allowing the guest OS
to recognize the proximity of a pass-through device to
other resources as RAM and CPUs.
The PXB is composed from:
- A primary PCI bus (can be associated with a NUMA node)
Acts like a normal pci bus and from the functionality point
of view is an "expansion" of the bus behind the
main host bridge.
- A pci-2-pci bridge behind the primary PCI bus where the actual
devices will be attached.
- A host-bridge PCI device
Situated on the bus behind the main host bridge, allows
the BIOS to configure the bus number and IO/mem resources.
It does not have its own config/data register for configuration
cycles, this being handled by the main host bridge.
- A host-bridge sysbus to comply with QEMU current design.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Use the newer pci_bus_num to correctly get the root bus number.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
If multiple root buses are used, root bus 0 cannot use all the
pci holes ranges. Remove the IO/mem ranges used by the other
primary buses.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Save the IO/mem/bus numbers ranges assigned to the extra root busses
to be removed from the root bus 0 range.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
If the machine has extra root busses that are snooping to
the i440fx host bridge, we need to add them to
acpi in order to be properly detected by guests.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
PXB buses are assumed to be children of bus 0. Look for them
while scanning the buses.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Because of the PXB hosts we cannot simply query TYPE_PCI_HOST_BRIDGE anymore.
On i386 arch we only have two pci hosts, so we can look only for them.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Refactoring it as a method of PCIBusClass will allow
different implementations for subclasses.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Refactoring it as a method of PCIBusClass will allow
different implementations for subclasses.
Removed the assumption that the root bus does not
have a parent device because is specific only
to the default class implementation.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Commit 68e6b0af7 (acpi: add aml_while() term) added
the definition of aml_while without the actual implementation.
Implement the term.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Add a new API named acpi_send_gpe_event() to send hotplug SCI.
This API can be used by pci, cpu and memory hotplug.
This patch is rebased on master.
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Commit "019a3ed virtio: make features 64bit wide" missed a few changes,
as I've noticed while trying to rebase the virtio-1 branch to latest
master. This patch adds them.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We should validate the vq index against nvqs_with_notifiers. Otherwise we may
try to mask or unmask vector for vqs without notifiers (e.g control vq). This
will lead qemu abort on kvm_irqchip_commit_routes() when trying to boot win8.1
guest.
Fixes 851c2a75a6 ("virtio-pci: speedup MSI-X
masking and unmasking")
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In DSDT FDC0 declares the IO region as IO(Decode16, 0x03F2, 0x03F2, 0x00, 0x04).
Use the same in lpc_ich9 initialization code.
Now the floppy drive is detected correctly on Windows.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
commit 5cb18b3d7b
TPM2 ACPI table support
was missing a file, so build with iasl fails
(build without iasl works since it uses the generated
hex files).
Reported-by: "Daniel P. Berrange" <berrange@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch corrects the Rx buffer size field mask to mask bits 23 to 16
to match Xilinx UG585 documentation.
Signed-off-by: Sai Pavan Boddu <saipava@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since ich9_lpc_pm_init only requests one irq, so let it just call
qemu_allocate_irq.
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since pc_allocate_cpu_irq only requests one irq, so let it just call
qemu_allocate_irq.
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
valgrind complains about:
==7055== 58 bytes in 1 blocks are definitely lost in loss record 1,471 of 2,192
==7055== at 0x4C2845D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==7055== by 0x24410F: malloc_and_trace (vl.c:2556)
==7055== by 0x64C770E: g_malloc (in /usr/lib64/libglib-2.0.so.0.3600.3)
==7055== by 0x64DEFD7: g_strndup (in /usr/lib64/libglib-2.0.so.0.3600.3)
==7055== by 0x650181A: g_vasprintf (in /usr/lib64/libglib-2.0.so.0.3600.3)
==7055== by 0x64DF0CC: g_strdup_vprintf (in /usr/lib64/libglib-2.0.so.0.3600.3)
==7055== by 0x64DF188: g_strdup_printf (in /usr/lib64/libglib-2.0.so.0.3600.3)
==7055== by 0x242F81: qemu_find_file (vl.c:2121)
==7055== by 0x217A32: clipper_init (dp264.c:105)
==7055== by 0x2484DA: main (vl.c:4249)
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
valgrind complains about:
==16447== 48 bytes in 2 blocks are definitely lost in loss record 2,033 of 3,310
==16447== at 0x4C2845D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16447== by 0x2E4FD7: malloc_and_trace (vl.c:2546)
==16447== by 0x64C770E: g_malloc (in /usr/lib64/libglib-2.0.so.0.3600.3)
==16447== by 0x53EC3F: qint_from_int (qint.c:33)
==16447== by 0x53B426: qmp_output_type_int (qmp-output-visitor.c:162)
==16447== by 0x539257: visit_type_uint32 (qapi-visit-core.c:147)
==16447== by 0x471D07: property_get_uint32_ptr (object.c:1651)
==16447== by 0x47000C: object_property_get (object.c:822)
==16447== by 0x472428: object_property_get_qobject (qom-qobject.c:37)
==16447== by 0x25701A: build_append_pci_bus_devices (acpi-build.c:520)
==16447== by 0x25902E: build_ssdt (acpi-build.c:1004)
==16447== by 0x25A0A8: acpi_build (acpi-build.c:1420)
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
valgrind complains about:
==16447== 16 bytes in 2 blocks are definitely lost in loss record 1,304 of 3,310
==16447== at 0x4C2845D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16447== by 0x2E4FD7: malloc_and_trace (vl.c:2546)
==16447== by 0x64C770E: g_malloc (in /usr/lib64/libglib-2.0.so.0.3600.3)
==16447== by 0x36FB47: qemu_extend_irqs (irq.c:55)
==16447== by 0x36FBD3: qemu_allocate_irqs (irq.c:64)
==16447== by 0x3B4B44: bmdma_init (pci.c:464)
==16447== by 0x3B547B: pci_piix_init_ports (piix.c:144)
==16447== by 0x3B55D2: pci_piix_ide_realize (piix.c:164)
==16447== by 0x3EAEC6: pci_qdev_realize (pci.c:1790)
==16447== by 0x36C685: device_set_realized (qdev.c:1058)
==16447== by 0x47179E: property_set_bool (object.c:1514)
==16447== by 0x470098: object_property_set (object.c:837)
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
valgrind complains about:
==16447== 8 bytes in 1 blocks are definitely lost in loss record 552 of 3,310
==16447== at 0x4C2845D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16447== by 0x2E4FD7: malloc_and_trace (vl.c:2546)
==16447== by 0x64C770E: g_malloc (in /usr/lib64/libglib-2.0.so.0.3600.3)
==16447== by 0x36FB47: qemu_extend_irqs (irq.c:55)
==16447== by 0x36FBD3: qemu_allocate_irqs (irq.c:64)
==16447== by 0x24E622: pc_init1 (pc_piix.c:287)
==16447== by 0x24E76A: pc_init_pci (pc_piix.c:310)
==16447== by 0x2E9360: main (vl.c:4226)
==16447== 128 bytes in 1 blocks are definitely lost in loss record 2,569 of 3,310
==16447== at 0x4C2845D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==16447== by 0x2E4FD7: malloc_and_trace (vl.c:2546)
==16447== by 0x64C770E: g_malloc (in /usr/lib64/libglib-2.0.so.0.3600.3)
==16447== by 0x36FB47: qemu_extend_irqs (irq.c:55)
==16447== by 0x36FBD3: qemu_allocate_irqs (irq.c:64)
==16447== by 0x25BEB2: kvm_i8259_init (i8259.c:133)
==16447== by 0x24E1F1: pc_init1 (pc_piix.c:219)
==16447== by 0x24E76A: pc_init_pci (pc_piix.c:310)
==16447== by 0x2E9360: main (vl.c:4226)
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Use C casts to avoid accessing ICCDevice's qdev field
directly.
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Setting the parent bus of a device increases its ref count, which we
ultimately want to level out. However it is only safe to do so after the
last reference to the device in local code, as qom-set or similar operations
might decrease the ref count.
Therefore move the object_unref() from pc_new_cpu() into its callers.
The APIC operations on the last CPU in pc_cpus_init() are still potentially
insecure, but that is beyond the scope of this code movement.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The RQM bit in MSR should be set whenever the guest is supposed to
access the FIFO, and it should be cleared in all other cases. This is
important so the guest can't continue writing/reading the FIFO beyond
the length that it's suppossed to access (see CVE-2015-3456).
Commit e9077462 fixed the CVE by adding code that avoids the buffer
overflow; however it doesn't correct the wrong behaviour of the floppy
controller which should already have cleared RQM.
Currently, RQM stays set all the time and during all phases while a
command is being processed. This is error-prone because the command has
to explicitly clear the flag if it doesn't need data (and indeed, the
two buggy commands that are the culprits for the CVE just forgot to do
that).
This patch clears RQM immediately as soon as all bytes that are expected
have been received. If the the FIFO is used in the next phase, the flag
has to be set explicitly there.
It also clear RQM after receiving all bytes even if the phase transition
immediately sets it again. While it's technically not necessary at the
moment because the state between clearing and setting RQM is not
observable by the guest, this is more explicit and matches how real
hardware works. It will actually become necessary in qemu once
asynchronous code paths are introduced.
This alone should have been enough to fix the CVE, but now we have two
lines of defense - even better.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-8-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This commit makes similar improvements as have already been made to the
write function: Instead of relying on a flag in the MSR to distinguish
controller phases, use the explicit phase that we store now. Assertions
of the right MSR flags are added.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-7-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Factor out a few common lines of code, reformat, improve comments.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-6-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Instead of relying on a flag in the MSR to distinguish controller phases,
use the explicit phase that we store now. Assertions of the right MSR
flags are added.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-5-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
The floppy controller spec describes three different controller phases,
which are currently not explicitly modelled in our emulation. Instead,
each phase is represented by a combination of flags in registers.
This patch makes explicit in which phase the controller currently is.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-4-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
What callers really do with this function is to switch from execution
phase (including data transfers) to result phase where the guest can
read out one or more status bytes from the FIFO (the number depends on
the command).
Rename the function accordingly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-3-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
What all callers of fdctrl_reset_fifo() really want to do is to start
the command phase, where writes to the data port initiate a new command.
The function doesn't only clear the FIFO, but also sets up the state so
that a new command can be received. Rename it to reflect this.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1432214378-31891-2-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVbWZHAAoJEDhwtADrkYZTQ6oQAIv3FRNj0buUSwh8Gwx8NrxJ
DHz/NgdtoRV/olk2ZPYSmKUooTdKFcH1XsR7oYBoznh2m/CgiQXGeJQ7tTTcrWAj
hw4vt4Uq0bMGPNFYNOBgsVC5cejiZo53YfXybRBLAcUJvOlbnPVJ+g7ru8mg3qaC
rX0ZI8Cu0QmQWsGXHuKArVRGSsc+CN4SbXWyI4WmMH3g9OzTVjKVDkZekcU3NbXj
GZq1mJBeVUsDT6wzRLGXe3qxkDAhK9THLoRd999/aQDWSpvUabNtotyZew5JG5Ey
WKkmhSZsi0RgJTuPUcf5i83QKrWVN8EaADn67J5GkrnYYVPLdWuk/PhRgFiiWlVk
mHFxEVyfQL7neTbb8dcwcREHIaSSb4BBF4+dyFk5921Idxs3kzw/5l20SwNi6bQy
y23Tq/tBvbsie89O1gdXipZm3pz9xM9/9FtODjGnnVO9zb9Fi9TAyMMi8RmiS15k
YL649rCgk9cwKOlwsMLSUajXGq0G1cOzou7M5DeDOLw5db5YEMMZkcpmETKNLaXk
DWiX1E1K6PsDHDQkAipi9lmP9izVmpfXS4EWNVLaVCP3ht27VHgsZx4TaAAr8AVD
NdA4wq/GEcUMcJI+erJe3YxZgPofO+Ja8whwyqwhvLM4g6g8gQgy7PNKM+FodXY1
Z5+Ou90aprDpWjPIUHOv
=ySet
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2015-06-02' into staging
Monitor patches
# gpg: Signature made Tue Jun 2 09:16:07 2015 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
* remotes/armbru/tags/pull-monitor-2015-06-02: (21 commits)
monitor: Change return type of monitor_cur_is_qmp() to bool
monitor: Rename monitor_ctrl_mode() to monitor_is_qmp()
monitor: Turn int command_mode into bool in_command_mode
monitor: Drop do_qmp_capabilities()'s superfluous QMP check
monitor: Unbox Monitor member mc and rename to qmp
monitor: Rename monitor_control_read(), monitor_control_event()
monitor: Rename handle_user_command() to handle_hmp_command()
monitor: Limit QError use to command handlers
monitor: Inline monitor_has_error() into its only caller
monitor: Wean monitor_protocol_emitter() off mon->error
monitor: Propagate errors through invalid_qmp_mode()
monitor: Propagate errors through qmp_check_input_obj()
monitor: Propagate errors through qmp_check_client_args()
monitor: Drop unused "new" HMP command interface
monitor: Use trad. command interface for HMP pcie_aer_inject_error
monitor: Use traditional command interface for HMP device_add
monitor: Use traditional command interface for HMP drive_del
monitor: Convert client_migrate_info to QAPI
monitor: Improve and document client_migrate_info protocol error
monitor: Clean up after previous commit
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
... by default. Add a per-device "permissive" mode similar to pciback's
to allow restoring previous behavior (and hence break security again,
i.e. should be used only for trusted guests).
This is part of XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>)
Since the next patch will turn all not explicitly described fields
read-only by default, those fields that have guest writable bits need
to be given explicit descriptors.
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
The adjustments are solely to make the subsequent patches work right
(and hence make the patch set consistent), namely if permissive mode
(introduced by the last patch) gets used (as both reserved registers
and reserved fields must be similarly protected from guest access in
default mode, but the guest should be allowed access to them in
permissive mode).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
xen_pt_emu_reg_pcie[]'s PCI_EXP_DEVCAP needs to cover all bits as read-
only to avoid unintended write-back (just a precaution, the field ought
to be read-only in hardware).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
This is just to avoid having to adjust that calculation later in
multiple places.
Note that including ->ro_mask in get_throughable_mask()'s calculation
is only an apparent (i.e. benign) behavioral change: For r/o fields it
doesn't matter > whether they get passed through - either the same flag
is also set in emu_mask (then there's no change at all) or the field is
r/o in hardware (and hence a write won't change it anyway).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
xen_pt_pmcsr_reg_write() needs an adjustment to deal with the RW1C
nature of the not passed through bit 15 (PCI_PM_CTRL_PME_STATUS).
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
There's no point in xen_pt_pmcsr_reg_{read,write}() each ORing
PCI_PM_CTRL_STATE_MASK and PCI_PM_CTRL_NO_SOFT_RESET into a local
emu_mask variable - we can have the same effect by setting the field
descriptor's emu_mask member suitably right away. Note that
xen_pt_pmcsr_reg_write() is being retained in order to allow later
patches to be less intrusive.
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Without this the actual XSA-131 fix would cause the enable bit to not
get set anymore (due to the write back getting suppressed there based
on the OR of emu_mask, ro_mask, and res_mask).
Note that the fiddling with the enable bit shouldn't really be done by
qemu, but making this work right (via libxc and the hypervisor) will
require more extensive changes, which can be postponed until after the
security issue got addressed.
This is a preparatory patch for XSA-131.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Limit error messages resulting from bad guest behavior to avoid allowing
the guest to cause the control domain's disk to fill.
The first message in pci_msix_write() can simply be deleted, as this
is indeed bad guest behavior, but such out of bounds writes don't
really need to be logged.
The second one is more problematic, as there guest behavior may only
appear to be wrong: For one, the old logic didn't take the mask-all bit
into account. And then this shouldn't depend on host device state (i.e.
the host may have masked the entry without the guest having done so).
Plus these writes shouldn't be dropped even when an entry is unmasked.
Instead, if they can't be made take effect right away, they should take
effect on the next unmasking or enabling operation - the specification
explicitly describes such caching behavior. Until we can validly drop
the message (implementing such caching/latching behavior), issue the
message just once per MSI-X table entry.
Note that the log message in pci_msix_read() similar to the one being
removed here is not an issue: "addr" being of unsigned type, and the
maximum size of the MSI-X table being 32k, entry_nr simply can't be
negative and hence the conditonal guarding issuing of the message will
never be true.
This is XSA-130.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
It's being used by the hypervisor. For now simply mimic a device not
capable of masking, and fully emulate any accesses a guest may issue
nevertheless as simple reads/writes without side effects.
This is XSA-129.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The old logic didn't work as intended when an access spanned multiple
fields (for example a 32-bit access to the location of the MSI Message
Data field with the high 16 bits not being covered by any known field).
Remove it and derive which fields not to write to from the accessed
fields' emulation masks: When they're all ones, there's no point in
doing any host write.
This fixes a secondary issue at once: We obviously shouldn't make any
host write attempt when already the host read failed.
This is XSA-128.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
After introduction of kvm_arch_msi_data_to_gsi, kvm_gsi_direct_mapping
now can be set on ARM. Also kvm_msi_via_irqfd_allowed can be set,
depending on kernel irqfd support, hence enabling VIRTIO-PCI with
vhost back-end.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The masked interrupt status register should be the state of the interrupt
after masking.
There should be a logical AND instead of a logical OR between the
interrupt status and the interrupt mask.
Signed-off-by: Victor CLEMENT <victor.clement@openwide.fr>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1433154824-6927-1-git-send-email-victor.clement@openwide.fr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a GICv2m device to the virt board to enable MSIs on the generic PCI
host controller. We allocate 64 SPIs in the IRQ space for now (this can
be increased/decreased later) and map the GICv2m right after the GIC in
the memory map.
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1432897270-7780-5-git-send-email-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation for adding the GICv2m which requires address specifiers
and is a subnode of the gic, we extend the gic DT definition to specify
the #address-cells and #size-cells properties and add an empty ranges
property properties of the DT node, since this is required to add the
v2m node as a child of the gic node.
Note that we must also expand the irq-map to reference the gic with the
right address-cells as a consequence of this change.
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1432897270-7780-4-git-send-email-christoffer.dall@linaro.org
Suggested-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ARM GICv2m widget is a little device that handles MSI interrupt
writes to a trigger register and ties them to a range of interrupt lines
wires to the GIC. It has a few status/id registers and the interrupt wires,
and that's about it.
A board instantiates the device by setting the base SPI number and
number SPIs for the frame. The base-spi parameter is indexed in the SPI
number space only, so base-spi == 0, means IRQ number 32. When a device
(the PCI host controller) writes to the trigger register, the payload is
the GIC IRQ number, so we have to subtract 32 from that and then index
into our frame of SPIs.
When instantiating a GICv2m device, tell PCI that we have instantiated
something that can deal with MSIs. We rely on the board actually wiring
up the GICv2m to the PCI host controller.
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1432897270-7780-3-git-send-email-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instead of passing the GIC phandle around between functions, add it to
the VirtBoardInfo just like we do for the clock_phandle. We are about
to add the v2m phandle as well, and it's easier not having to pass
around a bunch of phandles, return multiple values from functions, etc.
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1432897270-7780-2-git-send-email-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
All QMP commands use the "new" handler interface (mhandler.cmd_new).
Most HMP commands still use the traditional interface (mhandler.cmd),
but a few use the "new" one. Complicates handle_user_command() for no
gain, so I'm converting these to the traditional interface.
pcie_aer_inject_error's implementation is split into the
hmp_pcie_aer_inject_error() and pcie_aer_inject_error_print(). The
former is a peculiar crossbreed between HMP and QMP handler. On
success, it works like a QMP handler: store QDict through ret_data
parameter, return 0. Printing the QDict is left to
pcie_aer_inject_error_print(). On failure, it works more like an HMP
handler: print error to monitor, return negative number.
To convert to the traditional interface, turn
pcie_aer_inject_error_print() into a command handler wrapping around
hmp_pcie_aer_inject_error(). By convention, this command handler
should be called hmp_pcie_aer_inject_error(), so rename the existing
hmp_pcie_aer_inject_error() to do_pcie_aer_inject_error().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
commit 5cb18b3d7b
TPM2 ACPI table support
was missing a file, so build with iasl fails
(build without iasl works since it uses the generated
hex files).
Reported-by: "Daniel P. Berrange" <berrange@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Based on patch by Nikolay Nikolaev:
Vhost-user will implement the multi queue support in a similar way
to what vhost already has - a separate thread for each queue.
To enable the multi queue functionality - a new command line parameter
"queues" is introduced for the vhost-user netdev.
Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make features 64bit wide everywhere.
On migration a full 64bit guest_features field is sent if one of the
high bits is set, in addition to the lower 32bit guest_features field
which must stay for compatibility reasons. That way we send the lower
32 feature bits twice, but the code is simpler because we don't have
to split and compose the 64bit features into two 32bit fields.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Needed for virtio features which go from 32bit to 64bit with virtio 1.0
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
set_host_notifier and set_guest_notifiers supported by virtio-mmio now.
Most code copied from virtio-pci.
This makes it possible to use vhost-net with virtio-mmio,
improving performance by about 30%.
The kvm-arm does not yet support irqfd, need to fix the hard-coded part after
kvm-arm gets irqfd support.
Signed-off-by: Ying-Shiuan Pan <yingshiuan.pan@gmail.com>
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Add encoding for ACPI DefIncrement Opcode.
Reviewed-by: Shannon Zhao <zhaoshenglong@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add encoding for ACPI DefShiftRight Opcode.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>