qemu-irix/hw/ppc
Daniel Henrique Barboza 62695f60c3 hw/ppc: CAS reset on early device hotplug
This patch is a follow up on the discussions made in patch
"hw/ppc: disable hotplug before CAS is completed" that can be
found at [1].

At this moment, we do not support CPU/memory hotplug in early
boot stages, before CAS. When a hotplug occurs, the event is logged
in an internal RTAS event log queue and an IRQ pulse is fired. In
regular conditions, the guest handles the interrupt by executing
check_exception, fetching the generated hotplug event and enabling
the device for use.

In early boot, this IRQ isn't caught (SLOF does not handle hotplug
events), leaving the event in the rtas event log queue. If the guest
executes check_exception due to another hotplug event, the re-assertion
of the IRQ ends up de-queuing the first hotplug event as well. In short,
a device hotplugged before CAS is considered coldplugged by SLOF.
This leads to device misbehavior and, in some cases, guest kernel
Ooops when trying to unplug the device.

A proper fix would be to turn every device hotplugged before CAS
as a colplugged device. This is not trivial to do with the current
code base though - the FDT is written in the guest memory at
ppc_spapr_reset and can't be retrieved without adding extra state
(fdt_size for example) that will need to managed and migrated. Adding
the hotplugged DT in the middle of CAS negotiation via the updated DT
tree works with CPU devs, but panics the guest kernel at boot. Additional
analysis would be necessary for LMBs and PCI devices. There are
questions to be made in QEMU/SLOF/kernel level about how we can make
this change in a sustainable way.

With Linux guests, a fix would be the kernel executing check_exception
at boot time, de-queueing the events that happened in early boot and
processing them. However, even if/when the newer kernels start
fetching these events at boot time, we need to take care of older
kernels that won't be doing that.

This patch works around the situation by issuing a CAS reset if a hotplugged
device is detected during CAS:

- the DRC conditions that warrant a CAS reset is the same as those that
triggers a DRC migration - the DRC must have a device attached and
the DRC state is not equal to its ready_state. With that in mind, this
patch makes use of 'spapr_drc_needed' to determine if a CAS reset
is needed.

- In the middle of CAS negotiations, the function
'spapr_hotplugged_dev_before_cas' goes through all the DRCs to see
if there are any DRC that requires a reset, using spapr_drc_needed. If
that happens, returns '1' in 'spapr_h_cas_compose_response' which will set
spapr->cas_reboot to true, causing the machine to reboot.

No changes are made for coldplug devices.

[1] http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg02855.html

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 10f12e6450)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-10-03 17:40:40 -05:00
..
Makefile.objs ppc/pnv: add initial IPMI sensors for the BMC simulator 2017-04-26 12:41:56 +10:00
e500-ccsr.h
e500.c shutdown: Add source information to SHUTDOWN and RESET 2017-05-23 13:28:17 +02:00
e500.h target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlb 2016-07-01 09:57:01 +10:00
e500plat.c dma: do not depend on kvm_enabled() 2016-05-19 16:42:28 +02:00
fdt.c Move target-* CPU file into a target/ folder 2016-12-20 21:52:12 +01:00
mac.h Clean up ill-advised or unusual header guards 2016-07-12 16:20:46 +02:00
mac_newworld.c hw: Use new memory_region_init_{ram, rom, rom_device}() functions 2017-07-14 17:59:42 +01:00
mac_oldworld.c hw: Use new memory_region_init_{ram, rom, rom_device}() functions 2017-07-14 17:59:42 +01:00
mpc8544_guts.c shutdown: Add source information to SHUTDOWN and RESET 2017-05-23 13:28:17 +02:00
mpc8544ds.c
pnv.c Convert error_report() to warn_report() 2017-07-13 13:49:58 +02:00
pnv_bmc.c ppc/pnv: generate an OEM SEL event on shutdown 2017-04-26 12:41:56 +10:00
pnv_core.c pnv-core: use get_uint() for "core-pir" property 2017-06-20 14:31:33 +02:00
pnv_lpc.c ppc/pnv: enable only one LPC bus 2017-04-26 12:41:55 +10:00
pnv_occ.c ppc/pnv: Add OCC model stub with interrupt support 2017-04-26 12:00:42 +10:00
pnv_psi.c xics: introduce macros for ICP/ICS link properties 2017-06-09 12:12:34 +10:00
pnv_xscom.c kvm: move cpu synchronization code 2017-01-19 22:07:46 +01:00
ppc.c shutdown: Add source information to SHUTDOWN and RESET 2017-05-23 13:28:17 +02:00
ppc4xx_devs.c qemu-common: push cpu.h inclusion out of qemu-common.h 2016-05-19 16:42:29 +02:00
ppc4xx_pci.c qdev: Replace cannot_instantiate_with_device_add_yet with !user_creatable 2017-05-17 10:37:00 -03:00
ppc405.h Remove unused function declarations 2016-09-15 15:32:22 +03:00
ppc405_boards.c hw: Use new memory_region_init_{ram, rom, rom_device}() functions 2017-07-14 17:59:42 +01:00
ppc405_uc.c hw: Use new memory_region_init_{ram, rom, rom_device}() functions 2017-07-14 17:59:42 +01:00
ppc440_bamboo.c target-ppc: Add MMU model check for booke machines 2017-02-02 09:30:06 +11:00
ppc_booke.c ppc_booke: drop useless assignment 2017-05-07 09:57:51 +03:00
ppce500_spin.c hw/ppc: QOM'ify ppce500_spin.c 2017-01-31 10:10:13 +11:00
prep.c hw/ppc/prep: Remove superfluous call to soundhw_init() 2017-06-30 14:03:31 +10:00
prep_systemio.c prep: add PReP System I/O 2017-01-31 10:10:13 +11:00
rs6000_mc.c prep: add IBM RS/6000 7020 (40p) memory controller 2017-01-31 10:10:13 +11:00
spapr.c hw/ppc: CAS reset on early device hotplug 2017-10-03 17:40:40 -05:00
spapr_cpu_core.c spapr: prevent QEMU crash when CPU realization fails 2017-06-30 14:03:31 +10:00
spapr_drc.c hw/ppc: CAS reset on early device hotplug 2017-10-03 17:40:40 -05:00
spapr_events.c spapr: Minor cleanups to events handling 2017-07-17 15:07:05 +10:00
spapr_hcall.c spapr: Fix bug in h_signal_sys_reset() 2017-08-09 14:04:28 +10:00
spapr_iommu.c hw/ppc/spapr_iommu: Fix crash when removing the "spapr-tce-table" device 2017-08-22 21:26:46 +10:00
spapr_ovec.c spapr: replace debug printf with trace points 2017-02-22 11:28:28 +11:00
spapr_pci.c spapr_pci: Fix obsolete comment about MSIX encoding in addr/data 2017-07-25 11:14:25 +10:00
spapr_pci_vfio.c Use #include "..." for our own headers, <...> for others 2016-07-12 16:19:16 +02:00
spapr_rng.c spapr_rng: Convert to DEFINE_PROP_LINK 2017-07-14 12:04:43 +02:00
spapr_rtas.c pseries: Correct panic behaviour for pseries machine type 2017-06-08 14:38:18 +10:00
spapr_rtas_ddw.c spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) 2016-07-05 14:31:08 +10:00
spapr_rtc.c hw/ppc/spapr_rtc: Mark the RTC device with user_creatable = false 2017-08-22 21:26:46 +10:00
spapr_vio.c vmstate: error hint for failed equal checks 2017-06-28 11:18:44 +02:00
trace-events trace-events: fix code style: print 0x before hex numbers 2017-08-01 12:13:07 +01:00
virtex_ml507.c target-ppc: Add MMU model check for booke machines 2017-02-02 09:30:06 +11:00