qemu-irix

Commit Graph

Author	SHA1	Message	Date
David Gibson	b7b0b1f13a	target/ppc: Merge cpu_ppc_set_vhyp() with cpu_ppc_set_papr() cpu_ppc_set_papr() sets up various aspects of CPU state for use with PAPR paravirtualized guests. However, it doesn't set the virtual hypervisor, so callers must also call cpu_ppc_set_vhyp() so that PAPR hypercalls are handled properly. This is a bit silly, so fold setting the virtual hypervisor into cpu_ppc_set_papr(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>	2017-03-01 11:23:39 +11:00
David Gibson	c6404adebf	pseries: Minor cleanups to HPT management hypercalls * Standardize on 'ptex' instead of 'pte_index' for HPTE index variables for consistency and brevity * Avoid variables named 'index'; shadowing index(3) from libc can lead to surprising bugs if the variable is removed, because compiler errors might not appear for remaining references * Clarify index calculations in h_enter() - we have two cases, H_EXACT where the exact HPTE slot is given, and !H_EXACT where we search for an empty slot within the hash bucket. Make the calculation more consistent between the cases. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>	2017-03-01 11:23:39 +11:00
Greg Kurz	6244bb7e58	sysemu: support up to 1024 vCPUs Some systems can already provide more than 255 hardware threads. Bumping the QEMU limit to 1024 seems reasonable: - it has no visible overhead in top; - the limit itself has no effect on hot paths. Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Laurent Vivier	2530a1a5cf	spapr: generate DT node names When DT node names for PCI devices are generated by SLOF, they are generated according to the type of the device (for instance, ethernet for virtio-net-pci device). Node name for hotplugged devices is generated by QEMU. This patch adds the mechanic to QEMU to create the node name according to the device type too. The data structure has been roughly copied from OpenBIOS/OpenHackware, node names from SLOF. Example: Hotplugging some PCI cards with QEMU monitor: device_add virtio-tablet-pci device_add virtio-serial-pci device_add virtio-mouse-pci device_add virtio-scsi-pci device_add virtio-gpu-pci device_add ne2k_pci device_add nec-usb-xhci device_add intel-hda What we can see in linux device tree: for dir in /proc/device-tree/pci@800000020000000/@/; do echo $dir cat $dir/name echo done WITHOUT this patch: /proc/device-tree/pci@800000020000000/pci@0/ pci /proc/device-tree/pci@800000020000000/pci@1/ pci /proc/device-tree/pci@800000020000000/pci@2/ pci /proc/device-tree/pci@800000020000000/pci@3/ pci /proc/device-tree/pci@800000020000000/pci@4/ pci /proc/device-tree/pci@800000020000000/pci@5/ pci /proc/device-tree/pci@800000020000000/pci@6/ pci /proc/device-tree/pci@800000020000000/pci@7/ pci WITH this patch: /proc/device-tree/pci@800000020000000/communication-controller@1/ communication-controller /proc/device-tree/pci@800000020000000/display@4/ display /proc/device-tree/pci@800000020000000/ethernet@5/ ethernet /proc/device-tree/pci@800000020000000/input-controller@0/ input-controller /proc/device-tree/pci@800000020000000/mouse@2/ mouse /proc/device-tree/pci@800000020000000/multimedia-device@7/ multimedia-device /proc/device-tree/pci@800000020000000/scsi@3/ scsi /proc/device-tree/pci@800000020000000/usb-xhci@6/ usb-xhci Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-03-01 11:23:39 +11:00
Peter Maydell	28f997a82c	This is the MTTCG pull-request as posted yesterday. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJYsBZfAAoJEPvQ2wlanipElJ0H+QGoStSPeHrvKu7Q07v4F9zM Pvf05gRsaxvXl7UbwmXC4oKhvZf9rVJ6ITk0x/y0WvmK0mHCmNBWkC0nn5UFL5IH cdxetLz21Q+Ghpc36tZvqn2HYwRQFoEznge2LdtBDG0TyVA4jwquHU3HCG2D51zi BaImI6lYW1e4ejjZHw8cEInSxsj/HJZE4pPas2Tkci+uAnrJroErwBVRRcE/y/Tn aupl9TJFs2JdyJFNDibIm0kjB+i+jvCiLgYjbKZ/dR/+GZt73TtiBk/q9ZOFjdmT 7YFPI3F46QbGHoZahtzh0Xt7WMj94SlQgQ9OJ3zmNMfpXrze6Yc78xo/nbQ33U0= =wR0/ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-240217-1' into staging This is the MTTCG pull-request as posted yesterday. # gpg: Signature made Fri 24 Feb 2017 11:17:51 GMT # gpg: using RSA key 0xFBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-mttcg-240217-1: (24 commits) tcg: enable MTTCG by default for ARM on x86 hosts hw/misc/imx6_src: defer clearing of SRC_SCR reset bits target-arm: ensure all cross vCPUs TLB flushes complete target-arm: don't generate WFE/YIELD calls for MTTCG target-arm/powerctl: defer cpu reset work to CPU context cputlb: introduce tlb_flush__all_cpus[_synced] cputlb: atomically update tlb fields used by tlb_reset_dirty cputlb: add tlb_flush_by_mmuidx async routines cputlb and arm/sparc targets: convert mmuidx flushes from varg to bitmap cputlb: introduce tlb_flush_ async work. cputlb: tweak qemu_ram_addr_from_host_nofail reporting cputlb: add assert_cpu_is_self checks tcg: handle EXCP_ATOMIC exception for system emulation tcg: enable thread-per-vCPU tcg: enable tb_lock() for SoftMMU tcg: remove global exit_request tcg: drop global lock during TCG code execution tcg: rename tcg_current_cpu to tcg_current_rr_cpu tcg: add kick timer for single-threaded vCPU emulation tcg: add options for enabling MTTCG ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-02-25 18:43:52 +00:00
Jan Kiszka	8d04fb55de	tcg: drop global lock during TCG code execution This finally allows TCG to benefit from the iothread introduction: Drop the global mutex while running pure TCG CPU code. Reacquire the lock when entering MMIO or PIO emulation, or when leaving the TCG loop. We have to revert a few optimization for the current TCG threading model, namely kicking the TCG thread in qemu_mutex_lock_iothread and not kicking it in qemu_cpu_kick. We also need to disable RAM block reordering until we have a more efficient locking mechanism at hand. Still, a Linux x86 UP guest and my Musicpal ARM model boot fine here. These numbers demonstrate where we gain something: 20338 jan 20 0 331m 75m 6904 R 99 0.9 0:50.95 qemu-system-arm 20337 jan 20 0 331m 75m 6904 S 20 0.9 0:26.50 qemu-system-arm The guest CPU was fully loaded, but the iothread could still run mostly independent on a second core. Without the patch we don't get beyond 32206 jan 20 0 330m 73m 7036 R 82 0.9 1:06.00 qemu-system-arm 32204 jan 20 0 330m 73m 7036 S 21 0.9 0:17.03 qemu-system-arm We don't benefit significantly, though, when the guest is not fully loading a host CPU. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Message-Id: <1439220437-23957-10-git-send-email-fred.konrad@greensocs.com> [FK: Rebase, fix qemu_devices_reset deadlock, rm address_space_* mutex] Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com> [EGC: fixed iothread lock for cpu-exec IRQ handling] Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: -smp single-threaded fix, clean commit msg, BQL fixes] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Pranith Kumar <bobby.prani@gmail.com> [PM: target-arm changes] Acked-by: Peter Maydell <peter.maydell@linaro.org>	2017-02-24 10:32:45 +00:00
Peter Maydell	fb6971c110	hw/ppc/ppc405_uc.c: Avoid integer overflows When performing clock calculations, the ppc405_uc code has several places where it multiplies together two 32-bit variables and assigns the result to a 64-bit variable. This doesn't quite do what is intended because C will compute a 32-bit multiply result. Add casts to ensure we don't truncate the result. (Spotted by Coverity, CID 1005504, 1005505.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 14:28:53 +11:00
Thomas Huth	df58713396	hw/ppc/spapr: Check for valid page size when hot plugging memory On POWER, the valid page sizes that the guest can use are bound to the CPU and not to the memory region. QEMU already has some fancy logic to find out the right maximum memory size to tell it to the guest during boot (see getrampagesize() in the file target/ppc/kvm.c for more information). However, once we're booted and the guest is using huge pages already, it is currently still possible to hot-plug memory regions that does not support huge pages - which of course does not work on POWER, since the guest thinks that it is possible to use huge pages everywhere. The KVM_RUN ioctl will then abort with -EFAULT, QEMU spills out a not very helpful error message together with a register dump and the user is annoyed that the VM unexpectedly died. To avoid this situation, we should check the page size of hot-plugged DIMMs to see whether it is possible to use it in the current VM. If it does not fit, we can print out a better error message and refuse to add it, so that the VM does not die unexpectely and the user has a second chance to plug a DIMM with a matching memory backend instead. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1419466 Signed-off-by: Thomas Huth <thuth@redhat.com> [dwg: Fix a build error on 32-bit builds with KVM] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 14:28:53 +11:00
Igor Mammedov	c5514d0e4b	machine: replace query_hotpluggable_cpus() callback with has_hotpluggable_cpus flag Generic helper machine_query_hotpluggable_cpus() replaced target specific query_hotpluggable_cpus() callbacks so there is no need in it anymore. However inon NULL callback value is used to detect/report hotpluggable cpus support, therefore it can be removed completely. Replace it with MachineClass.has_hotpluggable_cpus boolean which is sufficient for the task. Suggested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Igor Mammedov	f2d672c248	machine: unify [pc_\|spapr_]query_hotpluggable_cpus() callbacks All callbacks FOO_query_hotpluggable_cpus() are practically the same except of setting vcpus_count to different values. Convert them to a generic machine_query_hotpluggable_cpus() callback by moving vcpus_count initialization to per machine specific callback possible_cpu_arch_ids(). Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Igor Mammedov	535455fdee	spapr: reuse machine->possible_cpus instead of cores[] Replace SPAPR specific cores[] array with generic machine->possible_cpus and store core objects there. It makes cores bookkeeping similar to x86 cpus and will allow to unify similar code. It would allow to replace cpu_index based NUMA node mapping with iproperty based one (for -device created cores) since possible_cpus carries board defined topology/layout. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Laurent Vivier	5b929608b9	spapr: replace debug printf with trace points Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Laurent Vivier	f4af7d4438	ppc4xx: replace debug printf with trace points Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Laurent Vivier	5283c27fc5	mac99: replace debug printf with trace points Signed-off-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:28 +11:00
Sam Bobroff	fe93e3e6ec	spapr: fix off-by-one error in spapr_ovec_populate_dt() The last byte of the option vector was missing due to an off-by-one error. Without this fix, client architecture support negotiation will fail because the last byte of option vector 5, which contains the MMU support, will be missed. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:27 +11:00
Thomas Huth	802fc7abd0	hw/ppc/pnv: Remove superfluous "qemu" prefix from error strings error_report() already puts a prefix with the program name in front of the error strings, so the "qemu:" prefix is not necessary here anymore. Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:27 +11:00
Igor Mammedov	115debf26c	spapr: make cpu core unplug follow expected hotunplug call flow spapr_core_unplug() were essentially spapr_core_unplug_request() handler that requested CPU removal and registered callback which did actual cpu core removali but it was called from spapr_machine_device_unplug() which is intended for actual object removal. Commit (`cf632463` spapr: Memory hot-unplug support) sort of fixed it introducing spapr_machine_device_unplug_request() and calling spapr_core_unplug() but it hasn't renamed callback and by mistake calls it from spapr_machine_device_unplug(). However spapr_machine_device_unplug() isn't ever called for cpu core since spapr_core_release() doesn't follow expected hotunplug call flow which is: 1: device_del() -> hotplug_handler_unplug_request() -> set destroy_cb() 2: destroy_cb() -> hotplug_handler_unplug() -> object_unparent // actual device removal Fix it by renaming spapr_core_unplug() to spapr_core_unplug_request() which is called from spapr_machine_device_unplug_request() and making spapr_core_release() call hotplug_handler_unplug() which will call spapr_machine_device_unplug() -> spapr_core_unplug() to remove cpu core. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reveiwed-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:27 +11:00
Igor Mammedov	ff9006ddbf	spapr: move spapr_core_[foo]plug() callbacks close to machine code in spapr.c spapr_core_pre_plug/spapr_core_plug/spapr_core_unplug() are managing wiring CPU core into spapr machine state and not internal CPU core state. So move them from spapr_cpu_core.c to spapr.c where other similar (spapr_memory_[foo]plug()) callbacks are located, which also matches x86 target practice. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:27 +11:00
Igor Mammedov	f844616bf6	spapr: cpu core: separate child threads destruction from machine state operations Split off destroying VCPU threads from drc callback spapr_core_release() into new spapr_cpu_core_unrealizefn() which takes care of internal cpu core state cleanup (i.e. VCPU threads) and is called when object_unparent(core) is called. That leaves spapr_core_release() only with board mgmt code, which will be moved to board related file in follow up patch along with the rest on hotplug callbacks. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-22 11:28:27 +11:00
Markus Armbruster	2059839baa	hw: Default -drive to if=ide explicitly where it works Block backends defined with -drive if=ide are meant to be picked up by machine initialization code: a suitable frontend gets created and wired up automatically. if=ide drives not picked up that way can still be used with -device as if they had if=none, but that's unclean and best avoided. Unused ones produce an "Orphaned drive without device" warning. -drive parameter "if" is optional, and the default depends on the machine type. If a machine type doesn't specify a default, the default is "ide". Many machine types default to if=ide, even though they don't actually have an IDE controller. A future patch will change these defaults to something more sensible. To prepare for it, this patch makes default "ide" explicit for the machines that actually pick up if=ide drives: * alpha: clipper * arm/aarch64: spitz borzoi terrier tosa * i386/x86_64: generic-pc-machine (with concrete subtypes pc-q35-* pc-i440fx-* pc-* isapc xenfv) * mips64el: fulong2e * mips/mipsel/mips64el: malta mips * ppc/ppc64: mac99 g3beige prep * sh4/sh4eb: r2d * sparc64: sun4u sun4v Note that ppc64 machine powernv already sets an "ide" default explicitly. Its IDE controller isn't implemented, yet. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <1487153147-11530-2-git-send-email-armbru@redhat.com>	2017-02-21 13:10:53 +01:00
Anton Nefedov	c86f106b85	report guest crash information in GUEST_PANICKED event it's not very convenient to use the crash-information property interface, so provide a CPU class callback to get the guest crash information, and pass that information in the event Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Message-Id: <1487053524-18674-3-git-send-email-den@openvz.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-02-16 15:30:49 +01:00
Thomas Huth	7c6e879733	hw/ppc/pnv: Use error_report instead of hw_error if a ROM file can't be found hw_error() is for CPU related errors only (it dumps the CPU registers and calls abort()!), so using error_report() is the better choice of reporting an error in case we simply did not find a file. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-02 09:30:07 +11:00
Valentin Plotkin	00469dc373	target-ppc: Add MMU model check for booke machines Machines bamboo, e500 and virtex-ml507 assume a certain MMU model, otherwise resulting in unpredictable behavior. Add apropriate checks into *_init functions. Signed-off-by: Valentin Plotkin <caliborn@sdf.org> [regarding virtex parts] Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-02-02 09:30:06 +11:00
Michael S. Tsirkin	25e6a11832	ppc: switch to constants within BUILD_BUG_ON We are switching BUILD_BUG_ON to verify that it's parameter is a compile-time constant, and it turns out that some gcc versions (specifically gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609) are not smart enough to figure it out for expressions involving local variables. This is harmless but means that the check is ineffective for these platforms. To fix, replace the variable with macros. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> [dwg: Correct a printf format warning] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 14:04:06 +11:00
Laurent Vivier	42043e4f12	spapr: clock should count only if vm is running This is a port to ppc of the i386 commit: `00f4d64` kvmclock: clock should count only if vm is running We remove timebase_post_load function, and use the VM state change handler to save and restore the guest_timebase (on stop and continue). We keep timebase_pre_save to reduce the clock difference on migration like in: `6053a86` kvmclock: reduce kvmclock difference on migration Time base offset has originally been introduced by commit `98a8b52` spapr: Add support for time base offset migration So while VM is paused, the time is stopped. This allows to have the same result with date (based on Time Base Register) and hwclock (based on "get-time-of-day" RTAS call). Moreover in TCG mode, the Time Base is always paused, so this patch also adjust the behavior between TCG and KVM. VM state field "time_of_the_day_ns" is now useless but we keep it to be able to migrate to older version of the machine. As vmstate_ppc_timebase structure (with timebase_pre_save() and timebase_post_load() functions) was only used by vmstate_spapr, we register the VM state change handler only in ppc_spapr_init(). Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:14 +11:00
Thomas Huth	d9d6e78ea8	ppc: Remove unused function cpu_ppc601_rtc_init() It is completely unused, thus it can be removed without problems. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:14 +11:00
Roman Kapl	0dfe952dc5	ppc: Prevent inifnite loop in decrementer auto-reload. If the DECAR register is set to 0, QEMU tries to reload the decrementer with zero in an inifinite loop. According to PPC documentation, the decrementer is triggered on 1->0 transition, so avoid reloading the decrementer if if is already zero. The problem does not manifest under Linux, but it is valid to set DECAR to zero (and may make sense as part of decrementer initialization when interrupts are disabled). Signed-off-by: Roman Kapl <rka@sysgo.com> [dwg: Fixed style nit] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:14 +11:00
David Gibson	f6f242c757	ppc: Add ppc_set_compat_all() Once a compatiblity mode is negotiated with the guest, h_client_architecture_support() uses run_on_cpu() to update each CPU to the new mode. We're going to want this logic somewhere else shortly, so make a helper function to do this global update. We put it in target-ppc/compat.c - it makes as much sense at the CPU level as it does at the machine level. We also move the cpu_synchronize_state() into ppc_set_compat(), since it doesn't really make any sense to call that without synchronizing state. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:14 +11:00
David Gibson	152ef803ce	pseries: Rewrite CAS PVR compatibility logic During boot, PAPR guests negotiate CPU model support with the ibm,client-architecture-support mechanism. The logic to implement this in qemu is very convoluted. This cleans it up to be cleaner, using the new ppc_check_compat() call. The new logic for choosing a compatibility mode is: 1. Usually, use the most recent compatibility mode that is a) supported by the guest b) supported by the CPU and c) no later than the maximum allowed (if specified) 2. If no suitable compatibility mode was found, the guest does support this CPU explicitly, and no maximum compatibility mode is specified, then use "raw" mode for the current CPU 3. Otherwise, fail the boot. This differs from the results of the old code: the old code preferred using "raw" mode to a compatibility mode, whereas the new code prefers a compatibility mode if available. Using compatibility mode preferentially means that we're more likely to be able to migrate the guest to a similar but not identical host. Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:14 +11:00
Hervé Poussineau	34b9b5575b	prep: add IBM RS/6000 7020 (40p) machine emulation Machine supports both Open Hack'Ware and OpenBIOS. Open Hack'Ware is the default because OpenBIOS is currently unable to boot PReP boot partitions or PReP kernels. Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> [dwg: Correct compile failure with KVM located by Thomas Huth] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
Hervé Poussineau	79623312c6	prep: add IBM RS/6000 7020 (40p) memory controller Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Added CONFIG_RS6000_MC to ppc64 or it breaks testcases] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
Hervé Poussineau	d2f8415226	prep: add PReP System I/O This device is a partial duplicate of System I/O device available in hw/ppc/prep.c This new one doesn't have all the Motorola-specific registers. The old one should be deprecated and removed with the 'prep' machine. Partial documentation available at ftp://ftp.software.ibm.com/rs6000/technology/spec/srp1_1.exe section 6.1.5 (I/O Device Mapping) Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
xiaoqiang zhao	0f358a0710	hw/ppc: QOM'ify spapr_vio.c Drop the old and empty SysBus init Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
xiaoqiang zhao	09a7eb978f	hw/ppc: QOM'ify ppce500_spin.c Drop the old SysBus init function and use instance_init Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
xiaoqiang zhao	d0c2b0d089	hw/ppc: QOM'ify e500.c Drop the old SysBus init function and use instance_init Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
David Gibson	12dbeb16d0	ppc: Rewrite ppc_get_compat_smt_threads() To continue consolidation of compatibility mode information, this rewrites the ppc_get_compat_smt_threads() function using the table of compatiblity modes in target-ppc/compat.c. It's not a direct replacement, the new ppc_compat_max_threads() function has simpler semantics - it just returns the number of threads the cpu model has, taking into account any compatiblity mode it is in. This no longer takes into account kvmppc_smt_threads() as the previous version did. That check wasn't useful because we check in ppc_cpu_realizefn() that CPUs aren't instantiated with more threads than kvm allows (or if we didn't things will already be broken and this won't make it any worse). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2017-01-31 10:10:13 +11:00
David Gibson	fa325e6cbf	pseries: Add pseries-2.9 machine type Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com>	2017-01-31 10:10:13 +11:00
Hervé Poussineau	5904bca84e	prep: do not use global variable to access nvram Signed-off-by: Hervé Poussineau <hpoussin@reactos.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
Thomas Huth	b99260ebbb	hw/ppc/spapr: Fix boot path of usb-host storage devices When passing through an USB storage device to a pseries guest, it is currently not possible to automatically boot from the device if the "bootindex" property has been specified, too (e.g. when using "-device nec-usb-xhci -device usb-host,hostbus=1,hostaddr=2,bootindex=0" at the command line). The problem is that QEMU builds a device tree path like "/pci@800000020000000/usb@0/usb-host@1" and passes it to SLOF in the /chosen/qemu,boot-list property. SLOF, however, probes the USB device, recognizes that it is a storage device and thus changes its name to "storage", and additionally adds a child node for the SCSI LUN, so the correct boot path in SLOF is something like "/pci@800000020000000/usb@0/storage@1/disk@101000000000000" instead. So when we detect an USB mass storage device with SCSI interface, we've got to adjust the firmware boot-device path properly that SLOF can automatically boot from the device. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1354177 Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
Nicholas Piggin	1c7ad77e56	ppc/spapr: implement H_SIGNAL_SYS_RESET The H_SIGNAL_SYS_RESET hcall allows a guest CPU to raise a system reset exception on CPUs within the same guest -- all CPUs, all-but-self, or a specific CPU (including self). This has not made its way to a PAPR release yet, but we have an hcall number assigned. H_SIGNAL_SYS_RESET = 0x380 Syntax: hcall(uint64 H_SIGNAL_SYS_RESET, int64 target); Generate a system reset NMI on the threads indicated by target. Values for target: -1 = target all online threads including the caller -2 = target all online threads except for the caller All other negative values: reserved Positive values: The thread to be targeted, obtained from the value of the "ibm,ppc-interrupt-server#s" property of the CPU in the OF device tree. Semantics: - Invalid target: return H_Parameter. - Otherwise: Generate a system reset NMI on target thread(s), return H_Success. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2017-01-31 10:10:13 +11:00
David Gibson	d6e166c082	ppc: Rename cpu_version to compat_pvr The 'cpu_version' field in PowerPCCPU is badly named. It's named after the 'cpu-version' device tree property where it is advertised, but that meaning may not be obvious in most places it appears. Worse, it doesn't even really correspond to that device tree property. The property contains either the processor's PVR, or, if the CPU is running in a compatibility mode, a special "logical PVR" representing which mode. Rename the cpu_version field, and a number of related variables to compat_pvr to make this clearer. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Thomas Huth <thuth@redhat.com>	2017-01-31 10:10:13 +11:00
David Gibson	1d1be34d26	ppc: Clean up and QOMify hypercall emulation The pseries machine type is a bit unusual in that it runs a paravirtualized guest. The guest expects to interact with a hypervisor, and qemu emulates the functions of that hypervisor directly, rather than executing hypervisor code within the emulated system. To implement this in TCG, we need to intercept hypercall instructions and direct them to the machine's hypercall handlers, rather than attempting to perform a privilege change within TCG. This is controlled by a global hook - cpu_ppc_hypercall. This cleanup makes the handling a little cleaner and more extensible than a single global variable. Instead, each CPU to have hypercalls intercepted has a pointer set to a QOM object implementing a new virtual hypervisor interface. A method in that interface is called by TCG when it sees a hypercall instruction. It's possible we may want to add other methods in future. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2017-01-31 10:10:13 +11:00
David Gibson	5b120785e7	pseries: Make cpu_update during CAS unconditional spapr_h_cas_compose_response() includes a cpu_update parameter which controls whether it includes updated information on the CPUs in the device tree fragment returned from the ibm,client-architecture-support (CAS) call. Providing the updated information is essential when CAS has negotiated compatibility options which require different cpu information to be presented to the guest. However, it should be safe to provide in other cases (it will just override the existing data in the device tree with identical data). This simplifies the code by removing the parameter and always providing the cpu update information. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2017-01-31 10:10:13 +11:00
David Gibson	0c86d0fd92	pseries: Always use core objects for CPU construction Currently the pseries machine has two paths for constructing CPUs. On newer machine type versions, which support cpu hotplug, it constructs cpu core objects, which in turn construct CPU threads. For older machine versions it individually constructs the CPU threads. This division is going to make some future changes to the cpu construction harder, so this patch unifies them. Now cpu core objects are always created. This requires some updates to allow core objects to be created without a full complement of threads (since older versions allowed a number of cpus not a multiple of the threads-per-core). Likewise it needs some changes to the cpu core hot/cold plug path so as not to choke on the old machine types without hotplug support. For good measure, we move the cpu construction to its own subfunction, spapr_init_cpus(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org>	2017-01-31 10:10:13 +11:00
Stefan Weil	b12227afb1	hw: Fix typos found by codespell Signed-off-by: Stefan Weil <sw@weilnetz.de> Acked-by: Alistair Francis <alistair.francis@xilinx.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2017-01-24 23:26:52 +03:00
Vincent Palatin	b39466269b	kvm: move cpu synchronization code Move the generic cpu_synchronize_ functions to the common hw_accel.h header, in order to prepare for the addition of a second hardware accelerator. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Message-Id: <f5c3cffe8d520011df1c2e5437bb814989b48332.1484045952.git.vpalatin@chromium.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-01-19 22:07:46 +01:00
Thomas Huth	fcf5ef2ab5	Move target-* CPU file into a target/ folder We've currently got 18 architectures in QEMU, and thus 18 target-xxx folders in the root folder of the QEMU source tree. More architectures (e.g. RISC-V, AVR) are likely to be included soon, too, so the main folder of the QEMU sources slowly gets quite overcrowded with the target-xxx folders. To disburden the main folder a little bit, let's move the target-xxx folders into a dedicated target/ folder, so that target-xxx/ simply becomes target/xxx/ instead. Acked-by: Laurent Vivier <laurent@vivier.eu> [m68k part] Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> [tricore part] Acked-by: Michael Walle <michael@walle.cc> [lm32 part] Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x part] Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [s390x part] Acked-by: Eduardo Habkost <ehabkost@redhat.com> [i386 part] Acked-by: Artyom Tarasenko <atar4qemu@gmail.com> [sparc part] Acked-by: Richard Henderson <rth@twiddle.net> [alpha part] Acked-by: Max Filippov <jcmvbkbc@gmail.com> [xtensa part] Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ppc part] Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> [cris&microblaze part] Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> [unicore32 part] Signed-off-by: Thomas Huth <thuth@redhat.com>	2016-12-20 21:52:12 +01:00
Michael Roth	5c0139a8c2	spapr: fix default DRC state for coldplugged LMBs Currently we set the initial isolation/allocation state for DRCs associated with coldplugged LMBs to ISOLATED/UNUSABLE, respectively, under the assumption that the guest will move this state to UNISOLATED/USABLE. In fact, this is only the case for LMBs added via hotplug. For coldplugged LMBs, the guest actually assumes the initial state to be UNISOLATED/USABLE. In practice, this only becomes an issue when we attempt to unplug one of these LMBs, where the guest kernel will issue an rtas-get-sensor-state call to check that the corresponding DRC is in an USABLE state before it will release the LMB back to QEMU. If the returned state is otherwise, the guest will assume no further action is needed, which bypasses the QEMU-side cleanup that occurs during the USABLE->UNUSABLE transition. This results in LMBs and their corresponding pc-dimm devices to stick around indefinitely. This patch fixes the issue by manually setting DRCs associated with cold-plugged LMBs to UNISOLATED/ALLOCATED, but leaving the hotplug state untouched. As it turns out, this is analogous to the handling for cold-plugged CPUs in spapr_core_plug(). Cc: qemu-ppc@nongnu.org Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-12-01 13:41:00 +11:00
David Gibson	5c4537bded	spapr: Fix 2.7<->2.8 migration of PCI host bridge `daa2369` "spapr_pci: Add a 64-bit MMIO window" subtly broke migration from qemu-2.7 to the current version. It split the device's MMIO window into two pieces for 32-bit and 64-bit MMIO. The patch included backwards compatibility code to convert the old property into the new format. However, the property value was also transferred in the migration stream and compared with a (probably unwise) VMSTATE_EQUAL. So, the "raw" value from 2.7 is compared to the new style converted value from (pre-)2.8 giving a mismatch and migration failure. Along with the actual field that caused the breakage, there are several other ill-advised VMSTATE_EQUAL()s. To fix forwards migration, we read the values in the stream into scratch variables and ignore them, instead of comparing for equality. To fix backwards migration, we populate those scratch variables in pre_save() with adjusted values to match the old behaviour. To permit the eventual possibility of removing this cruft from the stream, we only include these compatibility fields if a new 'pre-2.8-migration' property is set. We clear it on the pseries-2.8 machine type, which obviously can't be migrated backwards, but set it on earlier machine type versions. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-11-23 12:00:48 +11:00
David Gibson	5a78b821eb	Revert "spapr: Fix migration of PCI host bridges from qemu-2.7" This reverts commit `9b54ca0ba7`. The commit above corrected a migration breakage between qemu-2.7 and qemu-2.8. However it did so by advancing the migration version for the PCI host bridge, which obviously breaks migration backwards to earlier qemu versions. Although it's not totally essential, we'd like to maintain the possibility for backwards migration, so revert the change in preparation for a better fix. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-11-23 12:00:48 +11:00
David Gibson	146c11f16f	target-ppc: Allow eventual removal of old migration mistakes Until very recently, the vmstate for ppc cpus included some poorly thought out VMSTATE_EQUAL() components, that can easily break migration compatibility, and did so between qemu-2.6 and later versions. A hack was recently added which fixes this migration breakage, but it leaves the unhelpful cruft of these fields in the migration stream. This patch adds a new cpu property allowing these fields to be removed from the stream entirely. For the pseries-2.8 machine type - which comes after the fix - and for all non-pseries machine types - which aren't mature enough to care about cross-version migration - we remove the fields from the stream. For pseries-2.7 and earlier, The migration hack remains in place, allowing backwards and forwards migration with the older machine types. This restricts the migration compatibility cruft to older machine types, and at least opens the possibility of eventually deprecating and removing it entirely. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-11-23 12:00:48 +11:00
Michael Roth	62ef3760d4	spapr: migration support for CAS-negotiated option vectors With the additional of the OV5_HP_EVT option vector, we now have certain functionality (namely, memory unplug) that checks at run-time for whether or not the guest negotiated the option via CAS. Because we don't currently migrate these negotiated values, we are unable to unplug memory from a guest after it's been migrated until after the guest is rebooted and CAS-negotiation is repeated. This patch fixes this by adding CAS-negotiated options to the migration stream. We do this using a subsection, since the negotiated value of OV5_HP_EVT is the only option currently needed to maintain proper functionality for a running guest. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-11-23 12:00:48 +11:00
Igor Mammedov	5836d16812	fw_cfg: move FW_CFG_NB_CPUS out of fw_cfg_init1() PC will use this field in other way, so move it outside the common code so PC could set a different value, i.e. all CPUs regardless of where they are coming from (-smp X \| -device cpu...). It's quick and dirty hack as it could be implemented in more generic way in MashineClass. But do it in simple way since only PC is affected so far. Later we can generalize it when another affected target gets support for -device cpu. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Message-Id: <1479212236-183810-3-git-send-email-imammedo@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2016-11-16 12:09:58 -02:00
David Gibson	27d9ffd4b3	ppc/pnv: Fix fatal bug on 32-bit hosts If the pnv machine type is compiled on a 32-bit host, the unsigned long (host) type is 32-bit. This means that the hweight_long() used to calculate the number of allowed cores only considers the low 32 bits of the cores_mask variable, and can thus return 0 in some circumstances. This corrects the bug. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Suggested-by: Richard Henderson <rth@twiddle.net> [clg: replaced hweight_long() by ctpop64() ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-11-15 10:08:43 +11:00
Cédric Le Goater	f81e551229	ppc/pnv: fix xscom address translation for POWER9 High addresses can overflow the uint32_t pcba variable after the 8byte shift. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-11-15 10:08:43 +11:00
Cédric Le Goater	ad521238b4	ppc/pnv: add a 'xscom_core_base' field to PnvChipClass The XSCOM addresses for the core registers are encoded in a slightly different way on POWER8 and POWER9. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-11-15 10:08:43 +11:00
David Gibson	9b54ca0ba7	spapr: Fix migration of PCI host bridges from qemu-2.7 `daa2369` "spapr_pci: Add a 64-bit MMIO window" subtly broke migration from qemu-2.7 to the current version. It split the device's MMIO window into two pieces for 32-bit and 64-bit MMIO. The patch included backwards compatibility code to convert the old property into the new format. However, the property value was also transferred in the migration stream and compared with a (probably unwise) VMSTATE_EQUAL. So, the "raw" value from 2.7 is compared to the new style converted value from (pre-)2.8 giving a mismatch and migration failure. Although it would be technically possible to fix this in a way allowing backwards migration, that would leave an ugly legacy around indefinitely. This patch takes the simpler approach of bumping the migration version, dropping the unwise VMSTATE_EQUAL (and some equally unwise ones around it) and ignoring them on an incoming migration. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-11-15 10:08:42 +11:00
Cédric Le Goater	ec575aa0ae	ppc/pnv: fix compile breakage on old gcc PnvChip is defined twice and this can confuse old compilers : CC ppc64-softmmu/hw/ppc/pnv_xscom.o In file included from qemu.git/hw/ppc/pnv.c:29: qemu.git/include/hw/ppc/pnv.h:60: error: redefinition of typedef ‘PnvChip’ qemu.git/include/hw/ppc/pnv_xscom.h:24: note: previous declaration of ‘PnvChip’ was here make[1]: * [hw/ppc/pnv.o] Error 1 make[1]: * Waiting for unfinished jobs.... Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-11-15 10:05:51 +11:00
David Gibson	8bd9530e13	powernv: CPU compatibility modes don't make sense for powernv powernv has some code (derived from the spapr equivalent) used in device tree generation which depends on the CPU's compatibility mode / logical PVR. However, compatibility modes don't make sense on powernv - at least not as a property controlled by the host - because the guest in powernv has full hypervisor level access to the virtual system, and so owns the PCR (Processor Compatibility Register) which implements compatiblity modes. Note: the new logic doesn't take into account kvmppc_smt_threads() like the old version did. However, if core->nr_threads exceeds kvmppc_smt_threads() then things will already be broken and clamping the value in the device tree isn't going to save us. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Thomas Huth <thuth@redhat.com>	2016-11-15 10:05:51 +11:00
Peter Maydell	6bc56d317f	Base patches for MTTCG enablement. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQExBAABCAAbBQJYF07FFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D ppoIAI4AxWocso5WIUH6uEHjOAxw9ZNhZ92nF8VtcbvGtN/eh8Qk4jfRX+W/Jl0q D13Rm3m8ynNHqh8YFs+O6i/WSgxHGxKwb75mNr36HDnYnMFluTvRQkvYJUXRyRuL CVtNgy8+q8FbbWo+NiJ5I7gfk2Si4BQfZN0uCLqGuCwqvvA/spN13xUcpeBXEKhL TeDGZBT/atDnT2bRcve8E8g5/0RKjTL9EB0jwfJjHocT5bs+toPe6js9VnZDRNWN ZldcONgEHj3zAj9j7hTkVWFTGPSCx/tt6y6JeORq1oxk0mCCswEk0U9A3hLzLjc/ 94XHsLaEoZ7HNAKtkLc07NYhkQM= =+6Sj -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-mttcg' into staging Base patches for MTTCG enablement. # gpg: Signature made Mon 31 Oct 2016 14:01:41 GMT # gpg: using RSA key 0xBFFBD25F78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" # Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1 # Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83 * remotes/bonzini/tags/for-upstream-mttcg: tcg: move locking for tb_invalidate_phys_page_range up *_run_on_cpu: introduce run_on_cpu_data type cpus: re-factor out handle_icount_deadline tcg: cpus rm tcg_exec_all() tcg: move tcg_exec_all and helpers above thread fn target-arm/arm-powerctl: wake up sleeping CPUs tcg: protect translation related stuff with tb_lock. translate-all: Add assert_(memory\|tb)_lock annotations linux-user/elfload: ensure mmap_lock() held while setting up tcg: comment on which functions have to be called with tb_lock held cpu-exec: include cpu_index in CPU_LOG_EXEC messages translate-all: add DEBUG_LOCKING asserts translate_all: DEBUG_FLUSH -> DEBUG_TB_FLUSH cpus: make all_vcpus_paused() return bool Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-10-31 15:29:12 +00:00
Paolo Bonzini	14e6fe12a7	_run_on_cpu: introduce run_on_cpu_data type This changes the _run_on_cpu APIs (and helpers) to pass data in a run_on_cpu_data type instead of a plain void *. This is because we sometimes want to pass a target address (target_ulong) and this fails on 32 bit hosts emulating 64 bit guests. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20161027151030.20863-24-alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-10-31 15:00:25 +01:00
Peter Maydell	277d44f5a6	trivial patches for 2016-10-28 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABCAAGBQJYE2wfAAoJEHAbT2saaT5ZGYUH/3QWJ4OFWbqGo1YYN5AIAheF v1bQGTh1HGbLk46ajhUvzB0bMHb1FC1KoOruU2wFYuKK/J5zQ+4X9EmaC/fD7hyx nGTcPWAyxKOlqOq3In9ro+xWQNzEhfoypKCQQVC4Y3quzub48wAro8fuFSNXLyBq ERvAsjgj0TrLEHoWtJl2bPYiqSd6KAHZAKPFW3Jw8MmsBcTLmnF2PVW3LBfdcHe7 6vlhqX7lPzVlHRaUsaxRkFxYd2YGisbe3bPRDw2fTxrtOYyEkopQq7xi2Q6Yq5N0 z0yM2oJ7o1QtUOXYa7KBf03WZ7e119HimaUkGLg+0LVhQNbeG3hd3gNwApXa5og= =tYml -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging trivial patches for 2016-10-28 # gpg: Signature made Fri 28 Oct 2016 16:17:51 BST # gpg: using RSA key 0x701B4F6B1A693E59 # gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" # gpg: aka "Michael Tokarev <mjt@corpit.ru>" # gpg: aka "Michael Tokarev <mjt@debian.org>" # Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5 # Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59 * remotes/mjt/tags/trivial-patches-fetch: (23 commits) Fix build for less common build directories names clean-up: removed duplicate #includes scripts/clean-includes: added duplicate #include check monitor: deprecate 'default' option qemu-ga: Remove stray 'q' in documentation Makefile: Fix help text for target 'installer' s390: avoid always-true comparison in s390_pci_generate_fid() migration: Remove unneeded NULL check from migrate_fd_error() scripts/hxtool: fix undefined behavour of echo qemu-options.hx: set: fix copy-paste error usb: Change *_exitfn return type from int to void MAINTAINERS: qemu-trivial information colo-compare: remove unused struct CompareChardevProps and 'props' variable milkymist-pfpu: fix potential integer overflow hw/block/nvme: Simplify if-statements a little bit target-lm32: rewrite gen_compare() lm32: milkymist-tmu2: fix integer overflow target-lm32: disable asm logging via LOG_DIS() target-lm32: swap operand of wcsr in LOG_DIS() target-lm32: fix LOG_DIS operand order ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-10-31 11:58:30 +00:00
Anand J	814bb12a56	clean-up: removed duplicate #includes Some files contain multiple #includes of the same header file. Removed most of those unnecessary duplicate entries using scripts/clean-includes. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Anand J <anand.indukala@gmail.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2016-10-28 18:17:24 +03:00
Bharata B Rao	cf63246319	spapr: Memory hot-unplug support Add support to hot remove pc-dimm memory devices. Since we're introducing a machine-level unplug_request hook, we also had handling for CPU unplug there as well to ensure CPU unplug continues to work as it did before. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> * add hooks to CAS/cmdline enablement of hotplug ACR support * add hook for CPU unplug Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 11:17:35 +11:00
Michael Roth	79b78a6bd4	spapr: use count+index for memory hotplug Commit 0a417869: spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type dropped per-DRC/per-LMB hotplugs event in favor of a bulk add via a single LMB count value. This was to avoid overrunning the guest EPOW event queue with hotplug events. This works fine, but relies on the guest exhaustively scanning for pluggable LMBs to satisfy the requested count by issuing rtas-get-sensor(DR_ENTITY_SENSE, ...) calls until all the LMBs associated with the DIMM are identified. With newer support for dedicated hotplug event source, this queue exhaustion is no longer as much of an issue due to implementation details on the guest side, but we still try to avoid excessive hotplug events by now supporting both a count and a starting index to avoid unecessary work. This patch makes use of that approach when the capability is available. Cc: bharata@linux.vnet.ibm.com Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 11:17:35 +11:00
Bharata B Rao	afdbd40356	spapr: Add DRC count indexed hotplug identifier type Add support for DRC count indexed hotplug ID type which is primarily needed for memory hot unplug. This type allows for specifying the number of DRs that should be plugged/unplugged starting from a given DRC index. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> * updated rtas_event_log_v6_hp to reflect count/index field ordering used in PAPR hotplug ACR Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 11:17:35 +11:00
Michael Roth	f622921430	spapr: add hotplug interrupt machine options This adds machine options of the form: -machine pseries,modern-hotplug-events=true -machine pseries,modern-hotplug-events=false If false, QEMU will force the use of "legacy" style hotplug events, which are surfaced through EPOW events instead of a dedicated hot plug event source, and lack certain features necessary, mainly, for memory unplug support. If true, QEMU will enable support for "modern" dedicated hot plug event source. Note that we will still default to "legacy" style unless the guest advertises support for the "modern" hotplug events via ibm,client-architecture-support hcall during early boot. For pseries-2.7 and earlier we default to false, for newer machine types we default to true. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 11:17:35 +11:00
Michael Roth	ffbb1705a3	spapr_events: add support for dedicated hotplug event source Hotplug events were previously delivered using an EPOW interrupt and were queued by linux guests into a circular buffer. For traditional EPOW events like shutdown/resets, this isn't an issue, but for hotplug events there are cases where this buffer can be exhausted, resulting in the loss of hotplug events, resets, etc. Newer-style hotplug event are delivered using a dedicated event source. We enable this in supported guests by adding standard an additional event source in the guest device-tree via /event-sources, and, if the guest advertises support for the newer-style hotplug events, using the corresponding interrupt to signal the available of hotplug/unplug events. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 11:17:35 +11:00
Michael Roth	417ece33fc	spapr: improve ibm,architecture-vec-5 property handling ibm,architecture-vec-5 is supposed to encode all option vector 5 bits negotiated between platform/guest. Currently we hardcode this property in the boot-time device tree to advertise a single negotiated capability, "Form 1" NUMA Affinity, regardless of whether or not CAS has been invoked or that capability has actually been negotiated. Improve this by generating ibm,architecture-vec-5 based on the full set of option vector 5 capabilities negotiated via CAS. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:26 +11:00
Michael Roth	6787d27b04	spapr: add option vector handling in CAS-generated resets In some cases, ibm,client-architecture-support calls can fail. This could happen in the current code for situations where the modified device tree segment exceeds the buffer size provided by the guest via the call parameters. In these cases, QEMU will reset, allowing an opportunity to regenerate the device tree from scratch via boot-time handling. There are potentially other scenarios as well, not currently reachable in the current code, but possible in theory, such as cases where device-tree properties or nodes need to be removed. We currently don't handle either of these properly for option vector capabilities however. Instead of carrying the negotiated capability beyond the reset and creating the boot-time device tree accordingly, we start from scratch, generating the same boot-time device tree as we did prior to the CAS-generated and the same device tree updates as we did before. This could (in theory) cause us to get stuck in a reset loop. This hasn't been observed, but depending on the extensiveness of CAS-induced device tree updates in the future, could eventually become an issue. Address this by pulling capability-related device tree updates resulting from CAS calls into a common routine, spapr_dt_cas_updates(), and adding an sPAPROptionVector* parameter that allows us to test for newly-negotiated capabilities. We invoke it as follows: 1) When ibm,client-architecture-support gets called, we call spapr_dt_cas_updates() with the set of capabilities added since the previous call to ibm,client-architecture-support. For the initial boot, or a system reset generated by something other than the CAS call itself, this set will consist of all options supported both the platform and the guest. For calls to ibm,client-architecture-support immediately after a CAS-induced reset, we call spapr_dt_cas_updates() with only the set of capabilities added since the previous call, since the other capabilities will have already been addressed by the boot-time device-tree this time around. In the unlikely event that capabilities are removed since the previous CAS, we will generate a CAS-induced reset. In the unlikely event that we cannot fit the device-tree updates into the buffer provided by the guest, well generate a CAS-induced reset. 2) When a CAS update results in the need to reset the machine and include the updates in the boot-time device tree, we call the spapr_dt_cas_updates() using the full set of negotiated capabilities as part of the reset path. At initial boot, or after a reset generated by something other than the CAS call itself, this set will be empty, resulting in what should be the same boot-time device-tree as we generated prior to this patch. For CAS-induced reset, this routine will be called with the full set of capabilities negotiated by the platform/guest in the previous CAS call, which should result in CAS updates from previous call being accounted for in the initial boot-time device tree. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Changed an int -> bool conversion to be more explicit] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:26 +11:00
Michael Roth	facdb8b63b	spapr_hcall: use spapr_ovec_* interfaces for CAS options Currently we access individual bytes of an option vector via ldub_phys() to test for the presence of a particular capability within that byte. Currently this is only done for the "dynamic reconfiguration memory" capability bit. If that bit is present, we pass a boolean value to spapr_h_cas_compose_response() to generate a modified device tree segment with the additional properties required to enable this functionality. As more capability bits are added, will would need to modify the code to add additional option vector accesses and extend the param list for spapr_h_cas_compose_response() to include similar boolean values for these parameters. Avoid this by switching to spapr_ovec_* helpers so we can do all the parsing in one shot and then test for these additional bits within spapr_h_cas_compose_response() directly. Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:26 +11:00
Michael Roth	b20b7b7add	spapr_ovec: initial implementation of option vector helpers PAPR guests advertise their capabilities to the platform by passing an ibm,architecture-vec structure via an ibm,client-architecture-support hcall as described by LoPAPR v11, B.6.2.3. during early boot. Using this information, the platform enables the capabilities it supports, then encodes a subset of those enabled capabilities (the 5th option vector of the ibm,architecture-vec structure passed to ibm,client-architecture-support) into the guest device tree via "/chosen/ibm,architecture-vec-5". The logical format of these these option vectors is a bit-vector, where individual bits are addressed/documented based on the byte-wise offset from the beginning of the bit-vector, followed by the bit-wise index starting from the byte-wise offset. Thus the bits of each of these bytes are stored in reverse order. Additionally, the first byte of each option vector is encodes the length of the option vector, so byte offsets begin at 1, and bit offset at 0. This is not very intuitive for the purposes of mapping these bits to a particular documented capability, so this patch introduces a set of abstractions that encapsulate the work of parsing/encoding these options vectors and testing for individual capabilities. Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> [dwg: Tweaked double-include protection to not trigger a checkpatch false positive] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:26 +11:00
David Gibson	398a0bd5ae	pseries: Remove spapr_create_fdt_skel() For historical reasons construction of the guest device tree in spapr is divided between spapr_create_fdt_skel() which is called at init time, and spapr_build_fdt() which runs at reset time. Over time, more and more things have needed to be moved to reset time. Previous cleanups mean the only things left in spapr_create_fdt_skel() are the properties of the root node itself. Finish consolidating these two parts of device tree construction, by moving this to the start of spapr_build_fdt(), and removing spapr_create_fdt_skel() entirely. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:26 +11:00
David Gibson	bf5a6696ba	pseries: Consolidate construction of /vdevice device tree node Construction of the /vdevice node (and its children) is divided between spapr_create_fdt_skel() (at init time), which creates the base node, and spapr_populate_vdevice() (at reset time) which creates the nodes for each individual virtual device. This consolidates both into a single function called from spapr_build_fdt(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:26 +11:00
David Gibson	fca5f2dc6c	pseries: Move /hypervisor node construction to fdt_build_fdt() Currently the /hypervisor device tree node is constructed in spapr_create_fdt_skel(). As part of consolidating device tree construction to reset time, move it to a function called from spapr_build_fdt(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:26 +11:00
David Gibson	ffb1e275a6	pseries: Move /event-sources construction to spapr_build_fdt() The /event-sources device tree node is built from spapr_create_fdt_skel(). As part of consolidating device tree construction to reset time, this moves it to spapr_build_fdt(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:26 +11:00
David Gibson	3f5dabceba	pseries: Consolidate construction of /rtas device tree node For historical reasons construction of the /rtas node in the device tree (amongst others) is split into several places. In particular it's split between spapr_create_fdt_skel(), spapr_build_fdt() and spapr_rtas_device_tree_setup(). In fact, as well as adding the actual RTAS tokens to the device tree, spapr_rtas_device_tree_setup() just adds the ibm,lrdr-capacity property, which despite going in the /rtas node, doesn't have a lot to do with RTAS. This patch consolidates the code constructing /rtas together into a new spapr_dt_rtas() function. spapr_rtas_device_tree_setup() is renamed to spapr_dt_rtas_tokens() and now only adds the token properties. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:26 +11:00
David Gibson	7c866c6a60	pseries: Consolidate construction of /chosen device tree node For historical reasons, building the /chosen node in the guest device tree is split across several places and includes both parts which write the DT sequentially and others which use random access functions. This patch consolidates construction of the node into one place, using random access functions throughout. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:26 +11:00
David Gibson	9b9a19080a	pseries: Move construction of /interrupt-controller fdt node Currently the device tree node for the XICS interrupt controller is in spapr_create_fdt_skel(). As part of consolidating device tree construction to reset time, this moves it to a function called from spapr_build_fdt(). In addition we move the actual code into hw/intc/xics_spapr.c with the rest of the PAPR specific interrupt controller code. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:26 +11:00
David Gibson	2cac78c12a	pseries: Consolidate RTAS loading At each system reset, the pseries machine needs to load RTAS (the runtime portion of the guest firmware) into the VM. This means copying the actual RTAS code into guest memory, and also updating the device tree so that the guest OS and boot firmware can locate it. For historical reasons the copy and update to the device tree were in different parts of the code. This cleanup brings them both together in an spapr_load_rtas() function. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:26 +11:00
David Gibson	cf6e522390	pseries: Move adding of fdt reserve map entries The flattened device tree passed to pseries guests contains a list of reserved memory areas. Currently we construct this list early in spapr_create_fdt_skel() as we sequentially write the fdt. This will be inconvenient for upcoming cleanups, so this patch moves the reserve map changes to the end of fdt construction. This changes fdt_add_reservemap_entry() calls - which work when writing the fdt sequentially to fdt_add_mem_rsv() calls used when altering the fdt in random access mode. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:25 +11:00
David Gibson	a19f7fb045	pseries: Make spapr_create_fdt_skel() get information from machine state Currently spapr_create_fdt_skel() takes a bunch of individual parameters for various things it will put in the device tree. Some of these can already be taken directly from sPAPRMachineState. This patch alters it so that all of them can be taken from there, which will allow this code to be moved away from its current caller in future. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:25 +11:00
David Gibson	cae172ab6d	pseries: Remove rtas_addr and fdt_addr fields from machinestate These values are used only within ppc_spapr_reset(), so just change them to local variables. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>	2016-10-28 09:38:25 +11:00
David Gibson	997b6cfc3d	pseries: Split device tree construction from device tree load spapr_finalize_fdt() both finishes building the device tree for the guest and loads it into guest memory. For future cleanups, it's going to be more convenient to do these two things separately. The loading portion is pretty trivial, so we move it inline into the caller, ppc_spapr_reset(). We also rename spapr_finalize_fdt(), because the current name is going to become inaccurate. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-10-28 09:38:25 +11:00
Cédric Le Goater	3495b6b610	ppc/pnv: add a ISA bus As Qemu only supports a single instance of the ISA bus, we use the LPC controller of chip 0 to create one and plug in a couple of useful devices, like an UART and RTC. An IPMI BT device, which is also an ISA device, can be defined on the command line to connect an external BMC. That is for later. The PowerNV machine now has a console. Skiboot should load a kernel and jump into it but execution will stop quite early because we lack a model for the native XICS controller for the moment : [ 0.000000] NR_IRQS:512 nr_irqs:512 16 [ 0.000000] XICS: Cannot find a Presentation Controller ! [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: at arch/powerpc/platforms/powernv/setup.c:81 ... [ 0.000000] NIP [c00000000079d65c] pnv_init_IRQ+0x30/0x44 You can still do a few things under xmon. Based on previous work from : Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Trivial fix for a change in the serial_hds_isa_init() interface] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:25 +11:00
Benjamin Herrenschmidt	a3980bf517	ppc/pnv: add a LPC controller The LPC (Low Pin Count) interface on a POWER8 is made accessible to the system through the ADU (XSCOM interface). This interface is part of set of units connected together via a local OPB (On-Chip Peripheral Bus) which act as a bridge between the ADU and the off chip LPC endpoints, like external flash modules. The most important units of this OPB are : - OPB Master: contains the ADU slave logic, a set of internal registers and the logic to control the OPB. - LPCHC (LPC HOST Controller): which implements a OPB Slave, a set of internal registers and the LPC HOST Controller to control the LPC interface. Four address spaces are provided to the ADU : - LPC Bus Firmware Memory - LPC Bus Memory - LPC Bus I/O (ISA bus) - and the registers for the OPB Master and the LPC Host Controller On POWER8, an intermediate hop is necessary to reach the OPB, through a unit called the ECCB. OPB commands are simply mangled in ECCB write commands. On POWER9, the OPB master address space can be accessed via MMIO. The logic is same but the code will be simpler as the XSCOM and ECCB hops are not necessary anymore. This version of the LPC controller model doesn't yet implement support for the SerIRQ deserializer present in the Naples version of the chip though some preliminary work is there. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: - updated for qemu-2.7 - ported on latest PowerNV patchset - changed the XSCOM interface to fit new model - QOMified the model - moved the ISA hunks in another patch - removed printf logging - added a couple of UNIMP logging - rewrote commit log ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:25 +11:00
Cédric Le Goater	24ece07250	ppc/pnv: add XSCOM handlers to PnvCore Now that we are using real HW ids for the cores in PowerNV chips, we can route the XSCOM accesses to them. We just need to attach a specific XSCOM memory region to each core in the appropriate window for the core number. To start with, let's install the DTS (Digital Thermal Sensor) handlers which should return 38°C for each core. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:25 +11:00
Cédric Le Goater	967b75230b	ppc/pnv: add XSCOM infrastructure On a real POWER8 system, the Pervasive Interconnect Bus (PIB) serves as a backbone to connect different units of the system. The host firmware connects to the PIB through a bridge unit, the Alter-Display-Unit (ADU), which gives him access to all the chiplets on the PCB network (Pervasive Connect Bus), the PIB acting as the root of this network. XSCOM (serial communication) is the interface to the sideband bus provided by the POWER8 pervasive unit to read and write to chiplets resources. This is needed by the host firmware, OPAL and to a lesser extent, Linux. This is among others how the PCI Host bridges get configured at boot or how the LPC bus is accessed. To represent the ADU of a real system, we introduce a specific AddressSpace to dispatch XSCOM accesses to the targeted chiplets. The translation of an XSCOM address into a PCB register address is slightly different between the P9 and the P8. This is handled before the dispatch using a 8byte alignment for all. To customize the device tree, a QOM InterfaceClass, PnvXScomInterface, is provided with a populate() handler. The chip populates the device tree by simply looping on its children. Therefore, each model needing custom nodes should not forget to declare itself as a child at instantiation time. Based on previous work done by : Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: Added cpu parameter to xscom_complete()] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:25 +11:00
Cédric Le Goater	d2fd9612ee	ppc/pnv: add a PnvCore object This is largy inspired by sPAPRCPUCore with some simplification, no hotplug for instance. A set of PnvCore objects is added to the PnvChip and the device tree is populated looping on these cores. Real HW cpu ids are now generated depending on the chip cpu model, the chip id and a core mask. The id is propagated to the CPU object, using properties, to set the SPR_PIR (Processor Identification Register) Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:25 +11:00
Cédric Le Goater	631adaff31	ppc/pnv: add a PIR handler to PnvChip The Processor Identification Register (PIR) is a register that holds a processor identifier which is used for bus transactions (XSCOM) and for processor differentiation in multiprocessor systems. It also used in the interrupt vector entries (IVE) to identify the thread serving the interrupts. P9 and P8 have some differences in the CPU PIR encoding. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:25 +11:00
Cédric Le Goater	397a79e757	ppc/pnv: add a core mask to PnvChip This will be used to build real HW ids for the cores and enforce some limits on the available cores per chip. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:25 +11:00
Cédric Le Goater	e997040e3f	ppc/pnv: add a PnvChip object This is is an abstraction of a POWER8 chip which is a set of cores plus other 'units', like the pervasive unit, the interrupt controller, the memory controller, the on-chip microcontroller, etc. The whole can be seen as a socket. It depends on a cpu model and its characteristics: max cores and specific inits are defined in a PnvChipClass. We start with an near empty PnvChip with only a few cpu constants which we will grow in the subsequent patches with the controllers required to run the system. The Chip CFAM (Common FRU Access Module) ID gives the model of the chip and its version number. It is generally the first thing firmwares fetch, available at XSCOM PCB address 0xf000f, to start initialization. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:25 +11:00
Benjamin Herrenschmidt	9e933f4a62	ppc/pnv: add skeleton PowerNV platform The goal is to emulate a PowerNV system at the level of the skiboot firmware, which loads the OS and provides some runtime services. Power Systems have a lower firmware (HostBoot) that does low level system initialization, like DRAM training. This is beyond the scope of what qemu will address in a PowerNV guest. No devices yet, not even an interrupt controller. Just to get started, some RAM to load the skiboot firmware, the kernel and initrd. The device tree is fully created in the machine reset op. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: - updated for qemu-2.7 - replaced fprintf by error_report - used a common definition of _FDT macro - removed VMStateDescription as migration is not yet supported - added IBM Copyright statements - reworked kernel_filename handling - merged PnvSystem and sPowerNVMachineState - removed PHANDLE_XICP - added ppc_create_page_sizes_prop helper - removed nmi support - removed kvm support - updated powernv machine to version 2.8 - removed chips and cpus, They will be provided in another patches - added a machine reset routine to initialize the device tree (also) - french has a squelette and english a skeleton. - improved commit log. - reworked prototypes parameters - added a check on the ram size (thanks to Michael Ellerman) - fixed chip-id cell - changed MAX_CPUS to 2048 - simplified memory node creation to one node only - removed machine version - rewrote the device tree creation with the fdt "rw" routines - s/sPowerNVMachineState/PnvMachineState/ - etc.] Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:38:24 +11:00
Michael Roth	4bcfa56ca9	spapr_pci: advertise explicit numa IDs even when there's 1 node With the addition of "numa_node" properties for PHBs we began advertising NUMA affinity in cases where nb_numa_nodes > 1. Since the default on the guest side is to make no assumptions about PHB NUMA affinity (defaulting to -1), there is still a valid use-case for explicitly defining a PHB's NUMA affinity even when there's just one node. In particular, some workloads make faulty assumptions about /sys/bus/pci/<devid>/numa_node being >= 0, warranting the use of this property as a workaround even if there's just 1 PHB or NUMA node. Enable this use-case by always advertising the PHB's NUMA affinity if "numa_node" has been explicitly set. We could achieve this by relaxing the check to simply be nb_numa_nodes > 0, but even safer would be to check numa_info[nodeid].present explicitly, and to fail at start time for cases where it does not exist. This has an additional affect of no longer advertising PHB NUMA affinity unconditionally if nb_numa_nodes > 1 and "numa_node" property is unset/-1, but since the default value on the guest side for each PHB is also -1, the behavior should be the same for that situation. We could still retain the old behavior if desired, but the decision seems arbitrary, so we take the simpler route. Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Cc: Shivaprasad G. Bhat <shivapbh@in.ibm.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-28 09:36:58 +11:00
Igor Mammedov	079019f2e3	Increase MAX_CPUMASK_BITS from 255 to 288 so that it would be possible to increase maxcpus limit for x86 target. Keep spapr/virt_arm at limit they used to have 255. Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2016-10-24 17:29:15 -02:00
David Gibson	357d1e3bc7	spapr: Improved placement of PCI host bridges in guest memory map Currently, the MMIO space for accessing PCI on pseries guests begins at 1 TiB in guest address space. Each PCI host bridge (PHB) has a 64 GiB chunk of address space in which it places its outbound PIO and 32-bit and 64-bit MMIO windows. This scheme as several problems: - It limits guest RAM to 1 TiB (though we have a limited fix for this now) - It limits the total MMIO window to 64 GiB. This is not always enough for some of the large nVidia GPGPU cards - Putting all the windows into a single 64 GiB area means that naturally aligning things within there will waste more address space. In addition there was a miscalculation in some of the defaults, which meant that the MMIO windows for each PHB actually slightly overran the 64 GiB region for that PHB. We got away without nasty consequences because the overrun fit within an unused area at the beginning of the next PHB's region, but it's not pretty. This patch implements a new scheme which addresses those problems, and is also closer to what bare metal hardware and pHyp guests generally use. Because some guest versions (including most current distro kernels) can't access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and 64 TiB. This is broken into 1 TiB chunks. The first 1 TiB contains the PIO (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs. Each subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for one PHB each. This reduces the number of allowed PHBs (without full manual configuration of all the windows) from 256 to 31, but this should still be plenty in practice. We also change some of the default window sizes for manually configured PHBs to saner values. Finally we adjust some tests and libqos so that it correctly uses the new default locations. Ideally it would parse the device tree given to the guest, but that's a more complex problem for another time. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>	2016-10-16 12:04:15 +11:00
David Gibson	daa2369903	spapr_pci: Add a 64-bit MMIO window On real hardware, and under pHyp, the PCI host bridges on Power machines typically advertise two outbound MMIO windows from the guest's physical memory space to PCI memory space: - A 32-bit window which maps onto 2GiB..4GiB in the PCI address space - A 64-bit window which maps onto a large region somewhere high in PCI address space (traditionally this used an identity mapping from guest physical address to PCI address, but that's not always the case) The qemu implementation in spapr-pci-host-bridge, however, only supports a single outbound MMIO window, however. At least some Linux versions expect the two windows however, so we arranged this window to map onto the PCI memory space from 2 GiB..~64 GiB, then advertised it as two contiguous windows, the "32-bit" window from 2G..4G and the "64-bit" window from 4G..~64G. This approach means, however, that the 64G window is not naturally aligned. In turn this limits the size of the largest BAR we can map (which does have to be naturally aligned) to roughly half of the total window. With some large nVidia GPGPU cards which have huge memory BARs, this is starting to be a problem. This patch adds true support for separate 32-bit and 64-bit outbound MMIO windows to the spapr-pci-host-bridge implementation, each of which can be independently configured. The 32-bit window always maps to 2G.. in PCI space, but the PCI address of the 64-bit window can be configured (it defaults to the same as the guest physical address). So as not to break possible existing configurations, as long as a 64-bit window is not specified, a large single window can be specified. This will appear the same way to the guest as the old approach, although it's now implemented by two contiguous memory regions rather than a single one. For now, this only adds the possibility of 64-bit windows. The default configuration still uses the legacy mode. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>	2016-10-16 12:03:09 +11:00
David Gibson	2efff1c0dd	spapr: Adjust placement of PCI host bridge to allow > 1TiB RAM Currently the default PCI host bridge for the 'pseries' machine type is constructed with its IO windows in the 1TiB..(1TiB + 64GiB) range in guest memory space. This means that if > 1TiB of guest RAM is specified, the RAM will collide with the PCI IO windows, causing serious problems. Problems won't be obvious until guest RAM goes a bit beyond 1TiB, because there's a little unused space at the bottom of the area reserved for PCI, but essentially this means that > 1TiB of RAM has never worked with the pseries machine type. This patch fixes this by altering the placement of PHBs on large-RAM VMs. Instead of always placing the first PHB at 1TiB, it is placed at the next 1 TiB boundary after the maximum RAM address. Technically, this changes behaviour in a migration-breaking way for existing machines with > 1TiB maximum memory, but since having > 1 TiB memory was broken anyway, this seems like a reasonable trade-off. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>	2016-10-16 12:03:09 +11:00
David Gibson	6737d9ad79	spapr_pci: Delegate placement of PCI host bridges to machine type The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB) for a PAPR guest. Unlike on x86, it's routine on Power (both bare metal and PAPR guests) to have numerous independent PHBs, each controlling a separate PCI domain. There are two ways of configuring the spapr-pci-host-bridge device: first it can be done fully manually, specifying the locations and sizes of all the IO windows. This gives the most control, but is very awkward with 6 mandatory parameters. Alternatively just an "index" can be specified which essentially selects from an array of predefined PHB locations. The PHB at index 0 is automatically created as the default PHB. The current set of default locations causes some problems for guests with large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia GPGPU cards via VFIO). Obviously, for migration we can only change the locations on a new machine type, however. This is awkward, because the placement is currently decided within the spapr-pci-host-bridge code, so it breaks abstraction to look inside the machine type version. So, this patch delegates the "default mode" PHB placement from the spapr-pci-host-bridge device back to the machine type via a public method in sPAPRMachineClass. It's still a bit ugly, but it's about the best we can do. For now, this just changes where the calculation is done. It doesn't change the actual location of the host bridges, or any other behaviour. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com>	2016-10-16 12:03:09 +11:00
Benjamin Herrenschmidt	cc706a5305	ppc/xics: Make the ICSState a list Instead of an array of fixed sized blocks, use a list, as we will need to have sources with variable number of interrupts. SPAPR only uses a single entry. Native will create more. If performance becomes an issue we can add some hashed lookup but for now this will do fine. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [ move the initialization of list to xics_common_initfn, restore xirr_owner after migration and move restoring to icp_post_load] Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com> [ clg: removed the icp_post_load() changes from nikunj patchset v3: http://patchwork.ozlabs.org/patch/646008/ ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-10-14 16:31:02 +11:00

1 2 3 4 5 ...

943 Commits