Talospace

Posts

Showing posts with the label KVM

Posted by ClassicHasClass on September 22, 2021

Whonix on OpenPOWER

Developer Jeremy Rand wrote in to report his functioning port of Whonix 16 to OpenPOWER. (I should point out that all links in this article are "clearnet.") Whonix is a second operating system based on Kicksecure (a Debian derivative formerly known as "Hardened Debian") that runs within VMs on your existing OS (compare with Tails). All connections within it are forced through Tor, using different paths for different applications; additionally, it uses kloak for keystroke anonymization and secure network time synchronization instead of NTP, has higher quality RNGs, and enables AppArmor and hardened kernel profiles to prevent against other types of attacks.

The current release of Whonix is based on Debian bullseye and runs "native" on OpenPOWER KVM-HV using libvirt. Note that ppc64le isn't a top-tier architecture yet, so there are roadbumps: due to a bug in kernel versions prior to 5.14, currently you have to use Debian experimental for the VM, and there may be other glitches temporarily until support is mainstreamed. But if you bought an OpenPOWER workstation for its auditability and transparency, I doubt something like that's going to trip you up much. Detailed installation instructions, including Onion links if you prefer, are on the Raptor wiki.

Posted by ClassicHasClass on March 04, 2021

Juicing QEMU for fun, ??? and profit!

The number of packages and applications natively available for OpenPOWER continue to grow in just about every distro's package manager, and even if a prebuilt package doesn't exist even more will build from source. But emulation is still going to be a fact of life for Windows-only/x86/x86_64-only (maybe even aarch64-only) binaries we can't rebuild, and KVM only helps us with other Power ISA systems (in fact, it looks like KVM-PR broke and can't boot Mac OS X again, so I guess I'll be diving back into the source), so we need to wring as much speed out of QEMU's emulation engine as possible.

We are fortunate with QEMU in that there is ppc64le support in TCG, the Tiny Code Generator which implements a basic JIT, and the Power ISA TCG backend even emits those tasty newer POWER9 instructions to take better advantage of the processor. Without TCG, QEMU would be dreadfully slow when emulating a foreign architecture. However, unless IBM or some other OpenPOWER hardware developer implements instructions (a la Apple M1) in a future chip that specifically improve emulation of other CPUs (like, I dunno, x86_64), there's very little that can be done to improve the code the Power TCG backend generates and CPU emulation spends most of its time in TCG-generated code.

However, the software MMU that QEMU's CPU emulation uses has pre-compiled portions, and all the devices and components QEMU emulates (like the system bus, video, mass storage, USB, etc.) are also pre-compiled. This gives us an opportunity: with a little extra elbow grease, you can make a link-time-optimized and profile-guided-optimized (LTO-PGO) build of QEMU specific to the particular workload which can run the CPU anywhere from 3-8% faster and video and other devices up to 15% faster depending on the set of devices. While number crunching isn't substantially faster, and the modest CPU improvements don't improve user-mode emulation a great deal, full system emulation's general responsiveness improves and makes using more applications more feasible.

This process is not automated. For Firefox, we make LTO-PGO builds using the internal machinery and our patches for gcc compatibility, which is currently our preferred compiler on OpenPOWER systems. The Firefox build system generates a profiling build first, then automatically collects profiling data with it off a model workload and builds the optimized browser from that profile. QEMU doesn't have that infrastructure right now, but you can do it manually: you configure and compile a profiling build, run your workload with it to create a profile, and then configure and compile an optimized build with the profile thus generated.

I'll give instructions here for both QEMU 5.0 and 5.2, since 5.0 seems to be a bit more performant than 5.2 and has fewer build prerequisites, but 5.2 is more straightforward and we'll do it first. In these examples, I'm optimizing ppc-softmmu so that I can run Mac OS 9, which has never worked properly with KVM-PR; substitute with your desired target, such as x86_64-softmmu. Only do one target at a time, and you will want to do individual builds for each system image — even if you normally use the same executable binary for multiple OSes — because different code paths may be exercised with different workloads and/or configurations.

Let's start with making a profiling build. To do this, we'll add -fprofile-generate to the compiler flags (as well as -flto for LTO). For consistency we'll pass the same set of options to the C compiler, the C++ compiler and the linker (each will ignore options they don't need). In the QEMU source tree,

mkdir build
cd build
../configure --extra-cflags="-O3 -mcpu=power9 -flto -fprofile-generate" \ --extra-cxxflags="-O3 -mcpu=power9 -flto -fprofile-generate" \ --extra-ldflags="-flto -fprofile-generate" --target-list=ppc-softmmu
make -j24 (or as appropriate: this is a dual-8 Talos II)

Wait for QEMU to build. When it finishes, back up your drive image because you may not be able to shut it down normally and it would suck to damage it inadvertently. With a backup copy saved, run the new QEMU as you ordinarily would on your target workload. For example, my classic script is (assuming you're still in the build directory)

./qemu-system-ppc -M mac99,accel=tcg,via=pmu -m 1536 -boot c \ -drive id=root,file=classic.img,format=qcow2,l2-cache-size=4M \ -usb -netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no \ -device rtl8139,netdev=mynet0 -rtc base=localtime

You should use as close to your normal configuration as possible so that the device drivers you run are factored into the profile.

The first thing you'll notice is that QEMU is now really, really, really slow. Crust-of-the-earth-cooling slow. This is because it's storing all that profile data every time any block of compiled code is executed. As a result you will probably not be able to type or interact with the guest in any meaningful fashion, so let the system boot, grab a cup of a fortifying beverage and and wait for it to get as far as it can. For Mac OS 9, it took several minutes to get to the desktop; for OS X 10.4, it took about a quarter of an hour (with a lot of timeouts in a verbose boot) to even start the login window. At some point you will not be able to usefully proceed any further with the guest, but fortunately you backed up your drive image already, so you can simply close the window.

Go back to the build directory. This time we will tell gcc to build with the generated profile (-fprofile-use), though we will allow it to account for certain changes (-fprofile-correction) and allow compilation to occur even if a profile doesn't exist for a particular target (-Wno-missing-profile) so that it can get through configure cleanly:

make clean (this doesn't remove the profile .gcda files)
../configure \ --extra-cflags="-O3 -mcpu=power9 -flto -fprofile-correction -fprofile-use -Wno-missing-profile" \ --extra-cxxflags="-O3 -mcpu=power9 -flto -fprofile-use -fprofile-correction -Wno-missing-profile" \ --extra-ldflags="-flto -fprofile-use -fprofile-correction -Wno-missing-profile" \ --target-list=ppc-softmmu
make -j24

Enjoy the new hotness. You should be able to see measurable improvements in the CPU emulation, but more importantly, boot times and responsiveness of the full system emulation should also be improved.

For 5.0.0, the process is a bit more complicated, but it's a bit quicker, so I found it worth it (and it's what I currently use for Mac OS 9). In the QEMU source tree, configure the build:

./configure --extra-cflags="-O3 -mcpu=power9 -flto -fprofile-generate" \ --extra-cxxflags="-O3 -mcpu=power9 -flto -fprofile-generate" \ --extra-ldflags="-flto -fprofile-generate" --target-list=ppc-softmmu
make -j24

Run your profile as before. However, you need to preserve the profile before the rebuild because make clean will clobber it.

tar cvf instrumented.tar `find . -name '*.gcda' -print`
make clean
tar xf instrumented.tar
../configure \ --extra-cflags="-O3 -mcpu=power9 -flto -fprofile-correction -fprofile-use -Wno-missing-profile" \ --extra-cxxflags="-O3 -mcpu=power9 -flto -fprofile-use -fprofile-correction -Wno-missing-profile" \ --extra-ldflags="-flto -fprofile-use -fprofile-correction -Wno-missing-profile" \ --target-list=ppc-softmmu
make -j24

Life's golden, and just a little bit zippier. It's not always possible to PGO all the things, but here's one where it makes a noticeable difference.

Posted by ClassicHasClass on June 22, 2020

It's Talos all the way down

Still can't bear the sticker shock of your very own Talos II, or even a itty bitty Blackbird? Why not do what we all do for the machines we can't own and emulate one instead? (And then decide you like it a lot, and save your pennies?)

QEMU 5.0.0 offers a machine model for the bare-metal PowerNV profile, to which the Raptor systems and other OpenPOWER POWER8 and POWER9 designs intended for Linux (i.e., not PowerVM machines) belong. Using the Talos II firmware image (mostly: one snag to be mentioned), you can boot the machine in QEMU and from there bring up an operating system in emulation. In this article we'll prove it works by bringing up Void Linux for Power (hi, Daniel!) in a variety of configurations. A set-up like this might be enough to test that your software or open-source package builds and runs on OpenPOWER, even if you don't own one yet. In a future article we'll talk about how you can boot your own code on the metal so you can port your favourite OS or build a unikernel.

(For the purposes of this article I'll assume an audience that isn't as familiar with OpenPOWER terminology as our usual readership. Kindly humour me.)

The emulation is imperfect, both if you're emulating it on a real Raptor family system or on an icky PC. While QEMU can emulate an AST2500 (i.e., the ARM-based Baseboard Management Controller, which acts as the service processor and provides the video framebuffer), and QEMU can also emulate a PowerNV system, it doesn't do both at the same time. That means the very lowest levels are actually being simulated here -- you can't watch Raptor's pretty Hostboot display, for example, and only the barest functions of the BMC are simulated enough to allow bring-up, not including the framebuffer. In fact, the hardware profiles we will use here do not in general match a real Raptor system either: we're just virtually plugging in PCI devices that give us necessary functionality, though of course none of the peripheral devices in a Raptor system is Raptor-proprietary. Finally, even though I have tagged this entry with KVM, KVM currently doesn't work right with the QEMU PowerNV machine model even though I'm pretty sure it should be technically possible. Sadly, I tried in vain to do so, could never get KVM-HV to be happy, and ended up kernel panicking the machine with KVM-PR. See if you can triumph where I have failed. In the meantime, naturally you can do everything here on a T2 or Blackbird as well because that's how I did it writing this article, but there is no special acceleration for those systems right now.

The first order of business is the first order of business with any emulator: get the ROMs. Fortunately, no one is going to bust you for pirating a set of these because we're an open platform, remember?

The two pieces required are Skiboot and Petitboot, both of which live in the system's PNOR flash. Skiboot contains OPAL, the OpenPOWER Abstraction Layer. It comes in after the BMC has turned on main power and started the Power CPUs' self-boot engines, which then IPL ("initial program load") Hostboot for the second-stage power-on sequence. When Hostboot completes, it chains into Skiboot, which initializes the PCIe host bus controllers (PHBs) and provides all the basic hardware calls needed by a guest kernel to support the platform. You can think of it as something like an overgrown BIOS. This is the lowest firmware level of an OpenPOWER system that QEMU currently supports emulating.

Skiboot lives only to service a kernel, so it immediately starts one. This initial payload is the bootloader for Petitboot, which is also stored in firmware. Petitboot has a small Linux root (Skiroot) and acts as a boot menu, finding bootable volumes on attached devices or over the network. Having found one (or you select one), it chains into it to start the main OS, and from then on Skiboot will provide platform services via OPAL for this final guest until the system is shut down or restarted. Because it's in firmware, Petitboot is always available, which can come in really handy when you're trying to do system recovery.

The first, best and most dedicated way is to build Skiboot and Petitboot yourself. They are open-source and the process is relatively well documented and automated, and you should know how to do this if you own an OpenPOWER machine anyhow. If you aren't doing this on a real OpenPOWER machine you'll need a cross-compiler, but most Linux distros offer such a package nowadays. Do keep in mind that if it looks like you're building a tiny Linux distro, well, that's because that's exactly what you're doing. The advantage here is you can fool around with the firmware at your leisure, but it requires a bit of an investment in disk space and time.

The second way assumes you have a more casual interest and would prefer to go with something prefab. It's possible if you (or, you know, your "friend") has a Raptor-family system to extract the necessary components right from the BMC prompt. Log into the BMC over SSH (or via direct serial connection) and type pflash -i. You'll see a list of all the partitions stored in the PNOR flash. The ones we want are PAYLOAD (which contains Skiboot) and BOOTKERNEL (which contains Skiroot and Petitboot). The exact addresses may vary from system to system and firmware to firmware.

root@bmc:~# pflash -P PAYLOAD -r /tmp/pnor.PAYLOAD --skip=4096 Reading to "/tmp/pnor.PAYLOAD" from 0x021a1000..0x022a1000 ! [==================================================] 100% root@bmc:~# pflash -P BOOTKERNEL -r /tmp/pnor.BOOTKERNEL --skip=4096 Reading to "/tmp/pnor.BOOTKERNEL" from 0x022a1000..0x03821000 ! [==================================================] 100%

We skip the first 4K page to avoid the wrapping around each partition. pnor.PAYLOAD is actually compressed and needs to be uncompressed prior to use, so:

root@bmc:~# cd /tmp root@bmc:/tmp# xz -d < pnor.PAYLOAD > skiboot.lid

Finally, scp both skiboot.lid and pnor.BOOTKERNEL to your desired system from the BMC.

Admittedly we just talked at length about the two ways most of you won't get the firmware, so let's talk about the third method and the way most of you will, i.e., you'll just download it. Currently there is an irregularity about Raptor's present Skiboot build for this purpose: it only boots if you are emulating a single POWER8. That's not a typo. If you use it to boot an emulated POWER9, the guest will simply panic, and the guest will go into a bootloop if you are emulating multiple POWER8 CPUs (necessary if you need a larger number of PCIe devices). This is undoubtedly a QEMU deficiency which will be corrected in future releases. In the meantime, if you just care about playing around using a single POWER8 on a terminal, then Raptor's builds (either from BMC flash or downloaded) will suffice. However, if you intend to emulate a POWER9 or SMP POWER8 system, download QEMU's own pre-built skiboot.lid and use that instead.

For Petitboot, we will extract that directly from Raptor's PNOR images. Assuming you didn't get it using the process above, download the current Talos II PNOR image and decompress it. In the shell_upgrade directory you will see the bzip2-compressed PNOR image. Uncompress that, leaving you with a filename like talos-ii-v2.00.pnor. Download my pnorex extractor tool (it's in Perl, because I'm one of those people) and run it on the PNOR image:

% pnorex talos-ii-v2.00.pnor Version 1 PNOR archive with 33 entries. Extracting PAYLOAD at offset 8601. This is a xz format image. Wrote 1020K successfully. Extracting BOOTKERNEL at offset 8857. This is an ELF executable image. Wrote 22012K successfully. Extracted 2 partitions successfully.

If you will be using Raptor's Skiroot, then uncompress pnor.PAYLOAD to skiroot.lid as above: xz -d < pnor.PAYLOAD > skiboot.lid

Now, with skiroot.lid (for this first example, either Raptor's or QEMU's) and pnor.BOOTKERNEL in the same folder, grab an ISO you want to boot. I used the prefab one Daniel offers on the Void Linux for Power site since I know it boots fine on OpenPOWER hardware. For our first example let's do a simple example of booting Void from a CD image on a POWER8 using the serial port. Our QEMU command line:

qemu-system-ppc64 -M powernv8 -m 4G -cpu power8 \ -nographic \ -bios ./skiboot.lid \ -kernel ./pnor.BOOTKERNEL \ -device ich9-ahci,id=ahci0 \ -drive id=cd0,media=cdrom,file=void-live-ppc64le-musl-20200411.iso,if=none \ -device ide-cd,bus=ahci0.0,drive=cd0

This configures a single-processor POWER8 system with 4GB of RAM, no graphics, and an Intel AHCI host controller with a single CD-ROM drive attached. The serial output should go to your terminal. It goes a little like this:

Here we are with Skiboot chaining into Petitboot. You can ignore the errors; there will be a lot of them since the platform is still incomplete. It will take a little bit of time to decompress the kernel (much slower than it would be on a regular system). You will notice a single device attached to the three available PCIe host bridges on the single POWER8 CPU, i.e., the host controller itself. Don't you just love that the vendor code for Intel is 8086?

This is Petitboot. When the bootable choices appear, cursor up to the starred option and press E before it autoboots, because we need to tell Void its console is the on-board serial port (otherwise it uses a VGA console: not sure whose bug that is).

Add console=hvc0 at the end, cursor down to OK and hit RETURN/ENTER a couple times to boot.

A successful login on your emulated baby POWER8. Ta-daa! To rudely pull the plug on the QEMU session, press Ctrl-A, and then X (QEMU: Terminated).

Let's now load out the POWER8. We would like to add a video card, an Ethernet card and a USB controller to our existing system, but POWER8 Turismo chips only offer enough PHBs for three PCI endpoints. How do we solve this problem? Easy: we'll add another processor!

At this point you will require the QEMU Skiboot and should use that where skiboot.lid appears in the remainder of this article. I use tun/tap networking in this example, which assumes you already have tap0 configured and up; change the -netdev setting if you want to use a different means of bridging the NIC. This example keeps the AHCI host controller and still displays debug output on the terminal, but uses the QEMU emulated VGA as a console instead and adds a good old Realtek 8139 NIC with a USB mouse and keyboard attached to a QEMU XHCI USB 3.0 controller.

qemu-system-ppc64 -M powernv8 -cpu power8 -m 4G -smp 2 \ -serial mon:stdio \ -device VGA \ -device ich9-ahci,id=ahci0,bus=pcie.0 \ -netdev tap,id=nic0,ifname=tap0,script=no,downscript=no \ -device rtl8139,netdev=nic0,bus=pcie.1 \ -device qemu-xhci,id=usb0,bus=pcie.2 \ -device usb-mouse \ -device usb-kbd \ -bios ./skiboot.lid \ -kernel ./pnor.BOOTKERNEL \ -drive id=cd0,media=cdrom,file=void-live-ppc64le-musl-20200411.iso,if=none \ -device ide-cd,bus=ahci0.0,drive=cd0

Let's spin this sucker like Superman's cape in a dryer:

The reason I keep the serial output is because the extra CPU adds around an extra minute on this T2 to get to Petitboot. Here, you will notice we now have six PHBs available, three per CPU, so now we have enough virtual PCI slots for the peripherals we require.

Petitboot shows up on both the 2D framebuffer and the serial terminal, and both work. You'll also see it probing the bridged Ethernet tap to see if it can boot that way, proving our Ethernet device is up and working. Whichever you use is where boot messages will go, so we'll use the framebuffer as console and start Void by cursoring up and selecting the starred option (thus also proving our USB devices work too).

Having booted Void, we can now demonstrate the PCI cards in the system, the attached peripherals and the number of CPUs. For the record, the DD2.3 POWER9 I'm typing this on shows its Spectre v2 status as "mitigated" with hardware acceleration.

Starting the Installer, which won't install anything because we haven't configured any storage to install to in our QEMU options. I'll leave that as an exercise to the reader.

If we switch to an emulated POWER9 system, Sforza CPUs support six PCI endpoints, so we get six PHBs. This means a single CPU is more than enough for our basic configuration without adding additional startup time. The QEMU command line to do so merely returns to single processor and changes the machine to powernv9 and the CPU to power9, i.e.,

qemu-system-ppc64 -M powernv9 -cpu power9 -m 4G \ -serial mon:stdio \ -device VGA \ -device ich9-ahci,id=ahci0,bus=pcie.0 \ -netdev tap,id=nic0,ifname=tap0,script=no,downscript=no \ -device rtl8139,netdev=nic0,bus=pcie.1 \ -device qemu-xhci,id=usb0,bus=pcie.2 \ -device usb-mouse \ -device usb-kbd \ -bios ./skiboot.lid \ -kernel ./pnor.BOOTKERNEL \ -drive id=cd0,media=cdrom,file=void-live-ppc64le-musl-20200411.iso,if=none \ -device ide-cd,bus=ahci0.0,drive=cd0

and it runs in the same way, but faster, because the emulation overhead is less. So let's totally do something stupid as our last parlour trick and run a POWER9 configuration with as many sockets as QEMU will let us hold (which right now is four). Note that these are all single-threaded cores, so this is still much less powerful than even a 4-core basic Blackbird.

./qemu-system-ppc64 -M powernv9 -cpu power9 -m 4G -smp 4 \ -serial mon:stdio \ -device VGA \ -device ich9-ahci,id=ahci0,bus=pcie.0 \ -netdev tap,id=nic0,ifname=tap0,script=no,downscript=no \ -device rtl8139,netdev=nic0,bus=pcie.1 \ -device qemu-xhci,id=usb0,bus=pcie.2 \ -device usb-mouse \ -device usb-kbd \ -bios ./skiboot.lid \ -kernel ./pnor.BOOTKERNEL \ -drive id=cd0,media=cdrom,file=void-live-ppc64le-musl-20200411.iso,if=none \ -device ide-cd,bus=ahci0.0,drive=cd0

With four emulated CPUs startup took over seven minutes from start to Petitboot on this dual-8 Talos II, so have patience if you're on a lesser workstation, but it does work:

You can see the watchdog complaining about the length of time OPAL calls are taking now (call 128 resets the XIVE VM interrupt controller on POWER9 chips). But we do have our four cores, and it's not impossibly slow on a beefy enough system (like another POWER9).

Incidentally, while the Power ISA emulation in QEMU allows SMT, it's very basic and not enough to get through the boot-up sequence, or at least not before the heat death of the universe. If you like listening to your cooling fans, see what happens when you try to emulate the biggest baddest dual-22 Talos II by adding -accel tcg,thread=multi -smp 176,threads=4,cores=22,sockets=2 to your QEMU command line. It's not pretty. That's why you should buy an OpenPOWER machine of your own instead of emulating one.

Posted by ClassicHasClass on April 03, 2020

Some updated notes on KVMPPC

A couple quick notes on KVMPPC that came out of the last article on Mac OS 9 in QEMU:

If you get a kernel Oops when unloading kvm_hv, you are bitten by this bug which should be fixed in recent kernel releases. (Thanks, Paul Mackerras.)
However, you don't have to do that anymore to use KVM-PR anyway. With QEMU 4.2 (at least on this Talos II running Fedora 31), if you use qemu-system-ppc64 with -M accel=kvm and explicitly specify a CPU which requires KVM-PR (such as, in our previous articles, a Motorola G4/7410 with -cpu nitro), then it will be used without having to unload KVM-HV. In the examples in the second half of the article, just change qemu-system-ppc to qemu-system-ppc64 and it will now "just work" (otherwise you'll get a weird error trying to start QEMU). That is, of course, assuming you have your kernel in HPT mode instead of radix, which is the default on POWER9.
Likewise, if you specify -cpu host (or power8 on a POWER9), KVM-HV should be used automatically, though you can explicitly specify you want HV with -M accel=kvm,kvm-type=HV as in our AIX on OpenPOWER example. Just don't tell IBM.

If you don't understand what all that meant, read our original article for a brief primer.

Posted by ClassicHasClass on March 29, 2020

Getting closer to Classic on Linux on ppc64le

A recurring theme here is my personal opinion that OpenPOWER workstations like the Raptor family (particularly now that the Blackbird is a lower-cost option) are the Power Mac successors Power Mac holdouts should embrace. It's the same processor family, it's an open system, and Raptor machines have that pride of hand-built quality and solid engineering we got accustomed to with our G4 and G5 machines. We're still missing the ability to use KVM-PR to boot Mac OS 9 in QEMU on OpenPOWER systems, but with recent improvements in QEMU and a few third-party assists, I'm now to the point where I can nevertheless use my old Mac OS 9 apps productively like FrameMaker and Photoshop 7. If you run your own QEMU builds with -O3 -mcpu=power9 performance is more than enough for these apps and you can get the work done well. In the screenshot you can see me using Photoshop to make a demo image and saving it to the shared folder, which I then opened up in GNOME's image viewer to prove it transferred over. Et voila.

There are still some pain points left (other than the lack of working KVM-PR in Mac OS 9); sound is still somewhat of a work in progress if it works at all and there is no copy-paste. Many games won't work properly either. However, QEMU Mac OS 9 guests can (with assistance) now support tablets with now much more natural mouse movement, using ndrv you can run at multiple convenient resolutions, files can be exchanged over AFP via Netatalk, and if you're willing to put up with a little clock madness QEMU can "sleep" so that you're not sucking CPU cycles in the background.

You'll need an ISO and/or CDROM of Mac OS 9 to get started. I won't provide you this and merely assume you have a perfectly legal source for one, such as ripping the installation CD you already own. (I own several.) Some of these steps I've already gone through in our earlier QEMU articles but they are reproduced here for convenience. These instructions are not OpenPOWER-specific in general and should work for other Linux systems (I'm on Fedora).

Create your disk image, usually qemu-img create -f qcow2 classic.img 40G or some such. We will refer to this image as classic.img in the example commands below.
If not already configured set up tap networking, typically with something like this for tap0, with your username being billg:
sudo ip tuntap add dev tap0 mode tap user billg sudo ip link set tap0 up promisc on sudo brctl addif virbr0 tap0 (optional, but handy for libvert)
This will persist until reboot.
Download the custom video driver qemu_vga.ndrv from Github. I put it in ndrv/qemu_vga.ndrv.
Set up your initial boot string. Mine looks like this, assuming you're booting from your Mac OS 9 ISO (in iso/922.iso):
qemu-system-ppc -M mac99,accel=tcg,via=pmu -m 1536 -drive id=root,file=classic.img,format=qcow2,l2-cache-size=4M -usb -netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no -device rtl8139,netdev=mynet0 -rtc base=localtime -L ndrv -prom-env resolutions=640x480,800x600,1024x768,1152x864,1280x800,1440x900,1920x1080 -boot d -cdrom "iso/922.iso"
This gives you the maximum 1.5GB of RAM Mac OS 9 supports. You'll notice we haven't enabled the tablet yet; we'll get to that.
Start up Mac OS 9 and install to your hard disk. When this is done, shut down the emulated Mac (QEMU will close).
Download our "essentials" ISO from the Floodgap gopher server (Firefox will need either OverbiteWX or OverbiteNX). It contains this USB tablet driver, the freeware 1.4 version of USB Overdrive, the Network Time control panel (for talking to RFC 868 and NTP servers) and an RTL8139 Ethernet extension. Uncompress it and bring up your system booting from the hard disk image but with the ISO ready. We'll assume it's in iso/essential.iso:
qemu-system-ppc -M mac99,accel=tcg,via=pmu -m 1536 -boot c -drive id=root,file=classic.img,format=qcow2,l2-cache-size=4M -usb -netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no -device rtl8139,netdev=mynet0 -rtc base=localtime -L ndrv -prom-env resolutions=640x480,800x600,1024x768,1152x864,1280x800,1440x900,1920x1080 -cdrom "iso/essential.iso"
Mac OS 9 should have StuffIt Expander built-in. Uncompress USB Tablet and put it in the System Folder's Startup Items (it's an app, not an INIT or CDEV). Uncompress the RTL8139 extension and put it in System Folder, Extensions. The others are optional but recommended; uncompress and install as they direct. Set your desired screen resolution of the ones available (I use 1440x900 on my 1080p display) and shut down again. Here's our last boot string change, adding the tablet this time:
qemu-system-ppc -M mac99,accel=tcg,via=pmu -m 1536 -boot c -drive id=root,file=classic.img,format=qcow2,l2-cache-size=4M -usb -netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no -device rtl8139,netdev=mynet0 -rtc base=localtime -L ndrv -prom-env resolutions=640x480,800x600,1024x768,1152x864,1280x800,1440x900,1920x1080 -device usb-tablet
Incidentally, if you want to build the USB Tablet app yourself, I was able to do so by downloading the project, resource and source file, converting the .r and .c files to Mac line endings, and opening the project in CodeWarrior 7 on my MDD G4.
Mac OS 9 will now boot up, and once USB Tablet is run, your mouse will suddenly switch to tablet mode. No more mouse grabbing needed. Nice!

Since I have GNOME set to respond to some muscle-memory key sequences like Command-Q (to close the window), I created an appmodmap bit for QEMU so that Command-Q would be passed to the emulated Mac and only Ctrl-Command-Q would actually quit QEMU. A command like gsettings set org.gnome.desktop.wm.keybindings close "['<Ctrl><Super>Q']" & should do it too.

The next thing to do is get networking operational, because this is how we'll share files. Regular EtherTalk doesn't work but AFP-over-TCP does and Netatalk supports it. (You could of course install something like DAVE and use samba, but Mac OS 9 has AFP-over-TCP support built in and Netatalk preserves resource forks.)

Configure your network within Mac OS 9 from the TCP/IP control panel. As we have it configured here it should be sufficient to just get an address over DHCP (from QEMU).
Set up Netatalk. Depending on how it's built you may or may not get any of the authentication options to work; mine didn't with the Fedora package even though the emulated Mac could talk to my usual Sawtooth G4 fileserver with a password. I eventually just set up guest access to a single interchange folder that runs as me, because all of my office systems including the T2 are on an independent non-routable network (if you don't have such a setup, consider binding it to an internal interface only). /usr/local/etc/afp.conf thus looks like this, assuming your username is billg:
[Global] ; Global server settings vol preset = my default values afp interfaces = enP4p1s0f0 uam list = uams_guest.so guest account = billg [my default values] appledouble = ea ea = samba [Homes] basedir regex = /home [HFS Work Area] path = /home/billg/HFS
This way some malicious Mac program (or more likely me fatfingering something) can't do anything more than trash the HFS folder from QEMU. enP4p1s0f0 is the NIC on the local network (from ip addr).
Go to the Chooser. Click AppleShare (notice no servers appear, even if you set AppleTalk to Ethernet) and then the Server IP Address button, and enter your host machine's local address. If you used my example above, connect as guest and you'll get offered a drive called "HFS Work Area." Open that and you will now have file sharing operational.
Finally, let's make it automatically mount. Since it's connected as a guest we need not preserve credentials because there are none to preserve (you could, if you wanted to, in the Mac OS 9 keychain). Make an alias of the network mount and drop it into the Servers folder in your emulated Mac's system folder. Shut down the emulated Mac and restart QEMU; the drive should automount, allowing you to instantly interchange files back and forth.

One more nice thing, which is optional, is to sleep the emulation when not in use. Sleeping from the Finder (i.e., Special, Sleep) doesn't work because the "lid is not closed" [sic], so to do this go to QEMU's Machine menu and check Pause. This does have a couple side effects, namely that after a few minutes the AppleTalk connection will be severed and the time will be wrong, since the clock doesn't resync. To reconnect the AppleTalk connection go to the Apple Menu, Recent Servers and select it; it will reopen.

The time is a little trickier. Mac OS 9 will talk to NTP servers, but it doesn't sync on demand, thus the Network Time CDEV we offered on the Essentials ISO. Not only will this let you fix the time right away (Apple Menu, Control Panels, Network Time and click Set Time) but you can also adjust for post-2002 daylight savings in your locale by defining a new, corrected timezone.

Once we figure out the issues with split-hack and KVM-PR, we should be able to get a dramatic speed boost on OpenPOWER systems. But even as it is, the software runs well and my sizeable investment in classic Mac applications is preserved. Sure, you could do this on a PC or a modern Mac, but I'd rather do it on what every Power Mac wants to be when it grows up (and when virtualization is working should be even better).

Posted by ClassicHasClass on July 30, 2019

Qubes == Dollar$

Well, bitcoins, anyway. I'm of two minds on software bounties personally: there's nothing like money for bringing interest to a new platform and bounties do directly subsidize development, but they tend to attract mercenary types who may not have interest in the platform otherwise and they rarely cover the full actual development cost. Moreover, while they do usually yield software projects that work, at least for whatever the definition of "work" was, in many cases they subsequently bitrot and become unmaintained (or unmaintainable) due to the community lacking the technical expertise they put the bounty up for in the first place. As a relevant example, this happens a lot in the Amiga community, where people just try to throw money at the software gaps; many projects get finished but few have lasting significance (Timberwolf comes to mind but there are others), and these wrinkles clearly distinguish bounties from crowdfunding where a presumably already interested party just needs resources to finish the work they already want to do.

Nevertheless, it's still a step in the right direction, and there is lots of interest in our higher-security OpenPOWER world in running a higher-security operating system. Qubes OS certainly has the chops with its strict(er) security-by-isolation approach and its multiple operating domains. Qubes, however, is based on the Xen hypervisor and not KVM, and they make a cogent case for why, i.e., it doesn't rely on the Linux kernel to do proper isolation and Xen is more self-contained, smaller and thus more auditable (see the PDF specification). Unfortunately, while Xen used to support PowerPC through version 3.2 (so-called "XenPPC"), it doesn't look like work has been done on Power ISA compatibility in almost a decade and it certainly doesn't support the later features exploited by KVM-HV needed for high-throughput on modern Power CPUs.

Some work on getting a KVM-based strategy "good enough" for Power has already been done, and there are some encouraging statements from Qubes developers on what they would consider an acceptable security target. (However, this work was started by Shaun "Mr. Chromium on POWER9" Anastasio, which sort of proves my point that people who are already interested will do the work, bounty or not.) My impression is that there is still a fair amount of work to be done and that brings us to the moolah.

While the "task" has not actually been well-defined in the Github issue referenced (it's not actually "deliver Qubes OS that can boot on POWER9 (and the head of John the Baptist);" it reads to me more like "do the systems work to either get KVMPPC up to snuff or deliver a working alternative foundation"), the task is certainly well-funded: 2 BTC, currently US$19,368, and the potential for another matching donation of 1 BTC to equal 4 BTC. Thirty-eight grand is definitely enough money to get anyone's attention, though don't ask me, because I don't know a great deal about Qubes' internals and I'm still trying to do this Firefox JIT thing in my "copious" spare time. But if you do, and you've got the hardware and you've got the need, step right up.

Meanwhile, Shaun struck again and ported BSNES. What was that I said about bounties and people who were already interested?

Posted by ClassicHasClass on November 03, 2018

Making your Talos II into an IBM pSeries

This post has been updated with new information. Thanks to Zhuowei Zhang, the author of the post we reference, for pointing out QEMU did add SMP support for emulated pSeries hardware. Read on.

In our previous series on turning your Talos II into a Power Mac, we spent most of our time with the KVM-PR virtualizer, the "problem state" version of KVMPPC, which is lower performance but has no hardware dependencies and can emulate a great number of historical Power CPUs (including the G3 and G4, which were of most relevance to those articles).

Recently, however, someone pointed me to this blog post on running IBM's proprietary AIX operating system under QEMU and asked about how well this would work on the Talos II. AIX runs on IBM's own POWER hardware and thus affords a good opportunity for exploring KVM-HV, the hardware-assisted hypervisor flavour of KVMPPC, so let's find out.

Parenthetically I should say that I have a very long history with AIX: my first job out of college in 1997 was mostly working on a medium-size PA-RISC university server running HP-UX 10.20, but we also had a number of RS/6000 machines for E-mail running AIX 3.2.5 that I had access to as well. The RS/6000s are, of course, early implementations of the POWER architecture. In 1998, I ended up with an Apple Network Server 500 running AIX 4.1.4 (and later 4.1.5) that became the first floodgap.com until it was decommissioned in 2012. Its replacement was a 2-way SMT-2 IBM POWER6 p520 Express running AIX 6.1 TL.mumble with some hand-rolled patches, and this system still runs floodgap.com and gopher.floodgap.com today. I also have a couple of the oddball PowerPC ThinkPads, a ThinkPad "800" whose SCSI controller fuse got blown by a SCSI2SD upgrade, and a fully functional ThinkPad 860 with a German keyboard running AIX 4.1.5 as well.

I should also add that the licensing situation with AIX on non-IBM hardware is sticky. I may give the lawyers a heart attack with this oversimplification, but the salesdroids I worked with back in the day essentially had the rule that if you own IBM hardware that can run AIX, then you may run it, because you were considered to have an implicit license simply by possessing the hardware. This situation changed after IBM introduced pSeries hardware that was not allowed to run AIX, starting with the original POWER5 OpenPower machines: even though they are IBM hardware, they are not licensed for AIX, even though you allegedly could coerce AIX to run on at least a subset of these machines with some work.

This handwavy "some work" is what QEMU provides. There is enough of a pSeries-like environment to at least boot AIX, though some pieces are still missing and the kernel appears able to detect it's running under QEMU. However, whether it functions or not, it may not be legal to run an AIX installation on an OpenPOWER or PowerNV system like the Talos II even under virtualization because OpenPOWER and non-IBM Power ISA systems are explicitly not licensed for AIX. IBM is unlikely to come after you if you're just playing around with it, but you have been warned.

First of all, make sure your system is able to run QEMU under virtualization. You should be running at least kernel version 4.18 (my Fedora 28 T2 has 4.18.16) and QEMU 3.0. Check that kvm_hv shows up in lsmod to make sure it has loaded. You shouldn't need to make any modifications to it for this tutorial. If it hasn't loaded, try sudo modprobe kvm_hv to make sure the modules are enabled (check the dmesg if you get errors). There shouldn't be any problem if your kernel boots in HPT instead of radix MMU mode as mine does to enable KVM-PR.

Next, get bootable media. Although I have a set of install discs for AIX 7, the version I have is too old to boot on POWER9 systems (it's intended for when I get around to it with my POWER6), so for this demonstration we'll simply use the diagnostic image that the author of the blog post above uses. Although any of the diagnostic images compatible with POWER9 will work, download the CD72220.iso image to use the patch tool that author offers. This enables you to boot to a limited root shell to snoop around the filesystem. I haven't gotten around to updating the patcher for the more recent images, but this one will suffice for our purpose.

QEMU provides a graphical console and USB keyboard, but just like a real IBM system, only specific IBM-supplied devices are supported as the AIX console terminal (my own POWER6 requires a particular IBM USB keyboard and mouse, naturally provided at a confiscatory markup, to drive a console powered by a GXT145 graphics card). Since QEMU doesn't know how to provide these devices yet, we'll tell QEMU to provide an emulated serial terminal connected to one of the emulated system's VTYs instead, which will "just work." This emulated serial terminal is provided in the terminal session you run QEMU from, not the main QEMU window.

AIX will boot under TCG, the built-in JITted CPU emulation system. This is very slow but will demonstrate the speed differential versus running with hardware assistance. The same command line provided in the original blog post will work here too (I recommend keeping verbose booting enabled if you run with TCG so you can be reassured QEMU hasn't frozen); substitute your ISO filename below:

qemu-system-ppc64 -cpu POWER9 -machine pseries -m 2G -serial mon:stdio -cdrom iso/aix-72220-patched.iso -d guest_errors -prom-env "input-device=/vdevice/vty@71000000" -prom-env "output-device=/vdevice/vty@71000000" -prom-env "boot-command=dev / 0 0 s\" ibm,aix-diagnostics\" property boot cdrom:\ppc\chrp\bootfile.exe -s verbose"

When QEMU starts, just stay in the terminal session and minimize its graphical console; you won't be using it. Booting under TCG takes about seven minutes on my 32 thread (dual 4-core SMT-4) Talos II with QEMU built with -O3 -mcpu=power9. As the original author indicates, the boot will stall for some minutes (about six on my system) at the define_rspc step. You'll also notice four-digit hex codes appearing at the bottom of the terminal session representing the state of the bootloader which any AIX admin will recognize (real IBM hardware and the Apple Network Server display this on a front LCD or LED panel). Once the system prompts you to press 1 and press ENTER, do so, and it will either enter the diagnostics menu or the root shell depending on if you're using the patched ISO or not. This is sufficient to show it basically works but you will already appreciate this is dreadfully slow for any task of substance.

So, kill the QEMU process (or close the graphical console window) and let's bring it up with KVM-HV this time. SMP is supported, so let's give it four cores while we're at it to start with. You can continue to use a verbose boot if you want but this starts up so quickly you'll probably just find the messages annoying. As above, substitute your ISO filename below (if you get an error saying that the KVM type isn't supported and you know that kvm_hv is loaded, try booting it with just accel=kvm):

qemu-system-ppc64 -M accel=kvm,kvm-type=HV -cpu host -smp 4 -machine pseries -m 2G -serial mon:stdio -cdrom iso/aix-72220-patched.iso -d guest_errors -prom-env "input-device=/vdevice/vty@71000000" -prom-env "output-device=/vdevice/vty@71000000" -prom-env "boot-command=dev / 0 0 s\" ibm,aix-diagnostics\" property boot cdrom:\ppc\chrp\bootfile.exe"

Notice that we are using -cpu host. KVM-HV only supports virtualizing the actual CPU itself or the generation immediately before (-cpu power8 thus should work, but not -cpu power7 or before).

Once started, this virtualized boot shoots straight to the "press 1 on console" message in about 50 seconds on my box (!!), and all the way to the diags menu/root shell prompt in just under one minute. Much faster! As you explore the command line, do note that there are many missing binaries in the miniroot the diags disk provides and the terminal emulation (and my delete key: I manually backspaced with CTRL-H) have many glitches. This is to be expected since this disc was never meant to provide a shell environment and the components of the miniroot exist only to support the diagnostics front end. (In addition, it is not possible to actually configure the terminal correctly from the diags menu and therefore do anything useful, probably due to missing support in QEMU. Even if you enter a valid terminal type, the diagnostics front end will continue to complain the terminal was improperly initialized and prevent you from doing anything further.)

Nevertheless, once you get a root shell up, it's interesting to compare lsattr -E -lsys0 on real IBM hardware and on this emulated system. On my POWER6, here are some selected entries (I censored the system ID from the hardware VPD, nothing personal):

ent_capacity 2.00 Entitled processor capacity frequency 2656000000 System Bus Frequency fwversion IBM,EL350_149 Firmware version and revision levels modelname IBM,8203-E4A Machine name systemid IBM,{censored} Hardware system identifier

But some values are definitely different (and occasionally abnormal) on the emulated pSeries system. Some are even missing outright despite having a placeholder. Here are the corresponding ones from our virtualized 4-core box:

ent_capacity 4.00 Entitled processor capacity frequency System Bus Frequency fwversion SLOF,HEAD Firmware version and revision levels modelname IBM pSeries (emulated by qemu) Machine name systemid Hardware system identifier

The difference in entitled processor capacity is due to our command line options, but the CPU frequency is oddly unreported and the various other identifiers have different values or are unpopulated. This is possibly how the kernel was able to detect it's running under virtualization.

If you're curious what other hardware support is present, lsdev looks like this (with the given command line):

# lsdev
L2cache0   Available       L2 Cache
cd0        Available       N/A
mem0       Available       Memory
pci0       Available       PCI Bus
proc0      Available 00-00 Processor
proc8      Available 00-08 Processor
proc16     Available 00-16 Processor
proc24     Available 00-24 Processor
rcm0       Defined         Rendering Context Manager Subsystem
sys0       Available       System Object
sysplanar0 Available       System Planar
vio0       Available       Virtual I/O Bus
vsa0       Available       LPAR Virtual Serial Adapter
vscsi0     Available       N/A
vty0       Available       Asynchronous Terminal

The (in)famous AIX smit system configuration tool can be made to work from the command line; try something like TERM=vt100 /usr/bin/smitty to start it. As we say in the biz^tm, "smit happens."^tm Use CTRL-L to repaint the screen if needed; if you see key combinations like "Esc+0," press ESC, release it, and then quickly press the second key. Note that this version of smit is missing quite a few screens and not everything does anything.

To bring down the system cleanly, not like it really matters here, just type exit at the shell, eject the virtual CD if you want to (Y or N), and then indicate to halt the system (H). AIX will respond with Halt completed and QEMU will automatically exit.

IBM used to be a lot more interesting with AIX. AIX 4 in particular offered a lot of workstation features and even a few games (my ANS 500 has AIX ports of Quake and Abuse on it), but modern versions are intended as buttoned-down server OSes and any client functionality is either accidental or secondarily grafted on. That said, after AIX 5L it got a lot easier to build stuff on AIX (either with xlc or gcc) and my full-service POWER6 (web, gopher and E-mail) runs a good collection of servers and utilities I ported myself plus all my old binaries I built on the Apple Network Server without comment. AIX is definitely different (and arguably staid and humourless) and its underpinnings such as the ODM may not be immediately familiar, but it's a tough OS that can take punishment and run like a tank, and I have to admit that I do love the jackboots. Despite having my own real hardware, it is fun to see it boot and run on the Talos even if only in a limited sense.

Posted by ClassicHasClass on September 21, 2018

Nested virtualization coming to POWER9

On the KVMPPC mailing list, Paul Mackerras posted for comments a new set of updates to KVM-HV allowing POWER9 systems in radix MMU mode to finally nest virtualization (i.e., run a virtualized POWER9 guest within another virtualized POWER9 guest through KVM-HV). This is not only a big boon to shops that run Power ISA virtual machines in terms of enhanced security and portability, but also offers the potential for improved debugging and development.

As you will no doubt recall from our previous series on turning your Talos into a Power Mac, the Kernel-based Virtual Machine functionality on Power ISA and PowerPC comes in two flavours: KVM-PR, which emulates supervisor instructions in software and thus is slower but more flexible and can be nested, and KVM-HV, which uses hardware hypervisor support in later Power ISA chips and is faster, but cannot emulate most earlier CPUs and previously could not be nested (though a KVM-PR guest can run within a KVM-HV guest, and additional KVM-PR guests within that).

With these patches, nested KVM-HV guests are now possible, and can run at nearly full speed. Let's define the base hypervisor to be at level 0 ("L0"). L0 can use the hardware virtualization support to run a guest at level 1 ("L1"). An L1 guest, however, currently cannot do the same thing, so it can't spawn any additional nested VMs under its own control. The trick with these patches is to add hypercalls to allow an L1 guest to ask the L0 hypervisor to create another guest on its behalf, but set up address translation that the L1 guest can manipulate. The new guest is actually another L1 guest, but it looks like an L2 guest because L0 will in effect translate the fake L2's addressing requests through the L1 guest that requested it using a combination of instruction emulation and paravirtualization. The emulated L2 guest should be able to then turn around and request a new VM itself, and the L0 hypervisor will make another L1 guest that the faux L2 guest can control that acts like an L3 guest, and thus turtles all the way down.

Because it is still inherently KVM-HV, however, it inherits all of its basic limitations such as only supporting the current processor generation and the one immediately preceding it. In addition, the current nested guest implementation relies on radix MMU mode, the default MMU mode of the POWER9 (KVM-PR requires hashed page table MMU mode), meaning it does not support earlier Power ISA generations that only support hashed page tables. The patches are out for comments on the mailing list and hopefully will be incorporated into the Linux kernel tree in the very near future.

Posted by ClassicHasClass on August 29, 2018

Making your Talos II into a Power Mac: dcbz considered harmful (part 2)

In the first part of this article we talked about getting your Talos II prepped to emulate a Power Mac using KVMPPC, the kernel virtualization facility in Linux. Having followed the instructions in that article, you've got your kernel in hash table mode, you've got the KVM-PR kernel module loaded (and patched it if necessary), you installed (or built) QEMU, and you have a blank QEMU disk image ready to go.

For this part, we will assume you have chosen 10.3 Panther, 10.4 Tiger or 10.5 Leopard to install. I will discuss Leopard relatively little other than how to get you started in it; most of the rest applies to Leopard that applies to Tiger. I'll briefly discuss booting OS 9 with TCG at the end.

Before starting, since we will use tun/tap networking, make sure the interface is up before booting. On Fedora, I do something like this:

sudo ip tuntap add dev tap0 mode tap user [your username] sudo ip link set tap0 up promisc on

and, if you use libvirt,

sudo brctl addif virbr0 tap0

For filesharing you could set up either Samba or Netatalk. I use Netatalk, since I'm more accustomed to AppleTalk and it enables my T2 to serve files over AFP to the other classic Macs here, and it also will work fine with Mac OS 9 if you want to use that at some point.

Let's begin by constructing the command line to boot your emulated Mac from disc and install the OS. Each OS does better currently with certain combinations of emulated CPU and hardware features. In addition, we also need to make sure that the emulator stays within a single core for better performance (you will get random system stalls if it moves over to another core and throughput will be generally impaired), so we need to set affinities appropriately.

We'll go with 10.4 for our example; substitute for your OS of choice where relevant. Start out with

taskset -a -c 0-3 qemu-system-ppc -M

This binds all of QEMU's threads to a single core (recall that the T2 Sforza cores are SMT-4, and each appear as logical CPUs, so everything must run on a single core this way). While QEMU spawns more than four threads, encompassing two cores (i.e., 0-7) has no noticeable performance benefit and can sometimes unsettle Mac OS X by making timing loops unpredictable.

For the -M option, we will specify mac99 and kvm. The OSes differ on what they prefer for the VIA. 10.3 and 10.4 need to run the emulated mac99 with an emulated CUDA chip onboard, or the OS is unable to detect the real-time clock. 10.5, however, requires the later PMU attached to the VIA. So that gets us to

taskset -a -c 0-3 qemu-system-ppc -M mac99,accel=kvm,via=cuda (10.3, 10.4)
taskset -a -c 0-3 qemu-system-ppc -M mac99,accel=kvm,via=pmu (10.5)

All three of these OSes work fine emulating a 7400-series G4. We will use the "Nitro" 7410 (-cpu nitro), which is a bit faster than the G3 (-cpu G3). 10.3 may have some problems with assigning more than 1.5GB of RAM (-m 1536), but 10.4 and 10.5 work fine with 2GB (-m 2048). Don't use more than 2GB of RAM; it will cause various problems. A verbose boot is helpful in case you accidentally did something wrong (-prom-env boot-args=-v). We'll specify our disk image and some tuning parameters (-drive file=[filename].img,format=qcow2,l2-cache-size=4M), and say boot from the CD or DVD (-boot d -cdrom "/dev/cdrom"). Lastly, we'll enable the emulated RTL8139 NIC and USB tablet (-netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no -device rtl8139,netdev=mynet0 -usb -device usb-tablet) and use a sane screen resolution (-g 1024x768x32). For my 10.4 booter, the full command line looks like this (using the filenames I use on this system):

taskset -a -c 0-3 qemu-system-ppc -M mac99,accel=kvm,via=cuda -cpu nitro -m 2048 -prom-env boot-args=-v -boot d -cdrom /dev/cdrom -drive file=tigerhd.img,format=qcow2,l2-cache-size=4M -netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no -device rtl8139,netdev=mynet0 -usb -device usb-tablet -g 1024x768x32

I strongly suggest saving this as a shell script so that you can make any necessary variations. Insert your OS CD or DVD and run the script. It should go into the installer. If it didn't, make sure your filenames are correct, that you have OpenBIOS installed (it comes with QEMU) in a location the emulator can see, and that the KVM kernel modules (both kvm and kvm_pr) are loaded by checking lsmod.

Once the installer has booted you can of course directly proceed to installation in KVM, but I actually recommend shutting down the emulated Mac at this point and bringing everything back up in TCG to get the OS installed. To do that, just use the same command line, but change accel=kvm to accel=tcg. As I mentioned in the first part, heavy I/O loads tend to be less performant on KVMPPC, and installing and upgrading an OS is a pretty heavy I/O load, so running it in TCG will complete the task more quickly and more reliably.

If you want to run Software Update to bring your emulated Mac up to date, it's probably best to also do this in TCG. You could also separately download one of the combo installers (such as the one for 10.4.11) and push that to the emulated Mac on your Samba or Netatalk AFP share.

When the OS is installed, remove the CD-ROM from your command line unless you want to keep it, and change the -boot argument to -boot c to boot from the emulated drive image.

Ta-daa!

For best results with video updates, make sure that the display settings inside System Preferences match your physical display. I'm in 32-bit colour, so I made sure that System Preferences was using Millions instead of Thousands of colours. Because of variabilities in timing, you may notice the OS X clock is close but may seem to run somewhat unsynchronized from your host's clock because of how the delay loop might have been calibrated at bootup. This is mostly just a nuisance.

The next step is optional, but hacks KVMPPC to improve performance of the emulated Mac. Right now we're actually fooling the operating system; we're not really a G4. In fact, the closest Power Mac relative to the T2's POWER9 is the G5, i.e., the PowerPC 970, which is essentially a POWER4 with some modifications for workstation duty and a bolted-on AltiVec unit. Even though we told the OS we're a G4, this doesn't change the attributes of the CPU, in particular for this case the specific instructions it does and does not support and how certain others are handled.

With "big POWER" IBM removed some of the PowerPC instructions that were infrequently used or scaled badly, such as dcba and mcrxr. You don't need to know what these do; just know they were used in some software, but as of the G5 ceased to exist in hardware. Additionally, the G5 and later big POWER designs (including the POWER9) also have a 128-byte cache line instead of the 32-byte cache line of the G3 and G4, which is relevant to the dcbz instruction as it zeroes an entire cache line and potentially spills it to memory. OS X has adaptations for dealing with these cases (an illegal instruction handler in the first case that simulates the instructions in software, and modified system routines in the second), but that only happens if OS X knows the machine is a G5. In this case, it doesn't, so these adaptations are never installed.

KVM-PR gets around the dcbz problem on later POWER designs, including the POWER9, by scanning every new code page in a 32-bit guest for the dcbz instruction and replacing it with an illegal one it can detect. (Remember, it's still a legal instruction; it just behaves differently.) When executed it faults and falls back to KVM-PR, which simulates a 32-byte dcbz instruction in software, and returns control to the guest. It's not a surprise that this process is quite slow, especially if it gets called in a loop. Unfortunately Apple does just exactly that for clearing memory and the instruction is a major portion of the OS' built-in implementation of bzero, which is also called by memset. This is a hot routine and needs to run fast. The G5 version knows about the cache line difference and accounts for it; the G4 and G3 versions don't, and we're using the G4 version.

Apple, however, also helped us out here a little bit by allowing us to guess where the routine is. This and other major components live in a section of memory called the "commpage," which is always located in the top eight pages of the 32-bit addressing space in every process. It is provided by the kernel as an optimization for fast access to important data and common routines. The bzero routine is virtually unchanged from 10.3 to 10.4, and both start with a very unique instruction (cmplwi cr7,r4,32). If we see this instruction in the commpage, we can be confident we have found bzero. And now that we've found it, we can modify it.

Recall I mentioned that KVM-PR must scan each new executable code page for the instruction and change it. We can alter KVM-PR to detect that unique leader instruction if it's mapping in the commpage, and then monkeypatch in a new routine that doesn't use dcbz and thus won't require slow simulation. To make it more reliable, we know where the location should be, so we'll only patch it if it's actually there. As a bonus we'll also map dcba to nop anywhere in an executable section so that it doesn't need a trip to a special handler either. That is what this patch does.

To build KVMPPC with this patch uses the same steps as we discussed for building and installing the kernel modules in part 1. This patch also applies with -p1.

Does it make a difference? You bet it does. On my system with Geekbench 32-bit on Mac OS X 10.4.11, it improved the overall benchmark by nearly 200 points over the unpatched version, almost all of it in (no surprise) the memory score.

This consequence of masquerading as a different CPU also carries over into which software you can run. Even though this is a G4, you actually have to run the G5 version of TenFourFox, which doesn't have any of the other illegal instructions that aren't patched (just be patient -- it will take TenFourFox almost a full minute to come up). If your software offers a G5 version, you should run that if you can. The discontinuity leads to amusing discrepancies like this one.

Interestingly, TCG on POWER9 actually had errors during SunSpider that the JIT in TenFourFox under KVMPPC doesn't, and even with the warmup was up to twice as slow as KVM at SunSpider. Go TenFourFox!

You'll find that performance is still fairly pedestrian even with KVMPPC. While the OS typically benchmarks my T2 as a "2.04GHz G4" (TCG usually gets computed as somewhere between "900 MHz" and "1.0GHz"), the actual throughput you get varies greatly on workload. Raw CPU performance is a bit better than my Quad G5 scores running single core in Reduced mode, though the Quad running full tilt easily surpasses it (the emulation overhead is only reduced, not eliminated). The numbers get a lot different in applications depending on how their workload is structured. For example, TenFourFox's G5 JIT in KVMPPC gets about 6800ms in SunSpider compared to around 3800ms on a "real" 1GHz iMac G4. Improving these numbers to get parity, and especially getting QEMU to support SMP, will need to be an area of active future development.

Lastly, I mentioned about the best way to run OS 9 on a Talos. Although limited to TCG, it's still pretty snappy, a testament to Mac OS 9's comparatively low system requirements. Mac OS 9 works better with the PMU than the CUDA (or you get problems with the mouse not responding to double clicks reliably) and is limited to 1.5GB of RAM. It also doesn't support the QEMU USB tablet, but it does support the RTL8139 with this driver. To get the driver installed, I actually just made an ISO image out of it, dropped it in the Extensions folder and rebooted it. My command line looks like this:

qemu-system-ppc -M mac99,accel=tcg,via=pmu -m 1536 -boot c -drive file=classic.img,format=qcow2,l2-cache-size=4M -usb -netdev tap,id=mynet0,ifname=tap0,script=no,downscript=no -device rtl8139,netdev=mynet0 -rtc base=localtime

Mac OS 9 uses a different real-time clock base, so this has an additional -rtc option. You can use any CPU you want since it's emulated; I just use the default G4 7400 here instead of specifying one.

Post questions or things you've discovered in the comments.

Posted by ClassicHasClass on August 26, 2018

Making your Talos II into a Power Mac: KVMPPC for POWER9 (part 1)

UPDATE: On current systems (as of April 2020), see these errata.

Talospace is a spinoff from the TenFourFox Development blog, which for those unfamiliar with it, is a Firefox fork maintained for Power Macs running Mac OS X 10.4 and 10.5. It shouldn't be a surprise that the common architecture was a big plus for me, and it's possible to run OS X with reduced emulation overhead on the processor using the same Kernel-based Virtual Machine (KVM) scheme used for virtualization on other platforms.

Emulation is of course just one of the things us old Mac users would like working properly on the new Power hotness. The other is the damn Command key working like it's supposed to. We'll address that pain point in another "First Person" post coming soooooon.

Anyway, a brief digression before we begin, for those unfamiliar with how KVM works on Power ISA. KVMPPC comes in two flavours, KVM-PR ("PRoblem") and KVM-HV ("HyperVisor"); both work in big and little endian modes. KVM-HV is the more modern of the two and the most technically like hypervisors on other architectures. It uses the hardware support in later Power ISA CPUs, so it's overall faster, particularly when many supervisor-level instructions must be executed. However, it cannot be nested (you can't run a KVM-HV guest inside a KVM-HV guest, though you can run a KVM-PR guest; more on that in a moment), and most importantly, it supports only virtualizing the same processor generation or the one immediately prior. Since no version of OS X ran on a POWER8 (let alone a POWER9), we won't be dealing with it further for the purposes of this article.

That brings us to KVM-PR. Unlike KVM-HV, KVM-PR runs strictly in user mode, or what IBM docs refer to as the "Problem State." It does run as a kernel module, so it's not in userspace, but it does not depend on the hardware which powers KVM-HV and thus only runs user-level instructions. That means it must trap and emulate supervisor-level instructions on behalf of the guest, which is much slower. However, KVM-PR can also emulate other instructions and their desired behaviour, which theoretically allows it to act like any supported Power ISA or PowerPC CPU, including a G3, G4 or G5. Instructions which aren't supported natively are trapped and executed just like supervisor-level instructions, and everything else can still run on the metal. Because it's user mode, it can be nested (a KVM-PR guest can run inside of another KVM-PR guest, as well as inside a KVM-HV guest). KVM-PR was the original method of virtualization on PowerPC Linux, descending from the venerable old Mac-on-Linux project (which had its own peculiar hypercalls), and a specialized form of this method is how OS X runs Classic on 10.4 and earlier. This is the method we will use here.

Let's first talk about whether KVM is the way you want to go. For our Power Mac hardware emulation, we will use QEMU, which can use KVM (and KVMPPC) to accelerate the processor, and QEMU provides the rest of the platform. QEMU provides two platform profiles, g3beige, a Gossamer Beige Power Mac G3, as the name implies, and mac99, essentially a Sawtooth G4. We will only be using the mac99 platform since it provides the best combination of flexibility and compatibility.

QEMU also provides emulated USB devices. The most useful to us is the USB tablet, which allows QEMU to detect when the mouse is within the QEMU window without having to grab it and makes using the emulator a lot more seamless. Unfortunately, the USB tablet is only supported by 10.3 Panther and up. No version of PowerPC Mac OS currently has support for VirtIO devices yet either, so there is no graphics or disk acceleration. On the other hand, QEMU does provide an emulated RTL8139 network card, for which drivers are available for Mac OS 9 through 10.2 Jaguar and are built into at least 10.3 Panther and up, and with tun/tap runs with decent throughput. Sound is best described as a work in progress and graphics work but are basically a dumb framebuffer. Still, this is enough to get the OS off the ground and be useful.

KVMPPC does not work in all situations with QEMU. Most notoriously it does not work for booting Mac OS 9 and Rhapsody, and not 10.0 or 10.1 either, at least from disc. I've done some work on improving this and it gets mostly through the nanokernel startup in OS 9 but doesn't get any further yet. For these operating systems you will currently need to use TCG, QEMU's software CPU emulator, which runs by default if you don't ask for KVM. TCG does have JIT acceleration, and the JIT supports Power ISA, so while it's definitely slower it's at least somewhat better than it sounds. TCG also tends to run a little smoother than KVM since it's all within a user process, but compute-intensive tasks can run up to an order of magnitude slower. TCG is also involved if you run a completely alien inferior architecture like x86.

KVMPPC also tends to be problematic with heavy I/O loads. TCG can be noticeably faster when running installers, for example, or anything that involves substantial emulated disk access. This is probably due to the large amount of supervisor-level code that incurs a speed penalty with KVM-PR. I had better luck and faster install times installing things with QEMU using TCG, then shutting down and rebooting in QEMU with KVM to actually use them.

Finally, KVMPPC only works to mimic certain processors currently. G3 works for every system, and Nitro (G4 7410) works for most of them, but right now that's all. None will boot in KVMPPC with any G4 7450-series processor, and trying to start KVMPPC in 64-bit mode to emulate a G5 currently crashes my Talos. There is also no support for SMP, so our monstrous multi-core beasts will only present one CPU to the emulated OS. The processor you choose doesn't necessarily change the underlying vagaries of the architecture, though, which will be discussed in the next part as well.

Some specific notes on individual versions of Mac OS:

Mac OS 9, Rhapsody, 10.0 Cheetah and 10.1 Puma do not currently boot on KVMPPC, at least not from CD. They also don't support the USB tablet, so you must click in the window to grab the mouse and keyboard, and hit Ctrl-Alt-G to release the grab to do something else. Rhapsody can be notoriously hard to install and requires multiple steps which I won't discuss here. For OS 9 I'll talk about a couple of glitches with QEMU in Part 2, since many of you will still want to run it even though there is no CPU acceleration.
10.2 Jaguar has various problems in KVMPPC, though it does work. Finder windows tend to glitch and not fully load when you doubleclick folders and devices on the desktop. Classic does not work in 10.2 with KVMPPC and aborts with a bus error. 10.2 also does not support the USB tablet, so you need to grab the mouse as with OS 9.
10.3 Panther and 10.4 Tiger both run well in KVMPPC. Later on we'll talk about a specific optimization to the operating system "commpage" to make them run even better. 10.3 runs better than 10.4, but 10.4 has better compatibility. Both support the USB tablet and have built-in support for the RTL8139 NIC. Classic will boot and run in both, but is noticeably slower than on a real machine (this is true of both TCG and KVM), though Classic is somewhat faster in Panther.
10.5 Leopard appears to work fine in KVMPPC. It supports everything that 10.3 and 10.4 do, though I haven't done the particular commpage optimization for 10.5 yet because I don't use, nor particularly like, Leopard personally. 10.5 obviously does not support Classic.

You'll need to do some preparation to get your Talos II to be an accelerated Mac with KVMPPC (this isn't needed if you're going to use TCG since it's purely userspace). The first is that you need to make sure your T2's MMU is in hash table mode, used by POWER8 and earlier CPU generations. The POWER9 introduces a new MMU mode called radix mode, but without going into the gory technical details, the particular memory mapping characteristics of radix mode mean certain tracts of memory cannot be properly manipulated by KVM-PR. All OSes that support the POWER9 in radix mode will support it in hash table mode. For Linux, just add disable_radix to your kernel command-line arguments. For my Fedora workstation, I just put it into GRUB, regenerated the configuration, and rebooted. If you did this right, dmesg will show a line like this:

[    0.000000] hash-mmu: Initializing hash mmu with SLB

You shouldn't see any mention of radix mode.

The next step is possibly to download a copy of the kernel source code. If you have kernel 4.17.x or earlier (as my Fedora 28 system does), you will need to apply patches to the KVMPPC kernel modules in that version to even get it to start. If you have kernel 4.18.x and up, the necessary patches should already be present for basic functionality, but you may still want to get the kernel source for some of the hack optimizations in this post that aren't (and probably won't ever be) included by default.

Let's assume for didactic purposes that you do need to patch the KVMPPC kernel module that comes with your distro. We will talk about adding the hacks to it a little later. This will taint your kernel. If your system behaves strangely and you are unable to unload the module, you may need to reboot.

Download and unpack the source archive, and cd into the root of the unpacked source archive.
Download this patch (written by yours truly) into the root of the source archive.
patch -p1 < that_patch.diff
cp /your/kernel/config .config (assuming you're still in the source archive)
make -j24 modules (or if you're one of the lucky scum with more cores, adjust as appropriate; I have two of the 4-core CPUs for 32 threads, but I like to leave a core free)
When the make has run to completion, edit include/generated/utsrelease.h to make sure it matches what appears in uname -r, or your kernel may refuse to load the module.
Regenerate the KVMPPC modules with the matching string: make -j24 SUBDIRS=arch/powerpc/kvm

Now you can load your custom modules:

cd arch/powerpc/kvm && sudo modprobe -r kvm_hv kvm_pr kvm && sudo insmod kvm.ko && sudo insmod kvm-pr.ko

and you should see something like this in dmesg:

[22198.130998] kvm: loading out-of-tree module taints kernel.
[22198.184535] kvm: module verification failed: signature and/or required key missing - tainting kernel

If you actually got an error message, you loaded the wrong thing, or you possibly forgot the patch (earlier KVMPPC versions won't even start KVM-PR on a POWER9).

If you already have the patches, chances are your OS already loaded KVM-PR. You can check this with lsmod. If it didn't, and trying to load it with sudo modprobe kvm_pr doesn't work, you may need to also patch your kernel modules with the steps above. On the other hand, if you see both kvm_pr and kvm_hv listed, do a sudo modprobe -r kvm_hv (unless you really do need it) to limit your system to KVM-PR and help to simplify the remaining steps in this article.

Next, the third step is to install QEMU. QEMU 3.0 is strongly advised; if your package source doesn't have it, then download and compile from source (and you get to do -O3 -mcpu=power9 anyway for great justice). Although QEMU 2.12 will mostly work for these examples, many bugs and edge cases were fixed in the Mac hardware emulation and some bugs can't be worked around easily any other way. You may also have to remove some command line options from the examples that were not supported in 2.12.

Create your base disk image according to the QEMU instructions and get out your OS X disc. I saw little value in using a raw disk image and it was substantially larger than a qcow2 image, so I'd just use that. We'll assume this is your chosen format for the remainder of this series.

In part 2, we'll talk about how to get all this actually booting.

Posted by ClassicHasClass on August 13, 2018

Linux kernel 4.18 available

Version 4.18 of the Linux kernel is now official; Phoronix lists the major hits. There's even some PA-RISC love in there.

For Talos II land among all the other updates for POWER9, some of the KVMPPC work that enables QEMU to actually boot and run Mac OS X on T2 hardware without using pure software CPU emulation is now in this release (disclaimer: yours truly is a contributor). This requires using KVM-PR instead of the true hypervisor KVM-HV (which also is the subject of substantial updates in this release), but now will work as long as your Talos II's MMU is set to use hash tables instead of radix mode (putting disable_radix as a kernel command line parameter in your GRUB configuration will do nicely). More details on getting this up and running will be the subject of a future post.

Posted by ClassicHasClass on August 02, 2018

Talos articles from our sister blog

Talospace originally started as a spinoff from the TenFourFox Development blog to move Talos-related content into its own concentrated space. However, a number of relevant articles were posted there earlier; here are permalinks for them:

Unboxing the Talos II
A semi-review of the Raptor Talos II
News items:
- A little Talos of your very own (Talos II Lite announcement)
- Talos take II (original Talos II announcement)
KVM work and general using the Talos:
- A weekend on the new computer
- A second weekend on the new computer

And, just for historical fun, coverage of the original Talos POWER8 project (preorder announcement, preorder price drop, pledges open, Crowd Supply launch and the end of the campaign).