Thursday, 29 January 2015

Building Xen 4.4.1 on NetRunner Rolling (Manjaro)

------------------------------
This is a post I've had in draft as I progressed through setting hop Xen in Netrunner. I managed to get it all built and running but at the end ran into a network issue in VMs that I didn't have time to solve. I'm publishing anyway for future reference and maybe it will be useful to someone. I'll update or link through to any follow-ups I do.
------------------------------

To jump straight to the steps and skip my invariable ramblings, click here.

MediaServer8 has been offline for about a week now as I work on something of a radical shift: a long-term stalwart of the set up is under consideration for the chop. unRAID might go.

My whole-house integrated media server has been based on unRAID for some time now, with version 6 (still in beta) providing a virtualisation platform on which I've built a TV Server and several virtual HTPCs with passthrough GPUs.

This is a great setup as it consolidates multiple machines into one, reducing running costs and space usage while in theory being easy to maintain.


unRAID offers three flavours of virtualisation; Xen, KVM and Docker. As an early adopter, I built my virtual machines around Xen as that was the first and only technology available in early betas. As KVM and Docker support was introduced, they slowly gained traction to the point where they are much more actively supported both by LimeTech and the community. (It's been stated on unRAID Forums that that support for Xen is limited to checking that it works with each new release).

For the past few months, I've been having stability issues - VMs would freeze or blue screen, the base unRAID OS would crash, logs would fill with indecipherable fault reports and I've had more than one disk red ball requiring immediate and lengthy attention. I'd migrated the underlying filesystems on all my disks from ReiserFS to XFS in the hope of addressing this but to no avail.

Now, as it happens, it may be that these issues are not software related at all but symptoms of a failing motherboard. (I've subsequently tracked these issues to PSU). As part of the process described below, I had problems installing operating systems to drives when attached to specific SATA ports (failure to target the drive in the installer, streams of page faults related to the controller when I switched the boot disk to particular ports).

In any case, the combination of poor support for Xen, ongoing errors and crashes and a mild case of upgradeitis caused me to consider a move to the KVM option in unRAID (Docker isn't an option as my systems are based around Windows software, mostly).

This would be a big move for me. There's no easy way to migrate VMs from Xen to KVM so I'd essentially need to rebuilt everything. And that, of course, led to the thought, 'if I'm going to be starting from scratch,what else is out there?'

I'd been interested in a group that splintered from unRAID and found a home around blog.ktz.me. This is run by a former prolific contributor to unRAID community, IronicBadger. In particular, there has been talk there of setting up a fully open system that replicates a lot of the unRAID features in ArchLinux. However, reading some of the articles, it seemed like a lot of work for not much gain and a bit of a maintenance nightmare for someone with rudimentary Linux knowledge.

The discussions have, however, turned me on to ArchLinux, a rolling Linux release that's very configurable. I'd originally toyed with the Manjaro distribution but more recently moved to NetRunner which I've had installed on a VM for some time and really like.

One thing that's bugged me about my setup has been a wasted PCIe slot - my particular motherboard cannot run headless - it doesn't have an integrated GPU and needs one specified or won't boot. Therefore, I have a precious PCIe slot occupied by a card who's sole purpose is to allow unRAID to boot.

My thinking now is that maybe I could install NetRunner as my base OS, install Xen on that and run my existing VMs with little or no modifications. This would allow me have more control (I could upgrade Xen or the OS as I see fit and not be beholden to LimeTech release schedules). I would also gain back an expansion slot. I'm also keen to see if a change of platform will improve the performance of high-res audio playback via my m-audio delta cards - if so I could fold back in my pre-pro VM which I had to migrate to a physical machine because elf this issue.

Before attempting this route, however, I thought I'd look further afield, specifically at XenServer and VMWare Hypervisor. I spent a few days playing about with both. They are broadly similar in what they do - bare metal hypervisor solutions and I enjoyed testing them, though I liked the VMWare way of doing things a bit better.

Alas, in each case, I had problems when I got to hardware passthrough. A combination of non server-grade hardware and some oddball and legacy cards (such as older m-audio deltas and digital devices octopus tuners as well as my expansion chassis all likely contributed to 'an almost but not quite' experience). Definitely something to explore in the further though.

So with that out of my system, I set about getting Netrunner and Xen up and running.

I'd been a bit worried about setting up Xen having read this post which had some very specific steps but as it turns out, with Netrunner, a lot of that is redundant. I had a few false starts, caused not least by my own stupidity in installing a 32bit version of Netrunner without realising it. Here are the steps I followed;

Step 0 - System Prep.


I disconnected all hard drives from my system and added a 60GB SSD to act as system drive. I left in all my expansion cards etc. to ensure that any installation detection took place.

I downloaded the latest Netrunner Rolling Release (v 2014.09.01) .ISO and using DD on OSX, made a bootable USB key from it. (install runs much quicker from USB than from an optical drive).

I plugged the USB into MediaServer8, set it to be boot device in Bios and rebooted.


Step 1 - Installation & Post-Install Tasks

The net runner installation is a piece of cake. I'd set my boot GPU to one of the Radeon HD 5450 cards in the system so I selected the 'non-free drivers' version as I found it worked best with my particular graphics set up.

The graphical installer asks the usual questions around location, keyboards etc. and requires setting up of system and user names and passwords. I elected to set a root password as well. The entire installation takes about 10 minutes.

Once booted up, it's necessary to update the system. As an arch-based distro, this is achieved by opening a terminal and typing;

sudo pacman -Syu   

This will update repositories and start downloading updates. It's necessary to run this a few times to catch everything. In my case, I had almost 600 updates which took about 30 mins to download and install.

While I have a keyboard, mouse and display connected to Netrunner on MediaServer8, I ofent like to SSH into the server if I'm elsewhere in the house etc.

For some reason, SSHd isn't set up by default so I needed to run these two commands;

sudo systemctl enable sshd  
sudo systemctl start sshd   

I also like to install Screen;

sudo pacman -S screen   

Screen is a utility that can be used to multiplex several virtual consoles, allowing access to multiple separate terminal sessions inside a single terminal window or remote terminal session. Critically, it allows a command line process running on a remote machine to continue even if the client access fails.

That's it. Netrunner has all the build utilities you need to get Xen up and running so it's time to get building...


Step 2 - Building Xen

Xen is not available as a package via pacman. It needs to be compiled. To do this, we use the yaourt tool;

yaourt -Ss xen  

This will query the AUR database and list any sources that contain the term 'xen'. Several options are available at the time of writing ranging from version 4.3 through to untested and uefi versions of the latest 4.5.

I didn't want to be quite on the bleeding edge so I played it safe and went with version 4.4.1-3 which conveniently enough is named 'xen'. Kick off the download and build process like this;

yaourt -S xen  

(note that you shouldn't run yaourt as root or under sudo - it it will complain)

The entire process took about 30 minutes for me. It's pretty much self-contained; there are a few confirmations required at the beginning and at the end but assuming everything is OK, it pretty much runs itself, though it's pretty hypnotic to watch!

Once complete, you'll find a new Xen package in your /boot directory. But that's only the start.


Step 3 - Enabling Xen

Once the build process completes, the following is posted to the terminal (truncated);

In order to complete the installation, and enable Xen,
at the very least you must:
1. If using GRUB2, edit your GRUB2 config files as specified at
    https://wiki.archlinux.org/index.php/Xen#Bootloader_Configuration
2. If booting via efi, copy the example /etc/xen/efi-xen.cfg to /boot/xen.cfg
   and edit the contents to match the settings you need.
3. Issue the following commands to allow you to create and start VMs:
    systemctl enable xenstored.service
    systemctl enable xenconsoled.service
4. If you want some domains to automatically start up/shutdown, run the following:
    systemctl enable xendomains.service

For more information refer to the Wiki:

    https://wiki.archlinux.org/index.php/Xen

So there are a few more steps to complete before progressing. Specifically;

-Update Linux Kernel
-Enable Xen services
-Update Grub to enable Xen boot


3.1 - Updating Linux Kernel

The Kernel update is required as I found that with the default linux314 kernel, the NetRunner desktop would lock up and ssh access would be denied whenever a VM was active. An updated kernel resolved the issue.

To update the kernel in Netrunner, access Settings->Manjaro Settings Manager->Kernel


Choose the desired kernel, click install and reboot. You can see from the above, I selected 3.18.2-1 which fixed the issue I was having with the system freezing while Xen VMs were running.


3.2 - Enable Xen Services

To have Xen services load at startup, issue the following commands as noted in the build output listed above;

systemctl enable xenstored   
systemctl enable xenconsoled   
systemctl enable xendomains   

(also issue systemctl start commands for these services if you're going to use them before a reboot)


3.3 - Enable Xen Boot

As Netrunner boots via Grub(2), it's necessary to now complete the following steps;

First of all, by looking in my /boot directory, I find the newly created Xen image ('xen-4.4.1.gz' - highlighted by me in red). I also see the various files associated with both 314 and 318 kernels.

drwxr-xr-x 3 root root     4096 23.02.2014 10:36 EFI/
drwxr-xr-x 6 root root     4096 22.01.2015 14:46 grub/
drwxr-xr-x 2 root root     4096 06.10.2013 19:13 memtest86+/
drwxr-xr-x 2 root root     4096 06.11.2013 22:23 syslinux/
-rw-r--r-- 1 root root 19080760 21.01.2015 01:08 initramfs-314-x86_64-fallback.img
-rw-r--r-- 1 root root  4604675 21.01.2015 01:07 initramfs-314-x86_64.img
-rw-r--r-- 1 root root 19472695 22.01.2015 14:37 initramfs-318-x86_64-fallback.img
-rw-r--r-- 1 root root  4652521 22.01.2015 14:36 initramfs-318-x86_64.img
-rw-r--r-- 1 root root   648704 12.10.2014 13:05 intel-ucode.img
-rw-r--r-- 1 root root       22 08.01.2015 23:12 linux314-x86_64.kver
-rw-r--r-- 1 root root       21 09.01.2015 19:03 linux318-x86_64.kver
-rw-r--r-- 1 root root  3888784 08.01.2015 23:12 vmlinuz-314-x86_64
-rw-r--r-- 1 root root  4087264 09.01.2015 19:03 vmlinuz-318-x86_64
-rw-r--r-- 1 root root   852886 21.01.2015 01:42 xen-4.4.1.gz

To update GRUB to include a Xen boot option, it's necessary to run grub-mkconfig, but doing so in the current state won't work - no Xen option will be added to Grub. 

In researching this, I found an enlightening post at the bottom of this discussion thread which indicates that for grub-mkconfig to work, there needs to be a config file with a name matching the vmlinuz file, in my case 'vmliniz-318-x86_64'. Therefore, I need to run the following to generate this file;

zcat /proc/config.gz > /boot/config-318-x86_64  

This adds the required file (highlighted by me in red);


drwxr-xr-x 3 root root     4096 23.02.2014 10:36 EFI/
drwxr-xr-x 6 root root     4096 22.01.2015 14:46 grub/
drwxr-xr-x 2 root root     4096 06.10.2013 19:13 memtest86+/
drwxr-xr-x 2 root root     4096 06.11.2013 22:23 syslinux/
-rw-r--r-- 1 root root   148304 21.01.2015 22:57 config-318-x86_64
-rw-r--r-- 1 root root 19080760 21.01.2015 01:08 initramfs-314-x86_64-fallback.img
-rw-r--r-- 1 root root  4604675 21.01.2015 01:07 initramfs-314-x86_64.img
-rw-r--r-- 1 root root 19472695 22.01.2015 14:37 initramfs-318-x86_64-fallback.img
-rw-r--r-- 1 root root  4652521 22.01.2015 14:36 initramfs-318-x86_64.img
-rw-r--r-- 1 root root   648704 12.10.2014 13:05 intel-ucode.img
-rw-r--r-- 1 root root       22 08.01.2015 23:12 linux314-x86_64.kver
-rw-r--r-- 1 root root       21 09.01.2015 19:03 linux318-x86_64.kver
-rw-r--r-- 1 root root  3888784 08.01.2015 23:12 vmlinuz-314-x86_64
-rw-r--r-- 1 root root  4087264 09.01.2015 19:03 vmlinuz-318-x86_64
-rw-r--r-- 1 root root   852886 21.01.2015 01:42 xen-4.4.1.gz

Now, when I run 

grub-mkconfig -o /boot/grub/grub.cfg  

I get the necessary Xen blocks in /boot/grub/grub.cfg

However, for me, this is not enough. When I boot with the Xen option, the system gets as far as login and then presents a blank screen. It turns out I need to add the 'nopat' flag to the linux kernel.

To do this, I edited /etc/default/grub to add nopat to the default linux command line as follows (truncated and my highlighting);


GRUB_DEFAULT=saved
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=Netrunner
GRUB_CMDLINE_LINUX_DEFAULT="nopat resume=UUID=7fea5242-e72b-46ca-a9cf-68e0d780f7ff quiet splash"
GRUB_CMDLINE_LINUX=""
<snipped here>

This works and allows me boot into Xen-enabled Netrunner, but has the side effect of enabling 'no pat' for all boot options. There's a bit more to configuring these command line parameters which is well explained here. Here's an extracted summary;

<quote>


Regarding to the /etc/grub.d/20_linux_xen script, you should define parameters for each image:
  • GRUB_CMDLINE_LINUX_DEFAULT = basic kernel
  • GRUB_CMDLINE_LINUX = basic kernel, included in "recovery"
  • GRUB_CMDLINE_LINUX_XEN_REPLACE_DEFAULT = kernel using Xen hypervisor
  • GRUB_CMDLINE_LINUX_XEN_REPLACE = kernel w/ Xen, incl. in "recovery"
  • GRUB_CMDLINE_XEN_DEFAULT = Xen hypervisor (xen.gz)
  • GRUB_CMDLINE_XEN = Xen hypervisor, incl. in "recovery"
For example, /etc/default/grub:
GRUB_CMDLINE_LINUX_DEFAULT=""
GRUB_CMDLINE_LINUX=""
GRUB_CMDLINE_LINUX_XEN_REPLACE_DEFAULT="xen-pciback.permissive xen-pciback.passthrough=1 xen-pciback.hide=(01:00.0)(01:00.1) 'pci=resource_alignment=01:00.0,01:00.1'"
GRUB_CMDLINE_LINUX_XEN_REPLACE=""
GRUB_CMDLINE_XEN_DEFAULT="'dom0_mem=8192M,max:8192M' xsave=1 iommu=1 dom0_max_vcpus=4 dom0_vcpus_pin"
GRUB_CMDLINE_XEN=""

</quote>

Revisiting this, I therefore ended up with the following (truncated) /etc/default/grub file;


GRUB_DEFAULT=saved
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=Netrunner
GRUB_CMDLINE_LINUX_DEFAULT="resume=UUID=7fea5242-e72b-46ca-a9cf-68e0d780f7ff quiet splash"
GRUB_CMDLINE_LINUX=""

#added by me for xen config
GRUB_CMDLINE_LINUX_XEN_REPLACE_DEFAULT="nopat console=tty"
GRUB_CMDLINE_LINUX_XEN_REPLACE=""
GRUB_CMDLINE_XEN_DEFAULT="'dom0_mem=4096M,max:4096M' xsave=1 iommu=1 dom0_max_vcpus=2 dom0_vcpus_pin"
GRUB_CMDLINE_XEN=""
#end additions

# If you want to enable the save default function, uncomment the following
# line, and set GRUB_DEFAULT to saved.
GRUB_SAVEDEFAULT=true
<snipped here>


The proof's in the pudding, as they say so here it is. Having applied all the above and rebooted, I could see that Xen is installed and XL is working;

[root@netrunner default]# sudo xl list
Name                                        ID   Mem VCPUs State Time(s)
Domain-0                                     0 32172     8     r-----     204.0

Yay!


Step 4 - Setting Up VMs

For me, getting a VM working was straightforward enough. Since I'd been using Xen under unRAID, I just added a second SSD drive to MediaServer8, formatted it as XFS, mounted it via fstab to /mnt/vmpool and copied the contents of my domains directory which I'd previously backed up.

Making one small change to .cfg file to adjust the image path and disabling pic passthrough for now, I was able to get the Windows 7 image to boot using;

xl create /mnt/vmpool/TVserver/TVserver.cfg   

Here's my VM config file for reference;


builder = 'hvm'
vcpus = '4'
memory = '12288'
device_model_version="qemu-xen-traditional"

disk = ['file:/mnt/vmpool/TVserver/TVserver.img,hda,w' ]
name = 'TVserver'
vif = [ 'mac=00:16:3e:51:20:4c,bridge=xenbr0,model=e1000' ]
on_poweroff = 'destroy'
on_reboot = 'restart'
on_crash = 'restart'
boot = 'c'
acpi = '1'
apic = '1'
viridian = '1'
xen_platform_pci='1'
sdl = '0'
vnc = '1'
vnclisten = '0.0.0.0'
vncpasswd = ''
vncdisplay = 0
stdvga = '0'
usb = '1'

usbdevice = [ 'tablet' ]


With those VNC settings, I can connect to the VM on the server's IP address at port 0, no password;





But wait, there's no network :-(

There's a little red X on the network icon and when I check, there's no network device at all listed in device manager. Despite this VM working perfectly on Xen in unRAID, somethings gone wrong with the NetRunner build. 

My system has been offline for a while now and the family are starting to get irate. I'll wait for Xen 4.5 to become available in AUR and try again soon.

No comments: