XEN Part III The conclusion
Posted by scheidel21 on June 24 2010 13:28:45
So from last time we know that I had compiled XEN and rebooted the server to be presented with a frozen boot process. Time to hit Google again, after some research I found.....
Extended News
I found that there is a bug in some of the newer kernels that has something to do with IRQ assignment a chipset and legacy support. In the end adding "acpi_skip_timer_override" to the kernel boot arguments fixed the issue. So after another reboot back into my new kernel and XEN install, "xm pci-list-assignable-devices" showed that the device I hid was available to assign to a domU, I thought I was off to the races. Or so I thought. I had not yet added the GPLPV drivers to my Windows VM and was relying on QEMU bridging, well guess what my machine would not start, and after examining the logs it was because qemu could not create a tap device. This turns out to be because the XEN build process and the configuration used for the pvops kernel does not have tun device support turned on. not as a module, not built into the kernel. I can't believe this isn't turned on in the configuration for the pvops kernel. I go into the pvops folder and run "make menuconfig" I add support for tun devices and then build the kernel and modules, install them then run depmod, update the initramfs and reboot yet again. Finally without a hitch the whole thing runs, YEA!!!! And my PCI-passthrough works without a hitch as well. Now let me say this before I proceed, I mis-spoke I actually compiled both the 4.0 and 4.0.1-rc3-pre versions of XEN, the one I was using after this reboot was 4.0.1 because I had tried it thinking it might resolve my boot issue. I mention this because of what I did next. Well I now had all the kinks worked out and I want a clean system, so because I had the VM on a separate partition I decided I would reinstall a fresh copy of Debian and compile XEN fresh once. Reinstalling Debian went well, then as I was installing dependencies I ran into a difficulty installing e2fslibs-dev I now believe this is because I had added the backports repository and the version of e2fslibs installed was revision 6 the dev package was revision 3 and dev wouldn't install without having revision 3 of the library installed. No big deal I compile e2fslibs-dev from source repositories and it also recompiles all the e2fsprogs. i install them, I compile xen, then I try to reboot. reboot command not recognized, so I say "what?" I perform and updatedb operation then run locate to find reboot, it's now in /usr/lib/kibs/bin instead of /bin hmmm well let's try a reboot I run it in it's new location and the server reboot OK. That is until Linux loads. it boot but then it can't find the init in /sbin. Well at this point I'm like damnit!, so I just reinstall Debian again. This time I install all the requirements for a XEN build immediately after reinstalling Debian. They ALL install no problem. I download the XEN 4.0.0 source code and perform my compile. I reboot into the new pvops kernel running on XEN with my acpi_skip_timer_override option, as well as my NEW xen-pciback.hide (used to be pciback.hide before the newer kernels) and make sure my vtd is enable with a vtd=1. The system boots up find but networking is funky, it's not coming up properly on boot, I see there are errors about the NIC being ready, if I run ifup eth0 and ifup eth1 the interfaces come up and work properly. I then try to run xm info and I receive the message "Error: Unable to connect to xend: No such file or directory. Is xend running?" crud, so I run ps-aux | grep xen, it's not running. I check "/etc/init.d/xend status" I receive nothing just a return to the prompt. I run "/etc/init.d/xend start" and it does a little jig and then return me to the prompt with no errors on the command line. What is going on? Heck if I know exactly, but I check the logs for xend and I see "ERROR (SrvDaemon:349) Exception starting xend ((111, 'Connection refused'))" I do some research and the only thing valid to me is something to do with the 2.6.32 series kernel running on XEN 4.0 says I should run 4.0.1, but I'm not running 2.6.32 I am running 2.6.31.13. But hey, it can't be any worse than now. I go and download the newest source for the testing version of 4.0.1 (at this time XEN 4.0.1-rc4-pre is the current testing) I don't usually like to run non-stable software on a server, especially a production server, but it is a minor bug fix release only what the heck, no guts no glory. I compile this version and install it, same kernel and xend now runs fine, what an odd issue. Of course now I still have to recompile my pvops kernel with tun support, and reinstall that. But I do that and we are now finally up and running again. So remember don't fix it if it ain't broken!