I have set up my new lab server yesterday, which in essence is a rack mounted server with a core i7 2600 processor, 32GB RAM and 3 TB of (slow) disk. When I moved some of my VMs across from an identical system (except that it was a core i7 920) and tried to start the domU, it repeatedly crashed. The message from the console was a simple question: is xend running?
I couldn’t believe my eyes-using identical software now produced segmentation faults? How is that possible. I am using xen 4.2, kernel 3.1.9-1.4-xen and libvirt libvirt-0.9.6-3.3.1.x86_64
I started the troubleshooting with the xen logs. There was no output in the debug log, however the xend.log showed these lines:
... [2012-02-21 22:36:43 1212] INFO (image:187) buildDomain os=linux dom=1 vcpus=2 [2012-02-21 22:36:43 1212] DEBUG (image:819) domid = 1 [2012-02-21 22:36:43 1212] DEBUG (image:820) memsize = 1024 [2012-02-21 22:36:43 1212] DEBUG (image:821) image = /m/xen/kernels/ol62/vmlinuz [2012-02-21 22:36:43 1212] DEBUG (image:822) store_evtchn = 1 [2012-02-21 22:36:43 1212] DEBUG (image:823) console_evtchn = 2 [2012-02-21 22:36:43 1212] DEBUG (image:824) cmdline = vnc [2012-02-21 22:36:43 1212] DEBUG (image:825) ramdisk = /m/xen/kernels/ol62/initrd.img [2012-02-21 22:36:43 1212] DEBUG (image:826) vcpus = 2 [2012-02-21 22:36:43 1212] DEBUG (image:827) features = [2012-02-21 22:36:43 1212] DEBUG (image:828) flags = 0 [2012-02-21 22:36:43 1212] DEBUG (image:829) superpages = 0 [2012-02-21 22:36:44 1210] CRITICAL (SrvDaemon:232) Xend died due to signal 11! Restarting it. [2012-02-21 22:36:44 3288] INFO (SrvDaemon:332) Xend Daemon started ...
The xm-list output showed an unknown domain.
The clue which finally lead me to Novell’s bugzilla database was in /var/log/messages:
kernel: [ 230.384375] xend: segfault at 2408d7f6ea8 ip 00007fd098a9d779 sp 00007fd08d7f6c48 error 4 in libxenguest.so.4.2.0[7fd098a82000+25000]
Now to save you from a long session with your favourite search engine, I would like to point this thread out to you:
In a nutshell, all processors with “xsave” in the processor flags output are affected by this bug. Follow the steps in comment number 31 to add the fixes. It worked just beautifully after that.
The machine I am using is a rented root server and costs me a measly EUR 59/mth. The nice aspect is that I just switched from my “old” core i7-920 to the new machine and don’t have old hardware lying around. And as it’s in a hosted data centre all I need is an Internet connection to access it from anywhere. IMO a great alternative to the big bulky laptop approach.