To all, I hope the weekend was kind to each of you!

 

What I would like to discuss today is an infrequent annoyance, but one that does arise and is troublesome:  a full XenServer file system.  The reason to discuss this in a public manner is to share my experiences of such a situation as well experiences I have troubleshot for clients, partners, and beyond.

 

The Situation.  The Symptoms.

Be it a XenServer pool or a stand-alone XenServer, you find that you are able to ping the XenServer in question, but you cannot establish an SSH connection to the XenServer host.

You try to access the XenServer from XenCenter and find that the XenCenter in question has the nasty server icon: the one with the red indicator that XenCenter cannot communicate with it.

You try and access the server – physically via a monitor or through iLO/iDRAC/etc – by logging in and receive the following message:

 

 

XSCONSOLE stating: “Login Failed: (‘Critical error – immediate abort’, ’26′).

 

The Cause.

The cause is that the root (/) file system for XenServer is full.  There remains no more disk space for processes, file creation, and so forth.

 

The Solution.

Obviously, one wants to gain access to the root file system as to clean out any old patch files, syslog files, /tmp directory files, and general debris left over for one reason or another.

The difficulty is that since the root filesystem is full – regardless of any cleanup mentioned – a reboot is required.  What is required after a reboot depends on how full the filesystem is.

 

Option One.

Power down/Power up the XenServer in question.  While the system is booting up, a Syslinux line will appear with the following prompt (for only a few seconds):

boot:

Type in menu.c32 and select safe mode.

If you are able to boot into a file system that is not read-only, perform the following to clear out old logs as well as old patch files:

cd /var/patch

ls

(If there are older patches left over, the output will be similar to this)

17cb0ee5-5058-5434-efb5-46daeec1bd99      7ad75843-ecba-6dfc-ce3f-737aef946347
191fe9a5-a40b-fe61-38d0-e8f4dfcb2cbf      88af7eed-73f0-05a2-f8ed-ea672adcfbc2
36f72163-d39c-3987-93c8-3a28f5690400      applied
3a717185-3cee-aaba-0843-b7cd13212901      ee862a67-032c-2453-2424-298bc22205e9
6dd65837-555b-6573-b9b4-4f7f7601d4b1      f43b2434-325f-084c-9073-82bf5c26df0b

Remove all files with the exception of the “applied” directory.

 

Next, investigate /var/log/ and perform the following:

cd /var/log

ls

If there are an excess of log files, remove all compressed log files using (very carefully!):

rm -f *gz

 

Lastly, reboot as normal and if the system re-appears in XenCenter or if one is able to gain access to the XenServer in question, execute the following command to see if disk space has been freed:

df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             4.0G  3.0G  774M  80% /

 

Option Two.

Option two is a bit more complicated in terms of steps, but should be easy enough to follow if one has experience with a boot cd.

Download a copy of Knoppix’s Live CD.  Burn it, insert it into the XenServer in question, reboot the XenServer in question, and ensure that the Live CD is the primary boot target.

Once Knoppix is loaded, open a terminal and create a directory called “cleanup”.  Here is an example:

mkdir cleanup

XenServer usually installs to /dev/sda1 on the bare-metal system’s local hard disk.  The goal is to mount this partition, under Knoppix, to the “cleanup” directory so that the steps mentioned in Option One can be duplicated.

Again, from a terminal, run the following:

mount /dev/sda1 ~/cleanup

Once sda1 is mount in Knoppix’s Live CD environment, the “cleanup” directory essentially becomes the root.  Change directory into cleanup:

cd cleanup/

 

From there, change directory to var/patch as follows:

cd var/patch

As in step one, execute “ls” and remove any file that exists in the directory.

 

Lastly, execute the following command to check the mounted root partition’s log directory to clean up an excessive log files:

cd ../log/

rm *gz

 

Once complete, execute:

cd ~/

umount ~/cleanup

halt

 

The system will stop.  Power on the XenServer in question and remove the Knoppix Live CD as so the XenServer can boot normally.

 

How Can I Prevent This?

A general check of all filesystems can be checked by executing the following on any XenServer:

df -h

In the return output, if the following (bolded) is high, make a note of it:
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             4.0G  3.0G  774M  80% /

 

This implies that one needs to check /etc/logrotate.conf to ensure the configuration is set as such were logging is not allowed to fill up the /var/log directory (a sub-directory of the / filesystem).

Likewise, the /var/patch directory should be checked after system updates as to ensure old patch files (installed over time from updates) are removed.

 

Feel free to ask any questions from logrotate.conf, remote logging, and so forth as I hope this blog helps any Citrix XenServer Administrator from facing this issue.

 

And this is from my virtual desktop to you!

–jkbs

@xenfomation