The ramdisk ‘var’ is full on ESXi host and fix (without maintenance mode or reboot)

I recently encountered an issue where vMotions on a host would fail, the host would disconnect from vCenter, and some other strange errors.

This was an HP host installed with the HP ISO image, but not sure if that is the cause of this issue.

When investigating the logs on the host I noticed that /var on the ramdisk was full

When issuing vdf -h available space for /var on the ramdisk was  0%

Looking in /var/log i noticed all logfiles where symlinks to /scratch except for the EMU directory, where some Emulex process seemed to fill up a log file …..

When removing the logfile /var/log/EMU/mili/mili2d.log and after restarting hostd, space was freed up on /var in the ramdisk, but the logfile /var/log/EMU/mili/mili2d.log returned and started filling up again.

Googeling I found a suggestion to remove the Emulex vibs when not using an Emulex HBA, but these hosts did have Emulex HBA’s

After some more research I found a fix that did not need a reboot or maintenance mode (which is great since vMotion stoped working on these hosts):

First, edit /etc/config/EMU/mili/libmili.conf and set MILILogLevel to 0

Next, remove the logfile /var/log/EMU/mili/mili2d.log

And then restart hostd with /etc/init.d/hostd restart

Check with vdf -h and /var should have diskspace available on the ramdisk

past these commands on a host that has the issues if you do not manually want to edit the conf file and want to fix the host at once.

First line sets log level to 0, the second line removes log file, the third line restarts hostd and the last line outputs disk utilisation so you can check if space is available again on var on ramdisk

sed -i ‘s/^MILILogLevel\ 4/MILILogLevel\ 0/g’ /etc/config/EMU/mili/libmili.conf
rm /var/log/EMU/mili/mili2d.log
/etc/init.d/hostd restart
vdf -h

 

6 thoughts on “The ramdisk ‘var’ is full on ESXi host and fix (without maintenance mode or reboot)

  1. Thanks for posting this, it resolved our issue.
    VMWare support says this is fixed in 6.7 update 2.
    We are running HPE SimpliVity and had his issue with our storage nodes. The compute nodes were fine. We checked the libmili.conf file in the compute nodes and log level was already set to 0. The SimpliVity nodes were set to some other value.

  2. The first solution without just removing the emulex driver.
    Worked perfectly except, that i had to restart the vpxa agent too.
    After i restarted the hostd, the connection to my vCenter was gone and didn’t come back. I wasn’t so brave enough for production esx hosts, so i can’t tell you if that would restart any machines or somewhat cause other problems.

    Thanks for the post 🙂

  3. Great blog…really helped me to fix an issue where our vCenter was residing on the host and cannot reboot the host. Thanks so much

Leave a Reply

Your email address will not be published. Required fields are marked *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.