Connectivity issue when upgrading Dell R620 to ESXi 5.1 build 914609

When building a couple of new ESX hosts based on Dell R620 systems, I used the Dell customized iso VMware-VMvisor-Installer-5.1.0-799733.x86_64-Dell_Customized_RecoveryCD_A01.iso to install ESXi

Those Dell systems had 4 Broadcom nics (2 x 1Gb + 2 x 10Gb) and 2 Intel 10Gb nics

Install went fine, and I decided to upgrade to the latest patches using esxcli since the hosts had no access to vCenter. All went fine till after the reboot. I noticed all Broadcom nics where missing from my hosts, most likely due to a driver issue, so time to investigate. Continue reading

Serious performance impact on high IO VM with multiple snapshots

Recently I ran in to a situation where a customer suffered severe performance issues on a virtualized SQL server. In the SQL server we noticed a high CPU utilization, but the underlying ESX hosts only showed relatively low CPU utilization for this VM.

Debugging the VM performance issue with esxtop showed very high co-stop (%CTSP) vallues.

According to the vSphere Monitoring and Performance guide, %CTSP is

Percentage of time a resource pool spends in a ready, co-deschedule state.
NOTE You might see this statistic displayed, but it is intended for VMware use only.

Funny how VMware expresses this metric is only to be used by VMware 🙂 Continue reading

vCenter 5.1 upgrade removes permissions in vCenter in non AD environment

While upgrading vCenter to 5.1 in an environment where we used local authentication on the vCenter server, we were in for a little surprise.

The original vCenter server had a lot of custom roles and user permissions defind, on all kinds of objects in vCenter.

When we did the upgrade, we decided to install the SSO server on a separate server, and when we did the vCenter upgrade and it was registered with the SSO server, we suddenly received a message that users and groups where not found on the SSO server, which kind of made sense, since even though we recreated the users and groups on the SSO server, they had different security IDs. But what we did not expect, is the upgrade process decided to remove all non existing users and groups from the vCenter database, effectively removing all permissions from vCenter … Continue reading

ESX hosts not registering on EMC VNX (and fix)

While working on an upgrade to vSphere 5.0U1 on a Cisco UCS environment, where the ESX hosts boot from SAN, I noticed one of the hosts was not registered correctly on the EMC VNX, as it showed up as unmanaged. Because the ESX hosts boot from SAN, the host has to be registered before it can auto register, and when it was registered manually  the host was not able to update the registration. Continue reading

Error 29107 when upgrading to vCenter 5.1 (and fix)

When I tried to upgrade my vCenter 5.0U1 Server to 5.1, all seemed to go well, up until the the moment vCenter tried to register with SSO.

I received an error message “Error 29107. The service or solution user is already registered. Check Vm_ssoreg.log in system temporary folder for details”

I checked this log, but it did not really point me in to the right direction.

Then I found a post in the 5.1 beta archive that said the unique identifier for a service to register with SSO is the Common Name from its certificate. Continue reading

Issue with vShield Edge devices due to full root filesystem

In a vSphere environment I am working on we use VMware vShield Edge to do firewalling, NAT and terminate VPNs for customers.

On several occasions we where not able to make config changes to some of our VSE devices when we tried to publish the changes we made from within vShield Manager. Whenever we tried to publish the changes, we received an error message in vShield Manager it could not reach the vShield Edge device we where trying to configure.

Next to that, we noticed a lot off errors in the vShield Manager System Events tab for this specific Edge Device regarding “Multiple heartbeats missed from appliance”

An other thing we noticed was the VMware Tools for this specific VSE device did not seem to be running.

We decided to open a case at VMware and where told this is a know issue with the version of vShield we are running (5.0.1) and this will be fixed in a future version. (It is not fixed in version 5.0.2 that was released recently) Continue reading

Update on LUN connectivity issues with Storage vMotion on EMC VNX when using VAAI

A while ago I posted an article on LUN connectivity issues with Storage vMotion on EMC VNX when using VAAI we experienced.

Today I did received an e-mail from EMC they are able to reproduce our issues in their lab, which is an important step to get these issues resolved, since we can only do limited tests in our production environment. Great news to start the weekend. Will update again when I get more details on this.

EMC WebEx on VAAI support in ESX5 for CX4 at the 28th of June

Today I received a tweet from Chad Sakac, SVP Global SE at EMC, that he will be discussing some of the questions and concerns I raised in my blog post Challenges when upgrading environment with EMC CX4 to vSphere 5 and mixed CX4/VNX environment in next weeks Chad’s Choice WebEx, and in a blog post on his Virtual Geek blog.

Subjects for this call will be: Continue reading

Enabling VAAI on ESX5 on a per array base idea

Yesterday I wrote an article on issues when using a combination of CX4 and VNX in a vSphere 5 environment since ESX5 does not support on VAAI according to VMware and the VNX does. Simple solution would be to disable VAAI  on all ESX5 hosts, but in that case your VNXes would also loose VAAI. Victor Forde pointed me to a blog post by Chris Wahl titeled “Forcing the NMP Plugin for Microsoft Clustering LUNs on vSphere” (good post by the way)

When reading this article, I realized you can not only use the array vendor and model strings to assign the VAAI filter driver to an array, but you might also be able to use the location (the combination of adapter, target, channel and LUN) for a device to assign the VAAI filter and VAAI Plugin to a specific array.

In short, you would need to remove the default VAAI filter and VAAI plugin rules for vendor=DGC and model=* and replace them by claim rules for based on the location where you would use the target identifier to filter on, for the array that would need to have VAAI enabled.

Unfortunately I don’t have a VNX in my lab (Someone at EMC that wants to trade my VNXe and a case of beer for a VNX? 😉 ) so I am not able to test if this would really do the trick. Continue reading