In my lab I use to test and play with numerous VMware solutions, I have several nested ESXi servers running. Nested ESXi servers are ESXi servers running as a VM. This is a not supported option, but it does help me to test and play around with software without having to rebuild my physical lab environment all the time.
So first a little on the setup of my nested ESXi servers
The VM’s for my nested ESXI servers have 4 NIC’s
The first NIC connects to “vESXi Trunk” This is a port group on my physical ESXi hosts that is configured on a vDS with VLAN type “VLAN Trunking” so I get all VLAN’s in my nested ESXi host:
I use this VLAN trunk to present my management network and my VM networks to my nested ESXi servers
I also have a NIC that connects to my vMotion network, and two nice that connect to my iSCSI networks. I use two subnets and two VLAN’s for my iSCSI connections.
In my physical setup I use jumbo frames in these networks, and I did the same in my nested ESXi hosts, and it worked perfectly … Until I upgraded my nested ESXi hosts to vSphere 5.5 … Continue reading →
When building a couple of new ESX hosts based on Dell R620 systems, I used the Dell customized iso VMware-VMvisor-Installer-5.1.0-799733.x86_64-Dell_Customized_RecoveryCD_A01.iso to install ESXi
Those Dell systems had 4 Broadcom nics (2 x 1Gb + 2 x 10Gb) and 2 Intel 10Gb nics
Install went fine, and I decided to upgrade to the latest patches using esxcli since the hosts had no access to vCenter. All went fine till after the reboot. I noticed all Broadcom nics where missing from my hosts, most likely due to a driver issue, so time to investigate. Continue reading →
Recently I ran in to a situation where a customer suffered severe performance issues on a virtualized SQL server. In the SQL server we noticed a high CPU utilization, but the underlying ESX hosts only showed relatively low CPU utilization for this VM.
Debugging the VM performance issue with esxtop showed very high co-stop (%CTSP) vallues.
According to the vSphere Monitoring and Performance guide, %CTSP is
Percentage of time a resource pool spends in a ready, co-deschedule state.
NOTE You might see this statistic displayed, but it is intended for VMware use only.
While upgrading vCenter to 5.1 in an environment where we used local authentication on the vCenter server, we were in for a little surprise.
The original vCenter server had a lot of custom roles and user permissions defind, on all kinds of objects in vCenter.
When we did the upgrade, we decided to install the SSO server on a separate server, and when we did the vCenter upgrade and it was registered with the SSO server, we suddenly received a message that users and groups where not found on the SSO server, which kind of made sense, since even though we recreated the users and groups on the SSO server, they had different security IDs. But what we did not expect, is the upgrade process decided to remove all non existing users and groups from the vCenter database, effectively removing all permissions from vCenter … Continue reading →
While working on an upgrade to vSphere 5.0U1 on a Cisco UCS environment, where the ESX hosts boot from SAN, I noticed one of the hosts was not registered correctly on the EMC VNX, as it showed up as unmanaged. Because the ESX hosts boot from SAN, the host has to be registered before it can auto register, and when it was registered manually the host was not able to update the registration. Continue reading →
In a vSphere environment I am working on we use VMware vShield Edge to do firewalling, NAT and terminate VPNs for customers.
On several occasions we where not able to make config changes to some of our VSE devices when we tried to publish the changes we made from within vShield Manager. Whenever we tried to publish the changes, we received an error message in vShield Manager it could not reach the vShield Edge device we where trying to configure.
Next to that, we noticed a lot off errors in the vShield Manager System Events tab for this specific Edge Device regarding “Multiple heartbeats missed from appliance”
An other thing we noticed was the VMware Tools for this specific VSE device did not seem to be running.
We decided to open a case at VMware and where told this is a know issue with the version of vShield we are running (5.0.1) and this will be fixed in a future version. (It is not fixed in version 5.0.2 that was released recently) Continue reading →
Today I did received an e-mail from EMC they are able to reproduce our issues in their lab, which is an important step to get these issues resolved, since we can only do limited tests in our production environment. Great news to start the weekend. Will update again when I get more details on this.
My first reaction was “Why would Gabe want to disable VAAI on a per array basis isn the first place?” so I asked.
His answer was pretty simple and straight forward. He was working on an environment where ESX5 hosts had both EMC CX4s and VNXes connected, and VAAI was not supported on vSphere 5 for CX4, so he had to disable VAAI for the CX4’s and wanted to leave it on for the VNXes. Continue reading →
Today I was working on upgrading some hosts in a vSphere 5 environment that is using Cisco Nexus 1000V virtual switches. I imported the extension bundle in Update Manager, created a baseline, and scanned the hosts. After a couple of seconds , I got a message in vCenter telling me the scan failed:
Scan entity <hostname> Host cannot download files from VMware vSphere Update Manager patch store. Check the network connectivity and firewall setup, and check esxupdate logs for details. Continue reading →