vSphere Distributed Switch refused to upgrade to version 6.5

Just after vSphere 6.5 was released I decided to upgrade my lab to 6.5. Most of the upgrade went pretty smooth, but two of my 3 distributed switches refused to upgrade. Googeling for a solution dit not help too much, probably since the product was released just a day before ūüôā When I tried to upgrade I got a message the vDS config could not be read. I also noticed I was not able to upgrade these switches to enhanced LACP.

I did find some kb articles regarding some wrong vCenter database entries for LACP in previous upgrades, so I had a feeling this was related to LACP (which I do not use) … Continue reading

vCenter 6.5 upgrade did not recognise vSphere 6.0 Platform Service Controller version

When I tried to upgrade my lab environment from vCenter 6.0 with external PSC to vCenter 5.5, I ran in to an annoying issue. I tried to upgrade my PSC, but the installer was not able to determine the version from my current PSC. It assumed it was 5.5 and I had to confirm this, which of course, I did not. No way to tell it it was really 6.0 …

Continue reading

Invalid credentials message when registering vCenter Server with external Platform Service Controller in vSphere 6

When vSphere 6 was released, I decided to delete my RTM version of my external Platform service Controller and vCenter Server appliances, to replace them by the GA versions.

Installing the PSC went fine, but when installing the vCenter appliance, I was not able to register it to the PSC. I kept getting the message “Invalid credentials” every time I entered the SSO administrator password. Redeployed the PSC¬†several time, using different passwords, but no luck registering the VCSA.

Continue reading

Nasty HP software bug hits vSphere 5.1 and 5.5 and helpful info to fix this

Recently I got a call from a customer he was not able to log in to his ESX 5.5 hosts anymore trough ssh, and could not vMotion VM’s anymore. It seemed like the ssh daemon died and trying to start it again did not work.
I was able to log on to one of the hosts (DL380 G8) and have a look at the vmkernel.log file.
In the log file I saw a line that read:
WARNING: Heap: 3058: Heap_Align(globalCartel-1, 136/136 bytes, 8 align) failed. caller: 0x41802a2ca2fd
Google brought me to VMware KB article 2085618¬†with the title “ESXi host cannot initiate vMotion or enable services and reports the error: Heap globalCartel-1 already at its maximum size.Cannot expand” which sounded exactly like our problem, and seems to be caused by a memory leak in the hp-ams service.

And¬†that’s where¬†the fun started ….

Continue reading

Add RecoverPoint 4.1 with SRM RecoverPoint SRA 2.2 fails with error “SRA command ‘discoverArrays’ failed” UPDATED

During an installation and configuration of an SRM solution for a customer based on EMC RecoverPoint 4.1 I ran in to an interesting issue.

When I tried to add the RecoverPoint Clusters on both sites using the RecoverPoint SRA 2.2 I received the following error message:

Error

Continue reading

Issue with jumbo frames after upgrading nested ESXi servers in the lab to 5.5 and fix

IMPORTANT UPDATE AT THE END OF THE ARTICLE

In my lab I use to test and play with numerous VMware solutions, I have several nested ESXi servers running. Nested ESXi servers are ESXi servers running as a VM. This is a not supported option, but it does help me to test and play around with software without having to rebuild my physical lab environment all the time.

So first a little on the setup of my nested ESXi servers

The VM’s for my nested ESXI servers have 4 NIC’s

The first NIC connects to “vESXi Trunk” This is a port group on my physical ESXi hosts that is configured on a vDS with VLAN type “VLAN Trunking” so I get all VLAN’s in my nested ESXi host:

Screen Shot 2013-11-25 at 20.49.39

I use this VLAN trunk to present my management network and my VM networks to my nested ESXi servers

I also have a NIC that connects to my vMotion network, and two nice that connect to my iSCSI networks. I use two subnets and two VLAN’s for my iSCSI connections.

Screen Shot 2013-11-25 at 20.54.20

In my physical setup I use jumbo frames in these networks, and I did the same in my nested ESXi hosts, and it worked perfectly … Until I upgraded my nested ESXi hosts to vSphere 5.5 … Continue reading

Connectivity issue when upgrading Dell R620 to ESXi 5.1 build 914609

When building a couple of new ESX hosts based on Dell R620 systems, I used the Dell customized iso VMware-VMvisor-Installer-5.1.0-799733.x86_64-Dell_Customized_RecoveryCD_A01.iso to install ESXi

Those Dell systems had 4 Broadcom nics (2 x 1Gb + 2 x 10Gb) and 2 Intel 10Gb nics

Install went fine, and I decided to upgrade to the latest patches using esxcli since the hosts had no access to vCenter. All went fine till after the reboot. I noticed all Broadcom nics where missing from my hosts, most likely due to a driver issue, so time to investigate. Continue reading

Serious performance impact on high IO VM with multiple snapshots

Recently I ran in to a situation where a customer suffered severe performance issues on a virtualized SQL server. In the SQL server we noticed a high CPU utilization, but the underlying ESX hosts only showed relatively low CPU utilization for this VM.

Debugging the VM performance issue with esxtop showed very high co-stop (%CTSP) vallues.

According to the vSphere Monitoring and Performance guide, %CTSP is

Percentage of time a resource pool spends in a ready, co-deschedule state.
NOTE You might see this statistic displayed, but it is intended for VMware use only.

Funny how VMware expresses this metric is only to be used by VMware ūüôā Continue reading

vCenter 5.1 upgrade removes permissions in vCenter in non AD environment

While upgrading vCenter to 5.1 in an environment where we used local authentication on the vCenter server, we were in for a little surprise.

The original vCenter server had a lot of custom roles and user permissions defind, on all kinds of objects in vCenter.

When we did the upgrade, we decided to install the SSO server on a separate server, and when we did the vCenter upgrade and it was registered with the SSO server, we suddenly received a message that users and groups where not found on the SSO server, which kind of made sense, since even though we recreated the users and groups on the SSO server, they had different security IDs. But what we did not expect, is the upgrade process decided to remove all non existing users and groups from the vCenter database, effectively removing all permissions from vCenter … Continue reading

ESX hosts not registering on EMC VNX (and fix)

While working on an upgrade to vSphere 5.0U1 on a Cisco UCS environment, where the ESX hosts boot from SAN, I noticed one of the hosts was not registered correctly on the EMC VNX, as it showed up as unmanaged. Because the ESX hosts boot from SAN, the host has to be registered before it can auto register, and when it was registered manually  the host was not able to update the registration. Continue reading