VCSA Migration – Syslog and Dump Collector

In preparation for an upcoming project it was time to try out the vCenter Server Migration Tool for 6.0 (vCenter Server 6 U2M). I started with reading the vSphere Migration Guide and Release Notes to find out what the prerequisites are and if any known issues are reported.

vSphere Migration Guide

Release Notes

In the release notes I found the following known issue.

Release Notes Known Issue - Syslog and Dump Collector

In the vCenter Server Windows 5.5 the Syslog and Dump Collector are extra components you could deploy but in the vCenter Server Appliance 6.x the Syslog and Dump Collector are integrated in the appliance.

The pictures below show the difference in the vSphere Web Client. The left picture shows the vCenter Server Windows 5.5 with Syslog and Dump Collector and the right picture shows the vCenter Server Appliance 6.0 with the integrated Syslog and Dump Collector.

As you can see there are no more items for the Syslog and Dump Collector in the vSphere Web Client.

When you migrate from the vCenter Server 5.5 Windows to vCenter Server Appliance 6.0 and you have the Syslog and/or Dump Collector installed and configured as integrated with vCenter Server these items will still exist in the Web Client and will show an error even after configuring the services.

syslog_errordump_error

According to the release notes there is no workaround. The only way the avoid this issue is to uninstall the Syslog and/or Dump Collector before migrating to the vCenter Server Appliance.

As mentioned earlier in this post, this only happens when the Syslog and Dump Collector are configured as integrated with vCenter Server. If you have deployed them as standalone instances this issue does not occur.

 

VCSA Upgrade and VM Monitoring

With the release of vCenter Server 6.0 Update 2a and ESXi 6.0 Patch 4 I decided it was time the update the lab environment. Both releases contain a lot of fixes, some specifically for VSAN.

Release notes

vCenter Server 6.0 Update 2a

ESXi 6.0 Patch 4

The update went perfect on the Platform Service Controllers but then disaster struck. During the update of the vCenter Server the VM got a reset and the VM would not boot anymore with the error message:

“Error 15: Could not find file”

When using the Embedded Host Client to check the status of the VM I found out that the VM has had a reset started by the vpxuser. At first I suspected somebody else had giving the VM a restart but this was not the case.

Because I did not make a VM snapshot before the update (it’s a lab environment so he..) I could not recover the VM to a point before the update.

Luckily we had vSphere Replication configured for the vCenter Server to a different lab environment with PIT’s so I could recover the VM to an earlier state. After recovering the vCenter Server and logging on to the Web Client the cause of the reset was made clear.

Virtual Machine Monitoring was enabled for this cluster. Apparently no VMware Tools heartbeats have been received for 120 seconds (low sensivity) and no storage or network traffic was happening for a period of 120 seconds (default). This triggered a reset of the VM and therefore breaking it.

This vCenter Server has been upgraded several times and I never had any issue with VM Monitoring . I have no idea why this happened this time but to be sure I recommend disabling VM Monitoring for the vCenter Server during an update. And of course always have a backup of the vCenter Server in case of a failure.

 

 

vROPS 6.4 – New Dashboards

VMware has released vROPS 6.4 which contains several new dashboards to display status and identify problems. The new dashboards can be divided into several categories:

  • Environment and capacity overview dashboards to get a summary of your environments.
  • VM troubleshooting dashboard that helps you diagnose problems in a VM and start solving them.
  • Infrastructure capacity and performance dashboards to view status and see problems across your datacenter.
  • VM and infrastructure configuration dashboards to highlight inconsistencies and violations of VMware best practices in your environment

In this post I will highlight some of these dashboards. If you want to know more about all the other dashboards I suggest you download and install or upgrade to vROPS 6.4 🙂

Operations Overview

This dashboard provides an general overview of your vSphere environment such as amount of VM’s, clusters hosts, and datastores. The dashboard also provides top list information about virtual machines with CPU contention, memory contention or disk latency.

vROPS Dashboard - Operations Overview

Capacity Overview

This dashboard provides an overview of the capacity of your vSphere environment such as total CPU cores, memory and storage capacity. The dashboard also provides graphs for the different resources utilization containing realtime and trend/forecast data.

vROPS Dashboard - Capacity Overview

Troubleshoot a VM

This dashboard provides general troubleshooting information for a virtual machine such as critical alerts and possible contention.

Maybe the most requested dashboard by customers and I am excited this is now default available in vROPS!

vROPS Dashboard - Troubleshoot a VM

Heavy Hitter VMs

Like the name suggests, this dashboard provides information of the top heavy virtual machines in your vSphere environment such as top highest IOPS and network throughput.

vROPS Dashboard - Heavy Hitter VMs

Cluster Performance

This dashboard provides general performance information for clusters such as critical alerts and possible contention.

vROPS Dashboard - Cluster Performance

ESXi Configuration

This dashboard provides general information about the hardware of the vSphere hosts in your environment such as hardware model, ESXi version and power management setting.

Not in the picture below but the dashboard also provides an overview of the configuration of all the vSphere hosts. This overview contains information such as CPU sockets, NICs, Power State, CPU Model, etc.

vROPS Dashboard - ESXi Configuration

VM Usage

This dashboard provides general information about virtual machines in your environment such as general virtual machine configuration and graphs about CPU, memory and IOPS demand.

vROPS Dashboard - VM Usage

The addition of these dashboards are very welcome and I think these dashboards make vROPS even better to use. Here are some links to vROPS resources.

Release notes

http://pubs.vmware.com/Release_Notes/en/vrops/64/vrops-64-release-notes.html

vROPS Sizing Guidelines

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2147780

Download

https://my.vmware.com/en/web/vmware/info/slug/infrastructure_operations_management/vmware_vrealize_operations/6_4

 

vROPS 6.3 – vSphere Hardening Guide 6.0

As described in my previous post I have upgraded my lab vROPS cluster to vROPS 6.3. After a couple of days I finally had time to look at the updated vROPS policies. One of the things I was most interested in was support for the vSphere 6.0 hardening guide.

With vROPS 6.3 it is possible to generate alerts when a host or vCenter violates rules found in the vSphere 6.0 hardening guide. In previous releases only the vSphere 5.5 hardening guide could be used.

To enable alerts for the vSphere hardening guide you need to perform the following actions:

  • Enable vSphere hardening guide alerts in the VMware vSphere solution
  • Customize a policy to enable the vSphere hardening guide alerts

To enable vSphere hardening guide alerts in the VMware vSphere solution define the monitoring goals: Administration -> Solutions -> VMware vSphere -> Configure -> Define Monitoring Goals.

vrops_enable_hardening_alerts

After this you need to customize your policy to enable the alerts. At this step I had a problem with the vSphere hardening guide alerts. Because I performed an upgrade and I did not want to lose any customization on default objects I choose to not reset out of the box content during the upgrade.

This resulted in the policies not being updated with the new vSphere hardening guide alerts.

vrops_hardening_guide_55

After some digging I found a VMware KB article explaining that the policies were not updated because of my choice to not reset out of the box content. The only solution is to reset default content in the VMware vSphere Solution. You can do this via Administration -> Solutions -> VMware vSphere -> Reset Default Content.

Keep in mind that this removes all your customizations on default objects such alert definitions, symptoms, policy definitions and dashboards.

vrops_reset_default_content

A common best practice is to not customize the out of the box content but clone or create new objects such as dashboards and policies.

After resetting the default content I could enable the vSphere 6.0 hardening guide alerts in the policy I have created and alerts where created for the hosts.

vrops_hardening_guide_60

vrops_alert_hardening_guide

 

 

 

vROPS 6.x Blank Dashboard

A few weeks ago I upgraded the vROPS cluster in the lab environment to vROPS 6.3. The cluster is a 3 node cluster with a master, replica and remote collector. The cluster is behind a NSX load balancer to provide a single FQDN to connect to the cluster.

The upgrade went smoothly and all nodes were upgraded without a problem. One important thing to remember is to always update the virtual appliance OS first before upgrading vROPS! If you do not do this, you will break your vROPS instance!

But the problem started when I logged on to the vROPS instance. Some dashboard were working fine but other dashboards would not show any contect. The content on these faulty dashboards varied from completely blank to only some widgets.

vROPS Blank Dashboard

At first I thought the problems appeared only on custom dashboards I had made but after looking at some more dashboards it appeared that it also happened on the out of box dashboards.

At this stage I was thinking I broke the vROPS cluster and needed to redeploy the cluster. But before I was sure I really needed to do this I asked a colleague if he experienced the same problems. He could view the dashboards without any problem.

Because of this I tried to open the dashboards from another browser, in this case Internet Explorer instead of Chrome. The dashboards were working fine with Internet Explorer, so my first suspicion went to Chrome. But my colleague had opened the dashboards in Chrome and they worked fine for him.

Eventually my colleague suggested that I cleared the Chrome browser cache so I would not have any old references to the dashboards. And lo and behold, the dashboards were working fine after this!

vROPS Dashboard

TLDR, if you have any issues with content on vROPS dashboards clear your browser cache first 🙂