As a seasoned post-sales consultant, I’ve done my fair share of health checks. For vSphere (more on other topics later), this typically involves using the VMware HealthAnalyzer tool to gather information from vCenter Server and ESXi hosts, and then compare those results versus best practice recommendations. I’d like to share with you some very common items that I run into, and why remediation of these items would be important to you.
1. Use persistent and remote syslog logging to improve manageability.
Out of the box, ESXi does not log to persistent storage. Therefore, if something happens to your ESXi host and it goes down, you’ve lost one of your most powerful allies in troubleshooting – the log files. Either save your logs on persistent storage attached to the ESXi host or save them to a remote syslog server. Luckily, vCenter Server comes with its own syslog server, so there’s no need to set up another box to perform this function.
2. Enable Tech Support Mode timeout feature ESXi Shell timeout feature and configure it per customer security and manageability requirements.
This is something that I hardly ever see enabled. This feature allows the administrator to set a timeout value for unused ESXi Shell sessions to keep an intruder from being able to either open a KVM session or get physical control of your system and simply hit Alt + F1 to gain shell access (since you left an idle ESXi Shell session, they don’t have to log in!).
3. Use vCenter Server roles, groups, and permissions to provide appropriate access and authorization to the VMware virtual infrastructure. Avoid using Windows built-in groups (Administrators).
This practice used to be much more prevalent than I see today, but it’s still an issue, especially with Single Sign On adding another wrinkle to the picture. The suggested practice is to remove the local OS administrators group or user on the vCenter Server machine, and instead use LDAP or Active Directory users and groups to dole out permissions within vCenter Server.
4. Use Load-Based Teaming (LBT) to balance virtual machine network traffic across multiple uplinks.
As Jason Nash (@thejasonnash) notes in his vSphere Advanced Networking training from Pluralsight, Load-Based Teaming (LBT) is “a traffic- load-aware teaming policy, allowing it to feature superior load balancing capabilities that help physical NICs in a NIC team avoid getting under or over-utilized and really improve bandwidth efficiency.” Also noted is the fact that LBT is the only teaming policy supported by vSphere that does load balancing across dvUplinks based on load. Jason is a fan and so am I.
5. Allocate space on shared datastores for templates and media/ISOs separately from datastores for virtual machines.
I like this one just because it points to an easy win in the land of storage management for vSphere – Put your ISOs and Templates on storage that will be quick for sequential reads, but not necessarily so for random or write heavy workloads. Save your expensive storage for your VMs and put your media on 7.2K NL SAS.
6. Use NTP, Windows Time Service, or another timekeeping utility suitable for the operating system.
It’s absolutely crucial to keep time in sync across devices. This means both your ESXi hosts and your virtual machine guest operating systems. Use NTP on your ESXi hosts to keep them in sync with a trusted time source, and then use whatever means is appropriate for your virtual machines guest operating systems to keep them in sync.
7. Verify that virtual machines meet the requirements for vMotion.
Most often I see this one tripped by attached FLP or ISO images, though it can also be lack of shared storage or networks, port group name mismatches across ESXi hosts, and so on. Make sure your cluster is consistent across the board and you won’t have many, if any, problems with vMotion.
8. Verify that VMware Tools is installed, running, and up to date for running virtual machines.
This may seem like a no brainer, but in almost every environment I survey, there are at least some virtual machines without VMware Tools installed or running. The same goes for keeping VMware Tools up-to-date for your virtual machines. It’s important to monitor this on a regular basis, and I recommend the vCheck script from Alan Renouf (@alanrenouf) to check this periodically, say once a week.
9. Use the latest version of VMXNET that is supported by the guest OS.
In almost every case, it is appropriate to use the latest VMXNET Ethernet controller for your virtual machines. Refer to the VMware Knowledge Base article on choosing a virtual Ethernet adapter for further guidance.