Tuesday, March 25, 2014

Do not forget NTP

I have seen a lot of recent issues related to NTP and time skew in VMware environments recently.  Any new appliances which have SSO as a dependency need time skew to be kept at a minimum.  If you're deploying vCloud Automation Center (vCAC) 6.x, then time needs to be inline with some consistent source, most likely the domain controller.  This also applies for the VMware Big Data Extensions (BDE) appliance and if you've deployed the vCloud Networking and Security (vCNS) Manager aka vShield Manager appliance in the past with vSphere 5.1. The errors may not be particular helpful unless you dig in the logs and see something similar to the following "Server returned 'request expired' less than 0 seconds after request was issued".

In a Windows world, the servers can be pointed at the Domain Controller and on a linux distro you typically point /etc/ntp.conf to one of the pool.ntp.org servers and always make sure UDP port 123 is open.  In a virtualized context, you could be lazy and synchronize with host time from VMware Tools.  Most of the appliances now have this exposed under the "Admin" tab.  But let's say you go through the ESXi host configuration and point the NTP client to the domain controller.  That should fix it right?

If you're still running into the issue and noticing that your ESXi hosts are not syncing, then you need to read this:
Synchronizing ESXi/ESX time with a Microsoft Domain Controller
because there are several steps that require going into the ESXi shell to remedy.

In the interest of strong design, please don't take NTP for granted.  Your logging, which can be mind-numbing to troubleshoot to begin with, obviously makes no sense if there are time discrepancies.  Then we have log management tools like Log Insight now for example, which have basically become mandatory.  Do you still set all your servers to a specific timezone or have you standardized on UTC?

Additional Links:

No comments:

Post a Comment