Protect VMware Single Sign-On Servers with vCenter HeartBeat
I recently had a conversation with a friend who is a VMware architect for one of the largest snack companies in the world. And that company has a very large VMware infrastructure footprint. He called me up with a dilemma and wanted my opinion on how to address it. He is planning and designing an upgrade to VMware vSphere 5.1, specifically to take advantage of single sign-on high availability (SSO HA) capabilities. He has several vCenter servers in the environment, so he is using Linked Mode -- except he wants to use it in multisite mode. Now my friend is well aware of the limitations and requirements of this implementation, but I will list the caveats here in case you are not familiar with them.
Remember, SSO HA is a new feature introduced with vSphere 5.1 and still has a way to go as far as maturation. You should be aware of the following when designing a solution using it:
- To use the SSO in HA, you will need to front-end the SSO servers with a load balancer.
- The admin role is always held by the first SSO server and does not fail over, which means if that server goes offline all services registered to SSO will continue to function properly unless you restart the server or service, at which point they will not come back online until that first server is restored.
- While that server is down you cannot add any new services of course and your ability to manage SSO is hindered
Those points being said, there is a way to promote your secondary SSO node to primary in the event that the primary is not recoverable. There is a file on the primary that you can copy (before it fails, of course) to the secondary, which contains all the necessary information to promote it should you need to do so. It's not a very elegant solution and prone to error.
It is crucial to fully understand these implications in order to properly design your environment.
As I said, my friend was aware of these limitations and was asking for alternatives. I then suggested that he use vSphere High Availability instead, as that will give him more than adequate protection against host failures and will restart the VMs in a very acceptable timeframe. But he said he wanted to protect against more than just a host failure. Given the size of his infrastructure, his concern is understandable.
I then suggested he use vCenter HeartBeat. I have always been a fan of this technology and it so happens that it supports and protects SSO. It's something that he hadn't thought of, and we ended up discussing and designing his environment to utilize vCenter HeartBeat to protect all vCenter components and plugins against operating system and application failures (among other things), and he was able to use Linked Mode vCenters and multisite SSO. Heck, he could have designed this out further if he wanted to and account for disaster recovery by designing the environment for failover and failback as well, where he would have HeartBeat protect the critical components at both his production and DR sites.
I wanted to share this story with you because I want to highlight the fact that while VMware continues to innovate and deliver excellent features in each release, sometimes it is a good idea to be cautious, to take a step back and fully understand that some of the features you're planning on implementing are at generation 1 and you may want to use alternative -- but just as effective -- solutions.
I'm very curious to know how many of you are considering SSO in HA mode and if you have taken these caveats into consideration and whether or not you find HeartBeat as a viable alternative. Or let me know if vSphere HA is enough in your situation. Please share in the comments section.
Posted by Elias Khnaser on 07/10/2013 at 1:24 PM