vSphere 5.0 High Availability (HA)

Overview


·         vSphere 5.0 uses an agent called “FDM” Fault Domain Manager


·         No more primary/secondary node concept as in its predecessors


·         New master/slave concept with an automated election process


·         vpxa(vcenter agent) dependency removed





·         HA talks directly to hostd instead of using a translator to talk to vpxa


·         FDM agent communicates with vCenter to retrieve information about the status of virtual machines and vCenter is used to display the protection status of virtual machines


·         HA is no longer dependent on DNS


·         Character limit that HA imposed on the hostname has been lifted(Previously 26 chars)


·         If you add ESX/ESXi 4.1 or prior hosts to a vSphere 5.0 vCenter Server, the new vSphere HA Agent (FDM) will be installed


Master/Slave concept


·         One of the nodes in your cluster becomes the Master and the rest become Slaves


·         Master responsibilities


o   Monitors availability of hosts / VMs in cluster


o   Manages VM restarts after VM/host failures


o   Maintains list of VMs available in each ESXi host


o   Restarting failed virtual machines


o   Exchanging state with vCenter


o   Monitor the state of slaves


·         Slave responsibilities


o   Monitor their running VMs and send Status to Master and perform restarts on request from Master


o   Monitors Master Node Health


o   If the master should fail, participates in master election


Master-election algorithm


·         Takes 15 to 25s (depends on reason for election)


·         Elects participating host with the greatest number of mounted datastores.


·         Managed Object ID is used if there is a tie.



An election is held when:


·         vSphere HA is enabled initially


·         Master’s host fails or enters maintenance mode


·         A Management Network partition occurs


Heartbeating


·         Two different Heartbeat mechanisms


o   Network heartbeat mechanism


o   Datastore Heartbeat mechanism (New – Used when network is unavailable)




·         Network heartbeat mechanism


o   Sends heartbeat between slaves and master in every second


o   Election using UDP and master-slave communication using TCP


o   When a slave isn’t receiving any heartbeats from the master, it will try to determine whether it is isolated or whether the master is isolated or has failed


o   Prior vSphere 5.0, virtual machine restarts were always initiated, even if only the management network of the host was isolated and the virtual machines were still running




·         Datastore heartbeating


o   Adds a new level of resiliency and allows HA to make distinction between a failed host and an isolated / partitioned host.


o   Prevents Unnecessary Restarts


o   Two different files used : PowerOn file , Host hb file


o   Uses ‘PowerOn’ File to determine Isolation


o   Datastore heartbeat mechanism is only used in case the master has lost network connectivity with the slaves


o   2 datastores are automatically selected by vCenter for this mechanism


o   For VMFS datastores, the Master reads the VMFS heartbeat region(Uses locking mechanism).


o   For NFS datastores, the Master monitors a heartbeat file that is periodically touched by the Slaves


o   File created by each hosts in datastore (Host-<number>-hb)


o   Virtual Machine availability is reported by a file created by each Slave which lists the powered on VMs. (host-<number>-poweron)

Locking mechanism



·         HA leverages the existing VMFS files system locking mechanism


·         The locking mechanism uses a so called “heartbeat region” which is updated as long as the lock on a file exists


·         Host needs to have at least one open file on the volume to update heartbeat region


·         Per-host file is created on the designated heartbeating datastores to ensure heartbeat


·         HA will simply check whether the heartbeat region has been updated


Isolated vs Partitioned


·         Host is considered to be either Isolated or Partitioned when it loses network access to a master but has not failed


·         Isolation address is the IP address the ESXi hosts uses to check on isolation when no heartbeats are received


·         VMware HA will use the default gateway as an isolation address (Normally)


·         Isolated


o   Is not receiving heartbeats from the master


o   Is not receiving any election traffic


o   Cannot ping the isolation address


·         Partitioned


o   Is not receiving heartbeats from the master


o   Is receiving election traffic


o   (at some point a new master will be elected at which the state will be reported to vCenter)


·         When multiple hosts are isolated but can still communicate amongst each other over the management networks, it is called a network partition.


·         When a network partition exists, a master election process will be issued.


·         By default the isolation response is triggered after ~30 seconds with vSphere 5.x


Failed Master Host


·         Master Election Initiated


·         New Master Elected


·         New Master Restarts all VMs on the Protected list with Not Running State

Failed Slave Host



·         Master Check Network heartbeat


·         Master Checks Datastore Heartbeat


·         Master Restarts VMs Affected


Isolation Responses


·         Power Off


·         Leave Powered On


·         Shut Down


Isolation Detection


·         Slaves will Hold Single Server Election and Check Ping Address


·         Master will Check Ping Address


·         Master Restarts VMs Affected


Isolation of a slave


·         T0 – Isolation of the host (slave)


·         T10s – Slave enters “election state”


·         T25s – Slave elects itself as master


·         T25s – Slave pings “isolation addresses”


·         T30s – Slave declares itself isolated and “triggers” isolation response


Isolation of a master


·         T0 – Isolation of the host (master)


·         T0 – Master pings “isolation addresses”


·         T5 – Master declares itself isolated and “triggers” isolation response



Master declares a host dead when:


·         Master can’t communicate with it over the network


·         Host is not connected to master


·         Host does not respond to ICMP pings


·         Master observes no storage heartbeats


Results in:


·         Master attempts to restart all VMs from host


·         Restarts on network-reachable hosts and its own host



Master declares a host partitioned when:


·         Master can’t communicate with it over the network


·         Master can see its storage heartbeats


Results in:


·         One master exists in each partition


·         VC reports one master’s view of the cluster


·         Only one master “owns” any one VM


·         A VM running in the “other” partition will be


·         monitored via the heartbeat datastores


·         restarted if it fails (in master’s partition)


·         When partition is resolved, all but one master abdicates



A host is isolated when:


·         It sees no vSphere HA network traffic


·         It cannot ping the isolation addresses


Results in:


·         Host invokes (improved) Isolation response


·         Checks first if a master “owns” a VM


·         Applied if VM is owned or datastore is inaccessible


Master


·         Restarts those VMs powered off or that fail later


·         Reports host isolated if both can access its heartbeat datastores, otherwise dead


Determine if a slave is alive


·         Rely on heartbeats issued to slave’s HB datastores


·         Each FDM opens a file on each of its HB datastores for heartbeating purposes


·         Files contain no information. On VMFS datastores, file will have the minimum-allowed file size


·         Files are named X-hb, where X is the (SDK API) moID of the host


·         Master periodically reads heartbeats of all partitioned / isolated slaves


Determine the set of VMs running on a slave


·         A FDM writes a list of powered on VMs into a file on each of its HB datastores


·         Master periodically reads the files of all partitioned/isolated slaves


·         Each poweron file contains at most 140 KB of info. On VMFS datastores, actual disk usage is determined by the file-sizes supported by the VMFS version


·         They are named X-powereon, where X is the (SDK API) moID of the host


Protected-vm files are used


·         When recovering from a master failure


·         To determine whether a master is responsible for a given VM


FDMs create a directory (.vSphere-HA) in root of each relevant datastore


Within it, they create a subdirectory for each cluster using the datastore.

Comments

Post a Comment

Popular posts from this blog

VMware and Windows Interview Questions: Part 2

VMware and Windows Interview Questions: Part 3

VMware vMotion error at 14%