vSphere 5.0 High Availability (HA)
Overview
Locking mechanism
Failed Slave Host
·
vSphere 5.0 uses an agent called “FDM” Fault Domain Manager
·
No more primary/secondary node concept as in
its predecessors
·
New master/slave
concept with an automated election process
·
vpxa(vcenter agent) dependency removed
·
HA talks directly to hostd instead of using a
translator to talk to vpxa
·
FDM agent communicates with vCenter to
retrieve information about the status of virtual machines and vCenter is used
to display the protection status of virtual machines
·
HA is no longer dependent on DNS
·
Character limit that HA imposed on the
hostname has been lifted(Previously 26 chars)
·
If you add ESX/ESXi 4.1 or prior hosts to a
vSphere 5.0 vCenter Server, the new vSphere HA Agent (FDM) will be installed
Master/Slave
concept
·
One of the nodes in your cluster becomes the
Master and the rest become Slaves
·
Master responsibilities
o
Monitors availability of hosts / VMs in
cluster
o
Manages VM restarts after VM/host failures
o
Maintains list of VMs available in each ESXi
host
o
Restarting failed virtual machines
o
Exchanging state with vCenter
o
Monitor the state of slaves
·
Slave responsibilities
o
Monitor their running VMs and send Status to
Master and perform restarts on request from Master
o
Monitors Master Node Health
o
If the master should fail, participates in
master election
Master-election algorithm
·
Takes 15 to 25s (depends on reason for
election)
·
Elects participating host with the greatest
number of mounted datastores.
·
Managed Object ID is used if there is a tie.
An election is held when:
·
vSphere HA is enabled initially
·
Master’s host fails or enters maintenance
mode
·
A Management Network partition occurs
Heartbeating
·
Two different Heartbeat mechanisms
o
Network
heartbeat mechanism
o
Datastore
Heartbeat mechanism (New – Used when network is unavailable)
·
Network heartbeat mechanism
o
Sends heartbeat between slaves and master in
every second
o
Election using UDP and master-slave
communication using TCP
o
When a slave isn’t receiving any heartbeats
from the master, it will try to determine whether it is isolated or whether the
master is isolated or has failed
o
Prior vSphere 5.0, virtual
machine restarts were always initiated, even if only the management network of
the host was isolated and the virtual machines were still running
·
Datastore heartbeating
o
Adds a new level of resiliency and allows HA
to make distinction between a failed host and an isolated / partitioned host.
o
Prevents Unnecessary Restarts
o
Two different files used : PowerOn file , Host hb file
o
Uses ‘PowerOn’ File to determine Isolation
o
Datastore heartbeat mechanism is only used in
case the master has lost network connectivity with the slaves
o
2 datastores are automatically selected by
vCenter for this mechanism
o
For VMFS datastores, the Master reads the
VMFS heartbeat region(Uses locking
mechanism).
o
For NFS datastores, the Master monitors a
heartbeat file that is periodically touched by the Slaves
o
File created by each hosts in datastore (Host-<number>-hb)
o
Virtual Machine availability is reported by a
file created by each Slave which lists the powered on VMs.
(host-<number>-poweron)
Locking mechanism
·
HA leverages the existing VMFS files system
locking mechanism
·
The locking mechanism uses a so called
“heartbeat region” which is updated as long as the lock on a file exists
·
Host needs to have at least one open file on
the volume to update heartbeat region
·
Per-host file is created on the designated
heartbeating datastores to ensure heartbeat
·
HA will simply check whether the heartbeat
region has been updated
Isolated vs Partitioned
·
Host is considered to be either Isolated or
Partitioned when it loses network access to a master but has not failed
·
Isolation address is the IP address the ESXi
hosts uses to check on isolation when no heartbeats are received
·
VMware HA will use the default gateway as an
isolation address (Normally)
·
Isolated
o
Is not receiving heartbeats from the master
o
Is not receiving any election traffic
o
Cannot ping the isolation address
·
Partitioned
o
Is not receiving heartbeats from the master
o
Is receiving election traffic
o
(at some point a new master will be elected
at which the state will be reported to vCenter)
·
When multiple hosts are isolated but can
still communicate amongst each other over the management networks, it is called
a network partition.
·
When a network partition exists, a master
election process will be issued.
·
By default the isolation response is
triggered after ~30 seconds with vSphere 5.x
Failed Master Host
·
Master Election Initiated
·
New Master Elected
·
New Master Restarts all VMs on the Protected list
with Not Running State
Failed Slave Host
·
Master Check Network heartbeat
·
Master Checks Datastore Heartbeat
·
Master Restarts VMs Affected
Isolation Responses
·
Power Off
·
Leave Powered On
·
Shut Down
Isolation Detection
·
Slaves will Hold Single Server Election and
Check Ping Address
·
Master will Check Ping Address
·
Master Restarts VMs Affected
Isolation of a slave
·
T0 – Isolation of the host (slave)
·
T10s – Slave enters “election state”
·
T25s – Slave elects itself as master
·
T25s – Slave pings “isolation addresses”
·
T30s – Slave declares itself isolated and
“triggers” isolation response
Isolation of a master
·
T0 – Isolation of the host (master)
·
T0 – Master pings “isolation addresses”
·
T5 – Master declares itself isolated and
“triggers” isolation response
Master declares a
host dead when:
·
Master can’t
communicate with it over the network
·
Host is not
connected to master
·
Host does not
respond to ICMP pings
·
Master observes no
storage heartbeats
Results in:
·
Master attempts to
restart all VMs from host
·
Restarts on
network-reachable hosts and its own host
Master declares a
host partitioned when:
·
Master can’t
communicate with it over the network
·
Master can see its
storage heartbeats
Results in:
·
One master exists
in each partition
·
VC reports one
master’s view of the cluster
·
Only one master
“owns” any one VM
·
A VM running in the
“other” partition will be
·
monitored via the
heartbeat datastores
·
restarted if it
fails (in master’s partition)
·
When partition is
resolved, all but one master abdicates
A host is isolated
when:
·
It sees no vSphere
HA network traffic
·
It cannot ping the
isolation addresses
Results in:
·
Host invokes
(improved) Isolation response
·
Checks first if a
master “owns” a VM
·
Applied if VM is
owned or datastore is inaccessible
Master
·
Restarts those VMs
powered off or that fail later
·
Reports host
isolated if both can access its heartbeat datastores, otherwise dead
Determine if a
slave is alive
·
Rely on heartbeats
issued to slave’s HB datastores
·
Each FDM opens a
file on each of its HB datastores for heartbeating purposes
·
Files contain no
information. On VMFS datastores, file will have the minimum-allowed file size
·
Files are named
X-hb, where X is the (SDK API) moID of the host
·
Master periodically
reads heartbeats of all partitioned / isolated slaves
Determine the set
of VMs running on a slave
·
A FDM writes a list
of powered on VMs into a file on each of its HB datastores
·
Master periodically
reads the files of all partitioned/isolated slaves
·
Each poweron file
contains at most 140 KB of info. On VMFS datastores, actual disk usage is
determined by the file-sizes supported by the VMFS version
·
They are named
X-powereon, where X is the (SDK API) moID of the host
Protected-vm files
are used
·
When recovering
from a master failure
·
To determine
whether a master is responsible for a given VM
FDMs create a
directory (.vSphere-HA) in root of each relevant datastore
Within it, they
create a subdirectory for each cluster using the datastore.
Nice Documents on HA
ReplyDelete