My vSphere Blog: Host

Showing posts with label Host. Show all posts

Saturday, 6 July 2024

vCenter Server DRS | Concepts | Requirements | Configurations | General Purpose - Episode#7

Think about a cluster built by Server virtualization platform "vSphere". This cluster purpose is to pool up all the resources of available ESXi hosts running within the cluster.

Background

How can we build the cluster?

Of-course, through vCenter Server you can build the cluster. For more information about vSphere HA and Cluster object, you can read my last article for "Understanding vSphere HA".

In-short, vSphere HA is used to restart VMs (on surviving host in a cluster) if any disastrous situation happens to the ESXi host either Hardware issues, Network isolation etc.

So, cluster is a separate vCenter server object the vSphere HA or DRS. As vSphere HA or DRS are the services those run on top of cluster object and try to maintain resources amongst ESXi host in it.

Definition and Concepts

DRS is the short name of Distributed Resource Scheduler. This is the service that runs on top of the cluster of ESXi hosts and keep an Eye on VMs need for resources like

CPU
Memory
Storage IOPs
Network Bandwidth

DRS is a vCenter Server service that runs maintains vCenter Server object "Cluster". Each cluster got separate DRS configuration.

With the introduction to vSphere version 7.x and above DRS focuses on VM centric resource utilization dividing VMs into Happy (>80) and Un-Happy (<60) VMs scoring. You can see this scoring is about how VMs are having their desired resources (mentioned above) running on top of relevant ESXi host in the cluster.

So, if a VM is not having a good score than that VM is considered / marked not happy on relevant ESXi host and needed to be moved to next available (Qualifying) ESXi host depending on DRS configurational Settings. These setting we shall discuss below

Requirements

DRS is always required if we want vSphere Management to take control of resource management and meet the SLA of "Applications" running on top of our virtualization platform. In this case, below are some requirements we always need to follow and focus

Enterprise Plus License for vSphere (May change in future but for DRS full functionality, this license is needed)
vCenter Server - DRS is a vCenter Server service
Cluster object must be configured before DRS enablement - becuase DRS always works with Cluster
Shared Datastore - for Compute migration using vMotion amongst ESXi host in a cluster
vMotion VMKernel Configuration - is a must without this configuation only Manual DRS config works

Configuration

This thing must always kept in mind that DRS focuses on resource mangement on top of a cluster not on individual ESXi host.

So you select the cluster which is required to configure DRS and then you click Configure Tab of vSphere Client as shown in below step

IF you see in the picture that DRS is already configured but you can do configuration by clicking Edit option available in the same interface as you can see below

Once, you click edit, the same configurational page will be appeared that can be used to reconfigure the DRS service on selected cluster object.

in the Edit DRS page you will see numerous configurational settings as you can see below

So, DRS is configurable in 3 different modes / levels

Manual Level

This is the default and first option offered by DRS that does not require you to configure vMotion and it only provide "Recommendation" for workload movement based on their demand for resources.

Partially Automated

This is the level in which it covers "Manual Level" feature which is "Recommendation only" and it also gives you option of "Initial Placement" which means before the startup of a workload/VM DRS decides which ESXi host is good to entertain the workload/VM in the cluster. So this decision capability is the part of this level.

Fully Automated

This level is always "Recommended" and it require all the requirements mentioned above in the requirements section. I provide feature sets of above 2 levels and automatically moves the workload as required by the system and needed.

Migration Threshold

This option is required to be configured that how often vMotion should be done if DRS level is set to Fully Automated. Because vMotion has always got a cost in terms of Migrating VMs from one host to another with respect to Bandwidth utilization in between source and target ESXi host.

It is always Recommended to put the ball in between Neither so Conservative nor so Aggressive.

Conservative

In this level of threshold, if a VM need resources then DRS will not immidiately respond instead it would wait longer so that resource spike may settle which could result in delay providing resources when needed by VM

Aggressive

In this level of threshold, if a VM got a slight spike of resource high utilization then a vMotion (VM migration) going to happen which could result massive level execution of vMotion keeping VMs move back and forth at all times

So, moderation is the Best policy!!

Predictive DRS

As you know that DRS by nature is a Re-active service so as VM got resrouce spike only then it responses back but now its nature got more of Pro-active as well.

If you enable this option (that works well with VMware Aria Operation older name vRealize Operations) so, its look into the predictive Analysis of VAO/vROPs or vCenter server provided logs as well and moves the VM if needed based on time based resource utilizations like for example a VM with Active Directory services require more during Office startup time when employees are walking into the office and start logging in. So, VM must be provided good resoruces during 900hrs till 1100 or depending on number of logon requests to AD

Virtual Machine Automation

If you enable this option or check box checked then you can add "Exceptions" for VMs in the cluster to remove such VMs from DRS rules and implications. Like you can add vCSA VM not to be affected in a cluster having Management and compute workload al-together.

So no Affinity and Anti-Affinity rule would be applied to such VMs when rules are actually applied to the whole cluster. In this case, VMs are consider to be configured at separate Level of DRS and consider Exceptional VMs

I will come up with practical demonstration on my YouTube Channel. If you haven't Subscribe than please do have a look and subscribe.

Saturday, 24 July 2021

vSphere HA | Requirements | Admission Control | General Introduction

Hello my dear readers, Greetings!

It was quite a long time i just got engaged in my Training deliveries that's the reason couldn't spare time to write a blog post.

Let's start our topic Discussions!

vSphere HA, we normally say or recognize it with a restart of VMs on surviving host in a vSphere cluster.

We normally use vsphere HA in vCenter server cluster object and is helpful in different situations like

ESXi host Hardware issues
Network disconnectivity among ESXi hosts in a cluster
Shared Storage connectivity or unavailability issues with ESXi hosts
Planned maintenance of ESXi hosts

How does it work?

vSphere HA, unlike its name (HA = High Availability), it restarts VMs on surviving hosts where VMs requirements are accommodated as shown in the below picture.

For example, if any ESXi host has got any hardware problem due to which it stops working resulting in unavailability of VMs. The (interrupted) VMs then be taken care by other available ESXi hosts in the same cluster to power them on accessing the same shared datastore.

This failures could be a Hosts Hardware/ Network interruptions /Storage in-accessibility etc.

So, it means we have to fulfill some important hardware requirements for vSphere HA. Let's discuss its requirements

The basic high-level requirements are as below

vCenter Server (vpxd)
Fault Domain Management (FDM-local to every host)
Hostd (local to every host)

Let's break these requirements into understandable pieces

Hardware Requirements

Minimum 1 shared datastore - Recommended 2 shared Datastores
Minimum 2 ESXi hosts and Maximum 64 ESXi hosts in a cluster
Minimum 1 Ethernet network with Static IP Address for host Recommended 2 Ethernet networks with static IP Addresses for ESXi hosts (Multiple Gateways)

Software Requirements

vCenter Server - To create cluster object
1 Management network must be common among all ESXi hosts in the Cluster
Enable vSphere HA on the cluster object
Minimum vSphere Essential plus kit license or single standard vCenter server license

Talking about high-level requirements, vCenter server is required to build or create cluster object and to push FDM agents to the ESXi host those are the part of cluster as member hosts.

FDM Agent is actually a service that runs locally inside each ESXi hosts in the cluster which is enabled with vSphere HA feature. FDM is the one who is taking care of all HA related actions like

HA logging
VM restarted on survining hosts
selection of Master node in a cluster
Management of vSphere HA all requirements

FDM service talks directly to "hostd" service of each ESXi host.

The basic purpose of "hostd" is to create/delete/start/restart/shutdown and infact all the necessary actions of ESXi host against VMs are taken care by "hostd".

vSphere HA Anatomy

When you enable vsphere HA on a cluster then the members of the cluster are divided into two basic parts

Master Node
Slave / Subordinate Nodes

There would be only one Master Node in a vSphere HA cluster and rest would be Slave / Subordinate Nodes. Total size of vSphere cluster could go upto 64 Nodes (1 Master & 63 Slave Nodes) vSphere 6.5 / 6.7 / 7

Master node has got all the responsibility to Restart VMs on available surviving hosts (slave / subordinate).

Master node has got responsibility to equally divide the workload of Restarting VMs on surviving hosts.

Master node has got responsibility to inform vCenter server about the current status of vSphere HA cluster

Master node has got responsibility to keep track of Heartbeat from Slave nodes either from Network or from datastore.

How Master node know all about the VMs which are required to be restarted on surviving hosts ?

There is a file named "Protected List" located on all shared Data-stores which can be accessed by Master Node in the cluster and held / occupied by Master-node.

This file contains information about Virtual Machines running on their respective hosts.

An-other file named "Power-on" file located on shared data-stores and accessible by all nodes including master node in the cluster. The purpose of this file is to maintain time stamp updated after every 5 minutes by all the hosts to mark the connectivity of all hosts for avoiding network isolation.

The significance of "Power-on" file is to let Master node know about network isolation impact on disconnected hosts from ethernet network. So, master node locates the alternate connectivity of such network disconnected hosts by looking into the latest time stamp with 5 minutes update after last accessibility to this file by the host using alternative to heart-beat channel other than ethernet network (which is datastores).

Minimum alternative heart-beat sources (in the form of data-store accessibility) is two. It is highly recommended to choose alternative datastores manually instead of letting vCSA to choose them (automatically) for you.

(Design Tip)

Design your vSphere HA network with redundant ethernet gateways and keep your shared storage network (fabric) physical separate. Incase of any network disaster, your vsphere design can survive / mitigate the situation.

How different Nodes respond to HA failure Scenarios?

Master Node:

Master nodes are responsible to restart failing host VMs on surviving hosts and updates "Protected List" file all across datastores it can access.

If Master node struck a failure (H/W Issue or Network Isolation etc), VMs running on top of this host shall be evenly distributed amongst the surviving hosts right after election process. What is Election process

All the slave nodes in the cluster send heart beat to each other and to master node and wait for the master node's heart beat.
If slave node do not receive Master node heart beat for 15 seconds then they consider it is dead
Slave node initiate a special broadcast which is known as election traffic which all the slave nodes sense and elect the next master node amongst them.

This election process continues for next 15 seconds right after slave nodes waited for master node's heart beat for 15 seconds.
Right after election process (which takes another 15 seconds) to elect one master node from remaining slave nodes, the elected Master node takes over the "Protected list" file and initiate (initial placement) affected VMs which were running on faulty Master node.

Conclusion:

Master node takes around 45 seconds to restart the virtual machines on surviving hosts.

Slave nodes:

These are the nodes which take instructions from Master node to take care of affected (failing host) virtual machines to be restarted.

If slave node stuck a failure (H/W and or network isolation) then Master node takes responsibility to restart VMs (from the failing host) to available hosts in the cluster.
Master node within 15 seconds takes decision and evenly distribute the VMs across the cluster amongst the surviving hosts in the cluster.

Conclusion:

Slave nodes take 15 seconds to restart the VMs amongst the surviving hosts.

About Network Isolation

In this kind of state, affected host or number of hosts cannot be able to contact their gateways and Master node will not be able to contact isolated hosts. That's the reason we choose alternative to ethernet in the form of data-store heart beat.

This kind of isolation would impact more if we have not taken care of ethernet design along with shared storage accessibility with redundancy.

Better Ethernet-network designs

Choosing better and physically separated topological approach for vSphere HA always helps a-lot. Just as you can see in below picture.

In the above picture which depicts the recommended approach for system traffic isolation, explains clearly that physical isolation of system traffic can be done through provisioning or creation of separate logical switches for separate system traffic.

Though in this picture, I have mentioned 2 separate traffics to be the part of same virtual switch which explains that you can also put different (system) traffics combined or logically separate as well.

An-other important aspect, I wanted to draw your kind attention over there is to look at redundancy from the very basic component (vmNIC) till physical switches. This approach can also lessen the impact of any network level disaster.

Note: You can use same DNS as well instead of using separate DNS zones for each network as shown or mentioned in the picture above.

Logical (Isolation) network

In this scenario, you can use as low as available number of vmNICs (Physical network cards). Specially in case of blade chassis. So, you can separate system traffic (like Management, vMotion, vSAN, FT, Replication etc) logically using vLANs.

Note: Better network design even save from disasters like shared storage unavailability resulting in problems like APD (All Path Down).

To be continued (Stay Tuned...!)

My vSphere Blog