Saturday, 6 July 2024

vCenter Server DRS | Concepts | Requirements | Configurations | General Purpose - Episode#7

Think about a cluster built by Server virtualization platform "vSphere". This cluster purpose is to pool up all the resources of available ESXi hosts running within the cluster. 

Background

How can we build the cluster?

Of-course, through vCenter Server you can build the cluster. For more information about vSphere HA and Cluster object, you can read my last article for "Understanding vSphere HA". 

In-short, vSphere HA is used to restart VMs (on surviving host in a cluster) if any disastrous situation happens to the ESXi host either Hardware issues, Network isolation etc. 

So, cluster is a separate vCenter server object the vSphere HA or DRS. As vSphere HA or DRS are the services those run on top of cluster object and try to maintain resources amongst ESXi host in it.

Definition and Concepts

DRS is the short name of Distributed Resource Scheduler. This is the service that runs on top of the cluster of ESXi hosts and keep an Eye on VMs need for resources like 

  1. CPU
  2. Memory
  3. Storage IOPs
  4. Network Bandwidth

DRS is a vCenter Server service that runs maintains vCenter Server object "Cluster". Each cluster got separate DRS configuration. 

With the introduction to vSphere version 7.x and above DRS focuses on VM centric resource utilization dividing VMs into Happy (>80) and Un-Happy (<60) VMs scoring. You can see this scoring is about how VMs are having their desired resources (mentioned above) running on top of relevant ESXi host in the cluster. 



So, if a VM is not having a good score than that VM is considered / marked not happy on relevant ESXi host and needed to be moved to next available (Qualifying) ESXi host depending on DRS configurational Settings. These setting we shall discuss below

Requirements

DRS is always required if we want vSphere Management to take control of resource management and meet the SLA of "Applications" running on top of our virtualization platform. In this case, below are some requirements we always need to follow and focus

  • Enterprise Plus License for vSphere (May change in future but for DRS full functionality, this license is needed)
  • vCenter Server - DRS is a vCenter Server service
  • Cluster object must be configured before DRS enablement - becuase DRS always works with Cluster
  • Shared Datastore - for Compute migration using vMotion amongst ESXi host in a cluster
  • vMotion VMKernel Configuration - is a must without this configuation only Manual DRS config works
Configuration
This thing must always kept in mind that DRS focuses on resource mangement on top of a cluster not on individual ESXi host.

So you select the cluster which is required to configure DRS and then you click Configure Tab of vSphere Client as shown in below step









IF you see in the picture that DRS is already configured but you can do configuration by clicking Edit option available in the same interface as you can see below


Once, you click edit, the same configurational page will be appeared that can be used to reconfigure the DRS service on selected cluster object.




in the Edit DRS page you will see numerous configurational settings as you can see below















So, DRS is configurable in 3 different modes / levels

Manual Level

This is the default and first option offered by DRS that does not require you to configure vMotion and it only provide "Recommendation" for workload movement based on their demand for resources.

Partially Automated

This is the level in which it covers "Manual Level" feature which is "Recommendation only" and it also gives you option of "Initial Placement" which means before the startup of a workload/VM DRS decides which ESXi host is good to entertain the workload/VM in the cluster. So this decision capability is the part of this level.

Fully Automated

This level is always "Recommended" and it require all the requirements mentioned above in the requirements section. I provide feature sets of above 2 levels and automatically moves the workload as required by the system and needed.

Migration Threshold

This option is required to be configured that how often vMotion should be done if DRS level is set to Fully Automated. Because vMotion has always got a cost in terms of Migrating VMs from one host to another with respect to Bandwidth utilization in between source and target ESXi host. 




It is always Recommended to put the ball in between Neither so Conservative nor so Aggressive. 

Conservative

In this level of threshold, if a VM need resources then DRS will not immidiately respond instead it would wait longer so that resource spike may settle which could result in delay providing resources when needed by VM

Aggressive

In this level of threshold, if a VM got a slight spike of resource high utilization then a vMotion (VM migration) going to happen which could result massive level execution of vMotion keeping VMs move back and forth at all times

So, moderation is the Best policy!!

Predictive DRS

As you know that DRS by nature is a Re-active service so as VM got resrouce spike only then it responses back but now its nature got more of Pro-active as well. 

If you enable this option (that works well with VMware Aria Operation older name vRealize Operations) so, its look into the predictive Analysis of VAO/vROPs or vCenter server provided logs as well and moves the VM if needed based on time based resource utilizations like for example a VM with Active Directory services require more during Office startup time when employees are walking into the office and start logging in. So, VM must be provided good resoruces during 900hrs till 1100 or depending on number of logon requests to AD

Virtual Machine Automation

If you enable this option or check box checked then you can add "Exceptions" for VMs in the cluster to remove such VMs from DRS rules and implications. Like you can add vCSA VM not to be affected in a cluster having Management and compute workload al-together. 

So no Affinity and Anti-Affinity rule would be applied to such VMs when rules are actually applied to the whole cluster. In this case, VMs are consider to be configured at separate Level of DRS and consider Exceptional VMs

I will come up with practical demonstration on my YouTube Channel. If you haven't Subscribe than please do have a look and subscribe.


Thursday, 4 July 2024

vSphere 8 vMotion | Concepts | Requirements | Configurations| General Information - Episode#6

 How vMotion Works

There are on broader level 2 types of migration for workloads depending on their status (Either Power-on or Power-off). If the workload is Power-off and you migrate the VM then it is call or known as Cold migration. But if the workload is Powered On and you migrate than it is called or known as Hot Migration.

Where do we do hot migrations and what are the benefits? 

  • Migrate VM when its powered on and users are connected without loosing Connectivity and data.
  • Hardware maintenance of Hyper-visor is not a problem if workload is Powered on and connected
  • This migration can be done automatically depending on Hardware resources availability using DRS

Lets, talk about this in more details

If an application that represents a Business and Business availability then it should not be unavailable by any means. Whether you update or upgrade Hardware/Software or do maintenance activity while 100s and 1000s of people are connected to that Application.

If someone says that we can achieve this by having redundancy of application interfaces than the answer to that response is "Yes" but not 100% if number of connections are served by 1 Interfaces of Application Which runs on top of a hardware that require upgrade or maintenance than Connection are required to be lost resulting in Application unavailability or interruption.

So, vMotion is a technology that keep the Virtuali machine available even in the process of underlying hardware update or upgrades. And even in the event of software upgrades of hyper-visors (ESXi hosts) without letting users know about it while these users are connected to the very same instance of VM.

Requirements!

  • Configure VMKernel ports for ESXi host to ESXi host communication
  • 250 Mbps minimum required.
  • Configure common (Shared Storage) Data-store for VMs amongst the ESXi hosts
  • ESXi hosts must have common type of CPU (Family = Intel / AMD)
  • VMs must not connected to the local hardware resources like DVD, CPU Pinning or local Datastore

If you comply with the above points then you have configured your vMotion configuration easily and you paved the path to configure DRS which is the automated (AI based) and VM score based service to identify VM requirements for resource need.

how to initiate vMotion for a VM?

  • Just right click the VM and from the pop-up menu select migrate
  • Choose "Compute" migration and select the "Target" host.

Step#1 From Pop-up menu choose "Migrate"




































Step#2 Choose type of migration "Compute"









Step#3 Choose compute (i.e. Target ESXi host where you want to migrate selected VM)











Step#4 is to select network if VM was on virtual standard Switch but if VM is on distributed switch then no need to go for network stuff because network state and configuration moves with the VM (or remain the same accross ESXi host only if target ESXi host falls under same vDS)

Step#5 you also choose VMotion priority. By default is the "Normal" priority but you can set it to high priority while having multiple VMs migration. This is a system setting doesnt require any numbers to set (like Higher number high priority etc). 

If you set high priority for VMs then allocation of resources like CPU and Memory will be provided at higher priority (first) amongst other VMs.

And at the last page when your migration is about to initiate, it tells you whether you can migrate the VM/VMs or not (may be because of compatlity issues or some other problems).

As soon as you initiate the migration the VM(s) migration process initiates and finishes up without connected users of that Workload know about migration. 

So what happens in this migration?

As we mentioned above in "Requirements" that a single vMotion require 250Mbps. Why is it needed and what kind of data it moves?

  • VM occupied Memory on source ESXi host migrates to the Target ESXi host using vMotion VMkernel
  • Because Memory pages update and managed so quickly then after copying complete memory from source ESXi host to Target ESXi host any new changes are then copied as "Bitmap" images until minor changes just completed
  • Then comes a quick pause to switch over from source to target esxi host known as "Quise"
  • During this phase a reverse ARP / Gratituious ARP initiated by target ESXi host letting first hope physical switch know for updating CAM table to initiate VM traffic right after VM switch-over happens.
  • Because VM data files are located on a single (Datastore) storage so its access by Target host is required which will be assessed and verified before migration initiates.

We shall discuss "Cross-vCenter vMotion" later. I hope you enjoyed the topic. stay tuned!

My Posts

vSphere 8 HA | Isolation Addresses | Admission control Policy - Skill Enhancement Series - Advanced Administration - Episode#8

 In my last blog about vSphere HA basic concept, I explained the conceptual part of vSphere HA with some design tips. Now, in the continuat...