Download – Aria Operations Windows Server Checker Dashboard

Windows Server 2012 R2 has now reached end of life on October 2023. Which means free updates, bug fixes, and technical support for the OS has ended. Microsoft has allowed customers to purchase Extended Security Updates until 2026. With that being said, I have created this dashboard to easily check which environment still has Windows 2012 server and below in just one click of the button.

Download Dashboard here

https://developer.vmware.com/web/dp/samples?id=6554

User Guide

Select an Environment to do a Windows Check. Select vSphere World to search for all environments.

Get a nice breakdown of all Windows variants in the environment.

Anything in this column are Windows 2012 and below

Anything in this side are Windows 2016 and above

Download Dashboard here

https://developer.vmware.com/web/dp/samples?id=6554

twitterpinterestlinkedinmail


4 ways to measure VM Uptime

I always tell my customers that one of the hardest metrics to measure by far is Virtual Machine uptime. The reason why it is so hard to measure is because there are so many factors that most people don’t think of. Many people think it is simple as a VM being powered on or powered off. What about VMs that are supposed to be powered off? Also, if a VM is powered on what if it has the blue screen of death or there is no network activity? In my opinion, if a VM is Powered on but the Operating system is not functional it should be considered down because this virtual machine is totally in an unusable state. In this guide I will explain all the possible scenarios and some of the methods I use to resolve this using Aria Operations.

VM Down Scenarios

  • Virtual Machine is Powered Off in vCenter but is supposed to be On
  • Virtual Machine is Powered On but Operating System has the Blue Screen of Death
  • Virtual Machine is Powered On but Operating System is in a hung state
  • Virtual Machine is Powered On but has no network activity because of bad IP configurations or even worst no vNIC/network attached.
  • Virtual Machine is Powered On but nobody can access the URL for that application.

The above scenarios are what I consider a Virtual Machine to be down or unavailable. Now you can understand why this is the one of the hardest metrics to measure by a wide margin.

Methods to Measure VM Uptime

Measuring Blue Screen of Death and Hung VMs

One of my colleagues tested many scenarios already so you can read more about it below.

Measuring VMs down by using a Super Metric
This super metric will more accurately measure if a VM is down based on if a VM OS is responding or not. Using this method, I can also measure which VMs were up more than certain percent of the time but is currently down. Looking at the screenshot below as you can see vCenter reports that the VM has been powered on 100% of the time for a given time (Power On%). However, my super metrics (VM Uptime %) shows that the OS has been down unavailable at some point.

http://www.vmignite.com/2020/11/download-vm-uptime-dashboard-for-vrops-8-2/

 

Measuring VMs by Ping pack

This method simply pings allows you to ping any URL or IP you would like to see measure for a response.

http://www.vmignite.com/2020/12/2123/

 

 

Measuring VMs down by using vSphere Tags

This method allows you to control which VMs you want to measure by simply creating one alert and tagging any VM in vCenter with a tag to see if it is down or up. In the post below I kept it simple and only based it on a VM being in a powered off state. You can take all the above scenarios and make a super alert that combines them to one and then control which VMs to monitor by using vSphere tags if you would like.

http://www.vmignite.com/2021/05/vrops-how-to-monitor-vm-uptime-using-vsphere-tags/

 

twitterpinterestlinkedinmail


Aria Operations Tips (Shorten collection cycle, Pricing, Policies, Telegraf, Automate Maintenance mode, Super Metrics)

Although I spend endless hours testing features and perfecting day 2 operations as much as possible. I also spend some time reading what other people have written about Monitoring as well. With that being said, below are some blog post that are writing from others that I find will become useful for my readers.

Changing collecting cycle down from 5 minutes to 20 seconds

https://www.brockpeterson.com/post/aria-operations-saas-near-real-time-monitoring

Pricing VMs

https://www.brockpeterson.com/post/pricing-with-vmware-aria-operations

Policies explained

https://www.brockpeterson.com/post/vmware-aria-operations-policies

Monitoring Services and Process using Telegraf

https://www.brockpeterson.com/post/monitoring-windows-services-processes-and-linux-processes-with-aria-operations

vROPS Management Pack Builder

https://www.brockpeterson.com/post/vrops-management-pack-builder

How to automate Object Maintenance Mode

https://www.ntpro.nl/blog/archives/3605-How-to-automate-Object-Maintenance-Mode-in-vRealize-Operations-8.html

Super Metrics Creation

https://www.vmwareopsguide.com/miscellaneous/chapter-4-super-metrics/4.4.3-examples/

https://blogs.vmware.com/management/2020/09/my-top-15-vrealize-operations-super-metrics.html

Understanding Metrics

https://www.vmwareopsguide.com/metrics/

https://www.vmwareopsguide.com/downloads/vsphere-metrics/

Blue Screen of Death detection

How to detect Windows Blue Screen of Death using VMware Aria Operations

twitterpinterestlinkedinmail



Master Alerting in less than an hour using VMware Aria Operations

In this video tutorial I will walk you through how to create a multi-condition alert, an alert based on a parent/child object, how to send out the alert to email and to other third party tools, how to customize the message of the alert when it reaches your inbox, and alerting recommendations based on my 10 years experience. After watching the video make sure to check out my Alerting Do’s and Don’t post: http://www.vmignite.com/2021/06/vrops-alerting-dos-and-dont/

twitterpinterestlinkedinmail


vROPS – Environment Performance Bench Marking Dashboard (Download)

This must have dashboard will come in handy when you need to know how is your cluster, host, and virtual machines are performing in any point of time. For example if I do load testing on my cluster, how can I monitor the performance of all my Host, Virtual Machines, and overall health of my cluster during that time? This dashboards shows CPU performance, Memory performance, VM Growth, network usage, disk space growth, and more. Please see the youtube video below for full instructions on how to fully use the dashboard and some live examples on how it help solve one of my customer’s recurring issue.

Download here: https://developer.vmware.com/web/dp/samples?id=8089

Guide and demo on how to use the Dashboard

Installation Instructions:

Download the .json file from the above link and import into your dashboards. Then look for the dashboard named VMignite.com – Environment Benchmarking

twitterpinterestlinkedinmail


The Virtualization Engineer – Explaining what we do to friends and family video

Even after 15 years in the Virtual space, my friends and family still don’t understand what I do when I try to explain to them that I am an VMware Engineer AKA Virtualization Engineer. This video is a parody of some of our pain points that we are go through. Please share feel free to email and share to your peers, friends, and co-workers to make their day. Also remember to subscribe and like.

twitterpinterestlinkedinmail


vROPS – Master the List View video tutorial

Create a dashboard and report using the most popular widget by far, the list view. Mastering the list view widget can instantly make you become an advance vRealize Operations Manager user. I will also you show expert tips and advice when using the list view. This will be my first video on youtube, remember to subscribe as there will be many more videos to come.

twitterpinterestlinkedinmail


vROPS 8.10 – How to view and filter by Deleted Objects

New in vROPS 8.10 is the ability to filter deleted objects such as Virtual Machines, Host, datastores, etc.  In this guide I will walk you through how to view and filter based on deleted objects .  I will also show you how to hold up to 60-days of deleted VMs by default. 

Pre-requisites

  • The ability to filter based on Deleted Objects is a feature of vROPS 8.10 and above only.  Older versions have the feature to show deleted time but no feature to show only deleted objects only.
  • This only works for List Views

How to view deleted objects only

You can create or edit any list view widget.  Expand Settings and select Deleted under Show Objects.  This will filter out to show only Deleted data if they exist for that object. 

How to keep up to 60-days’ worth of deleted objects

By default vROPS only keeps a few days’ worth of deleted data.  To increase this we will need to go to Administration > Global Settings

Under Deleted Objects set how many hours you would like to keep deleted objects and click on Save.  Maximum is 1440 hours which is equivalent to 60 days’ worth of data. 

twitterpinterestlinkedinmail


How to automate anything with vROPS

The highest level of monitoring in my opinion is automation.  That is when you are leveraging monitoring tools to automate your everyday troubleshooting, capacity, alerting, and monitoring needs.  In theory most people think automation is hands off self-healing.  As in it will just fix itself without any user interaction. To be more realistic, mostly all use cases of monitoring need the final input of the stakeholder to decide what are the next best steps.  Therefore, in most cases, automation when it comes to monitoring is automating the process as much as possible to get you the results as quickly as possible so that the stakeholders can decide what are the best next steps to take. In this guide I will walk you through on how to automate almost any typical engineering and operation task using vROPS alone.  Remember Automation is a timely process to perfect and repeated testing is key.  In the end it will be worth it once successful.

What are the best use cases to Automate?

  1. Troubleshoot any application or VM in minutes! 
  2. Do a complete Health Check or your environment with one click on the button
  3. Manage Capacity in minutes (Growth, inventory, optimization)
  4. Monitor your entire Infrastructure in one pane of glass (vCenter, VMs, Host, Datastores, vSAN, NSX-T, VDI, vIDM, vRNI, Log Insight, vROPS, vRA, etc)

Unrealistic as it sounds, customer that have done PSO engagements with me have already seen that I have proven this can be done with just using vROPS alone.  Feel free to reach out to your TAM to setup a free demo with me to see for yourself. 

Ways you can automate using vROPS

1. Using Automation Central (Easy)

Using the Automation Central feature in vROPS allows you to automate task such as rebooting VMs, reclaiming resources, and rightsizing VMs in a recurring schedule you choose.  Read the guide below on how to use this feature

http://www.vmignite.com/2021/05/vrops-8-4-how-to-use-automation-central-to-schedule-task/

2. Create an alert and attach an Action to it (Intermediate)

This will answer the question when this problem happens, I want it to automatically fix itself by doing this. There are infinite use cases you can self-automate using this technique. This is a three-step process, however although I show my customers how easy it is to automate anything using this feature, none of my customers end up doing it.  Why?  Because real Engineers rather know the problem happened and get alerted on it and fix the root cause so it doesn’t happen again rather than mask the problem. For example, if vCenter keeps crashing, rather than automate it to automatically reboot each and every time, why not figure out what is causing the crash so it doesn’t happen ever again.

  • Step 2: You will then need to attach a Recommendation with an Action to it.  You can choose from the many predefined Actions available or import more using the vRO management pack which will provide you with endless automated Actions
  • Step 3: Next you will need to edit your default policy and enable the automation of this Action to take place each time the alert happens.  See below screenshot of how to Enable the Automation of the Action

3. Know something is healthy or unhealthy using custom properties (Easy)

Rather than check numerous metrics for a VM for example, management wants just one simple metric for Host, VMs, vCenter, and Datastores on whether it is healthy or unhealthy based on criteria that you can define and control.  This can be done using the guide I already written below.  This automates as many checks as you want and rolls it up to one simple Good or Bad result. You can also use the same technique for tagging purposes and more.

http://www.vmignite.com/2019/04/vrops-7-5-using-custom-groups-to-create-a-custom-health-metric/

4. Using Super Metrics to create Infinite Metrics (Expert)

If you believe vROPS doesn’t have the metrics you want or the metrics out of the box is not precise enough for you, you can just create a whole new one using a mix of formulas, if-then-else scenarios, and mixture of different metrics and properties comparison to create a brand-new custom metric or property.  For example, you can create a super metric to show you how many Windows 2012 servers you have in a vCenter, how many Host are down in a vCenter, etc. Below are some examples super metrics that I created for customers

You can download more Super Metrics here: https://developer.vmware.com/samples?categories=Sample&keywords=&tags=vRealize%20Ops%20Super%20Metrics&groups=&filters=&sort=dateDesc&page=

5. Creating a Dashboard to solve a Use case (Intermediate)

Yes, dashboard is automation if created with purpose.  Don’t believe me? Check out the dashboards below that I shared out to solve the scenarios I mentioned earlier.  Anyone can create a dashboard, but it takes a lot of time and testing to create one that will solve an entire use case.  See the below examples to see what I mean.  One of my dashboards could take weeks and months to perfect.

twitterpinterestlinkedinmail


VM Appliance Monitoring and Inventory Dashboard updated

Two must have dashboards are now updated and better than ever before.  The Appliance Dashboard is what I personally used to know what VMware products my customer has, how it is performing, how it is configured, and what are the IPs.  The Inventory dashboard provides everything you need to know about your environment inventory, capacity, hardware, versions, performance, and more.

VM Appliance Monitoring dashboard now monitors more than ever before.

Download here: https://developer.vmware.com/samples?id=7599

User Guide: http://www.vmignite.com/2021/05/vrops-vmware-appliance-monitoring-dashboard/

Monitors the following products

  • vCenter Server Appliance
  • NSX, NSX-T
  • vRA
  • vROPS
  • Log Insight
  • Orchestrator
  • Life Cycle Manager
  • Network Insight (vRNI)
  • Vmware SRM
  • vIDM
  • Air Watch
  • Cloud Proxy appliances
  • vSphere Replication (New)
  • HCX (New)
  • Kubernetes (New)
  • Tanzu (New)
  • Dell EMC Appliances (New)

Now also be able to view important VM Metric historical stats.

Inventory dashboard

Download here: https://developer.vmware.com/samples?id=5629

User Guide: http://www.vmignite.com/2020/12/download-vrops-complete-360-inventory-dashboard/

Updates include enhancements of formatting. Additional metrics to VM, Hosts, vCenter, Clusters, and more.

 twitterpinterestlinkedinmail



vROPS – How to monitor Services using no agents and no credentials

In this post I will show you how to monitor over 40 out of the box services using no credentials and without an agent. On top of that I will show you how create an alert when the service is down and how to create a custom group based on VMs that run that service. Enabling this feature allows you to monitor Up/Down of critical services such as IIS, SQL, Apache, Active Directory, and more. On top of that it will provides you with other useful metrics such as ports, how many outbound/inbound connections, and more.

Prerequisites

For maximum results you should be on the following Pre-requisites below

  • vROPS 8.6 or above (version 8.1 is minimal)
  • Have all your VMs run VMtools 11.1.0 or above
  • Be on vCenter 67u3G and above
  • ESXi Host with versions 6.5p5 and above
  • For full requirements read the following https://kb.vmware.com/s/article/78216

How to Enable Credential-Less Service Discovery

From the left menu, click Configure > Application Discovery > Configure Service Discovery option

Expend vCenter and Edit the vCenter

Click on Service Discovery Tab. Then Enable Service Discovery and check the box Enable Application Discovery. Click on Save when done

Expand the vCenter we just edited. You should now see a Service Discovery option below it. Edit the Service Discovery by clicking to the left of it and selecting Edit

Expand Advance Settings and make sure Credential-less service discovery and application discovery are both Enabled. Then click on Save

Give it at least 30 minutes for it to do some scanning. Now go to Environment > Applications > Manage SDMP Services

Refresh the page. If successful, you would see a status of Credential-less under Authentication Status. If known services are detected, you can expand the VM to see what services are detected.

How to scan services using Credentials

For VMs that failed, make sure they meet the requirements mentioned in the KB article given. You can also try entering credentials manually as well by selecting the VMs that failed and selecting the Provide Password option

To provide credentials for multiple VMs, the easiest way would to be provide credentials from the Service Discovery page.

How to view what Services are discovered

You can easily see what services are available and discovered by going to Configure > Application Discovery > Service Configuration and then seeing how many of each are discovered.

If you click on one of the services with VMs discovered it will lead you to a page that list all the VMs with that service.

How to add your own Service monitoring

There will be services that are not well known and won’t be included out of the box. However, in the following steps we will show you how to add your very own service to have it monitored.

Go to Configure > Application Discovery > Service Configuration

Click on Add Service and fill in the Process Name, Port, and Display Name and click on Save when done

You should now see your Custom Service under Custom Services

How to add additional Service Monitoring metrics

You can also enable service monitoring and collect important metrics such as the amount of CPU and memory consumed by the service plus how much disk I/O the service is generating. We can also collect the number of connections to and from the service, what ports that the service is listening on, version information, and more. See a full list in the link below

https://docs.vmware.com/en/vRealize-Operations/8.2/com.vmware.vcom.core.doc/GUID-3282DF19-194A-421C-B50F-A9AB5FB3D42B.html

To provide additional metrics go to Environment > Applications > Manage SDMP Services

This will bring a list of VMs. Check all the VMs you want to apply Service Monitoring and then select Enable Servide Monitoring from the dropdown.

Service Monitoring checkbox should now be shown next to the VMs you selected.

Below is a view that I created that shows metrics that are pulled once successful. You can see besides up/down I can get Ports, Incoming/Outgoing Connections, Paths, etc).

How to create a custom Group based on Services

In the following example we will create a new custom Group for any VM that is running SQL. This group can be used for reporting and dashboarding purposes.

We will create a new custom group by going to Environment > Custom Groups > Add

Provide a name and group type. Make sure the Keep Group Membership up to date is checked. This will ensure that this groups gets auto updated.

Using the previous steps above, we know that our Service is called MS-SQL DB. Therefore select Virtual Machine as our object type and fill Relationship Descendent of is MS-SQL DB.

Click on the Preview button to make sure the results are correct and then click on Close when done. Click on OK to save the Group. It is now ready to be added to a dashboard or report.

How to create an alert when a specific Service is down

In this example we will create an alert when a SQL service is down. This will only work for VMs that have MS-SQL discovered.

We will create the alert, go to Configure > Alerts > Alert Definitions

Click on Add to create a new alert definition

Configure the following

  • Provide a name and optional description
  • Choose Base Object Type: Services
  • Expand Advanced Settings and change Alert Type & Subtype to Application: Availability
  • Click on Next when done

Expand Properties and drag the property Status and Type until it shows up on the left side in the same box grouping as shown below.

Configure the following

  • Make sure All is selected on the top drowdown
  • Set the condition to Status Not equal to Up and mark as Critical.
  • Set Type Contains
    SQL
  • Click on Next when done

Click on Next and Add an optional Recommendation. Then click on Next again

In the Policies section. Make sure to enable it on all policies and click on Next

If you would like to forward this out to a third party such as Email, Service Now, and/or Ticketing System you can select it here (Note: this must be preconfigured).

Click on Create when done. You should now receive alerts whenever a SQL service is down.twitterpinterestlinkedinmail



vROPS – How to create an alert on any disk partition

Customers have told me stories in the past where certain partitions filled up which have caused certain applications to crash. Some examples are vCenter, vIDM, vROPS, etc. To solve this problem, in this post I will walk you through on how to create an alert on any partition and how to filter it out to certain specific appliance such as vCenter Server. This way you will get an alert on the specific drive partition of the VM before it gets 100% full. Which drive partition to monitor varies from server to server, therefore this guide is written for you to understand the steps and logic it takes to create an alert for any VM.

How to get an alert when a vCenter Server partition is out of space

In this guide I will show you to monitor all vCenter partitions except for the archive partitions since this partition is OK to be full.

Make sure the Guest Free Metric % is enabled on all policies. The below link will walk you through this process

http://www.vmignite.com/2021/02/vrops-8-how-to-enable-hidden-metrics-and-properties/

Next you will need to know the name of a vCenter Server. Use my VMware Application Monitoring dashboard to easily locate the names and the product name which we will need to filter later.

We will create the alert, go to Configure > Alerts > Alert Definitions

Click on Add to create a new alert definition

Configure the following

  • Provide a name and optional description
  • Choose Base Object Type: Virtual Machine
  • Expand Advanced Settings and change Alert Type & Subtype to Virtualization/Hypervisor: Capacity
  • Click on Next when done

To get only the partitions containing to our vCenter Server only, we will need to click on Select Specific Object. This will allow us to select a vCenter Server VM.

Search for our VM which we noted earlier and select it and then click on Select

Now expand Metrics > Guest File System. Notice there are a lot of partitions for vCenter Server.

Expand the first one and drag the metric Partition Utilization % to the left side until it shows up.

Since we only want it to alert us when it is 90% full, we will type in 90 in the filter and make sure the greater than sign is chosen. We will also set the critically to Critical

Repeat the process for each other partition you want monitored leaving out the archive drive. Make sure to drag it into the same box. It should be similar to below

The next step is extremely important. Change set met when from All to Any. This alert won’t work if All is set since we placed every condition in the same box, each condition will have to apply for this alert to work if the All condition is set. We do not want that. We want if Any of the conditions apply the alert will flare up. Hence why we need to change it to Any

Now we need to add one more condition, we will need to apply this to only vCenter Server VMs and nothing else. Therefore, we need to add an additional condition. We will add a filter to look for the product name vCenter Server.

To do this we will expand Properties > Summary > Configuration > Product Name to the second box on the bottom. Make sure it shows up in the 2nd box like shown in the picture below. Adding it to a 2nd box will allow us to create a separate condition. There will be VMs that won’t have a product name; therefore you can filter it based on Property > Configuration > Name instead.

Change the dropdown from Equals to Contains and type in vCenter Server in the empty space given. This will eliminate all other VMs from this alert.

Now scroll all the way to the top and make sure All is selected on the very top dropdown. This will ensure conditions of both boxes occur for the alert to flare up. As in one of the 10+ partitions monitored must be over 90% full and the VM is a vCenter Server.

Click on Next and Add an optional Recommendation. Then click on Next again

In the Policies section. Make sure to enable it on all policies and click on Next

If you would like to forward this out to a third party such as Email, Service Now, and/or Ticketing System you can select it here (Note: this must be preconfigured).

Click on Create when done. For more information on vCenter Disk partitions. See the KB article below.

https://kb.vmware.com/s/article/76563

For those who don’t want to create the alert. You can download the alert below.

http://www.vmignite.com/download/2448/ Alert-vCenter- Partitions (1547 downloads )

How to create an alert for many partitions

In the following links below, Brock Peterson has written a great blog on how to modify the out of the box alert that alerts on all partitions to exclude certain partitions you don’t wanted alerted on.  He also tells you how to monitor NFS mounts as well.

https://www.brockpeterson.com/post/monitoring-and-not-monitoring-windows-drives-and-linux-filesystems-in-vropstwitterpinterestlinkedinmail



vROPS – How to monitor and filter data based on Time (6 use cases covered!)

In this super post, I will uncover all the vROPS techniques on how to do time-based filtering and monitoring. I will show you how to filter and monitor based on 9am-5pm performance data, average/max/min performance for a given time and how to filter it, see performance data changes by time, and even see when the last time the power settings changed on a Host, and more. Before I begin, I would like to give out a shout-out for my first two donators of VMignite: Joy and Antonio. Some exclusive dashboards were sent to them for the donation and this post is dedicated to the donators. Thank you once again. For those who would like to donate to the site please click on the donate tab in the menu. Doing so will encourage me to share out more vROPS tips and tricks.

Use Cases that will be covered

  • How to monitor performance data based on time
  • How to monitor performance data based on 9am-5pm time settings
  • How to filter performance data based on time
  • How to view when the last time the power settings changed for a Host System.
  • How to view VM Creation date using a list view
  • How to view performance data changes by time

How to monitor performance data based on time

In this example I will get Host CPU usage average and maximum for a given week.

Create a List view and provide a name for the view and select Host System as your subject

Click on Data and expand CPU and double click on Usage (%) until it appears in the Data box twice

Select one of the CPU|Usage (%) metric and change the Metric label to CPU Avg % (this will change the display name of the Data)

Change the transformation to Average. This changes the value from current CPU usage data to average for a given time.

Select the other CPU|Usage (%) and repeat the above steps and change the metric label to CPU Max % and Transformation to Maximum

Now we need to set the time on the data. Click on Time Settings and set the desired time frame. Default settings is 7 days, but you can change it to a month, 90 days, 6-months, etc.

How to monitor performance data based on 9am-5pm time settings

In the time settings of the view, click on Advance and the click on the Business Hours check box

Change the time for your desired business hours. Below are example settings for Monday to Friday 9am-5pm with weekends disabled

Now go back to Data and enable it for which metrics you would like this to apply to. Save your view when you are done

How to filter performance data based on time

Using the above example, we can then filter any view data to show only any Host CPU Avg % greater than 20% only. To do this we will click on the Filter tab, Select the metric CPU Usage % metric and change it from Current to Average instead and then entering our desired filter of is greater than 20%. Previewing the data we can now see that it only list Host with CPU Average % that are greater than 20%.

How to view when the last time the power settings changed for a Host System

In this guide we will find out the date and time when the Host power settings last changed and how long ago since it changed.

In the list view, change the dropdown to Properties and double click on Runtime > Power State metric till it shows up on the right side 3 times.

Select the 2nd Power State metric change the Transformation to Timestamp and Absolute Timestamp. This will give us a date format on when the power state last changed.

Select the 3rd Power State metric change the Transformation to Timestamp and Relative Timestamp. This will give us a time format on when it was last changed.

Previewing the data, we can see how this looks. Change the metric label accordingly to whatever make senses to you. Save the view once done.

How to view VM Creation date using a list view

Create a list view with the Subject of Virtual Machine. We would add the Property > Configuration > Creation Date (ms) twice and change the Transformation to Timestamp Absolute Timestamp and the other one to Timestamp Relative Timestamp.

You can read more about how time settings works below.

https://blogs.vmware.com/management/2019/04/vrealize-operations-7-5-more-and-more-powerful-views-reports.html

How to view performance data changes by time

Using the trick below, we can see how VM growth changed by month for a given environment. We can also use this trick to see how CPU, Memory, and Disk used % changed in the past month for example for VMs, Host, Cluster, and vCenter for example.

http://www.vmignite.com/2019/12/vrops-7-how-to-get-vm-growth-stats-by-any-given-time/


twitterpinterestlinkedinmail



How to execute the best Monitoring Strategy

It is now the year 2022 and if your company is still taking hours and days to troubleshoot issues, still having a more reactive than proactive approach, have no capacity awareness, and says monitoring tools are not helping than this post is simply for you. Everything around us is getting smarter, getting better, and evolving every single day. Your phone can talk to you and give you advice, some cars can self-drive and self-park, and a watch can monitor your exercise routine and health.

An advance monitoring tool such as vROPs has evolved just as well through the years to do more than your typical monitoring. If can predict your environment, provide automation, provide costing and billing, and even provide self-healing capabilities. To skip to the bottom line, I have consulted for many Fortune 500 companies in almost every sector out there including public sector and I can tell you these are some if not all the companies monitoring challenges.

The real monitoring challenges that a company face every day.

  • Everyone in IT such as Engineering, IT Support, Helpdesk, and management can all benefit from a monitoring tool, yet they are all confused on where to start and they only use it when they absolutely need it (being reactive).
  • A lot of companies bought monitoring tool but barely anyone is using it (shelf-ware)
  • Every company I been to is constantly growing VMs every month and they don’t know to track it. Some grow up to 500 VMs a month on average, yet the number of IT employees don’t increase (overworked staff)
  • Some companies are using the alerting features for ticketing and email notifications, but it gets disregarded because there are too many non-important alerts going through (mismanagement of the product)

What are the Solutions for these challenges?

Although companies provide me with their monitoring use cases when I get there. My bigger personal use cases are the uses cases I mention above. I want everyone in all IT departments to love and use vROPS to the highest degree … willingly. I teach them how to be proactive. I show them how easy it is to be fully aware of their capacity, how to identify the problems they have today, and how to fix things in minutes when things do break. And most importantly, I change their whole view and mindset when it comes to monitoring. Monitoring is 100% essential for every company out there. If done correctly, monitoring will pay for itself easily will be an understatement. Below is the blueprint for success that I have perfected through the years to achieve this.

Step 1: Fully Understand what a powerful Monitoring tool can do

A monitoring tool such as vROPS is so advance it can provide automation, compliance, cost and billing, and so much more. For a full list of what it can do, check out this link below. I update this as new features comes out so make sure to check back often after every release.

http://www.vmignite.com/2020/03/15-features-that-makes-vrops-the-best-monitoring-tool-period/

Step 2: Learn how to maximize any monitoring tool

Easier said than done. Read my post below to understand what are today’s biggest use cases you need to solve.  Then implement the examples I’ve provided to solve those uses cases.

http://www.vmignite.com/2020/08/how-to-maximize-all-monitoring-tools/

Step 3: Implement the right Alerting Strategy

Perfecting your alerting process is the key to a successful proactive environment. Yet mostly all companies get this process wrong and then blames the product. Read my post on how to implement a successful alerting strategy.

http://www.vmignite.com/2021/06/vrops-alerting-dos-and-dont/

Step 4: Show case the work to all departments once done

Those who have done engagements with me know that I love to showcase my work with all IT staffing (operations, engineers, management, directors). If you did your job right, a 30-minute demo is all it takes to get them to use vROPS for the first time willingly. Yes, I said willingly. If you show them something that wows them and is a no brainer on how it will make their lives easier, they will log in for the first time while they are on the call. I know this because I often get interrupted while I am doing the demo because they have access issues (signs of them logging in for the very first time).

Step 5: Learn and read more about Metrics and Operation Monitoring

Iwan has shared his vROPs book to a website. If you have any questions on metrics and want to learn more on monitoring in general visit the link below

https://www.vmwareopsguide.com/

For quick links on vROPS in general for architecture, management packs, ports, patches, etc. Visit the link below

http://www.vmignite.com/2020/01/vrops-answering-all-the-major-vrops-questions-best-vrops-references/twitterpinterestlinkedinmail



vROPS 8.6 – How to create Alert Definitions using conditions

With the release of vROPS 8.6, creating alert definitions is easier than ever before. With the new addition of being able to configure “Conditions” into the Alert itself, there is not much of a need to create Symptom definitions anymore. One thing to note is that you can still combine conditions and symptoms into the same alert definition. In this guide I will create a sample alert definition using this new feature.

In this guide I will create a new alert for any VM that is having degraded performance issues for CPU, Memory, and Disk.  If you want to learn more about my recommendations for alerting read my alerts dos and don’t post

Go to Configure > Alerts and click on Alert Definitions to view all the alerts

Click on Add to create a new Alert

Configure the alert and click on Next when done

  • Name: Provide a name for the alert
  • Description: Optionally provide a description
  • Base Object Type: Choose your base Object in this case I am looking to create alerts based on virtual machines, therefore I have chosen Virtual machine as my base object
  • Criticality: For Criticality you can choose Critical, immediate, warning, etc. I chose Symptom Based which will take the highest value of the child symptoms and conditions.
  • Alert Type and Subtype: make sure to put in a category that makes sense to the alert itself. This will make it easier to filter alerts later for notifications and for dashboard purposes.
  • Wait Cycle: one wait cycle is 5 minutes, which in my opinion is not enough time to catch realistic scenarios. For example, a VM can spike up occasionally for disk, cpu, or memory utilization so that is OK if it happens for less than 5 minutes. However, if it continues to spike for over 30 minutes than alert me. Hence why I changed the Wait cycle to 6 which would equal out to 30 minutes.  (6 x 5 minutes = 30 minutes)
  • Cancel Cycle: I left this at default of 1 as if any of the conditions are resolved, cancel the alert immediately.

Make sure Conditions is selected and add in the metrics you want to apply a condition for. Just take the metric and drag it to the left as shown below. In this example I took 5 metrics that will trigger my alert and I set the thresholds and critically for each accordingly. Make sure to place them all in the same box as shown.

Because all the conditions are in the same box, I can set the filter for these conditions in that same box. If I choose All, then all 5 custom conditions have to be met (which most likely will never happen) for this alert to trigger. This is not what I want, as I want if any of the 5 conditions occur, trigger the alert. Therefore, I will change the filter to Any

Now let’s say you don’t want this alert to alert on certain vCenters, VMs, clusters, etc. In this example, I don’t want this to apply to any vCenter that has the word “test” in it. I will add the property Parent vCenter into a second box and fill in the conditions to remove any vCenter that has the words “Test” in it. Very important, I will change the filter for both boxes to All on the top, meaning each box must occur for the alert to trigger.

So, let me explain, since the first box has 5 conditions and it is set to Any, only one of the 5 conditions in the first box must occur for that box to be active. While the second box has only one condition, but it must trigger no matter what since I set the top condition to All. If I change the top to Any, only one of the boxes is needed for the alert to trigger.

Click on Next and it will bring us to Recommendations where you can create or add one or many recommendations as needed.

Click on Next and it will bring us to the Policies page. Make sure to check all policies shown here for it to activate the alert.

Click on Next and optionally choose a notification method if you want it emailed out or forwarded to a 3rd party system such as a ticketing system for example. Save the alert for it to apply.

Wait about 45 minutes since we put 30-minute wait time and go to Troubleshoot > Alerts to see if any alerts got triggered

If you click on each alert it will show you what triggered the alert. As you can see in the below example our VM has high disk latency and it is on a vCenter that doesn’t have the words test in it.

If you click on the dropdown you can view the details and history of that metric twitterpinterestlinkedinmail