Collecting additional metrics using Guest OS diagnostics


Microsoft Azure provides a set of standard host-level metrics. Guest OS metrics are not available by default. You can use guest-level monitoring to collect metrics of your guest virtual machines. The metrics are useful for investigating the capacity-related issues that might occur in your Azure environment. When you run the Microsoft Azure API ETL, these metrics are imported into the TrueSight Capacity Optimization database. 

  1. Enable the guest-level monitoring for Azure virtual machines
    • For new virtual machines

      1. Log in to the Azure portal.
      2. In the left pane, select Virtual machines.
      3. Click Add to create a virtual machine.
      4. On the Management tab, enable the OS guest diagnostics option.
      5. Add the required information to create the virtual machine and click Review + create.
        The virtual machine is created with the virtual machine agent installed in the guest OS.
        azure_etl_enabling_guest_os_diagnostics_new_vm.jpg
    • For existing virtual machines

      1. Log in to the Azure portal.
        In the left pane, select Virtual machines.
      2. A list of your virtual machines is displayed. 
      3. Select the virtual machine for which you want to enable guest-level monitoring.
      4. In the left pane of virtual machine, in the Monitoring category, click Diagnostic settings.
      5. On the Diagnostics settings page, click Enable guest-level monitoring.
        The Azure diagnostics agent is installed on the virtual machine and the metrics are displayed in the Overview tab.
        azure_etl_enabling_guest_os_diagnostics.jpg
  2. Configure and run the ETL. For more information, see Microsoft Azure - Azure API Extractor.
  3. Verify that the metrics are displayed in the Workspace.

Important

The support to enable 15-minute data resolution is available only when you apply Cumulative Hotfix 7 or later on Patch 2 (20.02.02) of TrueSight Capacity Optimization 20.02.

To enable data collection at a 15-minute resolution for Linux virtual machines

After you enable the guest-level monitoring for a new or an existing virtual machine, perform the following steps:

Important

  • Perform this process manually for each virtual machine.
  • The process described in this topic does not support bulk change.
  1. Log in to the Azure portal.
  2. Select Monitoring > Diagnostic settings.
    The Diagnostic settings page is displayed.

Diagnostic settings_Original.png

  1. On the Metrics tab, in the Metric Aggregation Intervals field, enter PT15M, as shown in the following image:
    Metrics aggregation interval.png

Performance metrics for Linux virtual machines

The following table lists the metrics that are available when the guest-level monitoring option is enabled in Azure for Linux virtual machines:

To enable data collection at a 15-minute resolution for Windows virtual machine

After you enable the guest-level monitoring for a new or an existing virtual machine, perform the following steps in Azure Cloud Shell:

Important

  • Perform this process manually for each virtual machine.
  • The process described in this topic does not support bulk change.


  1. Set the following variables:

    my_resource_group={Resource group name containing your Windows VM and the storage account}
    my_windows_vm={Your Azure Windows VM name}
    my_diagnostic_storage_account={Your Azure storage account for storing VM diagnostic data}
  2. Run the following command to get the resource ID of the virtual machine:

    my_vm_resource_id=$(az vm show -g $my_resource_group -n $my_windows_vm --query "id" -o tsv)
  3. Run the following command to get the default configurations of the diagnostic settings:

    az vm diagnostics get-default-config  --is-windows-os \
       | sed "s#__DIAGNOSTIC_STORAGE_ACCOUNT__#$my_diagnostic_storage_account#g" \
       | sed "s#__VM_OR_VMSS_RESOURCE_ID__#$my_vm_resource_id#g"

    An output is generated with two code blocks for protected settings and default configurations respectively, as shown in the following example. Make sure that the values in the StorageAccount and resourceId fields match the values you set in steps 1 and 2.

    Click to view the output
    {
     "storageAccountName": "__STORAGE_ACCOUNT_NAME__",
     "storageAccountSasToken": "__SAS_TOKEN_WITH_LEADING_QUESTION_MARK__"
    }
    {
     "StorageAccount": "testrgguestdiag955",
     "WadCfg": {
       "DiagnosticMonitorConfiguration": {
         "DiagnosticInfrastructureLogs": {
           "scheduledTransferLogLevelFilter": "Error",
           "scheduledTransferPeriod": "PT1M"
         },
         "Directories": {
           "scheduledTransferPeriod": "PT1M"
         },
         "Metrics": {
           "MetricAggregation": [
             {
               "scheduledTransferPeriod": "PT1H"
             },
             {
               "scheduledTransferPeriod": "PT1M"
             }
            ],
           "resourceId": "/subscriptions/a91a22fa-bc7f-4db9-9e98-bd8e2b4128c8/resourceGroups/testrg/providers/Microsoft.Compute/virtualMachines/MyVM"
         },
         "PerformanceCounters": {
           "PerformanceCounterConfiguration": [
             {
               "annotation": [
                 {
                   "displayName": "CPU utilization",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Processor(_Total)\\% Processor Time",
               "sampleRate": "PT15S",
               "unit": "Percent"
             },
             {
               "annotation": [
                 {
                   "displayName": "CPU privileged time",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Processor(_Total)\\% Privileged Time",
               "sampleRate": "PT15S",
               "unit": "Percent"
             },
             {
               "annotation": [
                 {
                   "displayName": "CPU user time",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Processor(_Total)\\% User Time",
               "sampleRate": "PT15S",
               "unit": "Percent"
             },
             {
               "annotation": [
                 {
                   "displayName": "CPU frequency",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Processor Information(_Total)\\Processor Frequency",
               "sampleRate": "PT15S",
               "unit": "Count"
             },
             {
               "annotation": [
                 {
                   "displayName": "Processes",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\System\\Processes",
               "sampleRate": "PT15S",
               "unit": "Count"
             },
             {
               "annotation": [
                 {
                   "displayName": "Threads",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Process(_Total)\\Thread Count",
               "sampleRate": "PT15S",
               "unit": "Count"
             },
             {
               "annotation": [
                 {
                   "displayName": "Handles",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Process(_Total)\\Handle Count",
               "sampleRate": "PT15S",
               "unit": "Count"
             },
             {
               "annotation": [
                 {
                   "displayName": "Memory usage",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Memory\\% Committed Bytes In Use",
               "sampleRate": "PT15S",
               "unit": "Percent"
             },
             {
               "annotation": [
                 {
                   "displayName": "Memory available",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Memory\\Available Bytes",
               "sampleRate": "PT15S",
               "unit": "Bytes"
             },
             {
               "annotation": [
                 {
                   "displayName": "Memory committed",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Memory\\Committed Bytes",
               "sampleRate": "PT15S",
               "unit": "Bytes"
             },
             {
               "annotation": [
                 {
                   "displayName": "Memory commit limit",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\Memory\\Commit Limit",
               "sampleRate": "PT15S",
               "unit": "Bytes"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk active time",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\% Disk Time",
               "sampleRate": "PT15S",
               "unit": "Percent"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk active read time",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\% Disk Read Time",
               "sampleRate": "PT15S",
               "unit": "Percent"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk active write time",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\% Disk Write Time",
               "sampleRate": "PT15S",
               "unit": "Percent"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk operations",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\Disk Transfers/sec",
               "sampleRate": "PT15S",
               "unit": "CountPerSecond"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk read operations",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\Disk Reads/sec",
               "sampleRate": "PT15S",
               "unit": "CountPerSecond"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk write operations",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\Disk Writes/sec",
               "sampleRate": "PT15S",
               "unit": "CountPerSecond"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk speed",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\Disk Bytes/sec",
               "sampleRate": "PT15S",
               "unit": "BytesPerSecond"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk read speed",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\Disk Read Bytes/sec",
               "sampleRate": "PT15S",
               "unit": "BytesPerSecond"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk write speed",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\PhysicalDisk(_Total)\\Disk Write Bytes/sec",
               "sampleRate": "PT15S",
               "unit": "BytesPerSecond"
             },
             {
               "annotation": [
                 {
                   "displayName": "Disk free space (percentage)",
                   "locale": "en-us"
                 }
                ],
               "counterSpecifier": "\\LogicalDisk(_Total)\\% Free Space",
               "sampleRate": "PT15S",
               "unit": "Percent"
             }
            ],
           "scheduledTransferPeriod": "PT1M"
         },
         "WindowsEventLog": {
           "DataSource": [
             {
               "name": "Application!*[System[(Level=1 or Level=2)]]"
             },
             {
               "name": "System!*[System[(Level=1 or Level=2)]]"
             }
            ],
           "scheduledTransferPeriod": "PT1M"
         },
         "overallQuotaInMB": 4096
       }
     }
    }
  4. From the output, copy the second code block for default configurations and paste it into a JSON file, for example, default_config.json.
  5. Add the 15-minute resolution setting to the Metric Aggregation Interval field, as shown in the following example:

    "MetricAggregation": [
             {
               "scheduledTransferPeriod": "PT1H"
             },
             {
               "scheduledTransferPeriod": "PT15M"
             },
             {
               "scheduledTransferPeriod": "PT1M"
             }
    ]
  6. Run the following commands to acquire storage SaaS token and set the values of the protected settings: 

    storage_sastoken=$(az storage account generate-sas \
       --account-name $my_diagnostic_storage_account --expiry 2037-12-31T23:59:00Z \
       --permissions acuw --resource-types co --services bt --https-only --output tsv)

    protected_settings="{'storageAccountName': '$my_diagnostic_storage_account', \
        'storageAccountSasToken': '$storage_sastoken'}"

    Important

    • The storage_saastoken command acquires a storage SaaS token for authorization.
    • The protected_settings command sets the values of the protected settings to the information you collected.
  7. Run the following command to apply the changes:

    az vm diagnostics set --settings default_config.json \
       --protected-settings "$protected_settings" \
       --resource-group $my_resource_group --vm-name $my_windows_vm
  8. To verify the changes, on your Windows virtual machine page, navigate to the Metrics tab, and perform the following steps:
    1. In Metric Namespace, select Guest (classic).
    2. In Time granularity, select 15 minutes.

Azure_diagnostic_Win_result.png


Performance metrics for Windows virtual machines

The following tables list the metrics that are available when the guest-level monitoring option is enabled in Azure for Windows virtual machines:

1 - Indicates that the metrics are available in TrueSight Capacity Optimization only when the following steps are performed on your Windows virtual machines.

Steps to create the Template


    1. Log in to the Azure portal.
      In the left pane, select Virtual machines.
    2. A list of your virtual machines is displayed. 
    3. Select the virtual machine for which you want to get the missing additional metrics. 
    4. In the left pane of virtual machine, in the Monitoring category, click Diagnostic settings.
    5. On the Diagnostics settings page, click Performance counters tab and then select the Custom option.
      A list of performance counters appears.
    6. Enter the name of the performance counter, specify the unit, and click Add.
      Repeat this step for each performance counter and then click Save.
    7. From the list of Azure services, click Resource groups
    8. In the left pane, from the list of resource groups, select the resource group in which your virtual machine resides.
    9. In the resource group settings, select Deployments. Select the latest deployment from the list. 
    10. From the Overview page, in the Deployment details expander, ensure that the name of the virtual machine is correct. This VM must be the VM for which you have added the performance counters.
    11. In the left pane, select Template > Add to library (preview).
    12. Add a name and description for the template, and save it. 

Steps to Deploy the template on a virtual machine


    1. Log in to the Azure portal.
    2. From the list of Azure services, select Templates, and edit the template that you have created in the earlier section. 
    3. On the Edit Template page, click Next:ARM Template and update the fields and save the template:
      • "name": "<vm_name>/Microsoft.Insights.VMDiagnostisSettings"
      • "resourceId": '/subscriptions/<subscription_id>/resourceGroups/resourcegroup/providers/Microsoft.Compute/virtualMachines/<vm_name>"
    4. From the Template preview page, click Deploy. The Custom deployment page appears.
    5. Maintain the default fields, choose the Resource group, agree to the terms and conditions, and click Purchase. Note that the Purchase 
      The deployment to the resource group begins. The metrics are available on the virtual machines after the deployment is complete.

2 - Indicates metrics that are set at level 4. These metrics are imported only when the collection level of the ETL is set to ‘Extended’.

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*