VMware vCenter Extractor Service


Use the VMware vCenter Extractor Service ETL to continuously collect the configuration and performance metrics of vCenter servers from your VMware environment. 

When your VMware environment is large, configure multiple ETLs and use the import filter option in the Advanced configuration to split the vCenter entities between them. To determine the number of ETLs that are required to collect data from your VMware environment, see the following sizing guidelines:


Sizing considerations for the VMware vCenter ETL

With BMC Helix Continuous Optimization, you can manage very large vSphere environments containing multiple vCenters.

The vCenter Extractor Service is a "service connector". When sizing the ETL Engines, follow the guidelines for service connectors. If you have a very large vCenter, you might have to deploy multiple extractors against it. 

Specifically for the vCenter Extractor Service, the main drivers for sizing are:

  • Number of VMs
  • Number of managed clusters

Use the following rules:

  • No more than 2,000 VMs per ETL
  • Scheduler heap: 2 GB for the first 2,000 VMs, and 1 GB for 2000 VMs thereafter
  • Data storage on the ETL Engine computer: 10 GB per 2,000 VMs

For heavily populated clusters, the 2,000 VMs in the earlier rules can be equated to a single cluster. If you have many small clusters, consider that the number of clusters affects the number of threads used by the vCenter Extractor Service; allow enough CPU resources to make sure timely polling of data by limiting the number of clusters for each vCenter extractor.

The VMware - vCenter Extractor uses TLS versions 1.1 and 1.2 to connect to the vCenter Server.

Collecting data by using the VMware vCenter Extractor Service ETL

To collect data by using the VMware vCenter Extractor Service ETL, do the following tasks:

I. Complete the preconfiguration tasks.

II. Configure the ETL.

Step I. Complete the preconfiguration tasks

Before you configure and run the ETL to collect data from your VMware environment, ensure that the following preconfiguration tasks are completed:

  • A user account with the read-only role is created to access the vCenter Servers.
  • For environments with a firewall, the firewall access is configured to enable communication between the VMware vCenter and the ETL Engine server.
  • The ETL Engine server has access to the URLs of the web services that are exposed by the vCenter servers.

Step II. Configure the ETL

You must configure the ETL to connect to VMware for data collection. ETL configuration includes specifying the basic and optional advanced properties. While configuring the basic properties is sufficient, you can optionally configure the advanced properties for additional customization.

A. Configuring the basic properties

Some of the basic properties display default values. You can modify these values if required.

To configure the basic properties:

  1. Navigate to Administration ETL & System Tasks, and select ETL tasks.
  2. On the ETL tasks page, click Add > Add ETL. The Add ETL page displays the configuration properties. You must configure properties in the following tabs: Run configuration, Entity catalog, and VMware ETL configuration.

    vmware_etl_config_page.png
  3. On the Run configuration tab, configure the following properties:
    1. From the ETL module list, select VMware vCenter Extractor Service. The name of the ETL is displayed in the ETL Task name field. You can edit this field to customize the name.
    2. From the Data type list, select one of the following metric levels:
      • Metrics at Cluster, Resource Pool, Host, Datastore and Virtual Machine level: This is the default selection. Use this metric level to collect data of your entire VMware infrastructure.
      • Metrics at Cluster, Resource Pool, Host, Datastore level: Select this metric level when you do not want to collect data of virtual machines. For example, you want to support only the Capacity-Aware Placement Advice service. This option saves disk space and I/O load and lets you manage large VMware environments with modestly-sized BMC Helix Continuous Optimization implementation.
  4. Click the Entity catalog tab, and select one of the following options:
    • Shared Entity Catalog

      Select if other ETLs access the same entities that are used by this ETL.

      • From the Sharing with Entity Catalog list, select an entity catalog name that is shared between ETLs.
    • Private Entity Catalog: Select if you want to use this ETL independently.
      If you are collecting business services data, we recommend that you select Shared Entity Catalog to avoid duplication of entities.
  5. Click the VMware ETL configuration tab, and specify the following details:
    • The web service URL of the vCenter Server in the following format:
      https://<host_address>/sdk
      Where <host_address> is the IP address of the server that hosts the vCenter Server.
    • The name of the user and password (if required) to connect to the vCenter Server
    • Specify whether you want to use the AS time zone
    • Specify whether you want to import the cluster failover threshold metrics
  6. (Optional) Override the default values of properties in the following tabs:

    The [confluence_table-plus] macro is a standalone macro and it cannot be used inline. Click on this message for details.
    *Scheduler 
    compatible with this ETL: Remote ETL Engine.

  7. Click Save.
  8. The ETL tasks page shows the details of the newly configured VMware ETL.

    vmware_etl_configured.png

(Optional) B. Configuring the advanced properties

Configure the advanced properties to change the way the ETL works or to collect additional metrics.

To configure the advanced properties:

  1. On the Add ETL page, click Advanced.
  2. Configure the following properties:

    Run configuration

    Property

    Description

    Run configuration name

    Specify the name that you want to assign to this ETL task configuration. The default configuration name is displayed. You can use this name to differentiate between the run configuration settings of ETL tasks.

    Deploy status

    Select the deploy status for the ETL task. For example, you can initially select Test and change it to Production after verifying that the ETL run results are as expected.

    Log level

    Specify the level of details that you want to include in the ETL log file. Select one of the following options:

    • 1 - Light: Select to add the bare minimum activity logs to the log file.
    • 5 - Medium: Select to add the medium-detailed activity logs to the log file.
    • 10 - Verbose: Select to add detailed activity logs to the log file.

    Use log level 5 as a general practice. You can select log level 10 for debugging and troubleshooting purposes.

    Datasets

    Specify the datasets you want to add to the ETL run configuration. 

    1. Click Edit.
    2. Select one (click) or more (shift+click) datasets from the Available datasets list and click >> to move them to the Selected datasets list.
    1. Click Apply.

    The ETL collects data of metrics associated with the datasets that are available in the Selected datasets list.

    Saver period

    Specify the interval after which data is updated in the BMC Helix Continuous Optimization database. The default interval is one hour.

    VMware ETL configuration

    Property

    Description

    Aggregation period

    Specify a period to aggregate the collected data before loading it into the database. The default period is 1 hour.

    Preaggregated statistics import

    Specify whether you want to collect the pre-aggregated statistics for metrics, such as min, max, and count. The default selection is no.

    Hierarchy import period

    Specify the time interval to update the object relationship in the hierarchy. The default interval is 6 hours.

    Cluster/Pool performance data extraction

    Specify whether you want to aggregate the performance data of clusters or pools before loading it into the database. Select one of the following aggregation methods:

    • Aggregate Cluster/Pool performance data starting from Host/VM data (recommended)
    • Extract Cluster/Pool performance data from vCenter

    Compatibility lookup name customization

    Select one of the following criteria for sharing the lookup between ETLs:

    • Default/Recommended: UUID for hosts and virtual machines
    • System names for hosts and virtual machines
    • Host names for hosts and virtual machines

    Use Virtual Machine network name as system name

    Specify whether you want to use the virtual machine network name as a system name.

    Import cluster failover threshold metric

    Specify whether you want the ETL to import the cluster failover threshold metrics.

    Custom detail import

    Specify whether you want to import the detailed custom metrics and configure the following properties. The default selection is No.

    • Detail resolution: Set the resolution to 5 minutes or 15 minutes. The default is 15 minutes.
      • Detailed metrics for entity type: Select if you want to collect the detailed metrics for all the supported entities or the selected entity types.
        • Entity types: Select the required entity types.
      • Detailed metric set: Select Default or Custom.
        • If you select Custom, the Custom detailed metric set is displayed. Click  ➕️ Add to add metrics to the default metric set. Click ➖️ to remove individual metrics from the list.

          You can add metrics, such as BYFS_SIZE, BYFS_FREE, BYFS_USED, and BYFS_USED_SPACE_PCT. The minimum detail resolution for these metrics is 15 minutes. Detailed resolution of BYFS metrics is supported only when VMware tools are installed on the virtual machine from which the data is to be imported.

    Import advanced VM events

    Specify whether you want to import the advanced VM events for the in-depth analysis. By default, this option is set to No.

    If you select the EVDAT dataset in the ETL configuration, the basic events are imported by the ETL. The following basic events are imported by default:

    • Host entering maintenance mode
    • Host exiting maintenance mode
    • Host shutdown
    • Host removed from cluster
    • Host added from cluster
    • Host disconnected from cluster
    • Cluster reconfiguration
    • VM unregistered from vCenter
    • VM registered to vCenter  
    • VM relocated to vCenter

    If you select Yes, the following advanced additional events are imported. This might require additional storage space and can have an impact on the ETL performance.

    • VM Power ON
    • VM Power OFF
    • VM Migration
    • VM Reconfiguration
    • DRS VM Migration
    • VM has been renamed

    Import Tags

    Specify whether you want to import tags assigned to vSphere clusters, hosts, and virtual machines.

    • The default selection is yes and None, which indicates that all the tags are imported.
      If you select no, tags will not be imported during the next scheduled ETL run, and the tags that are already imported will not be removed.
    • To import only specific tags, select WHITELIST - Import only selected list of tag types, and add the tags that you want to import in the Tag type whitelist field. You can specify multiple tags separated by a semicolon.
    • To exclude specific tags while importing, select BLACKLIST- Do not import selected list of tag types, and add the tags that you want to exclude in the Tag type blacklist field. You can specify multiple tags separated by a semicolon.
    Import filter

    Configure one or both the filtering properties in this section to include or exclude entities while importing.

    Property

    Description

    Filtering for clusters and standalone hosts

    Specify whether you want to import all or specific clusters and standalone hosts.

    • The default selection is None, which indicates that all clusters and standalone hosts are imported.
    • To import only the specific clusters and standalone hosts, select Whitelist, and specify the names of clusters and standalone hosts to be imported. For example, cl1;cluster24 and host1;esx_host
    • To exclude the specific clusters and standalone hosts while importing, select Blacklistand specify the names of clusters and standalone hosts to be excluded from importing. For example, cl1;cluster45 and host2;esx_host

    Filtering option by file path

    • The default selection is None, which indicates that the filtering criteria specified in the Filtering for clusters and standalone hosts option will be used. 
    • Blacklist/Whitelist - Use this property to exclude or include specific entities. While importing entities, you can exclude (blacklist) specific hosts and virtual machines or can include (whitelist) specific virtual machines. You can use the whitelist and blacklist file to distribute the clusters of a single vCenter among multiple vCenter ETLs. Depending on the sizing constraints, you can run the ETLs on the same ETL Engine or different ETL Engines. Regular expressions are not supported in blacklist filters but are supported in the whitelist filters. 

      To exclude specific hosts and virtual machines (blacklist)
      1. In a text file, add the names of virtual machines and hosts that you want to exclude from importing in the following format, and save the file on the ETL Engine Server where the ETL runs:
        {{code language="none"}}
        SYSTEM_TYPE;ENTITY UUID

        {{/code}}

        To find UUIDs, select the required virtual machine or host in the Workspace, and click View lookup. The Lookup value column in the Lookup Details table shows the UUID values.Regular expressions are not supported in blacklist filters.

      2. In the Use file at path box, specify the path to this text file.
      3. After you run the ETL, verify that the specified virtual machines and hosts are not displayed in the hierarchy.

      Example:

      To exclude specific hosts and virtual machines from importing, obtain their UUIDs and add them to a text file as follows:

      vh:vmw;44454c4c-4600-1054-8052-cac04f525231
      vh:vmw;44454c4c-4600-1054-8052-cac04f525232
      gm:vmw;4208badb-6a91-23d1-c6b5-061745b2c8d9
      gm:vmw;4208badb-6a91-23d1-c6b5-061745b2c8d7

      Where vh:vmw is the system type and 44454c4c-4600-1054-8052-cac04f525231 is the UUID of host_1. Similarly, gm:vmw is the system type, and 4208badb-6a91-23d1-c6b5-061745b2c8d9 is the UUID of vm_1, and so on.

      After the ETL runs, these specified virtual machines and hosts are not imported.

      To include specific virtual machines (whitelist)
      1. Obtain the DNS names of virtual machines to be imported from the vCenter.
      2. In a text file, add the virtual machine names, and save the file on the ETL Engine Server where the ETL runs:
        1. On the first line, add the following declaration statement: #FORMAT=V2;TYPE=VM_WHITELIST
        2. On the subsequent lines, add the virtual machine names in one of the following ways:
          • Add the DNS names.
          • Add a regular expression. The DNS names of virtual machines that match the pattern in the regular expression are included while importing.
      3. In the Use file at path box, specify the path to this text file.
      4. After you run the ETL, verify that only the specified virtual machines are imported into the hierarchy.

      This filter is applied only on the virtual machines. It does not affect importing of other entities, such as virtual hosts, clusters, and datastores.

      Example 1 (Specifying DNS names)

      To import virtual machines vm_1, vm_2, and vm_3, add their DNS names to the whitelist file as follows:

      #FORMAT=V2;TYPE=VM_WHITELIST
      vm_1.orgname.com
      vm_2.orgname.com
      vm_3.orgname.com

      Individual host names of the virtual machines in the whitelist filters are case-sensitive.

      After you run the ETL, these virtual machines, along with other entities, are imported.

      Example 2 (Using a regular expression)

      To import the virtual machines for which the name starts with vl, use the following regular expression:

      #FORMAT=V2;TYPE=VM_WHITELIST
      ^vl\-[a-z]*\-[0-9]*\.orgname\.com

      After you run the ETL, all the virtual machines for which the DNS name starts with vl will be imported. For example, vl-pun-023.orgname.com and vl-pun-024.orgname.com

      To import virtual machine names in any case, use this regular expression: ^(?i)vl\-[a-z]*\-[0-9]*\.orgname\.com

      Example 3 (Using a regular expression)

      To import all the virtual machines that do not include the text _backup_ in the virtual machine name. The regular expression can use an inverted or negative expression to monitor only the virtual machines that do not contain the provided text.

      #FORMAT=V2;TYPE=VM_WHITELIST
      ^((?!.*_backup_.*).)*$

    • Allow/Deny list (new) - Use this property to exclude or include specific entities. While importing entities, you can exclude (deny) or include (allow) specific clusters, hosts, and virtual machines. Regular expressions are supported in both allow list and deny list filters. You can use regular expressions in the system name and type fields. 
      1. To filter the entities, create a text file in the following format, and save the file on the ETL Engine Server where the ETL runs:

        #MODE= ALLOWLIST|DENYLIST
        #TYPE=<list of system types>
        <system name>

        For details on the system types, see System types.

      2. In the Allow/Deny list filter file path textbox, specify the path to this text file.
      3. After you run the ETL, verify that the specified entities are filtered accordingly.

        Example 

        #MODE=ALLOWLIST
        #TYPE=sys:vhc:vmw
        aus-cluster-.*
        #TYPE=sys:gm:vmw,sys:vapp:vmw
        aus-clm-.*

    Important: The Allow/Deny list filtering uses NAME and not the HOSTNAME to filter the entities. Filters applied using the Allow/Deny list (new) option will take precedence over the filters applied using the Filtering for clusters and standalone hosts option. BMC recommends that you use the Allow/Deny list option for filtering the VMware entities.

    Additional properties

    Property

    Description

    List of properties

    Perform the following steps to specify additional properties for the ETL that act as user inputs during the run. You can specify these values now or you can do so later by accessing the You can manually edit ETL properties from this page link that is displayed for the ETL in the view mode.

    1. Click Add.
    2. In the etl.additional.prop.<propertyname> field, specify an additional property.
    3. Click Apply.

    Repeat these steps to add more properties.

    This ETL supports the following additional properties:

    • extract.vmware.skip.filesystem.list: Use this property to specify the file systems to be excluded for BYFS metrics while importing VMware vCenter data. Provide the list of file systems by using regex with ';' as a delimiter in the property. For example, when you provide the following values: .*/docker/.*;/tmp/.*;/var, the following message is displayed: "Filesystem list to exclude : [.*/docker/.*, /tmp/.*, /var]".
      The excluded filesystems are not considered in the calculation of TOTAL_FS_* metrics, such as TOTAL_FS_USED and TOTAL_FS_UTIL.
      When this property is not added, the following default file systems are used: ".*/kubelet/.*", ".*/docker/.*", ".*/run/runc/.*".
    • extract.vmware.process.config.update.inparallelFor a large environment that includes many virtual machines, add this property and set its value to true to speed up ETL processing.

    Loader configuration
    Property
    Description
    Empty dataset behavior
    Specify the action for the loader if it encounters an empty dataset:
    • Warn: Generate a warning about loading an empty dataset.
    • Ignore: Ignore the empty dataset and continue parsing.
    Maximum number of rows for CSV output
    A numeric value to limit the size of the output files.
    Remove domain suffix from datasource name (Only for systems) 
    Select True to remove the domain from the data source name. For example, server.domain.com will be saved as server. The default selection is False.
    Leave domain suffix to system name (Only for systems)
    Select True to keep the domain in the system name. For example: server.domain.com will be saved as is. The default selection is False.
    Skip entity creation (Only for ETL tasks sharing lookup with other tasks)
    Select True if you do not want this ETL to create an entity and discard data from its data source for entities not found in . It uses one of the other ETLs that share a lookup to create a new entity. The default selection is False.
    Scheduling options
    Property
    Description
    Hour mask
    Specify a value to run the task only during particular hours within a day. For example, 0 – 23 or 1, 3, 5 – 12.
    Day of week mask
    Select the days so that the task can be run only on the selected days of the week. To avoid setting this filter, do not select any option for this field.
    Day of month mask
    Specify a value to run the task only on the selected days of a month. For example, 5, 9, 18, 27 – 31.
    Apply mask validation
    Select False to temporarily turn off the mask validation without removing any values. The default selection is True.
    Execute after time
    Specify a value in the hours:minutes format (for example, 05:00 or 16:00) to wait before the task is run. The task run begins only after the specified time is elapsed.
    Enqueueable
    Specify whether you want to ignore the next run command or run it after the current task. Select one of the following options:
    • False: Ignores the next run command when a particular task is already running. This is the default selection.
    • True: Starts the next run command immediately after the current running task is completed.

  3. Click Save.
    The ETL tasks page shows the details of the newly configured VMware ETL.

Step III. Run the ETL

After you configure the ETL, you can run it to collect data. You can run the ETL in the following modes:

A. Simulation mode: Only validates connection to the data source, does not collect data. Use this mode when you want to run the ETL for the first time or after you make any changes to the ETL configuration.

B. Production mode: Collects data from the data source.

A. To run the ETL in the simulation mode

To run the ETL in the simulation mode:

  1. Navigate to Administration ETL & System Tasks, and select ETL tasks.
  2. On the ETL tasks page, click the ETL. The ETL details are displayed.
    etl_details.png

  3. In the Run configurations table, click Edit edit_this_run_configuration.png to modify the ETL configuration settings.
  4. On the Run configuration tab, ensure that the Execute in simulation mode option is set to Yes, and click Save.
  5. Click Run active configuration. A confirmation message about the ETL run job submission is displayed.
  6. On the ETL tasks page, check the ETL run status in the Last exit column.
    OK Indicates that the ETL ran without any error. You are ready to run the ETL in the production mode.
  7.  If the ETL run status is Warning, Error, or Failed:
    1. On the ETL tasks page, clickclick to view details.pngin the last column of the ETL name row.
    2. Check the log and reconfigure the ETL if required.
    3. Run the ETL again.
    4. Repeat these steps until the ETL run status changes to OK.

B. To run the ETL in the production mode

You can run the ETL manually when required or schedule it to run at a specified time.

To run the ETL manually

  1. On the ETL tasks page, click the ETL. The ETL details are displayed.
  2. In the Run configurations table, click Edit edit_this_run_configuration.png to modify the ETL configuration settings. The Edit run configuration page is displayed.
  3. On the Run configuration tab, select No for the Execute in simulation mode option, and click Save.
  4. To run the ETL immediately, click Run active configuration. A confirmation message about the ETL run job submission is displayed.
    When the ETL runs, it collects data from the source and transfers it to the BMC Helix Continuous Optimization database.

To schedule the ETL run in the production mode

By default, the ETL is scheduled to run daily. You can customize this schedule by changing the frequency and period of running the ETL.

To configure the ETL run schedule:

  1. On the ETL tasks page, click the ETL, and click Edit task. The ETL details are displayed.
    aws_api_etl_schedule_run.png
  2. On the Edit task page, do the following, and click Save:
    • Specify a unique name and description for the ETL task.
    • In the Maximum execution time before warning field, specify the duration for which the ETL must run before generating warnings or alerts, if any.
    • Select a predefined or custom frequency for starting the ETL run. The default selection is Predefined.
    • Select the task group to which you want to assign the ETL task.
  3. Click Schedule. A message confirming the scheduling job submission is displayed.
    When the ETL runs as scheduled, it collects data from the source and transfers it to the BMC Helix Continuous Optimization database.

Step IV. Verify data collection

Verify that the ETL ran successfully and the VMware data is refreshed in the Workspace.

To verify whether the ETL ran successfully

  1. Click Administration > ETL and System Tasks > ETL tasks.
  2. In the Last exec time column corresponding to the ETL name, verify that the current date and time are displayed.
  3. In the Last exit column corresponding to the ETL name, verify that the status is OK.
    In case of WARNING or ERROR, click click to view details.png in the last column of the ETL name row to review the log files.

To verify that the VMware data is refreshed:

  1. In the Workspace tab, expand (Domain_name_for VMware) > Systems.
  2. In the left pane, verify that the hierarchy displays the new and updated VMware instances in your environment.
  3. Click a VMware virtual machine instance, and click the Metrics tab in the right pane.
  4. Check if the Last Activity column in the Configuration data and Performance metrics tables displays the current date.

The following image shows sample metrics data. To learn more about these metrics, see Lookup-information-and-metrics-for-VMware-ETLs.

vmware_etl_hierarchy.png

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*