Generic - Columnar Database extractor


Use this ETL to import generic data using the Columnar SQL Extractor.

Collecting data by using the Generic - Columnar database ETL

To collect data by using the Generic - Columnar Database ETL, do the following tasks:

I. Complete the preconfiguration tasks.

II. Configure the ETL.

Step I. Complete the preconfiguration tasks

  • Ensure that you can reach your database server from the ETL engine machine by running the ping <your_db_server_ip> command.
  • Ensure that you can remotely access your database server using telnet from the ETL engine machine, by running the telnet <your_db_server_ip> <your_db_server_port> command.
  • Connect to your database using a software client.
  • If the username is part of a Windows domain, verify that the username is specified in the YOURDOMAIN\username format. Verify that your settings match to the settings that are used to connect to the same database with a software client.

Step II. Configure the ETL

You must configure the ETL to connect to the database for data collection. ETL configuration includes specifying the basic and optional advanced properties. While configuring the basic properties is sufficient, you can optionally configure the advanced properties for additional customization.

A. Configuring the basic properties

Some of the basic properties display default values. You can modify these values if required.

To configure the basic properties:

  1. In the Helix Capacity Optimization Console, navigate to Administration ETL & System Tasks, and select ETL tasks.
  2. On the ETL tasks page, click Add > Add ETL. The Add ETL page displays the configuration properties. You must configure properties in the following tabs: Run configuration, Entity catalog, SQL query, and Connection parameters.
  3. On the Run Configuration tab, select Generic - Columnar Database Extractor from the ETL Module list. The name of the ETL is displayed in the ETL task name field. You can edit this field to customize the name.

    gen_col_db_config_page.png

  4. Click the Entity catalog tab, and select one of the following options:
    • Shared Entity Catalog: Select if the other ETLs access the same entities that are used by this ETL.
      • From the Sharing with Entity Catalog list, select the entity catalog name that is shared between ETLs.
    • Private Entity Catalog: Select if you want to use this ETL independently.
  5. Click the SQL query tab, and configure the following properties:

    The [confluence_table-plus] macro is a standalone macro and it cannot be used inline. Click on this message for details.

  6. Click the Connection properties tab configure the properties.

    Property

    Description

    Datasource type

    Select one of the following options:

    • Select existing datasource:
      • Datasource: From the Datasource list, select a datasource.
    • Custom datasource:
      • Database type: Select a database type from the drop down list.
        • Oracle
        • Microsoft SQL Server/Sybase
        • Other database
      • Database user: Type the user name to connect to the database.
      • Database password: Type the user name to connect to the database.
      • Database TNS name (Applicable to Oracle): Type in the database TNS name.
      • Data source name (Applicable to Microsoft SQL Server/Sybase): Specify the data source name.
      • Database name (Applicable to Microsoft SQL Server/Sybase): Type in the database name.
      • Connection URL (Applicable to Other database): Enter the connection URL. If you select Other database in the Database type property, then you must type a value.
        To connect to other database type, verify that the correct driver is installed on the  server running this ETL. For an ETL in Perl code, install an appropriate DBD driver. For an ETL in Java code, install an appropriate JDBC driver.
  7. (Optional) Override the default values of properties in the following tabs:

    The [confluence_table-plus] macro is a standalone macro and it cannot be used inline. Click on this message for details.

  8. Click Save.
    The ETL tasks page shows the details of the newly configured Generic - CSV columnar file parser ETL.

(Optional) B. Configuring the advanced properties

You can configure the advanced properties to change the way the ETL works.

To configure the advanced properties:

  1. On the Add ETL page, click Advanced.
  2. Configure the following properties:

    The [expand] macro is a standalone macro and it cannot be used inline. Click on this message for details.

    The [expand] macro is a standalone macro and it cannot be used inline. Click on this message for details.

    Format customization

    Property

    Description

    Timestamp format

    Specify a format that should be used by the ETL if the user tables or CSV files use an unsupported format. <YYYY-MM-DD HH:MM:SS> is the supported format. Depending on the type of ETL used, you might need to specify a custom format.

    Percentage format

    Reflects the format in which percentage metrics are imported.

    • If the percentage metric is out of 1 then, select 0 to 1
    • If the percentage metric is out of 100 then, select 0 to 100
    Additional properties

    Property

    Description

    List of properties

    Specify additional properties for the ETL that act as user inputs during run. You can specify these values now or you can do so later by accessing the "You can manually edit ETL properties from this page" link that is displayed for the ETL in the view mode.

    1. Click Add.
    2. In the etl.additional.prop.n field, specify an additional property.
    3. Click Apply.
      Repeat this task to add more properties.

    The [expand] macro is a standalone macro and it cannot be used inline. Click on this message for details.

    The [expand] macro is a standalone macro and it cannot be used inline. Click on this message for details.

  3. Click Save.
    The ETL tasks page shows the details of the newly configured Generic - Columnar Database ETL.

Step III. Run the ETL

After you configure the ETL, you can run it to collect data. You can run the ETL in the following modes:

A. Simulation mode: Only validates connection to the data source, does not collect data. Use this mode when you want to run the ETL for the first time or after you make any changes to the ETL configuration.

B. Production mode: Collects data from the data source.

A. To run the ETL in the simulation mode

To run the ETL in the simulation mode:

  1. Navigate to Administration ETL & System Tasks, and select ETL tasks.
  2. On the ETL tasks page, click the ETL. The ETL details are displayed.
    etl_details.png
     
  3. In the Run configurations table, click Edit edit_this_run_configuration.png to modify the ETL configuration settings.
  4. On the Run configuration tab, ensure that the Execute in simulation mode option is set to Yes, and click Save.
  5. Click Run active configuration. A confirmation message about the ETL run job submission is displayed.
  6. On the ETL tasks page, check the ETL run status in the Last exit column.
    OK Indicates that the ETL ran without any error. You are ready to run the ETL in the production mode.
  7.  If the ETL run status is Warning, Error, or Failed:
    1. On the ETL tasks page, clickclick to view details.pngin the last column of the ETL name row.
    2. Check the log and reconfigure the ETL if required.
    3. Run the ETL again.
    4. Repeat these steps until the ETL run status changes to OK.

B. To run the ETL in the production mode

You can run the ETL manually when required or schedule it to run at a specified time.

To run the ETL manually

  1. On the ETL tasks page, click the ETL. The ETL details are displayed.
  2. In the Run configurations table, click Edit edit_this_run_configuration.png to modify the ETL configuration settings. The Edit run configuration page is displayed.
  3. On the Run configuration tab, select No for the Execute in simulation mode option, and click Save.
  4. To run the ETL immediately, click Run active configuration. A confirmation message about the ETL run job submission is displayed.
    When the ETL runs, it collects data from the source and transfers it to the BMC Helix Continuous Optimization database.

To schedule the ETL run in the production mode

By default, the ETL is scheduled to run daily. You can customize this schedule by changing the frequency and period of running the ETL.

To configure the ETL run schedule:

  1. On the ETL tasks page, click the ETL, and click Edit task. The ETL details are displayed.
    aws_api_etl_schedule_run.png
  2. On the Edit task page, do the following, and click Save:
    • Specify a unique name and description for the ETL task.
    • In the Maximum execution time before warning field, specify the duration for which the ETL must run before generating warnings or alerts, if any.
    • Select a predefined or custom frequency for starting the ETL run. The default selection is Predefined.
    • Select the task group to which you want to assign the ETL task.
  3. Click Schedule. A message confirming the scheduling job submission is displayed.
    When the ETL runs as scheduled, it collects data from the source and transfers it to the BMC Helix Continuous Optimization database.

Step IV. Verify data collection

Verify that the ETL ran successfully and check whether the Generic - Columnar Database data is refreshed in the Workspace.

To verify whether the ETL ran successfully

  1. Click Administration > ETL and System Tasks > ETL tasks.
  2. In the Last exec time column corresponding to the ETL name, verify that the current date and time are displayed.
  3. In the Last exit column corresponding to the ETL name, verify that the status is OK.
    In case of WARNING or ERROR, click click to view details.png in the last column of the ETL name row to review the log files.

To verify that the database data is refreshed:

  1. In the Helix Capacity Optimization Console, click Workspace.
  2. Expand Generic - Columnar Database extractor > Systems.
  3. In the left pane, verify that the hierarchy displays the database instances.
  4. Click an instance, and click the Metrics tab in the right pane.
  5. Check if the Last Activity column in the Configuration metrics and Performance metrics tables displays the current date.


Configuring the ETL

Below is an example query that can be used to extract generic data. It can be used as a template to create custom extraction queries:

SELECT d.ts as TS, d.interval as DURATION, d.id as DS_SYSNM, d.name as SYSNM, d.fsname as SUBOBJNM, d.avgvalue as BYFS_FREE, d.size as BYFS_SIZE
from my_table d and d.ts > ?

You can use the :PARL placeholder parameter one or multiple times in a query; it is replaced by the last saved last counter value.

#Example 1
SELECT d.ts as TS, d.interval as DURATION, d.id as DS_SYSNM, d.name as SYSNM, d.fsname as SUBOBJNM, d.avgvalue as BYFS_FREE, d.size as BYFS_SIZE
from my_table d and d.ts > :PARL

#Example 2
SELECT d.ts as TS, d.ats as ACTIVITYDATE, d.interval as DURATION, d.id as DS_SYSNM, d.name as SYSNM, d.fsname as SUBOBJNM, d.avgvalue as BYFS_FREE, d.size as BYFS_SIZE from my_table d and d.ts > :PARL OR d.ats > :PARL

To import system metrics, configure the SYSDAT dataset to import system metrics. The 'TS' column is used as last counter. You can configure the default last counter in Advanced mode; if not configured, the default is 7 days from the ETL start date.

To import business driver data, configure the WKLDAT dataset as shown below:

SELECT d.ts as TS, d.interval as DURATION, d.id as DS_WKLDNM, d.name as WKLDNM, d.pagesinperiod as WEB_HITS, d.accountsrefresh as TOTAL_EVENTS
from my_table d and d.ts > ?

The recommended minimum and maximum sampling interval in BMC Helix Continuous Optimization is 5 minutes and 5 days.

Supported datasets

This ETL supports vertical datasets: SYSDATWKLDAT.

Metric filtering is performed within the extraction query. Only one dataset per extraction is supported.


 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*