Configuring DME - CSV Data Mart Extractor

The Moviri DME - CSV Data Mart Extractor creates CSV files from Data Mart data. The output CSV is made available on the Remote ETL Engine where the ETL runs, under the path specified in the ETL configuration.

Run Configuration

Property	Default	Description
Run configuration name	Default	The default name of the configuration
Deploy status	Production	Choose between Production and Test
Description		Description of the configuration
Log Level	1 - Light	Set the level of detail for log collection: 1 - Light 5 - Medium 10 - Verbose
ETL module	Name of ETL	The ETL choosen
Module Description		Gives a description of the ETL and a link to support for it
Execute in simulation mode	Yes	Use this to either test the ETL without getting data or to run it and get the data
Datasets		Select the only one available (STOFS).

Entity Catalog

Click the Entity catalog tab, and select one of the following options:

Shared Entity Catalog:
1. From the Sharing with Entity Catalog list, select the entity catalog name that is shared between ETLs.
Private Entity Catalog: Select this if you want to generate the hierarchy independent of other ETLs and do not want to include other entities look up from other ETLs.

Object Relationship

Click the Object relationship tab, and select the “leave all new entities in ‘Newly Discovered'“ option. This ETL does not import any data in the workspace, so this option won’t have any effect.

Connection

During the creation of Access Key and Access Secret Key, it is required to assign the Administrators Group and the Administrators and Capacity Administrators Roles.

Property	Required	Definition
BMC Helix Continuous Optimization URL	Yes	The URL of the BHCO instance i.e. https://my-host.onbmc.com
Access Key	Yes	Setting up access keys for programmatic access - BMC Documentation
Access Secret Key	Yes	Setting up access keys for programmatic access - BMC Documentation

Proxy settings

This section needs to be configured only if the Remote ETL Engine needs to connect to BHCO through a proxy.

Property	Required	Definition
Proxy Host	No	The address of the proxy. i.e. https://my-enterprise-proxy.com
Proxy port	No	The port number where the proxy is listening. i.e. 8080
Proxy username	No	Username for proxy authentication, if needed
Proxy password	No	Password for proxy authentication, if needed
Proxy timeout	No	Timeout in milliseconds

Data mart and CSV configuration

Property	Required	Definition
Data Mart Identifier	Yes	The unique name of the data mart that will be used as input of the ETL. It starts with ER_V_ or BUF_. This information can be found by looking into the Data Mart details, as shown in the picture below.
Columns to extract	No	Columns of the Data Mart to be extracted into the CSV, comma or semicolon separated. If empty, all the columns will be extracted.
Output file name	Yes	Path of the CSV file that the ETL will generate. The path must be writable by the cpit user. i.e. /tmp/my-output-csv-file.csv
Write CSV header	Yes	If yes, it generates the CSV header containing the selected column names.
CSV separator	Yes	The character to use as the CSV column separator. By default, it is comma (,).

Split the output into multiple CSV files

If needed, the ETL can also be configured to split the output into several CSV files, grouping data upon the value of a certain Data Mart column. The following example introduces this use case.

Suppose to use this Data Mart, with Identifier ER_V_SPLIT_CSV, as input of the ETL. The ETL can be configured to generate as many CSV files as ENTTYPENAME values are available in the Data Mart by specifying the output file name field of the configuration as follows:

/tmp/my-custom-path/{enttypename}.csv

This configuration will create one CSV named “Virtual Machine - VMware.csv”, containing the first three rows of the ER_V_SPLIT_CSV Data Mart, and one CSV named “Hadoop Resource Pool.csv” containing the last row of the datamart.

Run the ETL

After configuring the ETL, run it to collect data. You can run the ETL in these modes:

Simulation mode: Validates connection to the data source without generating the CSV file. Use this mode when running the ETL for the first time or after changing its configuration.
Production mode: Collects data from the data source and generates the CSV file.

1. Running the ETL in simulation mode

To run the ETL in simulation mode:

In the console, navigate to Administration > ETL & System Tasks, and select ETL tasks.
On the ETL tasks page, click the ETL to display its details.
In the Run configurations table, click the pencil icon to modify the ETL settings.
On the Run configuration tab, set Execute in simulation mode to Yes, and click Save.
Click Run active configuration. A confirmation message appears.
On the ETL tasks page, check the ETL run status in the Last exit column.
OK means the ETL ran without errors. You can now run the ETL in production mode.
If the ETL run status shows Warning, Error, or Failed:
1. On the ETL tasks page, click the pencil icon in the ETL name row's last column.
2. Check the log and adjust the ETL configuration if needed.
3. Run the ETL again.
4. Repeat these steps until the ETL run status is OK.

2. Running the ETL in production mode

Run the ETL manually when needed or schedule it to run at a set time.

Running the ETL manually

On the ETL tasks page, click the ETL to display its details.
In the Run configurations table, click the pencil icon to modify the ETL settings. The Edit run configuration page appears.
On the Run configuration tab, select No for the Execute in simulation mode option, and click Save.
To run the ETL immediately, click Run active configuration. A confirmation message appears.
The ETL collects data from the source and stores it into the CSV file.

Scheduling the ETL run

By default, the ETL is scheduled to run daily. You can customize this schedule by changing the frequency and period of running the ETL.

To configure the ETL run schedule:

On the ETL tasks page, click the ETL, and click Edit Task. The ETL details are displayed.
On the Edit task page, do the following, and click Save:
1. Specify a unique name and description for the ETL task.
2. In the Maximum execution time before warning field, specify the duration for which the ETL must run before generating warnings or alerts, if any.
3. Select a predefined or custom frequency for starting the ETL run. The default selection is Predefined.
4. Select the task group and the scheduler to which you want to assign the ETL task.
Click Schedule. A message confirming the scheduling job submission is displayed.
When the ETL runs as scheduled, it collects data from the source and stores it into the CSV file.

Verify data collection

Confirm the ETL ran successfully and that the Moviri DME - CSV Data Mart Extractor data is available in the output CSV file.

To verify whether the ETL ran successfully:

In the console, click Administration > ETL and System Tasks > ETL tasks.
In the Last exec time column corresponding to the ETL name, verify that the current date and time are displayed.