Working with ETLs

Navigate to Administration > ETL & System Tasks > ETL tasks to view the list of all configured ETL tasks grouped by task group. You can manage or run ETL tasks and task chains.

Overview

Similar to the Maintaining System tasks page, each row in the table represents an ETL task, and displays details including the last execution results. For more information on the structure of the summary table, refer to Managing ETL and System tasks.

Last exit value

Description

The ETL task completed successfully.

The WARNING status is generated by a large number of causes that often depend on the nature of the ETL task itself. For example, an ETL that collects data from files or databases may not find data to extract. In this case, a warning message will be generated: Empty directory [...] for extraction, or, in the case of a database ETL, Loader: Dataset is empty. Another common warning message generated by ETL tasks can be related to the the data extracted being partially corrupted. The ETL process discards that data, continues processing, and displays a warning message.

The FAILED status indicates that the ETL task did not complete processing. Failures can be generated by different events. For example, network failure or a badly configured parameter could prevent an ETL from connecting to a database or mounting a network folder.

Important

Service ETL tasks do not exit, hence they do not have a Last exit value.

For more information about execution issues, analyze the log file. For more information, see Managing tasks.

You can use the Task Commands form to start or reschedule any task displayed in the list. This functionality is described in the Task management section.

Viewing, editing, and deleting an ETL task

The ETL tasks page under Administration > ETL & System Tasks > ETL tasks displays detailed information about all ETLs, summarizing the status of their last run. For more information, see Understanding the ETL task summary table.

You can also edit the properties of an ETL task from this page. For more information, see Editing an ETL task.

Deleting a task

Do the following steps:

  1. From the ETL tasks list, select the ETL task that you want to delete.
  2. Hover over Maintenance, and select one of the following options:
    • Delete - Select to delete the ETL task.
    • Delete imported data - Select to delete the entities imported by the ETL including details, such as entity catalog, last counters, and lookups.

Restoring entity relationships

You can restore the relationship of an entity that is broken accidentally or manually, such as the removal of a system from a domain. This feature is available in all the ETLs that are configured to import the object relationship data.

To restore the entity relationships, do the following steps:

  1. From the ETL tasks list, select the ETL task for which you want to recover the relationships.
  2. Hover over Maintenance, select Restore relations. You'll see this dialog box:

  3. Click Confirm to proceed.
  4. Click Show processing details to view the progress of relationship processing.

    The processing might take some time. Proceed to the next step only after the processing is completed.

  5. In the Workspace, refresh the ETL data and verify that the entity relationships are restored.

This operation will delete any relation no more defined by the ETL in the configured "Destination domain."

Reimporting data by running the ETL in data repair mode

You can run the ETL in data repair mode to overwrite the existing samples produced by the ETL. This operation can be helpful in cases where you have imported incorrect data and want to overwrite it. 

Before you begin

Before proceeding with the data repair activity, verify that you have correctly configured the ETL to get the right data portion by performing the following:

  1. Review and edit the last counters
  2. Verify the configuration is enabled to get samples
  3. Verify the availability of input files for file parsing

This operation is applicable only for batch ETLs. 

To run the ETL in data repair mode, do the following steps:

  1. From the ETL tasks list, select the ETL task that you want to run in data repair mode.
  2. Hover over Maintenance, and select Run data in repair mode.
    Click Confirm in the Confirm the operation dialog box to run the ETL in data repair mode. 
  3. To verify whether the ETL ran successfully, in the Last exec time column corresponding to the ETL name, verify that the current date and time are displayed.
  4. In the Last exit column corresponding to the ETL name, verify that the status is OK.

Important

This action will run the ETL in data repair mode. Samples produced by the ETL execution will overwrite the previous samples for the selected timeframe. If the data used to repair historical data has missing samples in the data, the missing samples will overwrite the original samples during the data repair operation.

Viewing and editing the last counter value

Click Lastcounter to view the Status detail table. It lists the timestamp, the result of the last run, and the value of the lastcounter parameter for each data source. Click Edit Lastcounter to change the lastcounter value manually. The lastcounter and lookup entries are created only when the ETL task is in production mode.

Recommendation

It is strongly recommended to limit the amount of data to import. If you need to recover historical data you should do it in small chunks. For instance, you should import data for a few days at a time, and import data for subsequent days in chunks.

Working with ETL entity catalogs

Click Entity catalog in the ETL Task details page to view information about its lookup tables. You can view the list of systems, business drivers and domains imported by the ETL, including data sources and the name they will have in BMC Helix Continuous Optimization.

Deleting an entity catalog record

Each row displays the mapping between the data sources and Capacity Optimization name in the tables for Systems and Business drivers. Click the check boxes in the first column to select a row.

Click the Delete drop-down and select one of the following:

  • Delete selected lookup entries of ETL 
  • Delete selected lookup entries of all ETLs
  • Delete selected lookup entries of all ETLs and dismiss entities: Lookup reference and delete the selected resources. If the selected resources are not shared with other ETL tasks, this action will change their status to "dismissed". For more information, see Lifecycle and status of entities and domains.

When an ETL task encounters an entity (or domain) in the data source, it checks its own lookup tables to find a configured target. If no target is found, the object is treated as a new object and the ETL task performs the following actions:

  1. Creates a new entity (or domain) with a name identical to the one found in the data source.
  2. Adds an entry into the ETL task lookup tables to track the new association.

Additional information

New entities appear in the lookup tables and in the All Systems and Business Drivers > Newly discovered page of the Workspace section. For details, see Lifecycle and status of entities and domains.

Adding a lookup table record

You can manually add a record to the Systems, Business Drivers, or Domains lookup tables. Click Add system lookup, Add business driver lookup, or Add domain lookup, and enter the required details in the popup that is displayed.

FieldDescription
Lookup fieldSelect the method used by the data source to calls that entity.
Lookup valueType the lookup value
SystemSelect an existing entity from the drop-down list.

Click Add system, Add business driver, or Add domain as applicable.

Sharing an entity catalog

You can also configure an ETL to share the entity catalog of another ETL. To do so, follow these steps:

  1. Edit the Run configuration of the ETL.
  2. In the Edit run configuration page that appears, expand Entity catalog.
  3. Select Shared Entity Catalog and select an entity catalog from Sharing with Entity Catalog..
  4. Click Save.

Additional information

When two ETL tasks share the entity catalog, both of them should be able to load the same entity. Whenever a new entity is defined, one of the two ETLs will load it first, in no particular order.

Some issues might be caused if you set up the entity catalog after the first data import. An ETL task could automatically create a new entity and import its data, while it should have appended data to an existing entity. If this happens, you will have to perform an entity catalog reconciliation.

Note

The manual reconciliation of an entity in BMC Helix Continuous Optimization is discouraged. If manual reconciliation is performed incorrectly, it may disrupt the system. Also, the reconciliation process cannot be undone. It is strongly advised that you run an ETL task in simulation mode before executing it for the first time, to facilitate solving lookup duplication issues beforehand. For details, see Entity name resolution and Preventing duplication issues.

Lookup duplication example

The following example depicts a situation in which a lookup reconciliation is necessary.

An ETL task, ETL_A, which accesses a data source dsA that collects data for two systems: sys1 and sys2. ETL_A runs every day, and has been running for some time.

After its first run, it created two new entities in BMC Helix Continuous Optimization, sys1 and sys2. You later renamed these entities as ny_sys1 and ny_sys2 to match your BMC Helix Continuous Optimization naming policy.

The lookup table of ETL_A contains the following mappings, where 301 and 302 are the unique IDs for those BMC Helix Continuous Optimization entities.

ID

Source name

System

301

sys1

ny_sys1

302

sys2

ny_sys2

In your IT infrastructure, there is also another data source, dsB, which stores data for two systems, sys2 (the same as before) and sys3, but collects a different set of metrics from dsA.

If you create a new ETL task, ETL_B, which imports dsB data from sys2 and sys3 into BMC Helix Continuous Optimization and let ETL_B perform an automatic lookup, its lookup table will look like the following:

ID

Source name

System

303

sys2

sys2

304

sys3

sys3

The BMC Helix Continuous Optimization Data Warehouse now has two new systems. This is a problem, since sys2 already exists, but etlB did not know it.

In this case, ETL_B should share the lookup table of ETL_A in order to assign data to the correct system in BMC Helix Continuous Optimization, that is ny_sys2.

Lookup reconciliation

If a lookup duplication problem occurs, you can recover the problem. To learn how, see Adding and managing entity catalogs.

Preventing duplication issues

To avoid these problems, the correct procedure for creating a new ETL task is:

  1. Create the new ETL task with simulation mode turned on and the maximum log level (10).
  2. Manually run the ETL task and check its execution log to find out if it created any new entities. You can use this information to understand if the automatic lookup process is safe and if you need to use shared lookup from another ETL.
  3. If you notice an issue, you can also manually add a line in the lookup table.
  4. Toggle simulation mode off.
  5. Run the ETL task to import new data.

The following topics help you work with and understand ETL modules.

Was this page helpful? Yes No Submitting... Thank you

Comments