Creating and managing custom ETLs

You can use the ETL Development Kit (EDK), which is a command line utility, for creating and managing custom ETLs. You can create a custom ETL module in Perl or Java, and it can be a simple extractor, database extractor, or parser. The utility can be used on Windows and Linux systems. 

Using the EDK, you can achieve the following goals: 

  • Create custom ETL modules in Perl or Java programming language.
  • Modify existing ETL packages and repackage them.
  • Activate and deactivate ETL packages.

To create, modify, and activate custom ETL packages

The following procedure illustrates how to create a database extractor in Java.

1. Download the utility

Download the utility from the Administration tab.

  1. Log in to BMC Helix Capacity Optimization as an administrator.
  2. In the Home tab, select Tools, and click Download ETL Development kit
    The coetl-<version>.zip file is downloaded to your system.
  3. Create a directory (for example, coetl) and copy the downloaded file to the directory.
  4. Extract the contents of the file.
2. Create an ETL project or modify an existing project

Specify the type of ETL (extractor, database extractor, or parser) and the programming language (Perl or Java).

  1. At the shell prompt, change to the directory where you extracted the coetl utility files.

    Tip

    You can run the coetl command to view the syntax for the available commands such as create, compile, package, activate, import, and deactivate commands.

  2. (Linux) Run this command to provide the execute permission for the coetl.sh script.
    chmod +x coetl.sh
  3. Run the following command:
    • (Windows) .\coetl.cmd --create [extractor|dbextractor|parser] [java|perl] <project_name> <path_to_save_project>
    • (Linux) ./coetl.sh --create [extractor|dbextractor|parser] [java|perl] <project_name> <path_to_save_project>

    Example: Command to create a database ETL in Java

    • (Windows) .\coetl.cmd --create dbextractor java MonitoringDatabase development\MonitoringDatabaseETLProject
    • (Linux) ./coetl.sh --create dbextractor java MonitoringDatabase development/MonitoringDatabaseETLProject

    The utility creates the project folder at the specified path.

You can modify your existing custom ETL projects created and repackage them.

  1. Export the required ETL package.
    1. Log in to the BMC Helix Capacity Optimization console.
    2. Navigate to Administration ETL & System Tasks, and select ETL tasks.
    3. Select the required ETL package and export it. For details, see Importing and exporting ETL packages.
  2. Import the package in to the EDK.
    1. At the shell prompt, change to the directory where you extracted the coetl utility files.
    2. Run the following command to import the ETL package:
      • (Windows) .\coetl.cmd --import <path_of_exported_package> <path_to_save_project>
      • (Linux) ./coetl.sh --import <path_of_exported_package> <path_to_save_project>

Example:

      • (Windows) .\coetl.cmd --import C:\Downloads\ExportedETLProject.etl.pkg development\exported_project
      • (Linux) ./coetl.sh --import Downloads/ExportedETLProject.etl.pkg development/exported_project

The utility creates the project folder for the imported ETL package at the specified path.

The project folder contains the following directory structure and files:

Folder or fileDescription
.metadataContains metadata that is used by the coetl utility.
binProvides a location to store combined binary files (Java compiled code).
srcContains the source code.
libContains the libraries that are required for compiling the project.
module.propertiesContains customizable ETL module properties, such as ETL name, list of supported databases, and configuration settings.


3. (optional) Customize the ETL properties

The ETL project includes the default properties. Edit the <project folder>/module.properties file to configure the ETL properties, such as name, description, associated datasets, and additional properties. The following table explains how to edit some of these properties:

To configureUpdate this propertyExample
Module name

etl.module.description

etl.module.description=MonitoringDatabase Extractor module
Associated datasetsgeneral.dataset.idlist with a list of dataset IDsgeneral.dataset.idlist=1;2
Additional properties

Syntax for updating:

property name|description|type [String,Numeric,Boolean]|default value (optional)|required [true,false]

You can add two required String properties as follows:

additional.config.properties.number=2
additional.config.property.1=extract.system.types.list|System types to extract|String|0;16;25;28|true
additional.config.property.2=extract.domain.names|Domain names to extract|String|Dom1|true



#ETL module description
etl.module.description=MonitoringDatabase Extractor module

#Mandatory semicolon separated list of datasets IDs that ETL supports
general.dataset.idlist=1;2

#Additional configuration properties supported by ETL module
#
#Property definition syntax:
#       property name|description|type [String,Numeric,Boolean]|default value (optional)|required [true,false]
#
# Example - (adding a mandatory String property 'extract.system.types.list' with default value of '0;16;25;28':
#
#additional.config.properties.number=1
#additional.config.property.1=extract.system.types.list|System types to extract|String|0;16;25;28|true


4. Customize the project in an IDE

Import the project that you created to an Integrated Development Environment (IDE) such as Eclipse IDE. The following steps are for the Eclipse IDE.

  1. Open the Eclipse IDE.
  2. Create a new project and provide a project name.
  3. Select the ETL project path that you created in step 2 and customize as needed.
  4. Click Finish.

Example: The imported project displays the Java extractor code.

5. (Only for Java) Compile the Java code

Compile the Java code to create a Java ETL package. You can use the compiler that is embedded with the EDK. When you run the compile command, the generated compiled resources are copied to the bin directory of the project folder.

To use the embedded compiler, run the following command:

  • (Windows) .\coetl.cmd --compile <path_to_saved_project>
  • (Linux) ./coetl.sh --compile <path_to_saved_project>

Example:

  • (Windows) .\coetl.cmd --compile development\MonitoringDatabaseETLProject
  • (Linux) ./coetl.sh --compile development/MonitoringDatabaseETLProject

You can also use an external build tool such as Maven to compile the Java code. You need to instrument the build tool (Development IDE or external tool) to create Java compiled classes and add them to the bin directory of the project folder.

6. Create an ETL package

To create an ETL package from the ETL project that you created, run the following command:

  • (Windows.\coetl.cmd --package <project_path>
  • (Linux./coetl.sh --package <project_path>

Example:

  • (Windows.\coetl.cmd --package development\MonitoringDatabaseETLProject
  • (Linux./coetl.sh --package development/MonitoringDatabaseETLProject

The ETL package (.pkg file) is created in the output folder.

7. Activate the ETL package

Activate the ETL package to view the custom module in the list of ETL modules that are available to the BMC Helix Capacity Optimization administrators when creating a new ETL instance. Before activating the package, you must generate the API key that is used to authenticate the user when connecting to the Helix Capacity Optimization Console, and copy this key to the EDK root folder. For details on generating the API key, see Generating an API key for programmatic access.

  1. Navigate to the <project_path>/output folder. 
  2. Run the following command to activate the package:
    • (Windows.\coetl.cmd --activate <project_path>\output\<project_name>.etl.pkg
    • (Linux./coetl.sh --activate <project_path>/output/<project_name>.etl.pkg

Example:

    • (Windows.\coetl.cmd --activate development\MonitoringDatabaseETLProject\output\ETLProject.etl.pkg
    • (Linux./coetl.sh --activate development/MonitoringDatabaseETLProject/output/ETLProject.etl.pkg

The ETL module is named according to this format: "<project_name> <parser|dbextractor|extractor> module". For example, MonitoringDatabase DBExtractor module. If you rename the generated key file or save it in a folder different from the EDK root folder, ensure that you specify the complete path of the key file:

    • (Windows.\coetl.cmd --api-key <api_key_path>\<api_key>.key --activate <project_path>\output\<project_name>.etl.pkg
    • (Linux./coetl.sh --api-key <api_key_path>/<api_key>.key --activate <project_path>/output/<project_name>.etl.pkg

Example:

    • (Windows.\coetl.cmd --activate development\MonitoringDatabaseETLProject\output\ETLProject.etl.pkg --api-key development\MonitoringDatabaseETLProject\edk_vl-aus-bco-dv02.key
    • (Linux./coetl.sh --activate development/MonitoringDatabaseETLProject/output/ETLProject.etl.pkg --api-key development/MonitoringDatabaseETLProject/edk_vl-aus-bco-dv02.key 
8. Configure the ETL and run the active configuration

After activation, the custom module is displayed in the list of ETL modules that are available to the administrators when creating a new ETL instance.

  1. Log in to the BMC Helix Capacity Optimization console.
  2. Navigate to Administration ETL & System Tasks, and select ETL tasks.
  3. On the ETL tasks page, click Add Add ETL.
  4. On the Run Configuration tab, verify that the custom ETL module is displayed in the ETL Module list. 
  5. Select the custom ETL module and configure the ETL properties, and click Save. The ETL tasks page shows the details of the newly configured ETL.
  6. Click the ETL and click Run active configuration. A confirmation message about the ETL run job submission is displayed.
  7. Verify the ETL logs and data collection in the Workspace.

To deactivate custom ETL packages

When the deployed  ETL package is no longer needed, you can deactivate it using the EDK. You can reactivate it when needed. 

After deactivation, the custom module is removed from the list of ETL modules that are available to the administrators for creating a new ETL instance. Any ETL instance that was created using this custom module is still displayed in the list of ETLs, but you won't be able to run the ETL until you activate it again.

Navigate to the <project_path>/output folder. 

Run the following command to deactivate the package:

  • (Windows.\coetl.cmd --deactivate <project_path>\output\<project_name>.etl.pkg 
  • (Linux./coetl.sh --deactivate <project_path>/output/<project_name>.etl.pkg  

Example

  • (Windows.\coetl.cmd --deactivate development\MonitoringDatabaseETLProject\output\ETLProject.etl.pkg 
  • (Linux./coetl.sh --deactivate development/MonitoringDatabaseETLProject/output/ETLProject.etl.pkg 

Similar to the activation command, if you rename the generated key file or save it in a folder different from the EDK root folder, specify the API key while deactivating the package.

Was this page helpful? Yes No Submitting... Thank you

Comments

  1. Raffaele Spiezia

    Regarding point 1, the link "Download ETL Development kit" is actually under Home > Tools

    Sep 10, 2021 06:35
    1. Manisha Moon

      Hi Raffaele,

      Thank you for the feedback. 

      I have updated the correct download location. 

      Regards,

      Manisha



      Sep 11, 2021 08:20