Creating and managing custom ETLs
You can use the ETL Development Kit (EDK), which is a command line utility, for creating and managing custom ETLs. You can create a custom ETL module in Perl or Java, and it can be a simple extractor, database extractor, or parser. The utility can be used on Windows and Linux systems.
Using the EDK, you can achieve the following goals:
- Create custom ETL modules in Perl or Java programming language.
- Modify existing ETL packages and repackage them.
- Activate and deactivate ETL packages.
To create, modify, and activate custom ETL packages
The following procedure illustrates how to create a database extractor in Java.
1. Download the utility
Download the utility from the Administration tab.
- Log in to BMC Helix Capacity Optimization as an administrator.
- In the Home tab, select Tools, and click Download ETL Development kit.
The coetl-<version>.zip file is downloaded to your system. - Create a directory (for example, coetl) and copy the downloaded file to the directory.
- Extract the contents of the file.
2. Create an ETL project or modify an existing project
Specify the type of ETL (extractor, database extractor, or parser) and the programming language (Perl or Java).
At the shell prompt, change to the directory where you extracted the coetl utility files.
- (Linux) Run this command to provide the execute permission for the coetl.sh script.
chmod +x coetl.sh Run the following command:
- (Windows) .\coetl.cmd --create [extractor|dbextractor|parser] [java|perl] <project_name> <path_to_save_project>
- (Linux) ./coetl.sh --create [extractor|dbextractor|parser] [java|perl] <project_name> <path_to_save_project>
Example: Command to create a database ETL in Java
- (Windows) .\coetl.cmd --create dbextractor java MonitoringDatabase development\MonitoringDatabaseETLProject
- (Linux) ./coetl.sh --create dbextractor java MonitoringDatabase development/MonitoringDatabaseETLProject
The utility creates the project folder at the specified path.
You can modify your existing custom ETL projects created and repackage them.
- Export the required ETL package.
- Log in to the BMC Helix Capacity Optimization console.
- Navigate to Administration > ETL & System Tasks, and select ETL tasks.
- Select the required ETL package and export it. For details, see Importing-and-exporting-ETL-packages.
Import the package in to the EDK.
- At the shell prompt, change to the directory where you extracted the coetl utility files.
- Run the following command to import the ETL package:
- (Windows) .\coetl.cmd --import <path_of_exported_package> <path_to_save_project>
- (Linux) ./coetl.sh --import <path_of_exported_package> <path_to_save_project>
Example:
- (Windows) .\coetl.cmd --import C:\Downloads\ExportedETLProject.etl.pkg development\exported_project
- (Linux) ./coetl.sh --import Downloads/ExportedETLProject.etl.pkg development/exported_project
The utility creates the project folder for the imported ETL package at the specified path.
The project folder contains the following directory structure and files:
Folder or file | Description |
---|---|
.metadata | Contains metadata that is used by the coetl utility. |
bin | Provides a location to store combined binary files (Java compiled code). |
src | Contains the source code. |
lib | Contains the libraries that are required for compiling the project. |
module.properties | Contains customizable ETL module properties, such as ETL name, list of supported databases, and configuration settings. |
3. (optional) Customize the ETL properties
The ETL project includes the default properties. Edit the <project folder>/module.properties file to configure the ETL properties, such as name, description, associated datasets, and additional properties. The following table explains how to edit some of these properties:
To configure | Update this property | Example |
---|---|---|
Module name | etl.module.description | etl.module.description=MonitoringDatabase Extractor module |
Associated datasets | general.dataset.idlist with a list of dataset IDs | general.dataset.idlist=1;2 |
Additional properties | Syntax for updating: property name|description|type [String,Numeric,Boolean]|default value (optional)|required [true,false] | You can add two required String properties as follows: additional.config.properties.number=2 |
4. Customize the project in an IDE
Import the project that you created to an Integrated Development Environment (IDE) such as Eclipse IDE. The following steps are for the Eclipse IDE.
- Open the Eclipse IDE.
- Create a new project and provide a project name.
- Select the ETL project path that you created in step 2 and customize as needed.
- Click Finish.
Example: The imported project displays the Java extractor code.
5. (Only for Java) Compile the Java code
Compile the Java code to create a Java ETL package. You can use the compiler that is embedded with the EDK. When you run the compile command, the generated compiled resources are copied to the bin directory of the project folder.
To use the embedded compiler, run the following command:
- (Windows) .\coetl.cmd --compile <path_to_saved_project>
- (Linux) ./coetl.sh --compile <path_to_saved_project>
Example:
- (Windows) .\coetl.cmd --compile development\MonitoringDatabaseETLProject
- (Linux) ./coetl.sh --compile development/MonitoringDatabaseETLProject
You can also use an external build tool such as Maven to compile the Java code. You need to instrument the build tool (Development IDE or external tool) to create Java compiled classes and add them to the bin directory of the project folder.
6. Create an ETL package
To create an ETL package from the ETL project that you created, run the following command:
- (Windows) .\coetl.cmd --package <project_path>
- (Linux) ./coetl.sh --package <project_path>
Example:
- (Windows) .\coetl.cmd --package development\MonitoringDatabaseETLProject
- (Linux) ./coetl.sh --package development/MonitoringDatabaseETLProject
The ETL package (.pkg file) is created in the output folder.
7. Activate the ETL package
Activate the ETL package to view the custom module in the list of ETL modules that are available to the BMC Helix Capacity Optimization administrators when creating a new ETL instance. Before activating the package, you must generate the API key that is used to authenticate the user when connecting to the Helix Capacity Optimization Console, and copy this key to the EDK root folder. For details on generating the API key, see Generating-an-API-key-for-programmatic-access.
- Navigate to the <project_path>/output folder.
- Run the following command to activate the package:
- (Windows) .\coetl.cmd --activate <project_path>\output\<project_name>.etl.pkg
- (Linux) ./coetl.sh --activate <project_path>/output/<project_name>.etl.pkg
Example:
- (Windows) .\coetl.cmd --activate development\MonitoringDatabaseETLProject\output\ETLProject.etl.pkg
- (Linux) ./coetl.sh --activate development/MonitoringDatabaseETLProject/output/ETLProject.etl.pkg
The ETL module is named according to this format: "<project_name> <parser|dbextractor|extractor> module". For example, MonitoringDatabase DBExtractor module. If you rename the generated key file or save it in a folder different from the EDK root folder, ensure that you specify the complete path of the key file:
- (Windows) .\coetl.cmd --api-key <api_key_path>\<api_key>.key --activate <project_path>\output\<project_name>.etl.pkg
- (Linux) ./coetl.sh --api-key <api_key_path>/<api_key>.key --activate <project_path>/output/<project_name>.etl.pkg
Example:
- (Windows) .\coetl.cmd --activate development\MonitoringDatabaseETLProject\output\ETLProject.etl.pkg --api-key development\MonitoringDatabaseETLProject\edk_vl-aus-bco-dv02.key
- (Linux) ./coetl.sh --activate development/MonitoringDatabaseETLProject/output/ETLProject.etl.pkg --api-key development/MonitoringDatabaseETLProject/edk_vl-aus-bco-dv02.key
8. Configure the ETL and run the active configuration
After activation, the custom module is displayed in the list of ETL modules that are available to the administrators when creating a new ETL instance.
- Log in to the BMC Helix Capacity Optimization console.
- Navigate to Administration > ETL & System Tasks, and select ETL tasks.
- On the ETL tasks page, click Add > Add ETL.
- On the Run Configuration tab, verify that the custom ETL module is displayed in the ETL Module list.
- Select the custom ETL module and configure the ETL properties, and click Save. The ETL tasks page shows the details of the newly configured ETL.
- Click the ETL and click Run active configuration. A confirmation message about the ETL run job submission is displayed.
- Verify the ETL logs and data collection in the Workspace.
To deactivate custom ETL packages
When the deployed ETL package is no longer needed, you can deactivate it using the EDK. You can reactivate it when needed.
After deactivation, the custom module is removed from the list of ETL modules that are available to the administrators for creating a new ETL instance. Any ETL instance that was created using this custom module is still displayed in the list of ETLs, but you won't be able to run the ETL until you activate it again.
Navigate to the <project_path>/output folder.
Run the following command to deactivate the package:
- (Windows) .\coetl.cmd --deactivate <project_path>\output\<project_name>.etl.pkg
- (Linux) ./coetl.sh --deactivate <project_path>/output/<project_name>.etl.pkg
Example:
- (Windows) .\coetl.cmd --deactivate development\MonitoringDatabaseETLProject\output\ETLProject.etl.pkg
- (Linux) {{code language="none"}}
./coetl.sh --deactivate development/MonitoringDatabaseETLProject/output/ETLProject.etl.pkg
{{/code}}
Similar to the activation command, if you rename the generated key file or save it in a folder different from the EDK root folder, specify the API key while deactivating the package.