Maintaining ETL tasks
ETL task characteristics
Every ETL task has the following characteristics:
- It is modular, so that the whole process is defined by composing modules through configuration.
- It is extensible, so that the collection of a new data format is simple.
- It is source independent, that is all the transformation operations act on data that are represented through a standard format.
- It is sequential, so that only new data is loaded every time an ETL task is run.
ETL task composition
ETL tasks are composed of multiple modules that handle specific operations:
- An extractor module, that connects to the source and extracts new data.
- One or more transformer modules, that apply in-memory transformations to data in order to convert it to the internal format.
- One or more loader modules, which load data into the destination (by default, BMC Helix Capacity Optimization Data Warehouse and a CVS file).
Extractor modules
There are various extractor modules bundled with BMC Helix Capacity Optimization, organized in macro-categories:
- ETL tasks for third-party management platforms (for example, HP OpenView).
- ETL tasks that can collect data using OS-native tools and commands (for example, UNIX SAR).
- ETL tasks that can import standard log formats (for example, NCSA logs).
- Open ETL tasks, that can import the specific BMC Helix Capacity Optimization Open ETL data format defined in Importing data from custom sources through data formats.
- Custom ETL tasks, that you can develop using the BMC Helix Capacity Optimization ETL Development API, as described in Developing.
All available ETL tasks are displayed in the summary table. For more information, refer to Managing-ETL-and-System-tasks.
Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*