Maintaining ETL tasks
An ETL (Extract, Transform and Load) task is a process that extracts data from a source, such as a database or data file, transforms them in an appropriate format (if necessary), and then loads them into the TrueSight Capacity Optimization Data Warehouse, automatically cataloging new entities.
The ETL tasks page enables you to manage the tasks responsible for extracting data from all the available sources and feeding it into the TrueSight Capacity Optimization Data Warehouse.
ETL tasks feed TrueSight Capacity Optimization with data for systems and business drivers. Data is extracted from one or more sources, converted into an internal format, loaded into the TrueSight Capacity Optimization Data Warehouse, and then cataloged.
The ETL engine is a core part of TrueSight Capacity Optimization and its management is restricted to administrators. You can assign individual users the roles required to administer ETL tasks by allowing the accounts to perform the
WEB_ADMIN_EDIT activities. You can also use Access Groups and Task Groups to further restrict access to ETL tasks. For more information about users, roles and access groups, refer to Managing access control.
For further details, see the following sections in this topic:
ETL task characteristics
Every ETL task has the following characteristics:
- It is modular, so that the whole process is defined by composing modules through configuration.
- It is extensible, so that the collection of a new data format is simple.
- It is source independent, that is all the transformation operations act on data that are represented through a standard format.
- It is sequential, so that only new data is loaded every time an ETL task is run.
ETL task composition
ETL tasks are composed of multiple modules that handle specific operations:
- An extractor module, that connects to the source and extracts new data.
- One or more transformer modules, that apply in-memory transformations to data in order to convert it to the internal format.
- One or more loader modules, which load data into the destination (by default, TrueSight Capacity Optimization Data Warehouse and a CVS file).
There are various extractor modules bundled with TrueSight Capacity Optimization, organized in macro-categories:
- ETL tasks for third-party management platforms (for example, HP OpenView).
- ETL tasks that can collect data using OS-native tools and commands (for example, UNIX SAR).
- ETL tasks that can import standard log formats (for example, NCSA logs).
- Open ETL tasks, that can import the specific TrueSight Capacity Optimization Open ETL data format defined in Importing data from custom sources through data formats.
- Custom ETL tasks, that you can develop using the TrueSight Capacity Optimization ETL Development API, as described in Developing.
All available ETL tasks are displayed in the summary table. For more information, refer to Managing ETL and System tasks.