Generic - Custom structure format CSV parser


Use the Custom structure format CSV parser to import custom structure data inside a buffer table. You can import any data in buffer tables without requiring you to create metrics as these tables are custom-defined. For details, see Collecting-data-for-custom-structure-tables.

Before you begin

Before you integrate BMC Helix Continuous Optimization with the Custom structure format CSV parser, refer to Collecting-data-for-custom-structure-tables.

To collect data by using the Custom structure format CSV parser

To collect data by using the Custom structure format CSV parser, do the following tasks:

  1. Navigate to Administration > ETL & SYSTEM TASKS > ETL tasks.
  2. In the ETL tasks page, click Add > Add ETL under the Last run tab.
  3. In the Add ETL page, set values for the following properties under each expandable tab.

    Important

    Basic properties are displayed by default on the Add ETL page. These are the most common properties that you can set for an ETL, and it is acceptable to leave the default selections for each as is.

    Basic properties

    Property

    Description

    Run configuration

    ETL task name

    The default name is already filled out for you.

    ETL module

    Select Generic - Custom structure format CSV parser.

    Module description

    A link that points you to the technical documentation for this ETL.

    Execute in simulation mode

    By default, Yes is selected. Use this to validate the connectivity between the ETL engine and the target and to make sure that the ETL does not have any other configuration issues. This option is useful while testing a new ETL task.

    Set this property to No after the initial testing. 

    Datasets

    1. Click Edit.
    2. Select one (click) or more (shift+click) datasets that you want to include from Available datasets and click >> to move them to Selected datasets.
    3. Click Apply.

    Important: Make sure that you select the CST (Custom Structure Tables) dataset to import the buffer table data, as only the CST dataset is supported. For details, see Overview-of-datasets-in-an-ETL-task and Dataset-reference-for-ETL-tasks.

    Custom structure data

    Truncate table before load

    Select any one for the appropriate option:

    • Yes: Select Yes to truncate the table before the data gets loaded.
    • No: Do not truncate the table before loading.

    Buffer table name

    Specify the buffer table identifier that you received while creating the buffer table by using the data mart API. For details, see Datamart-API-endpoints.

    Important: BUF_ prefix is automatically added to the name if the identifier is missing. 

    Key columns

    Specify the key column information. The key columns are required if primary keys are not set already and you want to apply the Delete or Update behavior. 

    Column types

    Enter the column type of your choice.

    Behavior

    Select one of the following options:

    • Append: Add data to the buffer table
    • Delete: Delete the rows (or records) from the buffer table
    • Update: Update existing data in the buffer table
    • Truncate: Delete rows from the buffer table and reset the primary key counter to zero

    File location

    File location

    Select any one of the following methods to retrieve the CSV file. For details on the file format, see Input file format.

    • Local directory: Specify a path on your local computer where the CSV file resides.
    • Windows share: Specify the Windows share path name where the CSV file resides.
    • FTP: Specify the FTP path name where the CSV file resides.
    • SCP: Specify the SCP path name where the CSV file resides.
    • SFTP: Specify the SFTP path name where the CSV file resides.

    Important: The headers in the input file take precedence over the ETL configuration.

    Directory

    Path of the directory that contains the CSV file. 

    Important: By default, this ETL supports semicolon-separated CSV files. To use a different separator, on the Additional properties tab, add the extract.default.separator property and set the value as the required separator. For details, see the Advanced tab. 

    Network Share Path (Displayed for Windows share)

    The full UNC (Universal Naming Convention) address. For example: //hostname/sharedfolder

    Files to copy (with wildcards) 

    (Displayed for SCP)

    Before parsing, the SFTP and SCP commands need to make a local temporary copy of the files; this setting specifies which files in the remote directory should be imported.

    File list pattern

    A regular expression that defines which data files should be read. The default value is (?$<$!done)\$, which tells the ETL to read every file whose name does not end with the string "done". For example, my_file_source.done.

    Recurse into subdirs?

    Select Yes or No. When set to Yes, BMC Helix Continuous Optimization also inspects the subdirectories of the target directories.

    After parse operation

    Choose what to do after the CSV file has been imported. The available options are:

    • Do nothing: Do nothing after import.
    • Append suffix to parsed file: Append a suffix you add here to the imported CSV file. For example, _done or _impoted, and so on.
      • Parsed files suffix: The suffix that will be appended to parsed files. The default value is done.
    • Archive parsed file in directory: Archive the parsed file in the specified directory.
      • Archive directory (local): The default archive directory path is filled out for you. For example, %BASE/../repository/imprepository
      • Compress archived files: Select True or False.
    • Archive bad files in directory: Archive erroneous files in the specified directory.
      • Archive directory (local): The default archive directory path is filled out for you. For example, %BASE/../repository/imprepository
      • Compress archived files: Select True or False.

    Remote host (Displayed for FTP, SFTP, SCP)

    Enter the name or address of the remote host to connect to.

    Username (Displayed for Windows share, FTP, SFTP, SCP)

    Enter the username to connect to the file location server.

    Password required (Displayed for Windows share, FTP, SFTP, SCP)

    Select Yes or No.

    Password (Displayed for Windows share, FTP, SFTP, SCP)

    Enter a password to connect to the file location server. Applicable if you selected Yes for Password required.

    ETL task properties

    Task group

    Select a task group to classify this ETL into.

    Running on scheduler

    Select a scheduler for running the ETL. For cloud ETLs, use the scheduler that is preconfigured in Helix. For on-premises ETLs, use the scheduler that runs on the Remote ETL Engine.

    Maximum execution time before warning

    The number of hours, minutes, or days to run the ETL before generating warnings, if any.

    Frequency

    Select the frequency of ETL execution. Available options are:

    • Predefined: Select a Predefined frequency from Each Day, Each Week or Each Month.
    • Custom: Enter a Custom frequency (time interval) as the number of minutes, hours, days, or weeks to run the ETL.

    Start timestamp: hour\minute (Applies to Predefined frequency)

    The HH:MM start timestamp to add to the ETL execution running on a Predefined frequency.

    Important

    To view or configure Advanced properties, click Advanced. You do not need to set or modify these properties unless you want to change how the ETL works. These properties are for advanced users and scenarios only.

    Advanced properties

    Property

    Description

    Run configuration

    Run configuration name

    Default name is already filled out for you.

    Deploy status

    Select Production.

    Description

    (Optional) Enter a brief description.

    Log level

    Select how detailed you want the log to be:

    • 1 - Light: Add bare minimum activity logs to the log file.
    • 5 - Medium: Add medium-detailed activity logs to the log file.
    • 10 - Verbose: Add detailed activity logs to the log file.

    File location

    Subdirectories to exclude (separated by ';' ) (Local directory)

    Names of subdirectories to exclude from parsing.

    Input file external validator (Local directory, Windows share, FTP)

    Select any one of the following options:

    • No external validation: Do not use external validation of the CSV file structure.
    • Use external validation script: Use the following script to validate the CSV file:
      • Script to execute: Specify the validation script to use to validate the input file.

    Additional properties

    List of properties

    1. Click Add.
    2. Add an additional property in the etl.additional.prop.n box.
    3. Click Apply.
      Repeat this task to add more properties.
       
      This ETL supports semicolon-separated CSV files by default. Use the extract.default.separator property to set a different separator, such as a comma.
      When you run the ETL at log level 5 and set the separator by using this property, the following log message confirms that the default separator has been successfully updated:
       Using configured separator [<separator>]
      where <separator> represents the value you specified in the property, such as a comma or any other character, indicating that the ETL is using your chosen separator rather than the default one.

    Loader configuration

    Empty dataset behavior

    Select one of the following actions if the loader encounters an empty dataset:

    • Warn: Generate the warning about the empty dataset.
    • Ignore: Ignore the empty dataset and continue parsing.

    Maximum number of rows for CSV output

    A number that limits the size of the output files.

    Remove domain suffix from data source name

    (Only for systems) If set to True, the domain name is removed from the data source name. For example, server.domain.com will be saved as server.

    Leave domain suffix to system name

    (Only for systems) If set to True, the domain name is maintained in the system name. For example: server.domain.com will be saved as it is.

    Skip entity creation

    (Only for ETL tasks sharing lookup with other tasks) If set to True, this ETL does not create an entity, and discards data from its data source for entities not found in BMC Helix Continuous Optimization. It uses one of the other ETLs that share lookup to create the new entity.

    Scheduling options

    Hour mask

    Specify a value to execute the task only during particular hours within the day. For example, 0 – 23 or 1,3,5 – 12.

    Day of week mask

    Select the days so that the task can be executed only during the selected days of the week. To avoid setting this filter, do not select any option for this field.

    Day of month mask

    Specify a value to execute the task only during particular days within a month. For example, 5, 9, 18, 27 – 31.

    Apply mask validation

    By default this property is set to True. Set it to False if you want to disable the preceding Scheduling options that you specified. Setting it to False is useful if you want to temporarily turn off the mask validation without removing any values.

    Execute after time

    Specify a value in the hours:minutes format (for example, 05:00 or 16:00) to wait before the task must be executed. This means that once the task is scheduled, the task execution starts only after the specified time passes.

    Enqueueable

    Select one of the following options:

    • False (Default): While a particular task is already running, if the next execution command arises – it is ignored.
    • True: While a particular task is already running, if the next execution command arises – it is placed in a queue and is executed as soon as the current execution ends.
  4. Click Save.
    You return to the Last run tab under the ETL tasks page.
  5. Validate the results in simulation mode: In the ETL tasks table under ETL tasks > Last run, locate your ETL (ETL task name), click run_etl.png to run the ETL.
    After you run the ETL, the Last exit column in the ETL tasks table will display one of the following values:
    • OK: The ETL executed without any error in simulation mode.
    • WARNING: The ETL execution returned some warnings in simulation mode. Check the ETL log.
    • ERROR: The ETL execution returned errors and was unsuccessful. Edit the active Run configuration and try again.
  6. Switch the ETL to production mode: To do this, perform the following task:
    1. In the ETL tasks table under ETL tasks > Last run, click the ETL under the Name column.
    2. In the Run configurations table in the ETL details page, click edit icon.png to edit the active run configuration.
    3. In the Edit run configuration page, expand Run configuration and set Execute in simulation mode to No.
    4. Click Save.
  7. Locate the ETL in the ETL tasks table and click run_etl.png to Run it or schedule an ETL run.
    After you run the ETL or schedule the ETL for a run, it will extract the data from the source and transfer it to the BMC Helix Continuous Optimization database. 

Input file format

The input file follows the open ETL structure.

Here is a sample of the supported file structure for system metrics:

#TABLENAME=BUF_table_test
#BEHAVIOUR=APPEND
#KEYCOLUMN=VAL1;VAL2
#COLUMNTYPES=NUMBER;NUMBER;VARCHAR(50)
VAL1;VAL2;NAME
1;2;value1
1;3;value2
1;4;value3
6;6;valueUpdated

Select one of the APPEND, UPDATE, or TRUNCATE behaviors to import data in the buffer table. If data is present in the input file and you select the TRUNCATE behavior, the table will be emptied out before loading new data. This is the same as selecting APPEND behavior and enabling the Truncate table before load option in the ETL configuration.

The specified buffer table name will be checked in the BMC Helix Continuous Optimization schema. If the table is not present, create the table by using the data mart API. For details, see Datamart-API-endpoints.
 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*