Recovering data using the vis files parser

This section provides information about how to configure a recovery ETL. A recovery ETL is used when you want to recover and import in BMC TrueSight Capacity Optimization data. For more information on the scripts that can be used for recovery, refer to Creating recovery scripts for TrueSight Capacity Optimization and Gateway Server.

To configure the Recovery ETL

The following steps describe the content of the tabs and configuration settings; items that are only available in advanced mode are marked as (blue star). Click Advanced to view the advanced settings available for this ETL.

To configure the recovery ETL, follow these steps:

  1. Navigate to Administration > ETL & SYSTEM TASKS > ETL tasks.
  2. In the ETL tasks page, select Add > Add ETL under the Last run tab.
  3. Enter the following details in the Run configuration tab.
    • Run Configuration name: Enter a name for the ETL. For instance, Gateway Server Data Recovery.
    • Environment: Select environment as Production or Test.
    • Description: Brief description for the ETL.
    • Log Level: Select the level of detail for the execution log (for example, 1 - Light, 5 - Medium, 10 - Verbose)
    • Execute in simulation mode: Select yes if you want to test data import, the ETL will not store actual data into the data warehouse
    • ETL module: Select BMC - TrueSight Capacity Optimization Gateway VIS files parser.
    • (blue star) Datasets: Select the datasets from the list.
    • Platform: Select the platform that the ETL will recover. It should be the same platform as the data that you want to recover.
    • (blue star) Enable platform filtering: Select either yes or no. By default yes is selected.
    • Data type: Select the data types you want to recover.
  4. In the Entity catalog tab, select SHARED to share the lookup with the ETL that is importing online data. If you have a master lookup table that is handling all the Gateway Server ETLs, share it with that ETL.

    You must have one recovery ETL instance per BMC TrueSight Capacity Optimization domain/Gateway Server visualization platform.
  5. In the Object relationships tab, select existing domain to select the domain you are trying to recover.
  6. In the (blue star) Metric filter tab select the metrics that should be loaded for each of the selected data sets.
  7. In the Gateway Server Vis parser tab, 
    • Gateway Server Version: Select the version you are trying to recover.
    • Extractor modeSelect via file.
    • (blue star) Lookup names customization: Select the name customization for hosts and virtual machine.
    • (blue star) User network name as system name for VMware virtual machines: Select if you want to use system name or network name for VMware virtual machines.
    • (blue star) Data recovery mode: Select True to activate data recovery mode.
  8. In the File location tab, select the transfer method and required details. The selected directory must be shared between the recovery script and ETL.

    BMC Capacity Optimization can access data files by means of:

    • Local directory
    • Windows share

      Note

      The Windows share option uses sudo mount for mounting the file system to the BMC Capacity Optimization server. Onlysudo is supported for an su type command alternative, and BMC Capacity Optimization does not support sesu. Also note that the cpit id should be able to sudo to get root privileges, since mount requires root access.

    • FTP
    • SFTP
    • SCP

    Each connection method has slightly different parameters; the connection modes for which a parameter is available are listed in parentheses:

    • File location: The method that will be used to retrieve the file.
    • Directory (local directory, FTP, SFTP, SCP): The directory containing the file.
    • Directory UNC full path (Windows share): The network path to the directory, consisting of three parts: server name, share name and (optional) file path, combined using backslashes; for example, \\server_name\share_name\directory_path.
    • Files to copy (with wildcards) (SFTP, SCP): Before parsing, the sftp and scp commands need to make a local temporary copy of the files; this setting specifies which files in the remote directory should be imported.
    • File list pattern: A regular expression that defines which data files should be read; the default value is (?$<$!done)\$ which means every file whose name does not end with the string "done", for example, my_file_source.done.
    • Recurse into subdirs?: Set this parameter to "yes" if you want BMC Capacity Optimization to also inspect the target subdirectories.
    • After parse operation: Choose what to do after a file has been imported; the available options are:
      • Suffix to the parsed file.
      • Archive the parsed file in a directory.
      • Do nothing.
    • Parsed file suffix: The suffix that will be appended parsed files; default is .done.
    • Archive directory: The directory where the parsed file will be archived; default is $CPITBASE/repository/imprepository.
    • Remote host (FTP, SFTP, SCP): The network address of the target server.
    • Username (Windows share, FTP, SFTP, SCP): The authentication username.
    • Password required (Windows share, FTP, SFTP, SCP): Specify if a password is needed ("yes" or "no").
    • Password (Windows share, FTP, SFTP, SCP): The authentication password.

    The available file retrieval methods work in different ways:

    • Windows share, local directory and FTP:
      1. Access a remote directory.
      2. Get the list of files.
      3. Parse each file whose name corresponds to the File list pattern.
      4. Depending on the After parse operation setting, rename the file appending the chosen suffix (Parsed file suffix), move it to a local archive directory (Archive directory), or do nothing
    • SCP and SFTP:
      1. Create a local temporary directory into $CPITBASE/etl/temp.
      2. Copy the remote files that match the Files to copy (with wildcards) pattern to the new directory.
      3. Of all the copied files, only parse those that correspond to the File list pattern.
      4. Delete the temporary directory and its contents.

    In the latter case, it is your responsibility to name the files in order to ensure that no data is mistakenly duplicated. You can include the creation date into the file name and use is as a discriminant.

    An example of source file could be an Apache log which resides on a remote server:

    1. BMC Capacity Optimization accesses the remote directory /var/log/apache2 using scp
    2. You can instruct scp to only copy the files that match the name pattern specified in the Files to copy (with wildcards)parameter:

      *_access.log*
      

      that is, all the files whose name contains the string _access.log

    3. You can also instruct BMC Capacity Optimization to only parse the logs that were created the previous day, using the File list patternsetting:

      *.log.%YESTY%YESTM%YESTD$
      

      where %YESTY, %YESTM and %YESTD are macros that represent, respectively, yesterday's year, month and day (i.e. yesterday's date).


  9. In the (blue star) Loader configuration tab, specify the following details.
    • Empty dataset behaviour: What to do when no data is available (available options are Abort or Ignore).
    • ETL log file name: Name of the file contains the ETL execution log; the default value is $CPITBASE/log/%AYEAR%AMONTH%ADAY%AHOUR%MINUTES%SRCID.
    • Maximum number of rows for CSV output: A number which limits the size of the output files.
    • CSV loader output file name: Name of the file generated by the CSV loader; the default value is $CPITBASE/output/%AYEAR%AMONTH%ADAY%AHOUR%ZPROG%DSCD%SRCID.
    • BCO loader output file name: Name of the file generated by the BMC Capacity Optimization loader; the default value is $CPITBASE/output/%AYEAR%AMONTH%ADAY%HOUR%ZPROG%DSCD%SRCID.
    • Detail mode: Available options are Standard, Raw also, or Raw only.
    • Reduce priority: Available options are Normal or High.
    • Remove domain suffix from data source name: (Only for systems) if true, the domain name is removed from the data source name; e.g. server.domain.com will be saved as server.
    • Leave domain suffix to system name: (Only for systems) if true, the domain name is maintained in the system name; for example, server.domain.com will be saved as such.
    • Update grouping object definition: If this option is selected, the ETL will be allowed to update the grouping object definition for a metric loaded by an ETL.
    • Skip entity creation: (Only for ETL tasks sharing lookup with other tasks) If this option is selected, this ETL does not create an entity, and discards data from its data source for entities not found in BMC Capacity Optimization. It uses one of the other ETLs that share lookup to create the new entity.
  10. In the (blue star) Scheduling options tab, specify the details to schedule a recovery ETL. Enter the following details in the properties.
    • Hour mask: Hour mask to schedule a recovery ETL.
    • Day of week mask: Select the type of day of week mask from the check box.
    • Day of month mask: Number of days of month mask.
    • Apply mask validation: Select mask validation either as true or false.
    • Execute after time: Execute after time of a recovery ETL.
    • Enqueueable: Select if the ETL should be enqueueable.
  11. In the ETL task properties tab, specify the following details.
    • Task group: Select the Task group from the Task group drop- down box.
    • Running on scheduler: By default Primary Scheduler is selected.
    • Maximum execution time before warning: Set Maximum execution time before warning in format (minutes, hours, and days).
    • Frequency: By default Predefined is select as frequency. In case of selecting frequency as Custom, you need to set Custom frequency and Custom start timestamp.
    • Predefined frequency: Select Predefined frequency, either as day, week, or month.
    • Start timestamp: Specify timestamp in hour and minute format.
  12. Select Save to apply the changes.
Was this page helpful? Yes No Submitting... Thank you

Comments