Data collection

Collectors in BMC Helix Intelligent Integrations extract data from sources depending on the following parameters:

  • Data collection schedule
  • Data collection window
  • Data latency

You can set these parameters when configuring a collector for an integration, or subsequently modifying it.


Data collection schedule

Collectors extract data based on a schedule, which is known as the data collection schedule. This schedule indicates how often to run the extraction cycle. 

For example, if you set the data collection schedule to 5 mins, the collector runs every 5 mins, as shown in the following image:

You can run the collector in one of the following ways: 

  • Constantly by specifying the schedule in minutes, hours, or days
  • Periodically by specifying the schedule using a cron expression, which is a string consisting of five subexpressions (also called fields) that describe individual details of the schedule. These fields, separated by white spaces, can contain any of the allowed values with various combinations of the allowed characters for that field. These expressions can be useful in handling delayed availability data situations or to avoid having all the extractions query the source at the same time.

    You can specify a cron expression in the following format: Minutes Hours Day of Month Month Day of Week

    The following table shows the allowed values for different fields:

    FieldAllowed values
    Minutes

    0-59

    Hours

    (24-hour clock format)

    0-23
    Day of Month1-31
    Month1-12

    Day of Week

    0-6
    or
    Sun-Sat

    For example, if you specify 10 14 3 3 *, data is collected at 14:10 hours every third day in the month of March.

Data collection window

Data collection schedule is not sufficient for specifying the range of time over which the data should be extracted because the data sources typically have selection criteria such as before this time and between these times. Another parameter, data collection window enables you to specify the range of time (window) over which data should be extracted. For example, if you set the data collection schedule to 5 mins (considering 00:32 as the current time) and the data collection window to 5 mins, data is extracted from 00:27 to 00:32, as shown in the following image:

We recommend you to align or snap the extraction time (shown as Now in the preceding image) to one of the major time intervals. This alignment or snapping makes it much easier to see which extraction data sets correspond to which intervals of time during troubleshooting. The major interval depends on the extraction cycle. At 15 mins extractions, the intervals would be 00:00, 00:15, 00:30, 00:45, etc. At 1 min, they would be 00:00, 00:01, 00:02, 00:03, etc., and at 5 mins, they would be 00:00, 00:05, 00:10, etc. as shown on the preceding timeline.

For example, if we align the extraction time to 00:30, the data time window would be 00:25 to 00:30, as shown in the following image:

In the preceding example, if we started an extraction cycle just after 00:30, first BMC Helix Intelligent Integrations would snap back to 00:30, and then use the data time window (5 mins) to arrive at a time window of 00:25-00:30. It would request data for this time window from the source. 

Data latency

The data latency parameter lets you control how far back the query time bounds are shifted. This parameter is useful in slow, late data availability situations.

In the following example, data latency is set to ~2 mins, and so the extraction window would be from 00:23 - 00:28, when run from just after 00:30.

Was this page helpful? Yes No Submitting... Thank you

Comments