Page tree

The "Moviri– Splunk Generic Extractor" connector aims at importing almost any Business Driver or System metric contained in your Splunk instance that are not specifically mapped by the other Splunk connectors provided by Moviri.
It works in a similar fashion to the built-in 'Generic' connectors available in BMC TrueSight Capacity Optimization and its main use case is the analysis from a capacity management perspective of custom metrics, for example:

  • Baselining
  • Historical analysis, seasonality identification
  • Trending and forecasting
  • Correlation with other metrics (specially infrastructure utilization) already present in BMC TrueSight Capacity Optimization (or in their turn imported from Splunk) to enable capacity modeling and what-if scenarios


In order to do so it must be provided with:

  • A Splunk search query for retrieving results
  • How the results of the query map into BMC TrueSight Capacity Optimization data model


At each execution, the connector

  • Executes the provided search query (this step can be skipped for Splunk saved search)
  • Retrieves the result set of the query
  • Transforms the result set according to the specified mapping, producing BMC TrueSight Capacity Optimization Datasets
  • Load the Datasets into CO
  • Stores the most recent timestamp to be used as lower time boundary in the next execution

How to specify the search query for retrieving results

Two options are available for specifying the search query:

  1. a Splunk "saved search": i.e. a ready to use search saved on Splunk instance which can retain historical results of previous executions for a settable retention period and/or be scheduled to be periodically executed. Additionally, saved searches can be "accelerated" for faster execution. "Moviri Integration for BMC TrueSight Capacity Optimization – Splunk (Generic)" can exploit Splunk saved searches functionality to
    • define queries on the Splunk side, thus avoiding the management and maintenance of search query syntax on the BMC TrueSight Capacity Optimization side
    • rely on already available results produced by previous on-demand or scheduled executions, and only optionally execute the saved search if need arises
    • take advantage of saved searches acceleration
  2. a search query text: the search is saved in the connector configuration properties and it is passed at each connector execution to the Splunk instance

 
In both cases search results must meet the following requirements:

  • Results must come in a tabular format comprising multiple fields
  • Each record of the results table must refer to the same aggregated period of time (e.g. one record for each hour, one record for each day…)
  • A timestamp field must be present in order to identify the beginning of the time period (e.g. "2013-05-24 13:00:00 for hourly granularity, "2013-06-03 00:00:00" for daily granularity). Splunk has an implicit "_time" field for each event it processes, so the natural choice is to use it as the timestamp.
  • At least one value field must be present containing the data series to be transferred

In the table below a minimal result set example conforming to the above exposed requirements is presented: the daily number of logins on a system.

_time

Logins_on_sysA

2013-06-05T00:00:00.000+0200

28

2013-06-06T00:00:00.000+0200

24

2013-06-07T00:00:00.000+0200

25

2013-06-08T00:00:00.000+0200

1

2013-06-09T00:00:00.000+0200

1

 

How to select the appropriate dataset

When creating an ETL task that uses the 'Moviri – Splunk Generic Extractor', the first configuration step is associating one or more datasets to the ETL task.

  • Select 'WKLDAT' if the task is going to import 'Business Drivers'
  • Select 'SYSDAT' if the task is going to import 'System' data
  • Select 'APPDAT' if the task is going to import Application configuration This case is not going to be covered in this document; please refer to support for more information if required

 


How to map search query results to BMC TrueSight Capacity Optimization data model

The most important configuration of the connector is the definition of how to map each time series, extracted from Splunk, to

  • Entities (either Business Drivers or Systems);
  • Metrics (also referred to as resources);
  • Metric Subobjects (also referred to as subresources).

These mapping are very similar to the ones required by the built-in Generic - Database extractor:

  • Entities:
    • For Business Drivers (WKLDAT dataset) this is equivalent to DS_WKLDNM column, representing "Business driver lookup identifier"
    • For Systems (SYSDAT dataset) this is the equivalent of the DS_SYSNM column, representing the "System lookup identifier"
  • Metrics (also referred to as resources)
    • Equivalent of the OBJNM column
  • Metric Subobjects (also referred to as subresources)
    • Equivalent of the SUBOBJNM column
    • required when the metric is not of type 'GLOBAL', for example for every metric that has a sub-category dimension

For any time series the connector allows the following mapping options for the three above mentioned dimensions:

  • Entities
    • Use the name of the Splunk search query column containing the series values
    • Input a fixed string
    • Use values from another Splunk search query column to identify entity name
  • Metric (or resource)
    • Select among some proposed generic metrics (applicable to Business Drivers entities)
    • Input a specific metric (Advanced)
  • Metric Subobject (or subresource), when metric is not of type GLOBAL
    • Input a fixed string
    • Use values from another Splunk search query column to identify subobject

Some examples are reported in the remainder of this paragraph to facilitate the application of the above mentioned principles: each example includes the Splunk search query result set, the Splunk-to-BMC TrueSight Capacity Optimization mappings, the resulting BMC TrueSight Capacity Optimization series and a screenshot of relevant configuration properties from the "Splunk – Query and Mapping" ETL task configuration tab.


EXAMPLE 1: number of daily logins on two services

 

_time

svcA

svcB

2013-06-05T00:00:00.000+0200

28

12

2013-06-06T00:00:00.000+0200

24

32

2013-06-07T00:00:00.000+0200

25

11

2013-06-08T00:00:00.000+0200

1

4

2013-06-09T00:00:00.000+0200

1

3


In this example, the 'Saved Search' query on Splunk returns on two different columns the number of daily logins on two services ('svcA' and 'svcB') and we need to import these metrics in CO, as two separate business drivers.

  • Select 'WKLDAT' as dataset, as this data is related to Business Drivers
  • As 'Entity', we specify to use the name of the column (so that business drivers 'svcA' and 'svcB' will be created)
  • As 'Metric', the number of daily logins can be mapped to "a count of events over time" and thus fits the description of the metric.
  • As 'Metric subresource', the selected metric does not require a 'subobject', being a global metric and so we are not required to specify it.


This is the final mapping:

Query Column

 

Mapping

 

Resulting Series

Entity

Metric

Subresource

svcA

  • Entity: use the name of the query column
  • Metric: "a count of events/items/operations over time"
  • subresource: no need to specify as the metric is global

svcA

TOTAL_EVENTS

GLOBAL

svcB

  • Entity: use the name of the query column
  • Metric: "a count of events/items/operations over time"
  • subresource: no need to specify as the metric is global

svcB

TOTAL_EVENTS

GLOBAL



EXAMPLE 2: number of daily logins on monitored services

_time

Services

Logins

2013-06-05T00:00:00.000+0200

svcB

12

2013-06-06T00:00:00.000+0200

svcB

32

2013-06-07T00:00:00.000+0200

svcB

11

2013-06-08T00:00:00.000+0200

svcB

4

2013-06-09T00:00:00.000+0200

svcB

3

2013-06-05T00:00:00.000+0200

svcA

28

2013-06-06T00:00:00.000+0200

svcA

24

2013-06-07T00:00:00.000+0200

svcA

25

2013-06-08T00:00:00.000+0200

svcA

1

2013-06-09T00:00:00.000+0200

svcA

1

2013-06-05T00:00:00.000+0200

svcC

45

2013-06-06T00:00:00.000+0200

svcC

56

2013-06-07T00:00:00.000+0200

svcC

44

2013-06-08T00:00:00.000+0200

svcC

67

2013-06-09T00:00:00.000+0200

svcC

87

In this example, the query on Splunk returns the number of daily logins on two services ('svcA' and 'svcB') specifying the specific service in the 'Services' column and we need to import these metrics in CO, as two separate business drivers.

  • Select 'WKLDAT' as dataset, as this data is related to Business Drivers
  • As 'Entity', we specify to use the values in the 'Services' column (so that business drivers 'svcA' and 'svcB' will be created)
  • As 'Metric', the number of daily logins can be mapped to "a count of events over time" and thus fits the description of the metric.
  • As 'Metric subresource', the selected metric does not require a 'subobject', being a global metric and so we are not required to specify it.

This is the final mapping:

Query Column

 

Mapping

 

Resulting Series

Entity

Metric

Subresource

Logins

 

 

  • Entity: use values from query column "Services"
  • Metric: select "a count of events/items/operations over time"
  • subresource: no need to specify as the metric is global


svcA

TOTAL_EVENTS

GLOBAL

svcB

TOTAL_EVENTS

GLOBAL

svcC

TOTAL_EVENTS

GLOBAL

 


Note that in this case as new Services will appear in query results new Entities (Business Drivers) will be created in CO.



 

 
EXAMPLE 3: data volumes managed by a Datawarehouse system, split by its sub-procedures (steps)

 

_time

step

numOf LogLines

Avg Parallelism

Execution Time (s)

Tot Samples Processed

2013-07-05T21:00:00.000+0200

step1

12

1

3,854

136

2013-07-05T21:00:00.000+0200

step2

4

1

3,017

899

2013-07-05T21:00:00.000+0200

step3

2

8

1,485

64

2013-07-05T21:00:00.000+0200

step4

1

8

0,312

1

2013-07-05T21:00:00.000+0200

step5

8

3

1,083

65

2013-07-05T21:00:00.000+0200

step6

4

3

1,507

170

This is the mapping:

 

Query Column

Mapping

Resulting Series

EntityMetricSubresource

Avg Parallelism

 

 

 

 

  • Entity: use "Active Jobs"
  • Metric: "a number of concurrent/standing/open items split by sub-category"
  • subresource: use values from column Step




Active Jobs

BYSET_EVENTS_CURRENT

step1

Active Jobs

BYSET_EVENTS_CURRENT

step2

Active Jobs

BYSET_EVENTS_CURRENT

step3

Active Jobs

BYSET_EVENTS_CURRENT

step4

Active Jobs

BYSET_EVENTS_CURRENT

step5

Tot Samples Processed

 

 

 

 

  • Entity: use "Processed Samples"
  • Metric: "a count of events/items/operations over time split by sub-category"
  • subresource: use values from column Step




Processed Samples

BYSET_EVENTS_CURRENT

step1

Processed Samples

BYSET_EVENTS_CURRENT

step2

Processed Samples

BYSET_EVENTS_CURRENT

step3

Processed Samples

BYSET_EVENTS_CURRENT

step4

Processed Samples

BYSET_EVENTS_CURRENT

step5


Note that in this case as new Steps will appear in query results new subresources will be attached to existing BMC TrueSight Capacity Optimization Entities.



 

 
EXAMPLE 4: cpu utilization by host

 

_time

host

numSamples

pctUser

pctSystem

pctIowait

pctUtil

2013-06-09T10:00:00.000+0200

movvm123

120

0,0111

0,0127

0,0047

0,0238

2013-06-09T11:00:00.000+0200

movvm123

120

0,0193

0,0147

0,0017

0,0340

2013-06-09T12:00:00.000+0200

movvm123

120

0,0178

0,0158

0,0021

0,0336

2013-06-09T13:00:00.000+0200

movvm123

120

0,0210

0,0158

0,0047

0,0368

2013-06-09T14:00:00.000+0200

movvm123

120

0,0088

0,0148

0,0107

0,0236

2013-06-09T15:00:00.000+0200

movvm123

120

0,0123

0,0146

0,0024

0,0269

2013-06-09T16:00:00.000+0200

movvm123

120

0,0123

0,0139

0,0128

0,0262

2013-06-09T17:00:00.000+0200

movvm123

120

0,0097

0,0118

0,0022

0,0215

2013-06-09T18:00:00.000+0200

movvm123

120

0,0125

0,0158

0,0023

0,0283

2013-06-09T19:00:00.000+0200

movvm123

120

0,0107

0,0139

0,0030

0,0246

2013-06-09T20:00:00.000+0200

movvm123

120

0,0098

0,0135

0,0026

0,0233

 

This is the mapping:

Query Column

Mapping

Resulting Series

EntityMetricSubresource

pctUser

  • Entity: use values from column "host"
  • Metric: specified BMC TrueSight Capacity Optimization metric "CPU_UTIL_USER"
  • subresource: Fixed=GLOBAL

movvm123

CPU_UTIL_USER

GLOBAL

pctSystem

  • Entity: use values from column "host"
  • Metric: specified BMC TrueSight Capacity Optimization metric "CPU_UTIL_SYSTEM"
  • subresource: Fixed=GLOBAL

movvm123

CPU_UTIL_SYSTEM

GLOBAL

pctUtil

  • Entity: use values from column "host"
  • Metric: specified BMC TrueSight Capacity Optimization metric "CPU_UTIL"
  • subresource: Fixed=GLOBAL

movvm123

CPU_UTIL

GLOBAL

pctIowait

  • Entity: use values from column "host"
  • Metric: specified BMC TrueSight Capacity Optimization metric "CPU_UTIL_WAIO"
  • subresource: Fixed=GLOBAL

movvm123

CPU_UTIL_WAIO

GLOBAL


Note that in this case as new hosts will appear in query results new BMC TrueSight Capacity Optimization Entities will be created.



 

Full list of configuration properties

The following are the specific settings valid for connector "Moviri – Splunk Generic Extractor", they are presented in the "Splunk – Query and Mapping" configuration panel.

Property Name

Condition

Type

Required?

Default

Description

Search Type

 

Selection

Yes

 

Specify to use either a Splunk Saved Search or to manually input the Search Text

Query Text

Search Type = "Input Text"

String

 

No

 

The Splunk search query conforming to Splunk syntax Refer to Splunk documentation about searches and search syntax:

http://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsearch

and producing a result set conforming to criteria exposed in paragraph 5.2.1

Saved Search App

Search Type = "Splunk Saved Search"

String

No

 

The Splunk app the saved search belongs to. It is optional.

Saved Search Name

Search Type = "Splunk Saved Search"

String

Yes

 

The Splunk Saved Search name.

Saved Search Mode

Search Type = "Splunk Saved Search"

String

Yes

"Import data only if search is scheduled" and "Import data from existing results"

Three independent behaviors that affect how data is extracted from saved search. Each one can be enabled with the following result:

  • "Import data only if search is scheduled": checks if saved search is scheduled on Splunk instance. If it is not the connector won't extract any data.
  • "Import data from existing results": will look for existing results and if found import data from them
  • "Execute search to look for new data": it allows saved search to be executed if existing results cannot be used or they do not contain recent data (up to latest day or latest hour according to time granularity)

Timestamp column

 

String

Yes

_time

The name of the result set column containing the records' timestamps

Value Columns to import

 

String

(max column lenght: 28)

Yes

 

Semicolon separated list of the result set columns containing the values of the data series to be imported.


-- Following properties are repeated for each value column specified in "Value Columns to import" –

 

 

 

 

 

Use <<columnX>> as BCO Entity Name?

 

Selection

Yes

Yes

If set to yes tells the connector to use the value column name as the BMC TrueSight Capacity Optimization Entity Name

BCO Entity Name

Use <<columnX>> as Entity Name? = "No"

Selection

Yes

 

Specify to use as BMC TrueSight Capacity Optimization Entity Name either a "Fixed" string or the values taken from another result set column ("Based on Query Column")

Entity Name Value =

Entity Name="Fixed"

String
(max lenght 28) 

Yes

 

The BMC TrueSight Capacity Optimization Entity Name

Query Column for Entity Name=

Entity Name=" Based on Query Column"

String
(max lenght 28) 

Yes

 

The query column where to read BMC TrueSight Capacity Optimization Entity Name

BCO Metric: <<columnX>> represents

 

Selection

Yes

 

Specify which BMC TrueSight Capacity Optimization metric to use to map the data series. A textual description is provided for commonly used Business Drivers metrics (see Table 1 BMC TrueSight Capacity Optimization Metrics descriptions)
An option is also present to manually input the BMC TrueSight Capacity Optimization metric name.

BCO Metric=

Metric: <<columnX>> represents = "Specify Metric (Advanced)"

String

Yes

 

A valid BMC TrueSight Capacity Optimization metric that represents the data series to be imported

Subobject (sub-category) Name

"Metric: <<columnX>> represents" contains a sub-category or is equal to "Specify Metric (Advanced)"

Selection

Yes

 

Specify to use as subresource name either a "Fixed" string or the values taken from another result set column ("Based on Query Column")

Subobject Name Value =

Subobject Name="Fixed"

String

Yes

 

The subobject (subresource) name

Query Column for Subobject Name=

Subobject Name=" Based on Query Column"

String
(max lenght 28) 

Yes

 

The query column where to read subobject (subresource)

Number of events/operations (weight of the response time)

"Metric: <<columnX>> represents" refers to a response time or is equal to "Specify Metric (Advanced)"

String

Yes

 

Metrics referring to response times (or custom metrics) need a weight to be input in order to more correctly compute averages.
This property specify where to read the weight:

  • "Not specified".
  • "Fixed"
  • "Based on Query Column"

Weight Value =

Number of events/operations (weight of the response time)="Fixed"

Integer

Yes

 

The value for the weight

Query Column for Weight=

Number of events/operations (weight of the response time)=" Based on Query Column"

String

Yes

 

The query column where to read the weight

Description

Corresponding BMC TrueSight Capacity Optimization Metric

a count of events/items/operations over time

TOTAL_EVENTS

a number of concurrent/standing/open items/customers...

EVENTS_CURRENT

the number of users in a system

USERS_CURRENT

a rate of events/items/operations over time (events/s)

EVENT_RATE

a response time

EVENT_RESPONSE_TIME

a count of events/items/operations over time split by a sub-category

BYSET_EVENTS

a number of concurrent/standing/open items split by a sub-category

BYSET_EVENTS_CURRENT

the number of users in a system split by a sub-category

BYSET_USERS_CURRENT

a rate of events/items/operations over time (events/s) split by a sub-category

BYSET_EVENT_RATE

a response time split by a sub-category

BYSET_RESPONSE_TIME

Specify Metric (Advanced)

 
  • No labels