“Moviri Integrator for TrueSight Capacity Optimization – Cloudera” is an additional component of BMC TrueSight Capacity Optimization product. It allows extracting data from Cloudera Enterprise, which is Cloudera Hadoop distribution composed of CDH (Cloudera Data Hub) and Cloudera Manager. Relevant capacity metrics are loaded into BMC TrueSight Capacity Optimization, which provides advanced analytics over the extracted data in the form of an interactive dashboard, the Hadoop View.
The integration supports the extraction of both performance and configuration data across different component of CDH and can be configured via parameters that allow entity filtering and many other settings. Furthermore the connector is able to replicate relationships and logical dependencies among entities such as clusters, resource pools, services and nodes.
The documentation is targeted at BMC TrueSight Capacity Optimization administrators, in charge of configuring and monitoring the integration between BMC TrueSight Capacity Optimization and Cloudera.
Moviri – Cloudera Extractor requires Cloudera Manager is continuously and correctly monitoring the various entities supported by the integration, full list available below. Any lack in meeting this requirement will cause lack in data coverage.
ETL Module is made available in the form of an additional component, which you may download from BMC electronic distribution site (EPD) or retrieve from your content media.
To install the connector in the form of a TrueSight Capacity Optimization additional package, refer to Performing system maintenance tasks instructions.
The connector included in "Moviri Integrator for TrueSight Capacity Optimization – Cloudera" use the Cloudera Java API v6 to communicate with Cloudera Manager. This is always enabled and no additional configuration is required.
Please note that only SELECT statements are used by the connector, preventing any accidental change to the environments.
The connector requires a read-only user with permissions on all the clusters that should be accessed.
The following table shows specific properties of the connector, all the other generic properties are documented here.
Cloudera Manager Connection
Cloudera server hostname
|Cloudera Port||Number||Yes||7180||Cloudera connection port|
|Spark Port||Number||Yes||18080||Spark connection port|
|Connection Timeout||Number||No||20||Connection timeout in seconds|
|Use Encryption (TLS)||Boolean||Yes||false||Use encryption|
|Ignore certificate validation||Boolean||Yes||false||Ignore validation of TLS certificate|
|Ignore common name validation||Boolean||Yes||false||Ignore validation of TLS common name|
|Warn if version is unsupported||Boolean||Yes||false||Warn in the event the Cloudera Manager version is unsupported|
|Data Granularity||Multiple||Yes||10 minutes||Granularity of data to be imported|
Import data at node level
Import data at pool level
|Import hbase||Boolean||Yes||true||Import data about HBASE service|
|Import spark||Boolean||Yes||true||Import data about Spark service|
|Substitute any dot char in pools names with this char||Char||No||-||Because of the dot is a special char for the Loader component, it's suggested to change it|
Time Interval Settings
|Default Last Counter (YYYY-MM-DD HH24:MI:SS Z)||Date||Yes||Default last counter value|
|Relocate data to timezone (e.g. America/New_York, leave empty to use BCO timezone)||String||No||Timezone to which relocate any imported sample|
Limit extraction to date (YYYY-MM-DD HH24:MI:SS)
Maximum date to be considered while extracting data
Max days to import in a single run (0 for no limit)
Maximum days to collect in a single ETL run
The following image shows the list of options in the ETL configuration menu, with also the advanced entries.
The following entities are supported:
In addition to standard system performance metrics, data related to the following Hadoop specific services is gathered:
The connector is able to replicate relationships and logical dependencies among these entities. In particular all the available Clusters are attached to the root of the hierarchy and each Cluster contains its own Nodes and Pools.
Services' data is available among the above entities' metrics, according to the following table.
For ETL troubleshooting, please refer to official BMC documentation available here.