Moviri - Ambari Extractor
The integration supports the extraction of both performance and configuration data across different Hadoop components and can be configured via parameters that allow entity filtering and many other settings. Furthermore the connector is able to replicate relationships and logical dependencies among entities such as clusters, resource pools, services and nodes.
The documentation is targeted at BMC Helix Continuous Optimizationadministrators, in charge of configuring and monitoring the integration between BMC Helix Continuous Optimizationand Ambari.
Moviri Integrator for BMC Helix Continuous Optimization - Ambari is compatible with BMC Helix Continuous Optimization 19.11 and onward.
Supported versions of data source software
- Supported Hadoop distribution is Hortonworks Data Platform (HDP): version 2.2
- Supported Ambari: version 1.6 to 3.1.4
The integration, though not officially supported, is expected to work on custom deployments of Hadoop components that leverage Ambari as their management and monitoring service.
Supported configurations of data source software
Moviri – Ambari Extractor requires Ambari is continuously and correctly monitoring the various entities supported by the integration, full list available below. Any lack in meeting this requirement will cause lask in data coverage.
Downloading the additional package
Network View is made available in the form of an additional component, which you may download from BMC electronic distribution site (EPD) or retrieve from your content media.
Installing the additional package
To install the connector in the form of a BMC Helix Continuous Optimizationadditional package, refer to Performing system maintenance tasks instructions.
Datasource Check and Configuration
Preparing to connect to the data source software
The connector included in "Moviri Integrator for BMC Helix Continuous Optimization – Ambari" use the Ambari REST API to communicate with Ambari. This is always enabled and no additional configuration is required.
The REST API will extract cluster version out first and use that version to access cluster.
Please note that only GET method is used by the connector, preventing any accidental change to the environments.
The connector requires a regular Cloudera Manager user (admin privileges not required) with read-only permissions on all the clusters that should be accessed.
Connector configuration attributes
The following table shows specific properties of the connector, all the other generic properties are documented here.
Ambari server hostname
|Spark Hostname||String||Yes||Spark server hostname|
|Ambari Port||Number||Yes||8080||Ambari connection port|
|Spark Port||Number||Yes||18080||Spark connection port|
|Connection Timeout||Number||No||20||Connection timeout in seconds|
Import data at node level
Import data at pool level
HDFS data import
Import data about HDFS service
YARN data import
Import data about YARN service
|HBASE data import||Boolean||Yes||true||Import data about HBASE service|
|SPARK data import||Boolean||Yes||true||Import data about SPARK service|
|Cluster regexp whitelist, semicolon separated||String||No||List of clusters to be imported, semicolon separated. Regexp is supported.|
|Cluster regexp blacklist, semicolon separated||String||No||List of clusters not to be imported, semicolon separated. Regexp is supported. This setting overrides whitelist in case of conflict.|
|Host regexp whitelist, semicolon separated||String||No||List of hosts to be imported, semicolon separated. Regexp is supported. Setting this field disables aggregation at cluster level.|
|Host regexp blacklist, semicolon separated||String||No||List of hosts not to be imported, semicolon separated. Regexp is supported. This setting overrides whitelist in case of conflict. Setting this field disables aggregation at cluster level.|
|Maximum pool exploration depth||Number||No||A limit to the exploration of nested pools.|
|Substitute any dot char in pools names with this char||Char||No||-||Because the dot is a special char for the Loader component, it's suggested to change it.|
Time Interval Settings
Maximum days to extract for execution
Each ETL run will not extract more than the specified number of days.
Date limit not to extract beyond (YYYY-MM-DD HH24:MI:SS)
Maximum date to be considered while extracting data.
The following image shows the list of options in the ETL configuration menu, with also the advanced entries.
The following entities are supported:
- Hadoop Cluster
- Hadoop Resource Pool
- Hadoop Node
In addition to standard system performance metrics, data related to the following Hadoop specific services is gathered:
The connector is able to replicate relationships and logical dependencies among these entities. In particular all the available Clusters are attached to the root of the hierarchy and each Cluster contains its own Nodes, Resource Managers and Services.
Services' data is available among the above entities' metrics, according to the following table.
ETL runs fine but data is partially or totally missing
|Probably data is missing in the datasource. Check from the Ambari web frontend if data is available, otherwise the following image is shown. In such event consider to enable data collection.|