Page tree
    Skip to end of metadata
    Go to start of metadata

    “Moviri Integrator for TrueSight Capacity Optimization – Ambari” allows extracting data from Hadoop deployments through the open-source component Ambari. Relevant capacity metrics are loaded into BMC TrueSight Capacity Optimization, which provides advanced analytics over the extracted data in the form of an interactive dashboard, the Hadoop View.

    The integration supports the extraction of both performance and configuration data across different Hadoop components and can be configured via parameters that allow entity filtering and many other settings. Furthermore the connector is able to replicate relationships and logical dependencies among entities such as clusters, resource pools, services and nodes.

    The documentation is targeted at BMC TrueSight Capacity Optimization administrators, in charge of configuring and monitoring the integration between BMC TrueSight Capacity Optimization and Ambari.

    Requirements

    Supported versions of data source software

    • Supported Hadoop distribution is Hortonworks Data Platform (HDP): version 2.2 and 2.51
    • Supported Ambari: version 1.6 to 2.41, REST API v1

    1 - HDP version 2.5 and Ambari version 2.5 are supported only if you apply the Feature Pack 1 (10.7.01) of the TrueSight Capacity Optimization 10.7.

     

    The integration, though not officially supported, is expected to work on custom deployments of Hadoop components that leverage Ambari as their management and monitoring service.

    Supported configurations of data source software

    Moviri – Ambari Extractor requires Ambari is continuously and correctly monitoring the various entities supported by the integration, full list available below. Any lack in meeting this requirement will cause lask in data coverage.

    Installation

    Downloading the additional package

    Network View is made available in the form of an additional component, which you may download from BMC electronic distribution site (EPD) or retrieve from your content media.

    Installing the additional package

     To install the connector in the form of a TrueSight Capacity Optimization additional package, refer to Performing system maintenance tasks instructions.

     

    Datasource Check and Configuration

    Preparing to connect to the data source software

    The connector included in "Moviri Integrator for TrueSight Capacity Optimization – Ambari" use the Ambari REST API v1 to communicate with Ambari. This is always enabled and no additional configuration is required.
    Please note that only GET method is used by the connector, preventing any accidental change to the environments.
    The connector requires a regular Cloudera Manager user (admin privileges not required) with read-only permissions on all the clusters that should be accessed.

    Connector configuration attributes

    The following table shows specific properties of the connector, all the other generic properties are documented here.

     

    Property Name

    Value Type

    Required?

    Default

    Description

    Ambari Connection

    Ambari Hostname         

    String

    Yes

     

    Ambari server hostname

    Spark HostnameStringYes Spark server hostname
    Ambari PortNumberYes8080Ambari connection port
    Spark PortNumberYes18080Spark connection port
    UserStringYes Username
    PasswordStringYes Password
    Connection TimeoutNumberNo20Connection timeout in seconds

    Data Selection

    Import nodes

    Boolean

    Yes

    true

    Import data at node level

    Import pools

    Boolean

    Yes

    true

    Import data at pool level

    HDFS data import

    Boolean

    Yes

    true

    Import data about HDFS service

    YARN data import

    Boolean

    Yes

    true

    Import data about YARN service

    HBASE data importBooleanYestrueImport data about HBASE service
    SPARK data importBooleanYestrueImport data about SPARK service
    Cluster regexp whitelist, semicolon separatedStringNo List of clusters to be imported, semicolon separated. Regexp is supported.
    Cluster regexp blacklist, semicolon separatedStringNo List of clusters not to be imported, semicolon separated. Regexp is supported. This setting overrides whitelist in case of conflict.
    Host regexp whitelist, semicolon separatedStringNo List of hosts to be imported, semicolon separated. Regexp is supported. Setting this field disables aggregation at cluster level.
    Host regexp blacklist, semicolon separatedStringNo List of hosts not to be imported, semicolon separated. Regexp is supported. This setting overrides whitelist in case of conflict. Setting this field disables aggregation at cluster level.
    Maximum pool exploration depthNumberNo A limit to the exploration of nested pools.
    Substitute any dot char in pools names with this charCharNo-Because the dot is a special char for the Loader component, it's suggested to change it.

    Time Interval Settings

    Maximum days to extract for execution

    Number

    No

    7

    Each ETL run will not extract more than the specified number of days.

    Date limit not to extract beyond (YYYY-MM-DD HH24:MI:SS)

    Date

    No

     

    Maximum date to be considered while extracting data.

     

    The following image shows the list of options in the ETL configuration menu, with also the advanced entries.

    Supported entities

    The following entities are supported:

    • Hadoop Cluster
    • Hadoop Resource Pool
    • Hadoop Node

    In addition to standard system performance metrics, data related to the following Hadoop specific services is gathered:

    • HDFS
    • SPARK
    • YARN
    • HBASE

    Hierarchy

    The connector is able to replicate relationships and logical dependencies among these entities. In particular all the available Clusters are attached to the root of the hierarchy and each Cluster contains its own Nodes, Resource Managers and Services.

    Services' data is available among the above entities' metrics, according to the following table.

     

    HDFS

    YARNHBASESPARK

    Cluster

    X

    XXX
    Pool X  
    NodeX   

    Troubleshooting

    For ETL troubleshooting, please refer to official BMC documentation available here.

    Known issues

    IssueResolution

    ETL runs fine but data is partially or totally missing

     

    Probably data is missing in the datasource. Check from the Ambari web frontend if data is available, otherwise the following image is shown. In such event consider to enable data collection.