Page tree
    Skip to end of metadata
    Go to start of metadata

    “Moviri Integrator for TrueSight Capacity Optimization – Cloudera” is an additional component of BMC TrueSight Capacity Optimization product. It allows extracting data from Cloudera Enterprise, which is Cloudera Hadoop distribution composed of CDH (Cloudera Data Hub) and Cloudera Manager.  Relevant capacity metrics are loaded into BMC TrueSight Capacity Optimization, which provides advanced analytics over the extracted data in the form of an interactive dashboard, the Hadoop View.

    The integration supports the extraction of both performance and configuration data across different component of CDH and can be configured via parameters that allow entity filtering and many other settings. Furthermore the connector is able to replicate relationships and logical dependencies among entities such as clusters, resource pools, services and nodes.

    The documentation is targeted at BMC TrueSight Capacity Optimization administrators, in charge of configuring and monitoring the integration between BMC TrueSight Capacity Optimization and Cloudera.

    Requirements

    Supported versions of data source software

    • Supported Cloudera Data Hub and Cloudera Manager versions: 5.1 to 5.101
    • The integration supports both Cloudera Manager bundled in Cloudera Enterprise and Cloudera Express products

    1- Cloudera Data Hub and Cloudera Manager versions 5.9 and 5.10 are supported only if you apply the Feature Pack 1 (10.7.01) of the TrueSight Capacity Optimization 10.7.

    Supported configurations of data source software

    Moviri – Cloudera Extractor requires Cloudera Manager is continuously and correctly monitoring the various entities supported by the integration, full list available below. Any lack in meeting this requirement will cause lack in data coverage.

    Installation

    Downloading the additional package

    ETL Module is made available in the form of an additional component, which you may download from BMC electronic distribution site (EPD) or retrieve from your content media.

    Installing the additional package

     To install the connector in the form of a TrueSight Capacity Optimization additional package, refer to Performing system maintenance tasks instructions.

     

    Datasource Check and Configuration

    Preparing to connect to the data source software

    The connector included in "Moviri Integrator for TrueSight Capacity Optimization – Cloudera" use the Cloudera Java API v6 to communicate with Cloudera Manager. This is always enabled and no additional configuration is required.
    Please note that only SELECT statements are used by the connector, preventing any accidental change to the environments.
    The connector requires a read-only user with permissions on all the clusters that should be accessed.

    Connector configuration attributes

    The following table shows specific properties of the connector, all the other generic properties are documented here.

    Property Name

    Value Type

    Required?

    Default

    Description

    Cloudera Manager Connection

    Hostname         

    String

    Yes

     

    Cloudera server hostname

    Cloudera PortNumberYes7180Cloudera connection port
    Spark PortNumberYes18080Spark connection port
    UserStringYes Username
    PasswordStringYes Password
    Connection TimeoutNumberNo20Connection timeout in seconds
    Use Encryption (TLS)BooleanYesfalseUse encryption
    Ignore certificate validationBooleanYesfalseIgnore validation of TLS certificate
    Ignore common name validationBooleanYesfalseIgnore validation of TLS common name
    Warn if version is unsupportedBooleanYesfalseWarn in the event the Cloudera Manager version is unsupported

    Data Selection

    Data GranularityMultipleYes10 minutesGranularity of data to be imported

    Import nodes

    Boolean

    Yes

    true

    Import data at node level

    Import pools

    Boolean

    Yes

    true

    Import data at pool level

    Import hbaseBooleanYestrueImport data about HBASE service
    Import sparkBooleanYestrueImport data about Spark service
    Substitute any dot char in pools names with this charCharNo-Because of the dot is a special char for the Loader component, it's suggested to change it

    Time Interval Settings

    Default Last Counter (YYYY-MM-DD HH24:MI:SS Z)DateYes Default last counter value
    Relocate data to timezone (e.g. America/New_York, leave empty to use BCO timezone)StringNo Timezone to which relocate any imported sample

    Limit extraction to date (YYYY-MM-DD HH24:MI:SS)

    Date

    No

     

    Maximum date to be considered while extracting data

    Max days to import in a single run (0 for no limit)

    Number

    No

     

    Maximum days to collect in a single ETL run

    The following image shows the list of options in the ETL configuration menu, with also the advanced entries.

    Supported entities

    The following entities are supported:

    • Hadoop Cluster
    • Hadoop Resource Pool
    • Hadoop Node

    In addition to standard system performance metrics, data related to the following Hadoop specific services is gathered:

    • HDFS
    • SPARK
    • YARN
    • HBASE
    • MAP REDUCE

    Hierarchy

    The connector is able to replicate relationships and logical dependencies among these entities. In particular all the available Clusters are attached to the root of the hierarchy and each Cluster contains its own Nodes and Pools.

    Services' data is available among the above entities' metrics, according to the following table.

     

    HDFS

    YARNHBASEMAP REDUCESPARK

    Cluster

    X

    XXXX
    Pool X   
    NodeX    

    Troubleshooting

    For ETL troubleshooting, please refer to official BMC documentation available here.