Monitoring configuration best-practice process

To install monitors for collecting performance data and for enabling event management, system administrators should follow the recommended process of incrementally creating monitors in a development environment and testing before moving to a production environment. Each workflow illustrates the process steps discussed in the table that follows.

Perform steps 1 through 12 in a test or development environment.
Perform steps 13 through 21 in production.

Before you begin

This process assumes you have already planned the deployment, installed the Infrastructure Management system, and configured Infrastructure Management components.

For information about setting up this environment, see Staging-Integration-Service-host-deployment-and-policy-management-for-development-test-and-production-best-practices.

Tip

Review the Monitoring-configuration-best-practice-reference before proceeding with this process.

Monitoring configuration guidelines

Keep the following guidelines in mind during the configuration process:

Implement the configuration settings in the order outlined in this topic according to the process workflow. You can deviate from this best practice if you want; however, it is easier to understand the implementation and stay organized if you follow the process provided, especially during the initial implementation.

Start with a small number of agents in production to minimize risk.
Create, test, and validate before moving to production. Do not edit in production unless you find a problem in production that requires editing.

Monitoring is not applied to policy-managed agents until the policies in production are enabled. This is the point where you “go live” in production with monitoring. Backing out before then is easy. Backing out afterwards can be difficult, depending upon the situation. (Massive unintended data collection into the BMC TrueSight Infrastructure Management Server due to poor policy configuration is an example of a potentially difficult situation.)

Although a single policy can include configuration settings from both the categories (infrastructure and monitoring), it is a best practice to separate infrastructure configuration and monitoring configuration into separate policies. Keeping them separate helps you keep policies better organized, and in most environments it reduces the number of places in which you have to manage the infrastructure configuration settings.

Monitoring configuration steps performed in a test or development environment

Step	Task	Reference
1	Establish a policy naming convention and policy precedence scheme before you begin creating policies.	Monitoring-policy-naming-standards-and-definition-best-practices Policy-precedence-best-practices
2	Create separate staging policies in Central Monitoring Administration for the Infrastructure Management development, test, and production environments assigned to the appropriate Integration Service instances. Tip: Develop a clear strategy for assigning the PATROL Agents to each Integration Service. The Infrastructure Management Server does not auto-balance the load between PATROL Agents and Integration Services so the initial assignment is important. Although at least one Integration Service must exist per network, within the network a convention based on name or function, or simply round-robin assignment is acceptable as long as you are consistent and keep track in order to avoid overloading any one Integration Service.	Creating-a-staging-policy Staging-Integration-Service-host-deployment-and-policy-management-for-development-test-and-production-best-practices
3	Create PATROL Agent/KM deployable packages for test PATROL Agents and KMs in the Central Monitoring Administration repository, but do not deploy them to production at this point in the process. Note If previous versions of PATROL Agents already exist, the packages must be configured to install into the same directory as the existing PATROL Agent. Best practices Connect a subset of the PATROL Agents to the respective Integration Services PATROL Agents must be connected to the Integration Service by setting the configuration variables on the BMC PATROL Agent when you add the BMC PATROL Agent package through Central Monitoring Administration. This enables the BMC PATROL Agents to automatically initiate the connection to the Integration Service. Starting with a subset of agents is important to avoid overloading the BMC TrueSight Infrastructure Management Server or Integration Service. A reasonable subset of PATROL Agents must be less than 100. This would be repeated until the maximum recommended number of agents is added to the Integration Service (per the scaling guidelines). If two different BMC PATROL Agents are monitoring the same target, both agents must be tied to the same Integration Service. If this is not done, the monitors from the different PATROL Agents might not be properly reconciled on the BMC TrueSight Infrastructure Management Server.	Creating-and-editing-component-installation-packages Central-Monitoring-Administration-repository-best-practices
4	Deploy the PATROL Agent or KM deployable package to the development or test managed servers, and run the PATROL packages silent installer on the test managed machines..	Downloading-and-installing-an-installation-package
5	Validate that the PATROL deployable package installations were successful.	none
6	Validate and test the PATROL Agents in Central Monitoring Administration.	Viewing-the-status-of-Infrastructure-Management-components-in-Central-Monitoring-Administration
7	Configure global server thresholds for the monitoring solutions (KMs) in Central Monitoring Administration. Best practice Identify and configure the thresholds to set at the PATROL Agent and monitoring solution level. See Threshold considerations.	Managing-global-thresholds Global-thresholds-and-policy-application-best-practices
8	In Central Monitoring Administration, create monitoring policies to be tested in the development environment. Best practices The monitoring solutions must be configured to collect only data which is needed from the Infrastructure Management console or that is needed for event generation from thresholds. Collecting more data than this creates unwanted overhead on the Integration Service and BMC TrueSight Infrastructure Management Server. Ensure that only required instances are being discovered. Each monitoring solution may have different options on how to control this. Disable discovery of instances that are short lived (for example, instances that are created and then deleted within the span of one to two days). Ensure that all monitoring solutions used for data collection are preloaded, as collection with Infrastructure Management must not require console interaction.	Creating-or-editing-a-monitoring-policy Monitoring-parameter-configuration-best-practices Policy-precedence-best-practices
9	(Optional) If you are creating blackout policies, create time frames for the monitoring solutions (KMs) in Central Monitoring Administration.	Creating-a-time-frame Time-frame-and-blackout-policy-creation-best-practices
10	(Optional) Create blackout policies for the monitoring solutions (KMs) in Central Monitoring Administration.	Creating-a-blackout-policy Time-frame-and-blackout-policy-creation-best-practices
11	Enable the policies for the monitoring solutions (KMs) in Central Monitoring Administration.	Enabling-or-disabling-a-monitoring-policy
12	Test and validate that data is collected according to the policies you defined. Resolve any issues. Note Integrate data collection and event sources in specific groups one at a time, and then observe performance of the Infrastructure Management components after each group is integrated. For example, integrate all monitoring for a specific application and then observe and tune the Infrastructure Management components for performance, if needed, before integrating additional collection. Leverage the self-monitoring for performance in the Infrastructure Management Server. As each group is integrated, review the configuration report to verify that the number of devices, instances, and parameters does not exceed the plan. Tune as needed.	Managing-and-monitoring-events

Monitoring configuration steps performed in a production environment

Step	Task	Reference
13	Move the validated policies from test to production leveraging the export/import utility.	Exporting-and-importing-blackout-and-monitoring-policies
14	Deploy the PATROL packages to a subset of production machines.
15	Deploy the PATROL Agent or KM deployable package to the production managed servers, and run the PATROL packages silent installer on the production managed machines.	Downloading-and-installing-an-installation-package
16	Validate that the PATROL deployable package installations were successful.	none
17	Validate and test the PATROL Agents are in the Central Monitoring Administration.	none
18	Enable the policies in production.	Enabling-or-disabling-a-monitoring-policy
19	Validate agents and data collection in production. Resolve any issues.	Viewing-the-status-of-Infrastructure-Management-components-in-Central-Monitoring-Administration
20	Deploy remaining agents in batches.	none
21	Between each batch of PATROL Agents and Integration Services deployed and configured, ensure that the Infrastructure Management Server and Integration Services are performing well and can still manage the load. Performance diagnostics are available in the operator console for the Infrastructure Management Server and the respective remote agent nodes where the Integration Service is running. Ensure that the scalability limitations of the Integration Services are not exceeded.	Performance-benchmarks-and-tuning

Additional monitoring sources and capabilities

Monitoring capability	Reference
Define manual application models based on groups and devices, or implement BMC TrueSight App Visibility manager to enable automatic application models from which you can monitor the performance and health of active or synthetic applications, perform diagnostics, and trace application transactions.	Use-case-Monitoring-the-infrastructure-in-the-context-of-an-application
Use Third-party adapters to provide a mechanism for external applications to funnel data into Infrastructure Management. Data adapters facilitate the synchronization of performance data collected by specific monitoring solutions into Infrastructure Management for further analysis.	Event-integration-with-supported-third-party-products Data-integration-with-supported-third-party-products
Define impact service models to monitor when higher-level entities, such as applications, technical services, business services, and organizations are impacted, and how they are impacted when lower-level IT infrastructure entities, such as servers, network devices, and application systems are affected by some condition.	Defining-service-models Configuring-business-services-and-other-CIs-to-appear-in-the-Applications-page