This following topics describe the flow of event and performance data in Infrastructure Management and provide guidance in the deployment, configuration, and use of those components to achieve a scalable environment.
The Integration Service can consume and forward both performance data and events. The following diagram illustrates how the Integration Service nodes fit into the Infrastructure Management architecture.
The Integration Service accepts streaming of PATROL data and events using a common connection port. The default port is 3183. This includes all the data points and events from PATROL for parameters that you select. After the events arrive at the Integration Service, they are separated and follow a unique path to one of the following based on configuration:
For additional details about the ports used, see Network ports.
You can optionally install the following components and configure them on the Integration Service host depending on whether or not they are required in the environment. Before installing any of these additional components, consider scalability and additional resources that you might require.
Event management cell—The event management cell is the event management process installed locally on the same server with the Integration Service. BMC recommends that you install the event management cell on all of the Integration Service host computers.
The cell is not required for forwarding events to the BMC TrueSight Infrastructure Management Server; therefore, the cell does not have to be installed with the Integration Service.
BMC Event Adapters—BMC Event Adapters work with the event management cell to consume non-PATROL events; for example, SNMP traps. BMC recommends that significant non-PATROL event collection be dedicated to other event management cells. The default event adapter classes, rules, and files are installed with the cell that is installed with the Integration Service.
PATROL Agent and Knowledge Module (KM)— The PATROL Agent and Knowledge Module (KM) monitor the Integration Service host processes.
The Infrastructure Management architecture supports buffering of BMC PATROL performance data and events at the PATROL Agents in case there is a network connectivity issue or if the Integration Service cannot be reached. When the PATROL Agent reconnects to an Integration Service process, the buffered data is sent. This capability is not intended to support buffering for very large amounts of data. It is intended to support a few minutes of lost connectivity, not hours or days. Testing has shown that the process can support up to 30 minutes of data collected by the PATROL Agents across 1000 managed servers.
The Integration Service processes are generally stateless, meaning the following:
The Integration Service acts as a proxy to receive and forward both data and events that are sent to it from the PATROL Agents. It also receives PATROL Agent and Knowledge Module (KM) configuration data from Central Monitoring Administration and passes that data to the PATROL Agents.
BMC PATROL Agents collect performance data and generate events for availability metrics. Assuming version 9.6 or higher of the Infrastructure Management Server and the Integration Service are in use, both performance data and events from BMC PATROL are streamed though the Integration Service hosts as follows:
BMC PATROL streams raw performance data, including all of the data points that you decide to send, to the Infrastructure Management Server. The data is not summarized (as in previous versions).
At least one remote Integration Service host must be deployed for all environments.
Install the Integration Service and event management cell in pairs so that each Integration Service process has a corresponding event management cell installed on the same host computer. In this configuration, events are propagated from the Integration Service to the event management cell running on the same host. The option to install an event management cell is available when you install the Integration Service.
Maintain the event flow path so that all events from any PATROL Agent are always processed through the same event management cells (including cell HA pairs). This ensures event processing continuity where automated processing of one event is dependent on one or more other events from the same agent. An example of this type of processing is the automated closure of critical events that is triggered by “OK” events for the same object that was in a state of critical alarm. If you do not maintain the same event flow path per agent through the same event management cells, correlation of all events from the same agent is not possible because the necessary events are not received and available in the same cells.
Some environments might require more than two Integration Service hosts in a cluster or more than two Integration Service hosts defined for each agent that sends the data (events and performance) through a third party load balancer to the Integration Service hosts. This is acceptable as long as all events from any one agent always flow through the same high availability (HA) cell pair and the event processing continuity is maintained. For example, if four Integration Service nodes are clustered, then each node in the cluster must not have a cell configured on it. Instead, the cell must be on other systems (in an HA pair) so that the event path remains the same for all events coming from the agents that the cluster handles. For further information about HA deployments, see Infrastructure Management high availability deployment and best practices.
Include multiple events sources other than PATROL
Support more than a few users
A medium or large environment involving more than 100 managed servers
The event management cells allow you to further process events (event enrichment, filtering, correlation, deduplication, auto closure, and so on) before sending them on to the Infrastructure Management Server. This type of event processing must be avoided on the Infrastructure Management Server as much as possible. Event processing in the Infrastructure Management Servers must be controlled and limited to the following:
Event presentation of actionable events only
Collection of events for Probable Cause Analysis
Events used in service modeling
Events sent to the nfrastructure Management Servers must be closely controlled and limited for the following reasons:
Event presentation in the Infrastructure Management Server must not be cluttered with unactionable events that distract or otherwise reduce the efficiency of end users.
The capability to view PATROL performance data in Infrastructure Management without having to forward and store the data in the database is likely to decrease the number of parameters that trend in the Infrastructure Management Server for most environments. This might increase the number of events propagated from PATROL for parameters that do not require baselines but do require static thresholds. This increase will increase the load on the event management cell in the Infrastructure Management Server.
PATROL events are approximately twice the size in bytes compared to events generated in the Infrastructure Management Server. A larger volume of PATROL events increases the memory consumption of the event management cell on the Infrastructure Management Server and additionally increases the Infrastructure Management Server startup time. The overall startup time for an Infrastructure Management Server at full capacity ranges from 15 to 20 minutes.
Automated events to monitor association has a slightly increased load on the event management cell that is embedded in the Infrastructure Management Server.
The following are additional best practices for deployment and configuration of Integration Services, event management cells, and event and performance data collection.
|Integration Service and Integration Service host deployment|
|Event management cell deployment and configuration|
|Event collection and processing|
|Performance data collection|
Configuration of the performance and event data that is sent from the PATROL Agents to the BMC TrueSight Infrastructure Management Server is defined in policies, which are automatically applied to the required PATROL Agents. The PATROL Agent assignment is defined in each policy based on selection criteria. The details of agent selection criteria per policy are discussed at Staging Integration Service host deployment and policy management for development, test, and production best practices. BMC PATROL events and performance data are completely controlled at the PATROL Agent based on these policies. This means data, events, data and events, or no data and no events are controlled as per the parameter. You can edit or change these configuration settings when you want without having to rebuild any configurations or restart any processes.
First, a PATROL Agent reads the tag information from the pconfig variable, /AgentSetup/Identification/Tags/Tag/tagName, where
tagName is the name of the tag. The PATROL Agent then sends the information to the Integration Service, which passes the information to Central Monitoring Administration. Central Monitoring Administration evaluates which policies match the tags or the agent properties, determines the final configuration to be applied, and sends the configuration information to the agent.
A PATROL Agent initiates a configuration request after certain events, such as agent installation, agent restart, agent auto-connection with Integration Service, or changing a tag on the agent. If no policy matches the agent conditions, the agent does not receive configuration information. The agent does not receive the configuration until a matching policy is created.
If a policy is created or updated, changes are pushed from Central Monitoring Administration, via the Integration Service, to PATROL Agents.
The monitoring solutions configuration is stored under the /ConfigData pconfig branch. The pconfig variables received by PATROL Agent from Central Monitoring Administration are applied with the REPLACE request. For the configuration under /ConfigData, only the difference between the configuration received and the configuration that the agent contains is applied. If some configuration is not received for a particular class, it is considered to be deleted and is deleted from /ConfigData. For the configuration under /AgentSetup, it is applied directly.
For /AgentSetup configurations, the variables under the /ConfigData pconfig branch take precedence if there are conflicts.
You should not manually update any variables and values under /ConfigData. The variables and values are only for internal use.
Solution Administrators configure BMC PATROL Agents in Central Monitoring Administration, which is available from the TrueSight console. For information, see .