Perform Probable Cause Analysis on events generated by the PATROL Agent

Mix Technologies is a large enterprise company in the Silicon space. It has the following deployment:

  • 5000 servers in the IT infrastructure
  • 1500 servers in a virtual environment using VMware

  • 500 servers in a public cloud environment

Mix Technologies monitors its network devices using events through SNMP. It also uses deep dive network topology tools. The rest of the application infrastructure and servers are monitored using application performance and traditional monitoring tools.The help desk personnel and application owners are responsible for monitoring and managing the servers in the private cloud as well. 

Roles required

There are many user roles involved in the deployment, operation, and management of Infrastructure Management. Your company may employ the roles as described below, consolidate them into fewer roles, or divide them into roles with more granular responsibilities and may have other titles for these roles.

The following role is required to complete this use case:

  • Roger - Distributed Service Operations User

Roger handles the following responsibilities:

  • Maintaining the ongoing performance and availability of production systems with a focus on server infrastructure
  • Performing administrative functions on servers and monitoring tools
  • Monitoring the performance and solving availability, performance, and capacity problems

Viewing events generated by the PATROL Agent and performing Probable Cause Analysis on such events

When a critical condition arises in the PATROL environment, a critical event is generated by the PATROL Agent. Roger wants this event to impact the corresponding configuration item (CI) in Infrastructure Management. After the critical condition is resolved, PATROL generates an OK event. Roger wants the corresponding change in the event status to be reflected in Infrastructure Management. Roger also wants to view not only intelligent events but also events generated by the PATROL Agent. He then wants to be able to perform Probable Cause Analysis on all events, drill-down to the root cause and troubleshoot the problem area. To do this, Roger must:

  1. Add an Integration Service
  2. Configure the PATROL Agent and assign to an Integration Service
  3. Set up thresholds for the PATROL Agent.
  4. View the PATROL Agent and events generated by the PATROL Agent
  5. Perform Probable Cause Analysis on these events to determine the most likely cause for the event to be generated.

Was this page helpful? Yes No Submitting... Thank you

Comments

  1. Abhay Bhagat

    Can I get a detailed actual use on the case the PCA which may be demostrated to the customers and provide a value to admin/user using TrueSight

    Dec 13, 2018 06:06
    1. Sanjay Prahlad

      Hello Abhay, apologies for the delayed response.

      More details are available at Probable Cause Analysis and how-to instructions at Performing probable cause analysis on an event from the TrueSight console

      Does this help? Let us know if you need more info.

      Feb 26, 2019 10:41
      1. Abhay Bhagat

        Dear Sanjay , Thanks for your reply , what you shared is what I do , I need an actual use case which tells in a concrete way that if I have two monitoring as one as Windows Disk high Read\Write time and an application service got down , Will it show in probable cause , similarly there should be few point which would be written in code that what is probable cause and how it can be really showcased , I am looking for that information . Thanks in advance

        Mar 01, 2019 09:19