Event triage and remediation

This page provides some details about event triage and remediation use cases.

Process overview

IT operations are responsible for keeping IT (and therefore the business) running. Monitoring and performance tools are needed to ensure that an IT environment is always functioning and to generate events when something is wrong. When there is event that might be of concern to an organization, IT team members need to investigate the event to determine if it is important, and if the event needs addressing, to start the process to fix it.

Many IT organizations turn impacting events into incidents in the service desk where they can be tracked and fixed. Typically the assigned engineer gathers information and attempts to diagnose the issue and, when the problem is understood, fixes the problem and restores service.

This is a manually intensive process and many organizations have turned to IT process automation to improve and streamline operations by implementing the following use cases.

Key use cases

  • Collecting additional diagnostics and enriching events
  • Leveraging standard automation architecture to initiate incident and change management processes
  • Executing standardized triage and remediation tasks

Event triage and remediation

Business values

  • Cost
    • IT operations do not have to manually collect diagnostic information
    • IT operations do not have to manually create incidents
    • IT operations can automate fixes to known events, reducing system downtime
  • Risk
    • Less risk of long IT service outages
    • Less risk of an event/incident going unnoticed
  • Governance
    • ITIL processes around incident and change processes are followed
    • Incidents, changes, and outages are documented in the service desk