BMC Helix Control-M Incident Response policy
If service is disrupted in your production environment, the BMC Operations team restores service as quickly as possible. After service has been fully restored, BMC provides a Major Incident Report (MIR) document in the following situations:
- The disruption resulted in an outage (as defined in the BMC-Helix-Control-M-Availability-policy); or
- The disruption resulted in a significant service degradation (Severity 1) as deemed by BMC Operations.
If the MIR documentation does not clearly define probable preventive actions, the customer may request a detailed Root Cause Analysis (RCA) document.
Major Incident Report documents
BMC will provide a MIR document within three business days of the conclusion of an incident that meets the criteria defined above. MIR documents include the following information:
- Symptoms observed by the BMC incident response team
- Steps taken to mitigate impact or restore service
- Preventive actions (when available at the time of MIR document delivery; see Root Cause Analysis documents for additional details about preventive actions)
MIR documents are not provided for non-production environments. Additionally, the investigation, determination, and mitigation of a root cause for an incident are not part of the MIR document.
Root Cause Analysis documents
Upon customer request and post MIR delivery, BMC will provide a RCA document within 20 business days of the conclusion of a major incident. The following information is included in the RCA document:
- Timing and duration of the event
- Details describing the action taken to restore service
- Technical details uncovered during the review of available log data (including an assessment of the root cause)
- Recommendations for preventive action
BMC is committed to providing an assessment of the root cause, but an RCA document might not contain any actual root cause information because it is not always possible to determine or define a fundamental root cause. RCA documents are not provided for non-production environment incidents.
In some complex outage scenarios, when not enough diagnostic data is available or when BMC is unable to reproduce an issue, BMC might need to enable additional logging in the production environment to capture diagnostic data.