Risks overview


As an operator or a site reliability engineer (SRE), it's critical that you are able to observe the business services in your organization to monitor their overall health.

A vulnerability is a flaw in a system that can compromise security, and many new, critical vulnerabilities affect services daily.

IT personnel often face challenges in understanding and prioritizing these risks due to the complex nature of the vulnerabilities and a lack of the required security expertise. The lengthy, manual remediation process can also impact service health.

When infrastructure-based changes are implemented in an organization, the direct impact on service health and performance is not easily identifiable. Change risk assessment often is a manual and time-consuming activity, and a lack of proper risk assessment might lead to issues resulting in service outages and cause further disruption to the business. 

BMC Helix AIOps provides a set of comprehensive, risk monitoring capabilities. 

Vulnerabilities

A vulnerability is a flaw or weakness in a system that can compromise security. Tens of thousands of vulnerabilities, many with high or critical severity, affect services daily. It is often difficult and time-consuming for IT personnel to understand a vulnerability and assess and prioritize its risks. The remediation content creation process is lengthy and manual, with low throughput and a high margin for errors, and it can get delayed if the SecOps or DevOps team is occupied with other tasks. 

As an operator or a site reliability engineer (SRE), it's critical that you have a robust vulnerability management solution to be able to monitor the vulnerabilities affecting the services, investigate the risks associated with these vulnerabilities, and quickly prioritize remediation to restore the health of the impacted services.

The Vulnerabilities page provides relevant information about the services used by your organization in one place. You can view the following information:

  • The top impacted services, based on the Risk score assigned to them - a numerical value between 0 and 10
  • The top remediation owners, that is the user or user group that owns a vulnerability, based on the number of vulnerabilities assigned to them.
  • The top vulnerabilities based on their Severity - Critical, High, Medium, or Low
  • The details of each vulnerability including the option to generate remediation content for it

Scenario

The Apex Global IT Train Ticketing System is a microservices-based architecture that provides a portal for booking and managing train reservations.

Bruce is a site reliability engineer at Apex Global IT and is responsible for monitoring the overall health of all the services used for the train ticketing system. He uses BMC Helix AIOps for his monitoring.

The Vulnerabilities tab on the Risks page on the console shows the top impacted services, the top vulnerabilities impacting the services in his organization, and the top remediation owners. Today, he observes that the TrainsApp service, typically used by travelers to book tickets, is impacted and has a Risk score of 9.1.

He clicks the service name to open the service details and observes that there are 169 critical vulnerabilities affecting the service. The most critical vulnerability affecting the TrainsApp service is Apache Log4j SEoL (<= 1.x). He clicks the vulnerability name to view the vulnerability details, such as severity, CVE-ID, CVSS score, impacted services, and the number of impacted assets.

Bruce has enabled BMC HelixGPT, which generates a vulnerability summary in a human-readable format that is easy to understand. For this vulnerability, BMC HelixGPT generates the following summary:

A critical vulnerability exists in the Apache Log4j version less than or equal to 1.x. Since it is no longer maintained by the vendor, there will be no new security patches released. This leaves the system exposed to potential security vulnerabilities. It is strongly recommended to upgrade to a newer, supported version of Apache Log4j to ensure proper maintenance and security updates. The vulnerability has a CVSS Score of 10, indicating critical severity.

Bruce also leverages BMC HelixGPT to generate the best action recommendations for remediating the vulnerability.

Scenario_Vuln summary_251.png

Based on this information, Bruce can then take corrective measures to reduce the risks associated with open vulnerabilities.

With these capabilities, Bruce achieves the following objectives with services in his organization:

  • Remain available and healthy at all times
  • Perform at an optimal level
  • Have low downtime and minimal impact on the business


BMC HelixGPT-based summary and best action recommendations

BMC Helix AIOps connects with BMC HelixGPT to leverage the generative AI capabilities that help operators or SREs understand a vulnerability faster, by providing a human-readable AI-generated summary. This summary gives a synopsis of the causal summary, explaining the complete context of the vulnerability. 

vulnerability_summary_251.png

Important

To enable BMC HelixGPT, contact BMC Support.

If BMC HelixGPT is not enabled, the vulnerability summary is the vulnerability description received from the scanning systems configured in BMC Helix Automation Console.

Vuln_summary_no HelixGPT.png

Best action recommendations

By using the generative AI capabilities, BMC HelixGPT provides a step-by-step action plan for remediating a vulnerability. These remediation steps are called best action recommendations and can be used by the operators or SREs to resolve the vulnerability. Best action recommendations help close vulnerabilities faster and improve the mean time to resolve (MTTR).

BMC HelixGPT generates these recommendations by evaluating information received from the scanning systems configured in BMC Helix Automation Console.

With the remediation steps, a code wizard provides sample scripts that can be used for performing the recommended step in Ansible or .

HelixGPT_remediation_script_251.png

By leveraging the capabilities of BMC HelixGPT, operators can improve operational efficiency, derive insights from all connected sources, and reduce manual errors by implementing automation to resolve vulnerabilities faster.

If BMC HelixGPT is not enabled, you cannot automatically generate remediation content for the vulnerability. To enable BMC HelixGPT, contact BMC Support.

Change Risk Advisory

As a change manager, you must be aware of the impact any change in your infrastructure has on the business before implementing it. BMC Helix AIOps connects with BMC HelixGPT to retrieve change request information from BMC Helix ITSM and display it in the context of the services. When infrastructure-based changes are implemented in an organization, the direct impact on service health and performance is not easily identifiable. Most often, change risk assessment is a manual and time-consuming activity. Lack of proper risk assessment might lead to issues resulting in service outages and cause further disruption to the business. 

BMC Helix AIOps connects with BMC Helix ITSM to display change requests created in BMC Helix ITSM over a predefined period. When you investigate the change request, the status, AI-generated risk level, severity, and impacted services are displayed in addition to the request details. The number of open situations, situations that occurred in the past due to similar change requests and the status display the current health of the service. 

By connecting with BMC HelixGPT, BMC Helix AIOps generates insights derived from historical change requests, which helps in making informed decisions proactively before implementing changes. 

Change Risk Advisor_251.png

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*