Configuring incident correlation to detect similar incident clusters


After BMC Helix ITSM Insights is activated, Service Desk managers can use the Real-time incident correlation workspace to detect similar incident clusters and view emerging hotspots.

The system uses a set of default fields and settings for the Real-time incident correlation workspace. As a tenant administrator, you can change the incident correlation configuration based on your requirements. 

If your admin has updated your permissions in the hierarchical groups in BMC Helix ITSM, you must update the Real-time incident correlation configuration settings to view the clusters relevant to you.

If you have set up custom priority values, you must update the Real-time incident correlation configuration settings (except Similarity threshold) to view the updated custom priority details in the Real-time incident correlation dashboard. The algorithm takes at least six hours to display the newly added custom priority values in the Real-time incident correlation workspace.

Warning

Updating the similarity threshold value or the stop word file triggers the deletion of existing clusters.

Best practice for configuring changes

When you change the incident correlation configuration, all existing clusters are removed from the dashboard and the system performs the analysis again. This action might impact the analysis being carried out by any Service Desk managers or agents who are using the Real-time incident correlation dashboard for analysis.

We recommend that any configuration changes for incident correlation are done in off-hours so that the impact is minimal.

Out-of-the-box configuration for incident correlation

BMC Helix ITSM Insightsuses a set of default fields and settings to display the clusters in the Real-time incident correlation dashboard. 

The following table describes this out-of-the-box configuration for incident correlation:

Fields

Default value

Default fields used by the system for incident correlation

  • Assignee
  • Assignee - Company (Assigned Support Company)
  • Assignee Support group (Assigned Group)
  • CI (HPD_CI)
  • Calculated priority (Priority)
  • City
  • Closed Date
  • Communication coordinator - Company (SV_ComCoord_SupportCompany)
  • Communication coordinator - Support group (SV_ComCoordSGP)
  • Company
  • Customer site (Site)
  • HPD_CI_ReconID
  • Impact
  • Incident Number
  • Incident type (Service Type)
  • InstanceId
  • Last Resolved Date
  • Major incident manager - Company (SV_MIM_Company)
  • Major incident manager - Support group (SV_MIM_SGP)
  • Operational category 1 (Categorization Tier 1)
  • Operational category 2 (Categorization Tier 2)
  • Operational category 3 (Categorization Tier 3)
  • Product Name
  • Product category 1 (Product Categorization Tier 1)
  • Product category 2 (Product Categorization Tier 2)
  • Product category 3 (Product Categorization Tier 3)
  • Region
  • Reported Date
  • Service (Service CI)
  • ServiceCI_ReconID
  • Site group (Site Group)
  • Status
  • Status_Reason_Hidden (Status_Reason)
  • Submit Date
  • Submitter
  • Summary (Description)
  • Total Time Spent
  • Urgency

The maximum number of days a cluster can stay open

7 days

Similarity threshold

7

Minimum number of incidents that a cluster should have
to be visible in the dashboard

5

Warning

High volume of data in the Description (Detailed Description) field may result in performance issues while generating clusters in ITSM Insights. 

Starting with version 23.3.00, the Description (Detailed Description) field is no longer a mandatory field. If you are already using this field to generate clusters, you can exclude it from the dataset manually.

Best practice
We recommend excluding the Description (Detailed Description) field from the configuration to improve the performance and turnaround time of generating clusters.

To update the configuration

  1. In BMC Helix ITSM Insights, click the Settings icon.PNG icon.
    The Settings page is displayed.
  2. Select Real-time incident correlation > Configure.
    The Real-time incident correlation configuration page is displayed.
  3. In the Data set section, you can view the data fields being used by the system for the configuration. The fields that you select here appear as filter criteria in the Real-time incident correlation dashboard filter.

    Tip

    The fields that appear in BMC Helix ITSM display their field labels, system names (in brackets), and often display their additional description (in English only) in the data set. Therefore, when you choose amongst similar fields in the data set for creating clusters, we recommend you select the field that displays its label, system name and description. For example, while choosing between CI and CI(HPD_CI), we recommend you select CI(HPD_CI) because it displays the CI label, HPD_CI system name and its description as
    image-2024-4-23_11-38-40.png.

  4. In the Create clusters section, specify the parameters based on which incident data is grouped.
    See Create clusters for more details.
  5. In the Advanced cluster settings (Machine Learning) section, specify the details for generating clusters.
    See Advanced ML for more details.
  6. Upload stop word file
    See Stop words for more details.
  7. (Optional) Remove personally identifiable information from incident details during clustering.
    See Personally identifiable information for more details.
  8. In the Trending and major incident configuration section, specify the criteria for detecting major incidents in the clusters.
    See Major incident configuration for more details.
  9. In the Notification & email section, enable the notification feature, and specify the recipient and criteria to receive notifications.
    See Notification for more details.
  10. Click Save.

To configure the cluster groups

  1. For the first level of grouping, select up to two fields to group the incidents at the top level for clustering. Only categorization fields are available for selection such as service, CI, and company. 

    Best practice
    We recommend grouping by Assignee - Company (Assigned Support Company), Assignee Support group (Assigned Group) and Company to create clusters with all incidents related to your company and assigned group. Availability of all incidents of your assigned group and company helps you manage their relationships effectively.

  2. Select up to five additional field names for matching incidents to be grouped into a cluster. Only text fields are available for selection. 
    image-2023-11-23_16-38-19.png

To configure advanced machine learning

  1. Specify the maximum period that a cluster would stay open from the time an incident is last updated.
    This window can range from hours to days. The default value is 7, which means, clusters that are more than seven days old are automatically deleted.  However, you can set this value up to a maximum of 30 days. 
    image-2024-9-11_12-51-52.png
     
  2. Specify the similarity threshold in the slider.
    Similarity threshold determines how similar the incident descriptions are in relation to the description of the original incident, which is the first incident of a cluster. T
    he similarity threshold can be a value between 1 and 10, the default value being 7.  The higher the value you select, the more stringent is the test to match the similarity of the incident, and therefore, the clusters formed are more cohesive and smaller. 

    View example of similarity threshold

    Similarity threshold value

    Observation

    image2022-12-19_12-55-55.png

    A lesser similarity threshold value performs a lenient test to match the similarity of incidents for clustering. 
    image2022-12-19_12-58-39.png

    image2022-12-19_12-56-56.png

    A higher similarity threshold value performs a stringent test to match the similarity of incidents for clustering.
    image2022-12-19_12-59-54.png

    In most cases, it is observed that the number of incidents in the cluster decreases as you select a higher value of similarity threshold.

    Best practice
    We suggest to set the threshold similarity to its default value of 7 to generate optimal results.

  3. Specify the minimum number of incidents that a cluster should have, to be shown in the dashboard.
    The algorithm checks this condition before generating clusters. After a cluster is created, it remains open until the last incident within it is closed.

    image-2024-9-11_12-54-23.png

To configure stop words

You can use a regular expression to define stop word patterns, such as a combination of words and sentences, which the algorithm can either remove or extract based on your preference while clustering. 

In version 23.3.04 and later, you must upload stop word files in YAML (yet another markup language) format as you can no longer upload them in TXT format. However, existing stop word files in TXT format from previous release versions are still supported.

You can download the sample .YAML stop word file and include the following details in it:

  • List of stop words
  • Prefix and postfix notations by using wildcards
  • Patterns of stop words by using regular expressions based on your use case.

The following template examples show how you can define stop word patterns using regular expressions in YAML file.

Tip

If incidents contain template-based details, we recommend using a template-based stop word file that includes regular expressions for removing or extracting stop words, as shown in the examples.
However, for incidents that contain simple stop words without any template, such as otherthen, and if, you may define the words in the stop_word section of the YAML stop word file for extraction or removal.

Example 1: Using regular expression to remove words and sentences from getting clustered

Using regular expression to remove words and sentences from getting clustered

While generating clusters in the Real-time incident correlation dashboard, you can define patterns using regular expressions to remove words and sentences from incident details .
This example displays how you can remove words and sentences from the template-based incident.

Template-based Incident details

Reported by: John Smith
Address: 123 Main Street, New York, NY 10001
Email: joe@example.com
Phone: (555) 123-4567
Date of Birth: 07/15/1988
Social Security Number: 123-45-6789
Problem Summary: User unable to connect to the corporate VPN using IP address 192.168.1.101. The VPN access page https://vpn.example.com shows a timeout error after entering credentials.

Template-based Stop word file in YAML

The following stop word file is used to remove the irrelevant details from the incident details while generating clusters:

# Regex section contains regular expressions used for matching patterns in text.
# These can be used for tasks like text extraction and removal.
regex:
  removal:
    # Match email addresses
    - '\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}\b'
    # Match URLs
    - '\b((https?|ftp):\/\/[^\s\/$.?#].[^\s]*)\b'
    # Match phone numbers (US format)
    - '\b(?:\+1)?\s?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}\b'
    # Match phone number (Indian format)
    - '\b(?:91[-.\s]?)?\d{5}[-.\s]?\d{5}\b'
    # Match IP addresses
    - '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b'
    # Match dates of birth (MM/DD/YYYY)
    - '\b(0[1-9]|1[0-2])\/(0[1-9]|[12][0-9]|3[01])\/\d{4}\b'
    - '\b\d{3}-\d{2}-\d{4}\b'

# Wildcards section contains patterns with wildcard characters for flexible matching.
# These are used in scenarios where exact matching is not required, allowing for variability.
wildcards:
  # Match any word that ends with .com
  - '%.com'
# Note: All values in each section must be in between single quotes 
# (unless any of the stop words, regex, or wildcards contain single quotes; enclose such strings within double quotes)

Output

The email addresses, URLs, phone numbers (US and Indian format), IP addresses, and date of birth details are removed from incident details before generating clusters.

Example 2: Using regular expression to extract specific word and sentences pattern for clustering

Using regular expression to extract specific word and sentences pattern for clustering

While generating clusters in the Real-time incident correlation dashboard, you can define patterns using regular expressions to extract certain words and sentences from incident details.
This example displays how you can extract the details after "My requests" in the DESCRIPTION section of the template-based incident.

 

Template-based incident details

Customer Info:
ID: m756871
Name: ABC (ABC-DEMO.COM)
Email: ABC@xyz.com
Business: CACA
Business Group: CACA Asia Pacific
Enterprise: Agricultural Supply Chain
Phone: 011-2689 6767
Manager: QWERRTY
Region: Asia Pacific
Country: 
City: 

Form Name: TCE - Requests

DESCRIPTION
Create a personalized description to help you locate this ticket in “My Requests”:: Matrix Execution required by 12/6

REQUEST INFORMATION
What do you need help with today?: I need to make a request related to Master Data

Stop word file in YAML

# Stopwords section contains common words that should be excluded from text processing.
# These words are typically considered insignificant for the purpose of analysis.
stop_words:
  - 'Pacific'
  - 'Impact'
 

# Regex section contains regular expressions used for matching patterns in text.

# These can be used for tasks like validation, searching, or text extraction.
regex:
    removal:
   

         extraction:

     - '(?<=::).*$'

# Wildcards section contains patterns with wildcard characters for flexible matching.
# These are used in scenarios where exact matching is not required, allowing for variability.
wildcards:
  # Match any words that starts with ERR
  - 'Parameter%'
  # Match any word that has prod in between
  - '%prod%'
  # Match any word that ends with .com
  - '%.com'

Output

The Matrix Execution required by 12/6 value from the incident is used for clustering.

  1. Download the sample .YAML stop word file for reference.
    Review the sample stop word file to understand the specified format and create the stop word file for uploading.
    image-2024-9-10_20-7-9.png
     
  2. Define stop words and their patterns using regular expressions in a .YAML file, and validate the file by using any YAML validator.

    Important

    While creating a stop word file, you must adhere to the YAML format mentioned in the examples. See Stop word file examples for creating stop word patterns by using regular expressions.
    You may validate your regular expressions and YAML file in 
    https://regex101.com/ and https://yamlchecker.com/ respectively.

  3. Upload the .YAML file that contains your stop words for the recurrent job.

Important

  • Every time you upload a new stop word file, it overrides the old file and removes the existing clusters. The last updated YAML file is used for creating clusters.
  • While generating cluster labels in the dashboard from the relevant incidents, the algorithm compares the incident description words with the stop word library. Therefore, the cluster labels do not contain words mentioned in the stop word library.
View the use of % in stop words

The following table describes the usage of in stop words:

Incident summary

Stop word

Description

ITSMInsights is running low on memory

ITSM%

Removes the stop word ITSM and the characters following it. 
In this case, ITSMInsights is removed from the resulting cluster label.

ITSMInsights is running low on memory

%Ins%

Removes the stop word Ins and the characters preceding and following it.
In this case, ITSMInsights is removed from the resulting cluster label.

ITSMInsights is running low on memory

%Insights

Removes the stop word Insights and the characters preceding it.
In this case, ITSMInsights is removed from the resulting cluster label.

To configure Personally identifiable information

 You can enable the Remove Personally Identifiable Information (PII) toggle to remove the personally identifiable information from the incident details from being clustered.

The following personally identifiable information (PII) are removed from incidents:

  • Name
  • Phone number
  • Email 
  • City name
  • Credit card details
  • IP address
  • Address
  • US Passport number
  • Social security number
  • US driver license number
     

Important

  • The algorithm works best in removing PII related to English language, and may fail to detect and remove PII related to other languages as other languages are not supported.
  • When you enable the Remove Personally Identifiable Information (PII) toggle, the existing clusters are removed, and new clusters are generated.

To configure trend and major incident settings

Enter the following details to configure the trend and major incidents in clusters:

Configuration setting 

Description

Measure trend over last hour(s)

Specify the number of hours for which the trend must be calculated. By default, the trend is calculated for the last two hours.

Flag clusters for possible major incidents when 

  • # of incidents in cluster reaches 
  • # of incidents in trend window increases by

The application flags clusters as possible major incident candidate clusters in the following cases:

  • the number of incoming incidents in the cluster exceeds the specified value. By default, when the number of incoming incidents exceeds 50, the cluster is marked as a possible major incident cluster 
    OR
  • the number of new incidents in the last trend window exceeds the specified value. By default, the application flags a cluster as a possible major incident cluster when the number of incidents in the trend window increases by 25.

Trend config.png

To configure notification and email settings

Early notification helps major incident managers to assess the impact of the incidents on the overall business even when they are not actively monitoring the dashboard. You can set up notifications for emerging, potential major incidents in Real-time incident correlation clusters. Based on business requirement, you can add other users (all incident assignees, major incident manager of the incident cluster, or any other user) as recipients of the notification.

  1. On the Real-time incident correlation configuration page, in the Notification & email section, turn on the Enable notification for possible major incidents toggle key.
  2. Perform the following actions based on your requirement:

    Field

    Description

    Notify affected incidents assignees

    Select this field for notifying all unique assignees of the affected incidents present in the cluster.
    For example, if a cluster contains 100 incidents, the algorithm finds the unique assignees of the 100 incidents, and sends them notification.

    Notify major incident managers of Affected support companies

    Select this field for notifying the major incident managers of the affected support group companies.
    Every incident in the cluster is associated with a company (and a contact company, if applicable) that is mapped to multiple support groups. Every support group has multiple major incident managers. The algorithm finds the major incident managers for all incidents present in the cluster based on the support groups of the incidents' company. The algorithm then sends the notification to those major incident managers.

    Notify major incident managers of Affected incident support groups

    Select this field for notifying the potential major incident managers of the support group associated with each incident present in the cluster.

    Add recipient

    Enter the user name (and support group, if applicable) of the recipients who should receive the notification.

    Important

    You must select or enter a value in at least one of the 3 fields before you can save your changes.

  3. Click Save.

image-2024-9-27_15-44-4.png

Recipients can select their preferred locale and mode of receiving notification on the CTM:People form. For more details, see Configuring notifications for people records.

 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*