How emerging issues are identified in BMC Helix ITSM Insights

When a major incident happens, the Service Desk is flooded with tickets. It is important for Service Desk managers to correlate the tickets and help reduce the noise. In this way, it is possible to find the probable cause and restore the service as soon as possible.

With the Real-time incident correlation workspace in BMC Helix ITSM Insights, Service Desk managers can:

View the trending issues or hot spots in real time
Identify incidents related to same situation with the help of an improved AI-based algorithm
Relate incidents as duplicates to streamline incident management and reduce duplicate work

Algorithms used in incident correlation

As incidents are created or updated in BMC Helix ITSM Insights, these incidents are compared with existing incidents. The similarity is based on user-configurable grouping fields, text fields and sliding time window.

BMC Helix ITSM Insights uses the following machine learning algorithms and techniques to automatically identify clusters of incoming incidents:

DistilBERT language model

To find contextual relations in sentences, DistilBERT pretrained model, which is lighter version of BERT is used. BERT is a machine learning technique for natural language processing pre-training developed by Google. This model can be fine-tuned for a specific task (embeddings, classification, entity recognition, question answering, similarity) with a specific data set to produce accurate predictions. This model provides semantic similarity instead of the syntax and keyword-based approach from the previous generation of NLP and search technologies. For example, if we have two incidents with summary text fields such as - “My VPN does not work” and “Cannot connect to VPN”, DistilBERT is able to detect that they are quite similar even though only one word “VPN” matches the two sentences syntactically.

Cosine similarity algorithm

Cosine similarity is a metric used to determine similarity of two entities. The similarity score ranges from 0 to 1, with 0 being the lowest (the least similar) and 1 being the highest (the most similar). Based on the similarity score, the incidents are clustered together.

Real-time iterative grouping algorithm

As incidents flow in, the real-time iterative grouping algorithm continuously updates these clusters of incidents and stores results in Elasticsearch storage. A caption is generated for these incident clusters. When clusters are stale and do not have any further updates in a configured time, or if all incidents are resolved in a cluster, the cluster is closed. Clusters are also closed automatically after 30 days. For each cluster, a graph is also shown that indicates the growth rate of incidents over time grouped by priority.

How the cluster name is derived

The following techniques are applied to extract and pre-process incident text data to derive the cluster name:

Technique	Description
Tokenization	The phrases or sentences are split into smaller units, such as individual words or terms. Each of these smaller units are called tokens.
Stemming	The words are reduced to their root forms by removing any affixes. Example: For words such as failing, fails, failed, failure, the root form is fail.
Lemmatization	Inflectional and related forms of a word are reduced to a common base form.

The algorithm then finds the most important and most frequent words in the text of the incidents in each cluster and deduces a name for each cluster as a three-part name.

Important

Sometimes multiple clusters with similar or same names are observed on the dashboard. This situation occurs when:

Similar type of clusters are formed in various groups (clusters driven by group-by fields such as company, department, operational categories etc. For example, if the grouping is done by company and each company has clusters related to virtual machine issues, there might be clusters with the name as 'virtual-machine-unreachable'.
Similar textual data and high similarity threshold results in discrete clusters. Here, the cluster name can be same or similar for some of the clusters, as the name is derived using the most important and frequently used words in the text of the incidents in each cluster.