Configuring one-time job settings for proactive problem management

As a problem coordinator, you can configure one-time clustering jobs for proactive problem management and run these jobs. You can configure a maximum of three jobs. When you configure a one-time job, the algorithm runs once to cluster the incidents based on the selected criteria.
By default, a maximum of 200000 incidents are considered while creating a one-time Proactive problem management job. However, you can increase the limit by customization.

Important

In version 23.3.01 and later, Major Incident is added as a required system field in the data set. After upgrading to 23.3.01, you must rerun your existing jobs to effect the change and generate clusters with major incident details.

To configure a one-time job

Click Manage Job.
On the Proactive problem management settings page, click configure one-time job.
In the General section, enter the job name and select the language for the incident text processing.
Even though your incident text contains mixed language, pre-processing is done on the basis of the selected language.
In the Data Set section, you can specify the data fields on which you can create clusters and filter data. Select the fields on which you want to create clusters.

See Data set filters for more details.
In the Data range section, specify the date range that the job should use to search incident data.
See Data range for more details.
In the Create clusters section, specify the parameters by which the data is grouped.
See Create clusters for more details.
In the Advanced machine learning section, specify the number of clusters displayed in the dashboard.
See Advanced machine learning for more details.
Upload stop word file.
See Stop words for more details.
(Optional) Remove personally identifiable information from incident details during clustering.
See Personally identifiable information for more details.
In the Resolution insights section, enter the parameters of Resolution insights.
See Resolution insights for more details.
Click Run now.

A job might take several minutes to complete, depending on the incident data to be processed. Refresh the jobs table to check if the job is completed.

When the job run is completed, the jobs table displays job status in the Jobs table. If the job run was not successful, the Job message column displays the reason why the job failed. Once the job is successfully run, you can select it from the Jobs list in the dashboard to view the clusters.

Important

Depending on the number of records or incidents and the kind of incident data, the number of clusters on the dashboard could be less than the number you have specified.
Running a job multiple times with the same parameters may generate a different number of clusters. You may notice a slight difference in the number of clusters generated under the following conditions:
- When you use Group by fields and let the system generate the ideal number of clusters.
- When you use Group by fields and provide a cluster value.
The algorithm groups incidents that have similar words in their description and avoids generating one-word cluster labels in the dashboard. Therefore, if an incident has a one-word description, the algorithm finds an existing cluster that has incidents with that word in its description sentence and adds the incident to that cluster.

To configure Data set filters

To view the required system fields, select the Show fields required by the system check box. The system fields cannot be removed. Except for Submitter, all other system fields cannot be removed. Some fields may be hidden from the data set by your admin to comply with privacy regulations.

Tip

The fields that appear in BMC Helix ITSM display their field labels, system names (in brackets), and often display their additional description (in English only) in the data set. Therefore, when you choose amongst similar fields in the data set for creating clusters, we recommend you select the field that displays its label, system name and description. For example, while choosing between CI and CI(HPD_CI), we recommend you select CI(HPD_CI) because it displays the CI label, HPD_CI system name and its description as
.

You can specify filters on the fields to further refine your data set.

Click the required filter category and select one of the following options:
Option
Description
Equals to
Select this option to include values in the filter.
Not equals to
Select this option to exclude values from the filter.
Search and select the field value that you want to include or exclude.
Click Apply filters.
Warning
Note
- Searching for a string without a wildcard (%) is not supported in a filter that has a text field. We recommend using a wildcard (%) for a search in such filters.
- The Equals to and Not equals to options appear only in fields with a character menu, such as Service CI.

Option	Description
Equals to	Select this option to include values in the filter.
Not equals to	Select this option to exclude values from the filter.

To configure data range filter

Enter the date range in the Data range date field field within which you want to search for incidents.

Best practice
Define your date range based on the problem management process and review cycles for problem identification in your organization. Typically, the date range is the previous 1-4 weeks of data.

Data Range OTJ.PNG

To configure the cluster groups

For the first level of grouping, select up to three fields in Group by (max 3) to group the incidents at the top level for clustering.
You can select only the categorization fields that are selected in the Data set section, such as Status and Priority.

For matching incidents to be grouped into a cluster, select up to five additional field names in Inputs for machine learning.
You can select only the text fields that are selected in the Data set section. If no field is selected in Group by (max 3), it is mandatory to select at least one field in Inputs for machine learning for clustering.
Summary is the default text field used for clustering. If more than one text field is provided, these fields are concatenated into one field.

View how existing group by configuration appear after the 23.3.03 update

Group by configuration in existing job	Group by configuration after 23.3.03 update
Only Machine learning selected in group by (level 1).	The Machine learning option is removed from Group by (level 1). The Inputs for machine learning field appears by default.
Categorical field is selected in level 1 Text field is selected as input for machine learning in level 2	Group by (level 2) is removed. The Inputs for machine learning field appears by default.
Categorical field is selected in level 1 Categorical field is selected in level 2	Fields in the Group by (level 2) appear in Group by (max 3). You can select up to 3 fields in Group by (max 3). Summary (Description) is selected by default for machine learning.

Important

If you do not select any text field in Inputs for machine learning, the algorithm groups clusters based on categorical fields.
If you select text fields in Inputs for machine learning, the algorithm uses machine learning to groups clusters based on text fields.

To configure advanced machine learning

To allow the system to set the number of clusters, click the Let the system set the number of clusters check box.

Best practice
When selecting the number of clusters, we recommend selecting the Let the system find no. of clusters check box rather than setting the number of clusters yourself. The system automatically selects an optimal number of clusters. However, when you know the number of clusters from prior execution runs or domain knowledge, you can specify a value to improve the response time. We recommend setting 20-30 clusters for optimal incident monitoring.

Advanced ML.PNG

To configure stop words

You can use regular expressions to define stop word patterns, such as a combination of words and sentences, which the algorithm can either remove or extract based on your preference while clustering.

In version 23.3.04 and later, new jobs only support stop word files in YAML (yet another markup language) format. However, older jobs created in previous release versions still support stop word files in TXT format.
You can download the sample .YAML stop word file and include the following details in it:

List of stop words
Prefix and postfix notations by using wildcards
Patterns of stop words by using regular expressions based on your use case.

The following examples show how you can define stop word patterns using regular expressions in a YAML file:

Tip

If incidents contain template-based details, we recommend using a template-based stop word file that includes regular expressions for removing or extracting stop words, as shown in the examples.
However, for incidents that contain simple stop words without any template, such as other, then, and if, you may define the words in the stop_word section of the YAML stop word file for extraction or removal.

Example 1: Using regular expression to remove words and sentences from getting clustered

Using regular expressions to remove words and sentences from getting clustered

While generating clusters in the Proactive problem management dashboard, you can define patterns by using regular expressions to remove words and sentences from incident details.
This example displays how you can remove words and sentences from the template-based incident.

Template-based Incident details

Reported by: John Smith
Address: 123 Main Street, New York, NY 10001
Email: joe@example.com
Phone: (555) 123-4567
Date of Birth: 07/15/1988
Social Security Number: 123-45-6789
Problem Summary: User unable to connect to the corporate VPN using IP address 192.168.1.101. The VPN access page https://vpn.example.com shows a timeout error after entering credentials.

Template-based Stop word file in YAML

The following stop word file is used to remove the irrelevant details from the incident details while generating clusters:

# Regex section contains regular expressions used for matching patterns in text.
# These can be used for tasks like text extraction and removal.
regex:
  removal:
    # Match email addresses
    - '\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}\b'
    # Match URLs
    - '\b((https?|ftp):\/\/[^\s\/$.?#].[^\s]*)\b'
    # Match phone numbers (US format)
    - '\b(?:\+1)?\s?$?\d{3}$?[-.\s]?\d{3}[-.\s]?\d{4}\b'
    # Match phone number (Indian format)
    - '\b(?:91[-.\s]?)?\d{5}[-.\s]?\d{5}\b'
    # Match IP addresses
    - '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b'
    # Match dates of birth (MM/DD/YYYY)
    - '\b(0[1-9]|1[0-2])\/(0[1-9]|[12][0-9]|3[01])\/\d{4}\b'
    - '\b\d{3}-\d{2}-\d{4}\b'

# Wildcards section contains patterns with wildcard characters for flexible matching.
# These are used in scenarios where exact matching is not required, allowing for variability.
wildcards:
  # Match any word that ends with .com
  - '%.com'
# Note: All values in each section must be in between single quotes
# (If any of the stop words, regex, or wildcards contain single quotes; enclose such strings within double quotes)

Output

The email addresses, URLs, phone numbers(US and Indian format), IP addresses, and date of birth details are removed from incident details before generating clusters.

Example 2: Using regular expression to extract specific word and sentences pattern for clustering

Using regular expressions to extract specific word and sentences pattern for clustering

While generating clusters in the Proactive problem management dashboard, you can define patterns by using regular expressions to extract certain words and sentences from incident details.
This example displays how you can extract the details after "My requests" in the DESCRIPTION section of the template-based incident.

Template-based incident details

Customer Info:
ID: m756871
Name: ABC (ABC-DEMO.COM)
Email: ABC@xyz.com
Business: CACA
Business Group: CACA Asia Pacific
Enterprise: Agricultural Supply Chain
Phone: 011-2689 6767
Manager: QWERRTY
Region: Asia Pacific
Country:
City:

Form Name: TCE - Requests

DESCRIPTION
Create a personalized description to help you locate this ticket in “My Requests”:: Matrix Execution required by 12/6

REQUEST INFORMATION
What do you need help with today?: I need to make a request related to Master Data

Stop word file in YAML

# Stopwords section contains common words that should be excluded from text processing.
# These words are typically considered insignificant for the purpose of analysis.
stop_words:
- 'Pacific'
- 'Impact'

# Regex section contains regular expressions used for matching patterns in text.

# These can be used for tasks like validation, searching, or text extraction.
regex:
removal:

extraction:

- '(?<=::).*$'

# Wildcards section contains patterns with wildcard characters for flexible matching.
# These are used in scenarios where exact matching is not required, allowing for variability.
wildcards:
  # Match any words that starts with ERR
  - 'Parameter%'
  # Match any word that has prod in between
  - '%prod%'
  # Match any word that ends with .com
  - '%.com'

Output

The Matrix Execution required by 12/6 value from the incident is used for clustering.

Download the sample .YAML stop word file for reference.
Review the sample stop word file to understand the specified format and create the stop word file for uploading.
Define stop words and their patterns using regular expressions in a .YAML file, and validate the file using any YAML validator.
Warning
Important
While creating a stop word file, you must adhere to the YAML format mentioned in the examples. See Stop word file examples for creating stop word patterns using regular expressions.
You can validate your regular expressions and YAML file in https://regex101.com/ and https://yamlchecker.com/ respectively.
Upload the .YAML file that contains your stop words for the recurrent job.

Important

You can continue using the stop word file that was uploaded in .TXT format in the previous releases. However, in version 23.3.04 and later, you must use a YAML file to define stop words and their patterns.
Every time you upload a new stop word file, it overrides the old file. The last updated YAML file is used for creating clusters.
While generating cluster labels in the dashboard from the relevant incidents, the algorithm compares the incident description words with the stop word library. Therefore, the cluster labels do not contain words mentioned in the stop word library.

Proactive problem management has a built-in library of stop words for every supported language. The algorithm refers to the library and your preferred stop words while processing incident information.

View default stop words of English language

from
subject
re
edu
use
need
able
bmc
com
abc
however
ourselves
alone
us
would
already
most
off
back
and
none
because
where
first
it
nevertheless
too
each
whereupon
wherein
as
this
whatever
always
serious
then
‘ve
not
he
them
n't
even
thru
anyway
above
eight
'm
or
besides
hereby
than
during
being
never
therein
has
does
hereupon
whereafter
is
becomes
ca
get
seemed
nowhere
nobody
rather
whenever
yourselves
few
may
elsewhere
but
my
again
will
're
more
once
'm
otherwise
anything
various
had
together
within
via
are
fifty
afterwards
mine
how
many
thereupon
should
himself
everything
against
sixty
perhaps
although
's
along
except
his
whether
anywhere
must
one
their
s'
for
no
someone
upon
meanwhile
might
here
namely
indeed
under
n‘t
almost
least
forty
we
everyone
toward
before
if
show
about
please
another
through
've
to
unless
side
also
move
any
can
just
two
throughout
three
could
me
across
whence
else
five
amongst
mostly
wherever
four
hence
front
anyone
herself
whereby
somehow
whose
ever
go
was
themselves
since
when
noone
's
with
same
did
n’t
per
yourself
am
become
beside
well
why
they
'll
regarding
ours
her
give
our
made
‘re
thereby
an
much
full
ten
take
're
out
such
therefore
over
still
seem
'd
former
latter
next
everywhere
‘d
quite
becoming
amount
sometime
twenty
doing
last
name
’ll
whom
these
every
latterly
make
seeming
among
part
that
twelve
either
of
myself
say
hundred
thereafter
those
at
you
down
several
herein
been
beyond
nine
him
towards
what
onto
do
both
sometimes
thence
moreover
the
due
beforehand
empty
fifteen
thus
anyhow
have
whereas
eleven
done
yours
she
your
whoever
who
only
own
somewhere
formerly
between
others
neither
below
up
its
hers
be
whither
were
into
yet
became
‘m
less
until
all
see
used
seems
now
enough
often
call
without
on
a
something
further
put
though
after
while
bottom
six
nothing
hereafter
behind
which
very
so
top
using
whole
i
there
really
in
’d
nor
third
cannot
around
itself
by
keep
other
’ve
‘ll
some

View the use of % in stop words

The following table describes the usage of % in stop words:

Incident summary	Stop word	Description
ITSMInsights is running low on memory	ITSM%	Removes the stop word ITSM and the characters following it. In this case, ITSMInsights is removed from the resulting cluster label.
ITSMInsights is running low on memory	%Ins%	Removes the stop word Ins and the characters preceding and following it. In this case, ITSMInsights is removed from the resulting cluster label.
ITSMInsights is running low on memory	%Insights	Removes the stop word Insights and the characters preceding it. In this case, ITSMInsights is removed from the resulting cluster label.

To configure Personally identifiable information

You can enable the Remove Personally Identifiable Information (PII) toggle to remove the personally identifiable information from the incident details from being clustered.

The following personally identifiable information (PII) are removed from incidents:

Name
Phone number
Email
City name
Credit card details
IP address
Address
US Passport number
Social security number
US driver license number

Important

The algorithm works best in removing PII related to English language, and may fail to detect and remove PII related to other languages as other languages are not supported.
When you enable the Remove Personally Identifiable Information (PII) toggle, the existing clusters are removed, and new clusters are generated.

To configure Resolution insights

Click the Enable toggle key, and select the following parameters to derive accurate resolution notes for incidents.
Select the source field name of the incident in ITSM from which the algorithm derives the resolution note.
By default, Resolution note (Resolution) is selected.
Success
Tip
If you want to select a custom textual field as the source field, you must first add it to the data set.
Select the minimum number of incidents required in a cluster for the algorithm to generate the resolution insight details.
By default, at least five incidents must be present in a cluster to generate the resolution insights details.
Select the maximum number of resolution insights clusters to be displayed in the drill-down view.
A maximum of 25 resolution insights clusters can be displayed in the drill-down view.

Warning
Important
When you run a job with resolution insights enabled, the job runs in two iterations. The first iteration of the job run generates Proactive problem management clusters in the dashboard and the second iteration generates the resolution insights clusters. We recommend you wait until the second job run iteration is complete before viewing the Grouped by resolution insights tab.

To edit a one-time job

To edit a recurrent job, click the edit icon in the Actions column. Make the necessary changes to the job.

The changes you have done will take effect in the next job run.

To delete a one-time job

To delete a recurrent job, click the delete icon in the Actions column.

What happens when a job is deleted?

When you delete a job, the job definitions, that is, the data fields and filters applied are also deleted. Also, all job runs associated with that job are deleted.

Configuring one-time job settings for proactive problem management

To configure a one-time job

To configure Data set filters

To configure data range filter

To configure the cluster groups

To configure advanced machine learning

To configure stop words

Using regular expressions to remove words and sentences from getting clustered

Using regular expressions to extract specific word and sentences pattern for clustering

To configure Personally identifiable information

To configure Resolution insights

To edit a one-time job

To delete a one-time job

What happens when a job is deleted?

BMC Helix ITSM Insights 23.3

On this page