Working with full-text search

As an application business analyst, you can enable full-text search (FTS) to enhance search results in the applications, such as BMC Helix ITSM, BMC Helix Business Workflows, or BMC Helix Digital Workplace. FTS involves indexing before searching so that when a user performs a search, only the index is searched instead of the entire text.

The following table describes the process that occurs when you perform a full-text search for knowledge articles:

Description	System Component
The search engine performs the search by using the keywords from the summary described for the the published articles. Search component workflows are triggered that build a search qualification and raise a query to the application full-text search engine.	Application
The full-text search engine is invoked to perform the search.	BMC Helix Innovation Studio full-text search engine
The search engine performs the search by using an Index that gets generated based on the indexing of knowledge articles. The search engine uses its own Relevance Algorithm to order the search results.	BMC Helix Innovation Studio full-text search engine
The application completes processing the search results. It also ensures row level security by eliminating records that should not be visible to the current user.	BMC Helix Innovation Studio post processing
Search results are displayed based on the relevance score or weight returned by the full-text search.	Application

Depending on when the information is created or modified, an index of knowledge articles is created which is used to search the articles.

Relevance scoring in the search engine

This section describes the features related to relevance scoring in the search engine.

Indexing

When you add or update any information, the search engine updates the index. This re-indexing involves analysis of the information.

ISKM treats a record definition or an entry as a document and a field of the record marked for indexing as a field within a document. For example, a case record definition is a document and Description or Notes are fields in the document. The fields are marked for 'MFS only' or 'FTS and MFS' indexing and become the fields to be indexed by the search engine within the document. However, attachments linked to a record definition are not available as an out-of-the-box data source. You must create a data connection if you plan to use text from attachments.

The search engine builds a field level index.

When indexing a field, the search engine performs the following tasks:

Extracts keywords and calculates the number of occurrences per field and per record definition. This is called Term Frequency.
Uses root words. This is called Stemming.
Provides a dictionary for similar words or synonyms.
Provides a list of words which must be ignored. These words are called Stop Words.

Searching

Based on the search words provided by a user, the search engine uses the existing index to find matching or similar documents. It tries to find relevancy of the documents against the search terms and assigns a score to each document. The score is determined by three factors listed in the following table:

Factor	Description
How often do the search terms appear in the document	The more frequently the term is found, the higher the score. For example, a field containing five mentions of the same term is more likely to be relevant than a field containing just one mention.
How often do the search terms appear across the documents in the collection	The more frequently the term is found, the lower the score. Common terms like go and find contribute less to relevance than uncommon terms like MongoDB, Outlook, and so on.
How long is the field in which the search terms appear	The shorter the field, the higher the score. If a term appears in a shorter field like Title or Keyword, the probability of it describing the whole document is higher than a body field.

In case multiple fields are indexed in the same document, the above scores are aggregated across the field level scores for each document.

There are some more factors contributing to the score like Term Proximity for phrase queries and Term Similarity in case of fuzzy queries. However, ISKM does not use phrase query by default and does not support fuzzy queries.

Summarizing the search engine relevance

The following considerations decide the relevancy of results and consequently, the order of the documents in the result:

Documents containing all or maximum search words appear on the top in the search result list.
Matches found on rare words are better than common words.
Long documents are not as good as short documents.
Documents mentioning the search terms multiple times are good.