Phased rollout

 

This version of the software is currently available only to early adopter SaaS customers as the first step in our phased rollout. Click here to view an earlier version.

Installing and configuring the cognitive search data crawler for BMC Helix ITSM: Knowledge Management articles

To include knowledge articles from BMC Helix ITSM: Knowledge Management in cognitive search, you must install and configure the BMC crawler utility. The BMC crawler utility is an improved crawler that is used to crawl the BMC Helix ITSM: Knowledge Management articles only. 

After crawling, the knowledge articles in BMC Helix ITSM: Knowledge Management are uploaded to the IBM Watson Discovery collection. For more details about the data flow, see Leveraging cognitive search in your application.

Tip

BMC Helix Innovation Studio cognitive search supports BMC Helix Business Workflows knowledge articles and does not require a crawler to include knowledge articles in cognitive search.

Important

  • If you have previously crawled the BMC Helix ITSM: Knowledge Management articles by using the IBM Watson Discovery Data Crawler, you must perform one of the following tasks:
    • Create a new collection and re-crawl by using BMC crawler.
    • Configure the BMC crawler utility so that the utility deletes the older collection, creates a new collection, and re-crawls the knowledge articles.

After re-crawling, use the updated display templates. The older display templates do not work with the BMC crawler.

Before you begin

Ensure that you have completed the following tasks:

ProductTask
System configuration
  • Ensure that you have a Linux, Unix, or Windows machine on which you can run the BMC crawler utility.
  • Java 11 or later.

IBM Watson Discovery

  • Have the following details of the IBM Watson Discovery instance to which you want to upload the BMC Helix ITSM: Knowledge Management articles:
    • Identity and Access Management (IAM) API key
    • Endpoint URL

You can get the IAM API key and the endpoint URL by logging in to IBM Cloud.



BMC Helix ITSM: Knowledge Management


  • Have the Action Request System hostname and port. (You do not need to run the utility on AR System server.)
  • Have the Remedy administrator credentials.
  • To ensure that end users can access the articles in BMC Helix ITSM: Knowledge Management, set the visibility conditions for the knowledge articles. For information about configuring the visibility for knowledge articles, see Managing knowledge article visibility .
  • To ensure that the articles are indexed, ensure that you have registered the file system path. For more information, see Registering file system paths .

    The files in the registered path are mapped to the RKM:VF_FileSystem_Manageable_Join vendor form in .BMC Helix ITSM: Knowledge Management

Process for installing and configuring the BMC crawler utility

The following image illustrates the end-to-end process to install and configure the BMC crawler utility:

Task 1: To set up your system and download the BMC crawler utility

    1. As an administrator, download the BMC crawler utility file.
    1. Save the crawler utility in your local machine. 
    2. Extract the .zip file of the utility. 

      The extracted folder includes the following files:

      • bmccrawler.properties
      • bmcCrawler-versionNumber.jar
  1. You must set the Java environment variables in your system to run the BMC crawler utility.

    1. Open the command prompt.
    2. Perform one of the following steps:

      • For Linux, run the following command:

        Commands to set environment variables
        export JAVA_HOME=/opt/jdk
        export PATH=/opt/jdk/jre/bin:$PATH
        export PATH=$JAVA_HOME/bin:$PATH
      • For Windows, Set the Java_Home variable .

    1. Navigate to the folder where you downloaded the BMC crawler utility. 
    2. Open the extracted folder and then open the bmccrawler.properties file. 
    3. Specify the following parameter values:

      Mandatory parameters
      ParameterDescriptionExample value
      discoveryApiKey

      Specify the API key of the IBM Watson Discovery instance.

      zABCd4EFgHijKLmn1O1pqrStuu0vWXYzaBCDEfGHIjk_

      discoveryEndPoint

      Specify the endpoint URL of the IBM Watson Discovery instance.

      https://gateway-/location.watsonplatform.net/discovery/api
      remedyHostNameSpecify the host name of the Action Request System.abc-tenantname-1234
      remedyPortSpecify the port number of the Action Request System.46262
      remedyUserSpecify the AR System administrator user name.Administrator
      remedyPassword
      • (If you are running the BMC Crawler utility to encrypt the password)
        Specify the password in plain text.
      • (If you are running the BMC Crawler utility after encrypting the password)
        Specify the encrypted password.
      • Plain text password—password123
      • Encrypted password—1234567891106a28b29c843d98e5fg1hi59j0k12345678953592258d3a981312b1
      formName

      Specify the BMC Helix ITSM: Knowledge Management form name.

      Important: You can specify only one form at a time. Comma-separated form names are not valid.

      RKM:HowToTemplate_Manageable_Join
      fieldNamesForParagraphs

      Specify the field names whose values are appended to the Text field of the IBM Watson Discovery collection.

      During a cognitive search, matching paragraphs are derived from the Text field.

      Important:

      • For file system articles, specify only one attachment field. If you specify multiple attachment fields, only the first one is considered. 
      • For database tables, do not specify attachment fields.
      Article_Keywords,ArticleTitle
      fieldNamesForDetails

      Specify the field names that should be added as metadata fields in the IBM Watson Discovery collection.

      During a cognitive search, the metadata fields are used to display the knowledge article details.

      Important:

      • For file system articles, specify only one attachment field. If you specify multiple attachment fields, only the first one is considered. 
      • For database tables, do not specify attachment fields.
      DocID,ArticleTitle
      discoveryConfigurationName
      • If you do not want to specify the value for discoveryConfigurationFilePath, specify the name of the IBM Watson Discovery configuration.
      • If you specify the value for discoveryConfigurationFilePath, the configuration name is taken from the JSON file.
      HowTo-Config
      discoveryCollectionName

      Specify the name of the IBM Watson Discovery collection.

      HowTo-Collection
      customerIdSpecify the customer ID that you want to use for labeling data for General Data Protection Regulation (GDPR).BMC
      Optional parameters with default values
      chunkSize

      Specify the batch size for querying entries from the BMC Helix ITSM: Knowledge Management form.


      Default value1000
      threadPoolSize

      Specify the number of knowledge articles that will be crawled simultaneously.


      Default value—10

      Maximum valid value—50

      deleteCollectionBeforeCrawl

      Specify whether you want to delete the collection before you start crawling.

      Default value—no

      Valid values—yes/ no

      qualification

      Specify a valid AR System qualification to include BMC Helix ITSM: Knowledge Management articles.

      Default value—'1=1'
      language

      Specify the language of the IBM Watson Discovery collection.

      Valid values: en, es, de, fr, it, ja, ko, pt, nl, zh-CN

      Default valueen
      discoveryEnvironmentSize

      Specify the IBM Watson Discovery plan.

      Default valueLT (for the Lite plan) Valid valuesLT/ S (for all other plans)
      fieldNameForGettingModifiedRecord

      Specify the field name that represents the modified date of a form in BMC Helix ITSM: Knowledge Management

      Default value—Modified Date
      Optional parameters without default values
      discoveryConfigurationFilePath(After you download and modify the out-of-the-box files) Specify the path of the JSON file that you modified. C:\\Path\\Demo_Discovery_Collection_Configuration_HowTo_.json
      titleFieldNameSpecify the field name that will be considered as the title for the knowledge article in the search results. Title
      documentUniqueID

      Specify the document ID to be used in IBM Watson Discovery.

      Important:

      • If the parameter value is blank, the BMC Helix ITSM: Knowledge Management article request ID is set as the unique ID in IBM Watson Discovery. If there are multiple versions of the knowledge article, all the versions of the article might be displayed to the end user.
      • If you want to use the BMC Helix ITSM: Knowledge Management knowledge base ID as the unique ID in IBM Watson Discovery, you must set the value to DocID. If there are multiple version of the knowledge article, only the latest version of the article are displayed to the end user.
      docID
  2. After downloading the BMC crawler utility, you must restrict the bmcCrawler.properties file so that only authorized administrators can access and modify this file.

    Best Practice

    We recommend that you provide access to the file to only that user who owns the bmcCrawler.properties file. Do not provide access to any other users.

    • To restrict the file on Windows, perform one of the following steps:
      • Restrict file permissions by using Windows Explorer.
      • Open the command prompt and Apply Discretionary Access Control Lists (DACLs) to the file .
    • To restrict the file on Linux, open the command prompt and run the chmod 700 command as specified in The chmod command description.
      For example, use one of the following commands:
        • chmod 700 bmcCrawler.properties
        • chmod u+rwx,go-rwx bmcCrawler.properties

Task 2: To download the out-of-the-box files

BMC provides the following files out-of-the-box:

  • Stop words file—Text file with a list of words that you can filter out from the data collection.
  • Discovery Collection configuration file—JSON file format of knowledge articles that are crawled and uploaded to the IBM Watson Discovery collection in this format.

After downloading, you can modify the stop words list or the enrichments of the knowledge articles. 

  1. Download the sample stop words file. 
  2. Download the Discovery Collection configuration file.
  3. (Optional) If required, modify the stop words list. 
  4. Modify the enrichments section in the Discovery Collection configuration file according to the BMC Helix ITSM: Knowledge Management form that you want to crawl. 
  5. Save the modified file with a different name. 

Task 3: To encrypt the password and forcefully pause the crawler for stop words

  1. You run the BMC crawler for the first time to encrypt the Remedy password. The password is encrypted by using the AES with GCM cipher and 256-bit key.

    1. Open command prompt. 
    2. Run the following command:

      Example of command to encrypt the password
      java -jar -DbmcCrawlerPropertyFile=.\bmcCrawler.properties bmcCrawler-20.2.0.jar -encpassword

      The encrypted password is displayed in the command prompt. 

    3. Note the encrypted password for future reference. 

  2. After the password is encrypted, the key.txt file is generated and is located in the same directory from where you run the BMC crawler utility. If you have Microsoft Windows, you must restrict this file to administrators only. If you have Linux, this file is automatically restricted and you do not have to restrict it manually.

    1. Navigate to the location of the key.txt file.
    2. Perform one of the following steps:
      • Restrict file permissions by using Windows Explorer.
      • Open command prompt and Apply Discretionary Access Control Lists (DACLs) to the file .

Task 4: To run the crawler for a second time and forcefully pause the crawler for stop words

Perform this task if you want to upload and activate the stop words.

Best practice

We recommend that you wait till the stop words file status becomes active. It takes several minutes for the stop words file status to become active. Perform the steps to check the status of the stop words file.

  1. Open command prompt. 
  2. Run the following command:

    Example of command to encrypt the password
    java -jar -DbmcCrawlerPropertyFile=.\bmcCrawler.properties bmcCrawler-20.2.0.jar -stopToCreateStopwordList

    The BMC Crawler displays a message that the crawler has stopped. The user must go to the IBM Watson Discovery UI to upload and activate the stop words file.

Task 5: To upload the stop words file

  1. Log in to IBM Watson Discovery.
  2. Navigate to the data collection to which you want to upload the BMC Helix ITSM: Knowledge Management articles. 
  3. Click the Search settings tab. 
  4. In the Stopwords section, click Upload, as shown in the following image:

For more information about uploading stop words, see  Defining stop words  in IBM documentation. 

Task 6: To check the status of the stop words file

After uploading the stop words file, it takes several minutes for the stop words status to become active. You can check the status of the file, by performing the following steps:

  1. In IBM Watson Discovery, open the data collection that was created (data collection to which you want to upload the articles). 
  2. Copy the Collection Id and the Environment Id, as shown in the following image:
  3. Open command prompt. 
  4. Run the following command:

    Example of command to check the stop words file status
    curl -u "apikey":"{apikey}" -X GET https://gateway.watsonplatform.net/discovery/api/v1/environments/{environment_id}/collections/{collection_id}/word_lists/stopwords?version=2019-04-30

    If the file is in pending status, as shown in the following example, wait for the status to change to active:

    {"status":"pending","type":"stopwords"}

    After the file is active, the status is displayed as shown in the following example:

    {"status":"active","type":"stopwords"}

Task 7: To run the crawler

You run the BMC crawler for a third time to crawl BMC Helix ITSM: Knowledge Management articles so that they are uploaded to the IBM Watson Discovery collection.

  1. Open the command prompt. 
  2. Run the following command:

    Example command to run BMC crawler
    java -jar -DbmcCrawlerPropertyFile=.\bmcCrawler.properties bmcCrawler-20.2.0.jar

    A log file is generated every time you run the BMC crawler and is saved in the same directory from where you ran the crawler. 

Tip

After crawling articles from BMC Helix ITSM: Knowledge Management, you can train and test the search result relevancy by using IBM Watson Discovery tooling methods. For more information, see Improving result relevance with the tooling in IBM documentation.

Where to go from here

Defining search data sets


Was this page helpful? Yes No Submitting... Thank you

Comments