This documentation supports the 20.08 version of BMC Helix Platform. 
To view an earlier version, select 20.02 from the Product version menu.

Installing and configuring the cognitive search data crawler for Remedy Knowledge Management articles

To include knowledge articles from Remedy Knowledge Management in cognitive search, you must install and configure the BMC crawler utility. The BMC crawler utility is an improved crawler that is used to crawl the Remedy Knowledge Management articles only. 

After crawling, the knowledge articles in Remedy Knowledge Management are uploaded to the IBM Watson Discovery collection. 

Tip

BMC Helix Platform cognitive search supports BMC Helix Business Workflows knowledge articles and does not require a crawler to include knowledge articles in cognitive search.

Important

  • If you have previously crawled the Remedy Knowledge Management articles by using the IBM Watson Discovery Data Crawler, you must perform one of the following tasks:
    • Create a new collection and re-crawl by using BMC crawler.
    • Configure the BMC crawler utility so that the utility deletes the older collection, creates a new collection, and re-crawls the knowledge articles.

After re-crawling, use the updated display templates. The older display templates do not work with the BMC crawler.

Before you begin

Ensure that you have completed the following tasks:

ProductTask
System configuration
  • Ensure that you have a Linux, Unix, or Windows machine on which you can run the BMC crawler utility.
  • Java 11 or later.

IBM Watson Discovery

  • Have the following details of the IBM Watson Discovery instance to which you want to upload the Remedy Knowledge Management articles:
    • Identity and Access Management (IAM) API key
    • Endpoint URL

You can get the IAM API key and the endpoint URL by logging in to IBM Cloud.

 Copy the IAM API key and endpoint URL from this screen



Remedy Knowledge Management


  • Have the Remedy Action Request System hostname and port. (You do not need to run the utility on Remedy AR System server.)
  • Have the Remedy administrator credentials.
  • To ensure that end users can access the articles in Remedy Knowledge Management, set the visibility conditions for the knowledge articles. For information about configuring the visibility for knowledge articles, see Managing knowledge article visibility .
  • To ensure that the articles are indexed, ensure that you have registered the file system path. For more information, see Registering file system paths .

    The files in the registered path are mapped to the RKM:VF_FileSystem_Manageable_Join vendor form in Remedy Knowledge Management.

Process for installing and configuring the BMC crawler utility

The following image illustrates the end-to-end process to install and configure the BMC crawler utility:

Task 1: To set up your system and download the BMC crawler utility

  1.  Download the BMC crawler utility.
    1. As an administrator, download the BMC Crawler utility.
    2. Save the crawler utility in your local machine. 
    3. Extract the .zip file of the utility. 

      The extracted folder includes the following files:

      • bmccrawler.properties
      • bmcCrawler-20.2.0.jar
  2.  Set the Java environment variables.

    You must set the Java environment variables in your system to run the BMC crawler utility.

    1. Open the command prompt.
    2. Perform one of the following steps:

      • For Linux, run the following command:

        Commands to set environment variables
        export JAVA_HOME=/opt/jdk
        export PATH=/opt/jdk/jre/bin:$PATH
        export PATH=$JAVA_HOME/bin:$PATH
      • For Windows, Set the Java_Home variable .

  3.  Configure the bmcCrawler.properties file.
    1. Navigate to the folder where you downloaded the BMC crawler utility. 
    2. Open the extracted folder and then open the bmccrawler.properties file. 
    3. Specify the following parameter values:

      Mandatory parameters
      ParameterDescriptionExample value
      discoveryApiKey

      Specify the API key of the IBM Watson Discovery instance.

      zABCd4EFgHijKLmn1O1pqrStuu0vWXYzaBCDEfGHIjk_

      discoveryEndPoint

      Specify the endpoint URL of the IBM Watson Discovery instance.

      https://gateway-/location.watsonplatform.net/discovery/api
      remedyHostNameSpecify the host name of the Remedy Action Request System server.abc-tenantname-1234
      remedyPortSpecify the port number of the Remedy Action Request System server.46262
      remedyUserSpecify the Remedy AR System administrator user name.Administrator
      remedyPassword
      • (If you are running the BMC Crawler utility to encrypt the password)
        Specify the password in plain text.
      • (If you are running the BMC Crawler utility after encrypting the password)
        Specify the encrypted password.
      • Plain text password—password123
      • Encrypted password—1234567891106a28b29c843d98e5fg1hi59j0k12345678953592258d3a981312b1
      formName

      Specify the Remedy Knowledge Management form name.

      Note: You can specify only one form at a time. Comma-separated form names are not valid.

      RKM:HowToTemplate_Manageable_Join
      fieldNamesForParagraphs

      Specify the field names whose values are appended to the Text field of the IBM Watson Discovery collection.

      During a cognitive search, matching paragraphs are derived from the Text field.

      Notes:

      • For file system articles, specify only one attachment field. If you specify multiple attachment fields, only the first one is considered. 
      • For database tables, do not specify attachment fields.
      Article_Keywords,ArticleTitle
      fieldNamesForDetails

      Specify the field names that should be added as metadata fields in the IBM Watson Discovery collection.

      During a cognitive search, the metadata fields are used to display the knowledge article details.

      Notes:

      • For file system articles, specify only one attachment field. If you specify multiple attachment fields, only the first one is considered. 
      • For database tables, do not specify attachment fields.
      DocID,ArticleTitle
      discoveryConfigurationName
      • If you do not want to specify the value for discoveryConfigurationFilePath, specify the name of the IBM Watson Discovery configuration.
      • If you specify the value for discoveryConfigurationFilePath, the configuration name is taken from the JSON file.
      HowTo-Config
      discoveryCollectionName

      Specify the name of the IBM Watson Discovery collection.

      HowTo-Collection
      customerIdSpecify the customer ID that you want to use for labeling data for General Data Protection Regulation (GDPR).BMC
      Optional parameters with default values
      chunkSize

      Specify the batch size for querying entries from the Remedy Knowledge Management form.


      Default value1000
      threadPoolSize

      Specify the number of knowledge articles that will be crawled simultaneously.


      Default value—10

      Maximum valid value—50

      deleteCollectionBeforeCrawl

      Specify whether you want to delete the collection before you start crawling.

      Default value—no

      Valid values—yes/ no

      qualificationSpecify a valid Remedy AR System qualification to include Remedy Knowledge Management articles.Default value—'1=1'
      language

      Specify the language of the IBM Watson Discovery collection.

      Valid values: en, es, de, fr, it, ja, ko, pt, nl, zh-CN

      Default valueen
      discoveryEnvironmentSize

      Specify the IBM Watson Discovery plan.

      Default valueLT (for the Lite plan) Valid valuesLT/ S (for all other plans)
      fieldNameForGettingModifiedRecordSpecify the field name that represents the modified date of a form in Remedy Knowledge Management. Default value—Modified Date
      Optional parameters without default values
      discoveryConfigurationFilePath(After you download and modify the out-of-the-box files) Specify the path of the JSON file that you modified. C:\\Path\\Demo_Discovery_Collection_Configuration_HowTo_.json
      titleFieldNameSpecify the field name that will be considered as the title for the knowledge article in the search results. Title
  4.  Restrict the properties file to administrators.

    After downloading the BMC crawler utility, you must restrict the bmcCrawler.properties file so that only authorized administrators can access and modify this file.

    Best Practice

    We recommend that you provide access to the file to only that user who owns the bmcCrawler.properties file. Do not provide access to any other users.

    • To restrict the file on Windows, perform one of the following steps:
      • Restrict file permissions by using Windows Explorer.
      • Open the command prompt and Apply Discretionary Access Control Lists (DACLs) to the file .
    • To restrict the file on Linux, open the command prompt and run the chmod 700 command as specified in The chmod command description.
      For example, use one of the following commands:
        • chmod 700 bmcCrawler.properties
        • chmod u+rwx,go-rwx bmcCrawler.properties

Task 2: To download the out-of-the-box files

BMC provides the following files out-of-the-box:

  • Stop words file—Text file with a list of words that you can filter out from the data collection.
  • Discovery Collection configuration file—JSON file format of knowledge articles that are crawled and uploaded to the IBM Watson Discovery collection in this format.

After downloading, you can modify the stop words list or the enrichments of the knowledge articles. 

  1. Download the sample stop words file. 
  2. Download the Discovery Collection configuration file.
  3. (Optional) If required, modify the stop words list. 
  4. Modify the enrichments section in the Discovery Collection configuration file according to the Remedy Knowledge Management form that you want to crawl. 
  5. Save the modified file with a different name. 

Task 3: To encrypt the password and forcefully pause the crawler for stop words

  1.  Run the crawler for the first time to encrypt the Remedy password.

    You run the BMC crawler for the first time to encrypt the Remedy password. The password is encrypted by using the AES with GCM cipher and 256-bit key.

    1. Open command prompt. 
    2. Run the following command:

      Example of command to encrypt the password
      java -jar -DbmcCrawlerPropertyFile=.\bmcCrawler.properties bmcCrawler-20.2.0.jar -encpassword

      The encrypted password is displayed in the command prompt. 

    3. Note the encrypted password for future reference. 

  2.  (For Microsoft Windows only) Restrict the Keys file to administrators

    After the password is encrypted, the key.txt file is generated and is located in the same directory from where you run the BMC crawler utility. If you have Microsoft Windows, you must restrict this file to administrators only. If you have Linux, this file is automatically restricted and you do not have to restrict it manually.

    1. Navigate to the location of the key.txt file.
    2. Perform one of the following steps:
      • Restrict file permissions by using Windows Explorer.
      • Open command prompt and Apply Discretionary Access Control Lists (DACLs) to the file .

Task 4: To run the crawler for a second time and forcefully pause the crawler for stop words

Perform this task if you want to upload and activate the stop words.

Best practice

We recommend that you wait till the stop words file status becomes active. It takes several minutes for the stop words file status to become active. Perform the steps to check the status of the stop words file.

  1. Open command prompt. 
  2. Run the following command:

    Example of command to encrypt the password
    java -jar -DbmcCrawlerPropertyFile=.\bmcCrawler.properties bmcCrawler-20.2.0.jar -stopToCreateStopwordList

    The BMC Crawler displays a message that the crawler has stopped. The user must go to the IBM Watson Discovery UI to upload and activate the stop words file.

Task 5: To upload the stop words file

  1. Log in to IBM Watson Discovery.
  2. Navigate to the data collection to which you want to upload the Remedy Knowledge Management articles. 
  3. Click the Search settings tab. 
  4. In the Stopwords section, click Upload, as shown in the following image:

For more information about uploading stop words, see  Defining stop words  in IBM documentation. 

Task 6: To check the status of the stop words file

After uploading the stop words file, it takes several minutes for the stop words status to become active. You can check the status of the file, by performing the following steps:

  1. In IBM Watson Discovery, open the data collection that was created (data collection to which you want to upload the articles). 
  2. Copy the Collection Id and the Environment Id, as shown in the following image:
  3. Open command prompt. 
  4. Run the following command:

    Example of command to check the stop words file status
    curl -u "apikey":"{apikey}" -X GET https://gateway.watsonplatform.net/discovery/api/v1/environments/{environment_id}/collections/{collection_id}/word_lists/stopwords?version=2019-04-30

    If the file is in pending status, as shown in the following example, wait for the status to change to active:

    {"status":"pending","type":"stopwords"}

    After the file is active, the status is displayed as shown in the following example:

    {"status":"active","type":"stopwords"}

Task 7: To run the crawler

You run the BMC crawler for a third time to crawl Remedy Knowledge Management articles so that they are uploaded to the IBM Watson Discovery collection.

  1. Open the command prompt. 
  2. Run the following command:

    Example command to run BMC crawler
    java -jar -DbmcCrawlerPropertyFile=.\bmcCrawler.properties bmcCrawler-20.2.0.jar

    A log file is generated every time you run the BMC crawler and is saved in the same directory from where you ran the crawler. 

Tip

After crawling articles from Knowledge Management, you can train and test the search result relevancy by using IBM Watson Discovery tooling methods. For more information, see Improving result relevance with the tooling in IBM documentation.

Where to go from here

Defining search data sets


Was this page helpful? Yes No Submitting... Thank you

Comments