Migrating IBM Watson Discovery training data from V1 to V2


Users who are using IBM Watson Discovery V1 may already use the training data available there. After they migrate to IBM Watson Discovery V2, they may want to retain the existing training data available in IBM Watson Discovery V1 instance. To get access to the IBM Watson Discovery V1 features, users need to get all the relevant information that is available in that version. However, IBM does not offer the copy paste process to get the data from V1 to V2. As a result, the user needs to migrate the training data from the beginning in order to get access to the IBM Watson Discovery features.

Once the source data in the V1 instance, such as record Definition, articles and others that are absorbed through the BMC crawler utility or any supported data-sources, and re-triggered to the V2 instance, the user needs to restore all the training data of the V1 instance and migrate them to the V2 instance. 

Before you begin

Make sure that you have completed the following tasks:

Product

Task

System configuration

Make sure that you have:

  • A Linux, Unix, or Windows machine on which you can run the BMC crawler utility.
  • Java version 11 or later.

IBM Watson Discovery

  • Have the following details of the IBM Watson Discovery instance to which you want to upload the BMC Helix ITSM: Knowledge Management articles:
    • Identity and Access Management (IAM) API key
    • Endpoint URL
    • IBM Watson Discovery V2

You can get the IAM API key and the endpoint URL image-2023-7-26_10-26-6.png by logging in to IBM Cloud.

 

Important

  • Ensure that a simple <filename>.txt file contains more than 51 questions. 
  • Ensure that the <filename>. properties file exists in the instance.
  • Ensure that you have configured the following mandatory attributes in the properties file:
    • questionsFile=<Filename.txt>
    • trainingFile=<Filename.csv>
    • version=2023-03-31
    • apikey=<watson_discovery_v2_apikey>
    • url=<watson_discovery_v2_url>
    • projectId=<project_id>
    • collectionId=<collection_id>
  • Ensure that you have configured the following optional attributes in the properties file:
    • Desire confidence rank to train (default value is 0.01)
      Example: desireConfidenceKey=<desireConfidenceKey>
    • Number of examples that will use to train (default value is 10)
      Example: desireExampleKey=<desireExampleKey>
    • Number of results that discovery return from query request api (default value is 10)
      Example: resultsCountKey=<resultsCountKey>
    • Fields names separated with comma
    • Default values are: id, confidence, extracted_metadata, text, metadata and title)
      Example: fieldNamesKey=<fieldNames1,fieldName2...>
    • Number of the characters that return in "TEXT" field (Default value is 400)
      Example: resultsCharsKey=<resultsCharsKey>

Task 1: To migrate IBM Watson Discovery V1 support data to IBM Watson Discovery V2

  1. Ensure that the Java archive and its files are in the same folder type: java -DibmtrainingPropFile=<sample-ibmtraining>.properties -jar IBMTraining-99.00.00.jar.
  2. Use the Java utility to convert the questions file to list objects and run the IBM query API search process.
  3. The system stores the results in the generate *.csv, file, with the following data:
    • Question.
    • Document ID.
    • Passage text: first 400 characters.
    • Train rank base on the IBM confident. It is used to generate the default training examples.
  4. After successful run, you can go access the V2 instance and customise the training data.


 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*