Training and testing the cognitive service for a custom application
You must specify the percentage of data that can be used for training the cognitive service. By default, 80% of the data set is used for training and 20% is used as test data.
Process for training and testing cognitive service
The following image explains the tasks that an administrator must perform to train and test the cognitive service by using a CSV data set or application data:
The following table describes the steps to train and test the cognitive service:
Create data sets | |||
---|---|---|---|
Type of data set | Task | Description | Reference |
CSV | Create a CSV data set | Create the CSV data set according to the defined structure and guidelines. | |
Upload the CSV file | Upload the CSV data and specify the percentage of rows that you want to use as training data and test data. The system randomly splits the CSV file into training data set and test data set. | ||
Application data | (Optional) Upload seed data | To start the cognitive service training, initially you might not have application data. In this case, the seed data acts a startup data for machine learning. The cognitive service learns by using the seed data initially and then picks up the application data. Important: The seed data is provided in a CSV file. |
|
Specify the application data that you want to use for training | You must select the fields in the record definitions from which data is used for training and test the cognitive service. To ensure that the training data does not go beyond the limit specified by IBM Watson, you can define a condition to further filter the data. Example: You select the Service request record definition and the Summary and Category fields in that record definition. You also define a condition such as Status = Open and Priority = High so that only the data which matches these conditions are used for training. | ||
Train the cognitive service | |||
CSV | Train the cognitive service | Select the CSV training data set that you uploaded earlier to train the cognitive service. | |
Application data | Select the application data that you specified earlier to train the cognitive service. | ||
Evaluate the cognitive service training | |||
CSV and Application data | Evaluate whether the cognitive service is trained correctly. | After you train the cognitive service, you can evaluate the cognitive service training for auto-categorization or auto-assignment. For auto-assignment, the cognitive service returns the login IDs of the assignee. When you create assignment training data sets, you must ensure that the assignee belongs to the Agents group in the Foundation library. |
Before you begin
Before you train or test the cognitive service by using the CSV file or by using application data, make sure that you have completed the following tasks:
- To be able to train or test the IBM Watson or BMC Native (Google) classification, administrators must create an application configuration UI.
For more information, see Enabling-a-custom-application-for-cognitive-service. - Based on the classification service provider you select, perform one of the following tasks. To know more about classification service providers, see Centralized-configuration.
To communicate with IBM Watson Assistant, add the Watson credentials in BMC Helix Innovation Studio.
For more information, see Configuring-cognitive-service-for-custom-applications-by-using-IBM-Watson-activated-by-BMC.- To communicate with Google Cloud Platform and use BMC Native (Google) classification service, add the service account credentials in BMC Helix Innovation Studio.
For more information, see Configuring-cognitive-service-for-a-custom-application-by-using-BMC-Native-Google-classification.
- If you want to use application data for training data sets, identify the record definition and fields that you want to use for training and testing.
To upload a CSV file
After creating the CSV data set, perform the following steps to upload the CSV file:
- Log in to BMC Helix Innovation Studio and navigate to the Administration tab.
- Click the configuration that you created for training the cognitive service.
For example, select My application > Cognitive Training. Based on the value defined in the Classification-Service-Provider setting, one of the following tabs is displayed:
Value
Tab
WATSON
Auto-classification Training and Evaluation - IBM Watson
NATIVE
Auto-classification training and evaluation - BMC Native (Google)
For information about the Classification-Service-Provider setting, see Configuration-settings-C-D.
- From the selected tab, in the Data Sets section, click New, and select CSV Data Set.
The following image is an example of uploading CSV data sets: - Fill out the Data Set Name and Description fields.
- In Training Type, the option is populated automatically based on the value defined in the Classification-Service-Provider setting.
You cannot change this option. - In CSV File, select the CSV training data set file that you created earlier.
From the Locale list, select the locale of the training data set.
- In Training Data, select the percentage of the CSV data that you want to use as training data.
- In Testing Data, the percentage of CSV data that you want to use as test data is automatically calculated according to the Training Data percentage.
- Click Save.
To use application data in a training data set
Perform the following tasks when you want to use application data to train and test the cognitive service:
To upload seed data
- Log in to BMC Helix Innovation Studio and navigate to the Administration tab.
- Click the configuration that you created for training the cognitive service.
For example, select My application > Cognitive Training. - Click Configure Training Data Sets.
- On the Training Data Sets section, click New, and select Platform Data Set.
- Fill out the Data Set Name and Description fields.
- In Training Type, the option is populated automatically based on the value defined the Classification-Service-Provider setting.
You cannot change this option. - In CSV File, select the CSV data set file that you created earlier.
From the Locale list, select the locale of the training data set.
- In Record Definition Name, select the record definition that you want to use to provide data to the cognitive service.
- In Text Fields, click Add/Remove Text Fields and select one or more text fields that contain the text values for the cognitive service to classify. If you select more than one text field, the values are concatenated.
- In Category Fields, click Add/Remove Category Fields and select one or more category fields to classify the values in the text fields. If you select more than one text field, the values are concatenated.
- Click Save.
To specify application data that you want to use for training and testing
- Log in to BMC Helix Innovation Studio and navigate to the Administration tab.
- Click the configuration that you created for training the cognitive service.
For example, select My application > Cognitive Training. - Click Configure Training Data Sets.
- On the Training Data Sets section, select one of the following options:
- If you have not uploaded seed data, click New, and select Platform Data Set.
- If you have uploaded seed data, click the name of the data set in which you uploaded the seed data.
Modify the data set in which you have uploaded the seed data.
Specify the application data that you want to use for training.
The new training data set is displayed in the Auto-classification Training and Evaluation section. An administrator can delete the training data set or create a copy of the existing training data set.
- Click Save.
To train and test the cognitive service
After you have uploaded the CSV data set or selected the application data, you can train and test the cognitive service.
- Log in to BMC Helix Innovation Studioand navigate to the Administration tab.
- Click the configuration that you created for training the cognitive service.
For example, select My application > Cognitive Training. - In the Auto-classification Training and Evaluation section, select the training data set that you want to use for training, and click Train and Test.
The data set is randomly split into training data set and test data set based on the percentage that you specified earlier.
The status of the training data is changed to Training and when the training is completed successfully, the status of the training data set is changed to Trained.
Where to go from here
To understand how to evaluate the cognitive service test results, see Evaluating-the-cognitive-service-test-results.