Phased rollout


This version of the software is currently available only to early adopter SaaS customers as the first step in our phased rollout. Click here to view an earlier version.

Leveraging machine learning metrics to improve chatbot predictability

BMC Helix Cognitive Automation provides chatbot capabilities for an application so that the chatbot correctly responds to queries by end users, understands the chat context, and performs tasks on behalf of the end users. You must train IBM Watson Assistant to work with your data by creating intents, entities, and dialog in IBM Watson Assistant (that serve as training data for chatbot). Before implementing the intents, entities, and dialog, you can test the accuracy of BMC Helix Virtual Agent to ensure that the chatbot correctly understands the user intent and correctly auto-categorizes the service requests. After testing the data sets, the test results provide the exact problem area when the chatbot does not respond as desired, so that you can rectify the chatbot training data. The tests are particularly important when you are implementing a new training data set or if you have made major changes to the data set.

Benefits of testing BMC Helix Virtual Agent

Testing BMC Helix Virtual Agent has the following benefits:

  • Helps to evaluate the chatbot on the basis of standard machine learning algorithms. 
  • Helps identify the exact area of problem so that you can rectify the data sets to improve the performance of BMC Helix Virtual Agent.
  • Provides a history of the test results. 

Test metrics provided after testing BMC Helix Virtual Agent

You can test BMC Helix Virtual Agent to get the following test metrics:

Higher precision and recall indicate higher viability of the data sets. For more information about how test metrics are calculated, see  FAQs Open link .

Additional information about machine learning metrics

The following blogs provide more information about machine learning metrics and macro versus micro average of precision. BMC does not endorse the information in these external links. This information provided in these links should be used for reference purposes only.

  • GreyAtom Blog, Performance Metrics for Classification problems in Machine Learning. Open link
  • Text Mining, Analystics, & More Blog, Computing Precision and Recall for Multi-Class Classification Problems. Open link
  • Data Science Stack Exchange Question and Answer Site, Micro Average vs Macro average Performance in a Multiclass classification setting. Open link
  • Rushdi Shams Blog, Micro and Macro-average of Precision, Recall and F-Score. Open link

Scenario of testing BMC Helix Virtual Agent

Scenario: An organization uses BMC Helix Virtual Agent to create service requests on behalf of the end user. The administrator has created intents, entities, and dialogs in IBM Watson Assistant, so that when when an end user chats and requests to increase RAM in the laptop, the chatbot categorizes the user's intent and creates a service request on behalf of the user. These intents, entities, and dialogs serve as training data for BMC Helix Virtual Agent. You want to test whether the data set correctly understands the users' intent. 

Example of test data: You must create a CSV file of test data with examples of natural-language texts and the user's intent in the text.  This test data set is used to check whether variations of the natural language dialog such as apply for vacationneed time-offand so on are recognized as PTO requests

Example of test results: The test results CSV file shows that the variation need time-off is incorrectly recognized as Show time card. You can add more entities of this variation in  IBM Watson Assistant. You can also evaluate the score of each test metrics.

Where to go from here

Testing the chatbot training data to improve predictability

Related topic

Leveraging machine learning metrics to improve cognitive service data sets Open link

Was this page helpful? Yes No Submitting... Thank you