Page tree

This topic provides high-level use cases for using the BMC TrueSight IT Data Analytics (IT Data Analytics) product.

Problem isolation and root cause analysis

One of the major use cases for using the product is fast problem isolation and quick troubleshooting of the root cause.

Related topics

Problem

Applications and infrastructure produce a variety of IT data such as logs, events, and performance metrics, in large volumes. IT organizations must manually collect, read, and correlate this data to identify application issues. This process is error prone, and resolving application issues becomes a time-consuming process. Most of the time, users report that a specific operation is failing, the system appears to be slow, or the operation is consistently failing for a specific user. Due to multiple components in the system, investigating log files for each of the components becomes a challenge.

Solution

IT Data Analytics allows IT organizations to collect and index large volumes of data, in real time, in a single location. IT Data Analytics partly automates the data-collection process with the help of features such as the collection profile and content pack. These features speed up the configuration required for collection data in the product and simplifies the troubleshooting process. The product enables you to perform simple ad hoc searches and to use search commands, to perform advanced searches. The product enables you to find answers to your questions about IT data and find the root cause for application failures.

You can view all component logs across the application stack for a specific time period to find errors in other components.

The Views tab shows you application metrics trends over time. For example, you can monitor the response time, transaction duration, or server inactivity over a period of time. Using the Views tab in this way can help you narrow down the area to monitor during performance problem troubleshooting. 

With the integration of IT Data Analytics with BMC ProactiveNet Performance Management, you can send related metrics from IT Data Analytics to BMC ProactiveNet Performance Management to correlate cause with effect before you apply the fix.

Benefit

The product allows you to simplify your troubleshooting process by decreasing the number of steps required for troubleshooting. You can easily narrow down various problems by performing effective searches on the data indexed. You can use saved searches to schedule notifications to detect problems ahead of time. You can use workspaces to share the troubleshooting steps with other users. Thus the product helps you reduce the mean time to resolution (MTTR) when solving critical problems.

Proactive applications and systems monitoring

This product enables you to proactively monitor the applications and systems in your environment.

Problem

Information about a variety of performance and availability data such as startup, abnormal termination, and performance data is written into log files. Keeping track of processes such as starting and stopping can be difficult due to the large number of machines involved. It is also important to find out whether a process has stopped responding or has gone down. In addition, it is difficult to monitor how different application metrics are behaving over a period of time.

Solution

IT Data Analytics collects and organizes IT data such as performance metrics, logs, events, and abnormalities from a variety of data sources such as applications, databases, and servers. In this way, IT Data Analytics enables IT operations to monitor any application failures by executing predefined saved searches and sending notifications. The product detects such failures based on notification conditions that you define. The product has a mechanism to detect whether a server has gone down or has stopped responding within a certain period of time.

For example, say you have data collected and indexed for a particular server, "abchost.com." You create a saved search with the search query (COLLECTOR_NAME=abchost.com). Based on the saved search, you add a notification with an equal to (=) condition. Now, suppose you execute this search query every 30 minutes. If the result count is zero, you know that the server is not processing log data and the application has not started successfully. Before the end user reports an abnormal behavior, notifications about the dependent system failures can be sent based on the notification conditions specified.

Viewing the data trend over a period of time helps you simplify your troubleshooting. For example, an increase in application latency over time suggests that the load on the application has increased over time.

The product allows you to view a trend analysis chart in two ways:

  • By adding views based on search queries that use advanced search commands such as timechart
  • By viewing the summarization charts that are available on the Search tab

Benefit

The product features help you to reduce application down time, respond proactively to failures, and easily understand the application parameter trend over time.