This topic provides high-level use cases for using the TrueSight IT Data Analytics product.
Problem isolation and root cause analysis
One of the major use cases for using the product is fast problem isolation and quick troubleshooting of the root cause.
Applications and infrastructure produce a variety of IT data such as logs, events, and performance metrics, in large volumes. IT organizations must manually collect, read, and correlate this data to identify application issues. This process is error prone, and resolving application issues becomes a time-consuming process. Most of the time, users report that a specific operation is failing, the system appears to be slow, or the operation is consistently failing for a specific user. Due to multiple components in the system, investigating log files for each of the components becomes a challenge.
TrueSight IT Data Analytics enables IT organizations to collect and index large volumes of data, in real time, in a single location. TrueSight IT Data Analytics partly automates the data-collection process with the help of features such as the collection profile and content pack. These features speed up the configuration required for collecting data into the product and simplifies the troubleshooting process. The product enables you to perform simple adhoc searches and to use search commands, to perform advanced searches. The product enables you to find answers to your questions about IT data and find the root cause for application failures.
You can view all component logs across the application stack for a specific time period to find errors in other components.
The Dashboards tab shows you application metrics trends over time. For example, you can monitor the response time, transaction duration, or server inactivity over a period of time. Using the Dashboards tab in this way can help you narrow down the area to monitor during performance problem troubleshooting.
When you integrate TrueSight IT Data Analytics with TrueSight Infrastructure Management, you can send related metrics from TrueSight IT Data Analytics to TrueSight Infrastructure Management to correlate cause with effect before you apply the fix.
The product allows you to simplify your troubleshooting process by decreasing the number of steps required for troubleshooting. You can easily narrow down various problems by performing effective searches on the data indexed. You can use saved searches to schedule notifications to detect problems ahead of time. Thus the product helps you reduce the mean time to resolution (MTTR) when solving critical problems.
Proactive monitoring of applications and systems
This product enables you to proactively monitor the applications and systems in your environment.
Information about a variety of performance and availability data such as startup, abnormal termination, and performance data is written into log files. Keeping track of processes such as starting and stopping can be difficult due to the large number of machines involved. It is also important to find out whether a process has stopped responding or has gone down. In addition, it is difficult to monitor how different application metrics are behaving over a period of time.
TrueSight IT Data Analytics collects and organizes IT data such as performance metrics, logs, events, and abnormalities from a variety of data sources such as applications, databases, and servers. In this way, TrueSight IT Data Analytics enables IT operations to monitor any application failures by executing predefined saved searches and sending notifications. The product detects such failures based on notification conditions that you define. The product has a mechanism to detect whether a server has gone down or has stopped responding within a certain period of time.
For example, say you have data collected and indexed for a particular server, "abchost.com." You create a saved search with the search query (
COLLECTOR_NAME=abchost.com). Based on the saved search, you add a notification with an equal to (=) condition. Now, suppose you execute this search query every 30 minutes. If the result count is zero, you know that the server is not processing log data and the application has not started successfully. Before the end user reports an abnormal behavior, notifications about the dependent system failures can be sent based on the notification conditions specified.
Viewing the data trend over a period of time helps you simplify your troubleshooting. For example, an increase in application latency over time suggests that the load on the application has increased over time.
The product enables you to view a trend analysis chart in two ways:
- By adding dashboards based on search queries that use advanced search commands such as the timechart search command
- By viewing the summarization charts that are available on the Search tab
The product features help you to reduce application down time, respond proactively to failures, and easily understand the application parameter trend over time.