Page tree

This topic lists common issues, probable causes, and suggestions for troubleshooting the BMC TrueSight IT Data Analytics (IT Data Analytics) product.

To locate an issue, use one of the following options:

  • To see issues pertaining to a particular category, select an item from the Category list.
    To return to the complete list of issues, click Reset all filters  next to the Category box.
  • To see the page in full screen mode, press F and press Esc to return to the normal view.
    Alternatively, press [ to hide and show the navigation pane on the left.

Oops, it seems that you need to place a table or a macro generating a table within the Table Filter macro.

The table is being loaded. Please wait for a bit ...

CategoryScenarioProbable causes with solutions (if any)
Accessing the productUnable to access the product from the Start menu

You might not be able to start the product, if:

  • During installation, you did not have the product start services immediately after installation.
  • The ports that you specified during installation, are already in use.

Solutions:

Saved searchesUnable to edit a saved search (or data pattern)

You cannot edit the following items:

  • Artifacts that are imported by using a content pack
  • Shared saved searches that are created by another user

Solution: If you want to edit an object (saved searches or data patterns) that was initially imported by using a content pack, you can clone the object and then modify it per your requirements.

System response timeThe Collection Station went down, and the IT Data Analytics server was restarted, the system became very slow

When you restart the IT Data Analytics server, all of the data collectors try to catch up and send the old pending data into the Collection Station for indexing.

Based on the number of data collectors, this process can take some time (a few minutes to a few hours) to complete.

System response timeThe Indexer remained down for two days over the weekend, and after IT Data Analytics server was restarted, the system became very slow

This issue can occur if the Collection Station cached a lot of data over the weekend. When you restart the IT Data Analytics server, the Collection Station pushes all cached data together into the Indexer.

Solution:

  1. Stop the Collection Station and clean up the %BMC_ITDA_HOME%\station\collection\data folder.
    Note, however, that when you clean up the folder, you can lose cached data.
  2. Leave the system in the current state until all of the data is sent to the Indexer.
Product component shows redSome of the Collection Agents are showing up as red on the Administration > Hosts page and do not change to green, even though all of the servers are running.

This issue might occur if the Collection Station remains down at the time at which the Collection Agents start.

Solution: Restart the Collection Agents after the Collection Station is up and running.

Product component shows redThe Configuration Database and Indexer are showing as green on the Administration > Components page, but some of the other components are down (on the Linux operating system).

The components might not have been started in the recommended order.

Solution: Restart the services in the correct order. For more information, see Starting or stopping product services .

Search

Unable to search for indexed data.

This can happen in two scenarios:

  • Search component is unable to connect to the Indexer: In this scenario, you might not be able to search data.
    You might get the following errors:
    • Error on the Search tab:

      Could not connect to the Indexer. Go to
      Administration > Components to see if the
      Indexer is up and running or
      contact your Administrator for support.

    • Error in the itda.log file located at %BMC_ITDA_HOME%\logs:

      org.elasticsearch.transport.
      ConnectTransportException:
      [Blackout][inet[ipaddress]]
      connect_timeout[30s]

  • Collection Station is unable to connect to the Indexer: In this scenario, you might not be able to collect data.
    You can see the following error in the collection.log file located at %BMC_ITDA_HOME%\station\collection\logs:

    org.elasticsearch.transport.
    ConnectTransportException:
    [Blackout][inet
    [ipaddress]] connect_timeout[30s]

Solution:

  • If the Search component is unable to connect to the Indexer: Perform the following steps:
    1. Add the following properties in the searchserviceCustomConfig file located at %BMC_ITDA_HOME%\custom\conf\server:
      • indexing.network.bind_host: Specifies the host name or IP address of the Search component that is accessible to the Indexers.
      • indexing.network.publish_host: Specifies the fully qualified host name of the computer where the Search component is installed.
    2. Restart the Search component. For more information, see Starting or stopping product services.
  • If the Collection Station is unable to connect to the Indexer: Perform the following steps:
    1. Add the following properties in the agent.properties file located at %BMC_ITDA_HOME%\station\collection\custom\conf:
      • indexing.network.bind_host: Specifies the host name or IP address of the Collection Station that is accessible to the Indexers.
      • indexing.network.publish_host: Specifies the fully qualified host name of the computer where the Collection Station is installed.
    2. Restart the Collection Station. For more information, see Starting or stopping product services.
Data  collection

The Collection Station on a Windows computer is not starting or working properly after the time it stopped abruptly and you see the following exceptions in the collection.log.

  • BadCheckpointException
  • IllegalStateException

This issue is rare and might occur when the Collection Station stops abruptly, which can happen if the %BMC_ITDA_HOME%\station\collection\data\c*\flume-checkpoint(1)\checkpoint file becomes corrupted.

You can find the exact name of the corrupted checkpoint file in the collection.log file located at %BMC_ITDA_HOME%\station\collection\logs. In this file, you can search for the line containing the IllegalArgumentException error. 

Error example:

java.lang.IllegalStateException:
Destination file:
C:\Tasks\HA\MultipleServers\
station1\data\c1\
flume-checkpoint1\checkpoint
unexpectedly exists
 

Workarounds:

  • If data loss is acceptable: Delete the data directory located at %BMC_ITDA_HOME%\station\collection\ and restart the Collection Station.
  • If data loss is unacceptable:
    1. Stop the Collection Station. For more information, see Starting or stopping product services.
    2. Perform a backup of the %BMC_ITDA_HOME%\station\collection\data\c*\flume-checkpoint(2)\checkpoint file if the error occurs for the flume-checkpoint(1) file.
      (On the other hand, if the error occurs for the flume-checkpoint(2) file, then perform a backup of the %BMC_ITDA_HOME%\station\collection\data\c*\flume-checkpoint(1)\checkpoint file).
    3. Replace the corrupted checkpoint file with the backup file.
    4. Restart the Collection Station.
SearchData entry time on the Search tab is ahead of the time at which the notification was generated

By default, there is a delay of 90 seconds between data collection and reporting of search results to the product. Therefore, during notification creation, when you select one of the search duration options and apply a condition related to the number of results, you can expect a delay of 90 seconds.

Solution: You can change the 90 seconds time lag by modifying the value of one of the following properties available in the searchserviceCustomConfig.properties file. For more information, see Modifying the configuration files.

  • notification.search.exec.offset.sec: Change the value of this property if you set the search duration to Last execution to current execution.

  • notification.search.relative.exec.offset.sec: Change the value of this property if you set the search duration to any other option other than Last execution to current execution. For example, Last 60 minutes, Last 6 hours, and so on.
Data collectionData collector has been created, but the results cannot be seen

After the data collector is created, it might take some time (approximately 1 minute) for the first poll to happen. The first poll is used to make the data collector ready for data collection. The data is fetched only from the second poll.

Expected time delay (to see the first set of data for a search) = (Time for first poll) + (Poll interval set for the data collector)

SearchData is being generated in the files for monitoring, but no data can be seen when performing a search

This issue might occur if the time zone specified during data-collector creation is set incorrectly.

Solution: Ensure that the time zone is set correctly when you create data collectors. 

Product component shows redStatus of the Collection Station appears red on the Administration > Components tab.

This issue might occur in two scenarios:

  1. If the computer on which the Collection Station is installed has multiple IP addresses and during installation you provided an IP address (bind address) that cannot be connected from the Console Server.
    Solution: To resolve this issue, you must add the following properties in the Collection Station's agent.properties file (custom file):
    • httpBindAddress=0.0.0.0
    • payload.bindaddress=0.0.0.0
    For more information, see Modifying the configuration files.
  2. The host name specified while registering the self-signed certificate does not match the host name of the computer where the Collection Station is installed. You can find the correct host name by navigating to the %BMC_ITDA_HOME%\logs\itda.log and search for the following line:

    com.sun.jersey.api.client.
    ClientHandlerException:
    javax.net.ssl.SSLHandshake
    Exception: java.security.cert.
    CertificateException: <Host-Name>
    No name matching  found

    where, <Host-Name> refers to the host name of the Collection Station.
Product component shows redStatus of the Search component appears red on the Administration > Components tab.The host name specified while registering the self-signed certificate does not match the host name of the computer where the Search component is installed. You can find the correct host name by navigating to the %BMC_ITDA_HOME%\logs\itda.log and search for the following line:

com.sun.jersey.api.client.
ClientHandlerException:
javax.net.ssl.SSLHandshakeException:
java.security.cert.CertificateException:
<Host-Name>No name matching  found

where, <Host-Name> refers to the host name of the Search component.
Other errors

You see the following error on the product user interface:

Error fetching data
from backend

This can happen if the Configuration Database service is down.

Solution: Perform the following steps:

  1. Ensure that the Configuration Database service is up and running.
  2. If you continue to get this error, then navigate to %BMC_ITDA_HOME%\logs\services\configdb.log and check for the occurrence of OutOfMemoryError. If this error is present in the log then increase the Java memory heap size (wrapper.java.initmemory property value) to 1024 MB. For more information, see Component configuration recommendations.
Data collectionSome data got lost when the Collection Station (or Collection Agent) went down brieflyBy default, the number of days for which data must be collected and indexed (Read from Past (#days) function) is set to zero. As a result, in the process of data collection, if the Collection Station (or Collection Agent) goes down, you can experience data loss. When the Collection Station (or Collection Agent) is up again, it starts collecting data from that point onwards and the time for which it remained down is ignored. You can change the Read from Past (#days) default value for the Monitor Windows events and Monitor using external configuration data collectors.
Data collection

The polling status for the following data collectors shows red (unsuccessful polling).

You see the Invalid value error when you select one of the preceding data collectors and click Last 10 Polls Status of Data Collector .

Example: 
collection-station_
Host1.bmc.com:Invalid
value: -473379094

This issue might occur in the following scenarios:

This issue occurs because the number of concurrent SSH connections allowed to the target host is lesser than the the number of data collectors that you want to create. The number of concurrent SSH connections determine the number of data collectors that you can create for collecting data from the same target host.

Solution: Navigate to the /etc/ssh/sshd_config directory and increase the value of the MaxSessions parameter. This solution is only applicable for OpenSSH version 5.1 and later.

Data collection

At the time of creating a data collector, when you try to filter the relevant data patterns, you might see the following error:

Collection Station not
reachable.

This issue might occur if the system response time of the server on which you are trying to create the data collector is slow.

Solution: Edit the olaengineCustomConfig.properties file located at %BMC_ITDA_HOME%\custom\conf\server\ and then add the station.response.timeout property with a value greater than 120.

Example: station.response.timeout=180

This property determines the duration of time (in seconds) for which the Console Server waits to receive a response from the Collection Station.

Saved searchesUnable to use a saved search shared by another user to create views or notifications.

You cannot use saved searches shared by other users for creating views or notifications.

Solution: Clone the saved search and then use the cloned instance to create views or notifications. For more information, see Managing saved searches

Upgrade

Unable to see the updated UI available for the upgraded version of the product.

To find out if you are still accessing the older UI, navigate to Administration > Components and check if you can see the Version column that is available in the new version of the product.

This might occur if the browser where you are accessing the product continues to access the cached content.

Solution: Press Ctrl+F5 to reload the browser page while ignoring the cached content. If reloading the browser page does not work, then append ?version=1_1 to the browser URL, before the hash sign (#) as follows:

http://hostName:port/console/?version=1_1#LoginPage

Examples:

  • http://host1.bmc.com:9797/console/?version=1_1#LoginPage
  • http://host2.bmc.com:9797/console/?version=1_1#Search
Data collection

You are experiencing some data loss and you see the following error in the collection.log file located at %BMC_ITDA_HOME%\station\collection\logs.

ElasticsearchTimeout
Exception: Timeout
waiting for task

 

Even when all the Indexers in the cluster are up and functioning normally, this error might occur due to various reasons. For example, a poor network connection or the system on which the Indexers (or the Collection Station) reside have become slow.

Workaround: Increase the value of the indexing.request.timeoutmillis property. For example, the default value of this property is 5000, you can double it to 10000. For more information, see Component configuration recommendations.

Data collection

Some data collected by the Receive over TCP/UDP data collected is not getting indexed and occasionally you find the following message in the collection.log file.

Buffer is full, write cannot proceed

This might occur when the rate at which the sender sends data via the TCP port is greater than the rate at which the Receive over TCP/UDP data collector indexes data. This indicates that the data collector is dropping records and needs to be tuned.

Solution: To allow for indexing higher volumes of data per day on a single data collector, you must add the following properties with appropriate values in the agent.properties file. For more information, see Modifying the configuration files.

  • collection.reader.batch.size: The total batch size (number of messages) that will be indexed by a single data collector.

  • collection.reader.portreader.eventbuffer.maxsize: The maximum number of messages that will be waiting to be indexed by a single data collector.

Note that the following property values were used for indexing up to 100 GB of data in the lab environment.

  • collection.reader.batch.size=8000
  • collection.reader.portreader.eventbuffer.maxsize=204800