Troubleshooting operating system and network issues
This topic lists the operating system and network issues in TrueSight Infrastructure Management.
- The upgrade fails because the registry contains the previous installation data
- The upgrade fails because the TrueSight directory is locked
- The upgrade fails due to a missing default gateway
- The Integration Service (IS) appears disconnected in the TrueSight console
- Event propagation to TrueSight Presentation Server stops
- The upgrade in high availability stops because of port issues
- The upgrade from version 11.3.03 to 11.3.04 in high availability deployment stops abruptly
- The installation on the secondary cluster node displays a validation error
- The TrueSight Infrastructure Management server fails to start on Linux
- The Cell does not start because the publishing server takes over the 1828 port
- The installation fails due to a CSH shell issue
- The JBoss Wildfly services process fails to start
- The pserver process appears as not running on Windows
- Not receiving email alerts when Oracle database is down
Related topics
TrueSight Infrastructure Management Server system requirements
Administrator console system requirements
Remote cell system requirements
Network port schematics for Infrastructure Management
Network ports for Infrastructure Management
Network ports for a high-availability deployment of Infrastructure Management
Troubleshooting an Infrastructure Management deployment
To check open ports on the Windows server
- To display all open ports, on the command prompt, run the following command:
netstat - To obtain a list of all listening ports, run the following command:
netstat -an |find /i “listening” - To understand whether a specific port is an open port, use the find switch.
For example, to understand if the port 1900 is an open port, run the following command:
netstat -an |find /i “1900“
To check the open ports on the Linux server
Perform the following tasks:
- Open a Linux terminal application.
- Run the following commands:
- To display all open TCP and UDP ports, use the ss command.
- To list all ports, use the netstat command.
- To list open files and ports, use the lsof command.
The upgrade fails because the registry contains the previous installation data
Issue
The TrueSight Infrastructure Management upgrade fails because the registry contains the previous installation data.
Cause
This can happen because the previous installation might have been removed or copied.
Resolution
Clean up the keys from the location showed in the following image:
The upgrade fails because the TrueSight directory is locked
Issue
The TrueSight Infrastructure Management upgrade fails because the <installDirectory>\pw directory is locked by some process. The directory remains locked even if you disable the TrueSight Infrastructure Management services and restart the server.
Even after disabling the TrueSight Infrastructure Management services and rebooting the machine, the folder is still locked.
The following error message appears in the tsim_server_install log file:
Cause
This issue might be caused by one of the following reasons:
- An ITDA collector is still running even if all the TrueSight Infrastructure Management processes are stopped. The ITDA Collector is configured to capture the publishing server errors.
- Some other software is accessing the installation directory.
- A wrapper.exe process or a java process can cause the lock.
Resolution
- First, try to rename the TrueSight folder. Provide the admin permission, and then rename the TrueSight folder to the original name.
Contact your IT admin for help on identifying the file in the TrueSight directory that is locked and the process or tool that is locking the file.
The IT admin can use the Task Manager or Microsoft Process Explorer to identify the process. You can also use the free utility by LockHunter.- If you are using IT Data Analytics, ensure that you stop the ITDA collectors before starting the upgrade.
- If a wrapper.exe or a java process has caused the lock, stop the processes.
- Disable the antivirus and DLP settings on the TrueSight Infrastructure Management server.
- Verify if any storage software or backup software is accessing the TrueSight install directory. If yes, disable process or restrict the software from accessing the install directory during the upgrade process.
The upgrade fails due to a missing default gateway
Issue
In TSIMPreInstall.log, the preinstallation status of TrueSight Infrastructure Management displays following message:
Configuration: Default Gateway
Actual Value:
Recommended Configuration:
Pre Install Status: Not Ok
Cause
This issue occurs because the operating system has two NIC cards.
Resolution
- Perform the following actions:
- Run the ipconfig \all command to see if the default gateway is displayed in the output.
- If the default gateways is not displayed, contact the system administrator to fix this issue.
- If a second NIC card is present, disable it.
- Ensure that the DNS is getting resolved on the TrueSight Infrastructure Management server.
If the above resolutions do not work, contact the system administrator and refresh the network configuration on the server.
The Integration Service (IS) appears disconnected in the TrueSight console
Issue
After adding Integration Service from the TrueSight console, the Integration Service appears as disconnected in the TrueSight console.
The following error is displayed in the pronet_cntl.log file:
INFO pronet_cntl [Thread-86] 600002 Re-establishing connection to agent with ID 10005
WARN pronet_cntl [AgentConnector-10005] 101762 AgentConnector: Having problem connecting to Agent (AgentId:10005 AgentName:<hostname> AgentIP:<IP_Address>)
java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:204)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.proactivenet.ipc.TCPMessageReader.readMessage(TCPMessageReader.java:72)
at com.proactivenet.agent.controller.AgentConnector.connectToAgent(AgentConnector.java:128)at com.proactivenet.agent.controller.AgentConnector.run(AgentConnector.java:517)
INFO: pronet_cntl [AgentConnector-10005] 101574 Close the connection to agent 10005
WARN: pronet_cntl [AgentConnector-10005] 101764 Terminating this AgentConnector, another AgentConnector is trying to connect.
The following error is displayed in the TrueSightAgent_<hostname>-151_12124_3183.log in location <IS-Install_Path>\pw\pronto\logs file:
ERROR 11/02 09:03:13 Library [AC_Listener] 102660 Unable to write TCP message.
ERROR 11/02 09:03:13 Agent [AC_Listener] 102099 Exception:
javax.net.ssl.SSLHandshakeException: Remote host closed connection during handshake
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1002)
at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:757)
at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:123)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at com.proactivenet.ipc.TCPMessageWriter.writeMessage(TCPMessageWriter.java:76)
at com.proactivenet.agent.Listener.run(SessionHandler.java:2469)
Caused by: java.io.EOFException: SSL peer shut down incorrectly
at sun.security.ssl.InputRecord.read(InputRecord.java:505)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
Cause
This issue occurs because TLS is enabled only on the TrueSight Infrastructure Management server or the Integration Service.
Resolution
Ensure you have performed and verified steps in the following topic:
Configuring TrueSight Infrastructure Management to enable TLS 1.2- In the custom/conf/pronet.conf file, add a value to the ssltcp parameter as shown below:
pronet.apps.agent.conntype=ssltcp - Restart the TrueSight Infrastructure Management server.
- While adding Integration Service in the TrueSight console, ensure that you select the Direct access using SSL TCP/IP connection as shown in the following image:
Event propagation to TrueSight Presentation Server stops
Issue
The event propagation from TrueSight Infrastructure Management to TrueSight Presentation Server stops in a few hours after the TrueSight Infrastructure Management server starts.
Cause
This issue occurs because the port 1900 is not enabled.
Resolution
- Ensure that the telnet to port 1900 works from TrueSight Infrastructure Management to TrueSight Presentation Server.
- Run the netstat command on the TrueSight Presentation Server to ensure that there are connections on port 1900.
If there are no connections, investigate if there is a firewall rule that blocks the port after a few hours. - If there is a firewall issue, work with the network administrator to correct it. Also, you can add a new rule to the Windows firewall and make the connection to port 1900 as always-enabled.
After completing these steps, verify if the events start flowing from TrueSight Infrastructure Management to the TrueSight Presentation Server.
The upgrade in high availability stops because of port issues
Issue
If you disable the http port 7001 and enable the http port 7002, the TrueSight Infrastructure Management upgrade process on high availability stops and the following error is displayed:
Error parsing HTTP port
The following message is displayed in the BPPM_SERVER_HOME/pw/pronto/logs/ServerComponentAvailability.log file:
Resolution
The following were the ports set in pw/pronto/conf/ha.conf:
pronet.master.server.port.https=7002
pronet.master.server.port.http=7001
#Listen 7001
Commented http port in httpd.conf
httpd-ssl..conf has 7002 port in listen mode.
Listen 7002 Workaround:
- Stop the primary TrueSight Infrastructure Management server.
- Add the following properties in the pronet.conf file:
$BPPM_SERVER_HOME/pw/custom/conf/pronet.conf)webserver.apache.port.http=7002
webserver.apache.port.https=70023 - Start the primary TrueSight Infrastructure Management server and verify by using the following command:
pw lic list Perform the cell sync. Refer to the following documentation:
- Stop the secondary TrueSight Infrastructure Management server.
- Add the following properties in the pronet.conf file:
($BPPM_SERVER_HOME/pw/custom/conf/pronet.conf)webserver.apache.port.http=7002
webserver.apache.port.https=70028 - Start Secondary TrueSight Infrastructure Management and verified using the command:
pw lic list - Confirm the high availability status by using the command:
pw ha status
The upgrade from version 11.3.03 to 11.3.04 in high availability deployment stops abruptly
Issue
This issue occurs when the secondary TrueSight Infrastructure Management server is being upgraded in high availability deployments.
The following error is displayed:
The following message is displayed in the tsim_server_install_log.txt file:
INFO,com.bmc.install.product.bppm.server.postinstall.ServerPostInstallConfigUtilities,
LOG EVENT {Description=[ <SERVER_INSTALL> /data/bmc/TrueSight/pw/tmp/Active/activemq-rar.rar does not exist. Creating the directory - /data/bmc/TrueSight/pw/tmp/Active]}
SEVERE,com.bmc.install.product.base.project.runner.ProjectRunner,
LOG EVENT {Description=[InstallationTask [com.bmc.install.product.bppm.BPPMServerServerPostInstallInstallationTask] uncaught exception thrown, cancelling install.],Detail=[java.lang.UnsatisfiedLinkError: no nio in java.library.path]}
Error occurred while executing InstallationTask: com.bmc.install.product.bppm.BPPMServerServerPostInstallInstallationTask: Error occurred while executing com.bmc.install.product.bppm.BPPMServerServerPostInstallInstallationTask]},
Throwable=[java.lang.RuntimeException: Error occurred while executing com.bmc.install.product.bppm.BPPMServerServerPostInstallInstallationTask
com.bmc.install.product.base.project.runner.ProjectRunner.runTask(ProjectRunner.java:3725)
com.bmc.install.product.base.project.runner.ProjectRunner.handleNextProjectTask(ProjectRunner.java:3254)
com.bmc.install.product.base.project.runner.ProjectRunner.nextSelected(ProjectRunner.java:1576)
com.bmc.install.product.base.project.runner.ProjectRunnerPanelDelegate$3.run(ProjectRunnerPanelDelegate.java:264)],
Throwable=[java.lang.UnsatisfiedLinkError: no nio in java.library.path
java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
java.lang.Runtime.loadLibrary0(Runtime.java:870)
java.lang.System.loadLibrary(System.java:1124)
sun.nio.fs.UnixCopyFile$2.run(UnixCopyFile.java:612)
sun.nio.fs.UnixCopyFile$2.run(UnixCopyFile.java:609)
java.security.AccessController.doPrivileged(Native Method)
sun.nio.fs.UnixCopyFile.<clinit>(UnixCopyFile.java:609)
sun.nio.fs.UnixFileSystemProvider.copy(UnixFileSystemProvider.java:253)
java.nio.file.Files.copy(Files.java:1274)
com.bmc.install.product.bppm.server.postinstall.ServerPostInstallConfigUtilities.genericConfigurationTasks(ServerPostInstallConfigUtilities.java:1334)
com.bmc.install.product.bppm.BPPMServerServerPostInstallInstallationTask.execute(BPPMServerServerPostInstallInstallationTask.java:560)
com.bmc.install.task.InstallationTask.run(InstallationTask.java:93)
java.lang.Thread.run(Thread.java:748)]
During post installation, the updated activemq-rar.rar file is not copied from the TrueSight/pw/wildfly/standalone/deployments directory to the TrueSight/pw/tmp/Active directory.
Cause
Although an exact cause is unknown, the JAVA API for copying fails in specific environments.
Resolution
- Revert to the server snapshot or the file system backup.
- Back up the \TSIMServer\Disk1\setup.jar file. For example, you can rename it as setup_11.3.04.jar.
- Go to ftp://ftp.bmc.com/pub/000376800 and download the setup_version5.jar file.
- Copy the setup_version5.jar file to \TSIMServer\Disk1 by using the operating system copy command cp.
Do not copy the file by using the JAVA API. - Rename the setup_version5.jar to setup.jar and run the following command:
chmod 755 setup.jar
Refer to the BMC documentation for the installation procedure.
The installation on the secondary cluster node displays a validation error
Issue
The installer displays following messages:
Cause
This issue occurs because the description registry entry is missing from the key HKEY_LOCAL_MACHINE\Cluster\Resources file.
Resolution
Add the description registry entry:
- Open the registry editor.
- Navigate to HKEY_LOCAL_MACHINES\Cluster\Resources.
- Add the description registry entry as a string with value as <BPPM INSTALL DIR> in the registry key where PnService is available as name.
- Restart the installer.
The TrueSight Infrastructure Management server fails to start on Linux
Issue
When you use the command pw system start to start the TrueSight Infrastructure Management Server, the following error appears:
Failed to establish JMS connection Check /tmp/startdwhouse.log for errors
Cause
The IP6 protocol is not enabled on the server.
When you start TrueSight Infrastructure Management the following error message appears:
Case 1
TESTSERVER :/opt/bmc/tsim/pw/pronto/logs $ pw sys start
stopping all BMC TrueSight processes...
--------------------------stopdbsrv begin-------------------------------
stopping server
SQL AnyWhere Server storm_sla18978 is not running on sla18978
--------------------------stopdbsrv end-------------------------------
All BMC TrueSight processes stopped successfully.
/usr/bin/nohup: ignoring input and appending output to 'nohup.out'
mcell
SQLAnyWhere Server
Services
Jboss bind address: 0.0.0.0
Waiting for DB to accept connections...ok.
Waiting for JMS service to initialize...
Failed to establish JMS connection
Check /tmp/startdwhouse.log for Errors
The startdwhouse.log file displays following messages:
Case 2
TESTSERVER:/opt/bmc/tsim/pw/pronto/logs $ cat /tmp/startdwhouse.log
Trying to connect to port 8093 for up to 300 sec, checking every 1 sec.
InetAddress: /0.0.0.0
InetAddress: /0.0.0.0
InetAddress: /0.0.0.
Resolution
Solution 1
Enable the IP6 protocol on server.
Solution 2
- Switch TrueSight Infrastructure Management to use the IP4 protocol by making the following change:
Edit the pw/wildfly/bin/standalone.conf file change the following entry:
From
JAVA_OPTS="-Dservices=DUMMY -Xms256m -Xmx3072m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m -Djava.net.preferIPv4Stack=false -Djava.net.preferIPv6Addresses=true"
To
JAVA_OPTS="-Dservices=DUMMY -Xms256m -Xmx3072m -XX:MetaspaceSize=96M -XX:MaxMetaspaceSize=256m -Djava.net.preferIPv4Stack=true - Restart TrueSight Infrastructure Management.
The Cell does not start because the publishing server takes over the 1828 port
Issue
In TrueSight Infrastructure Management, the publishing server takes over the 1828 port and starts to listen on it. Because of this, the Cell does not start.
Cause
This issue occurs because the entry for the gateway.imcomm parameter in the mcell.dir file for the secondary node is incorrect.
Resolution
- Stop the TrueSight Infrastructure Management by using the following command:
pw system stop - Enable pserver traces and start only pserver.
If it still starts to LISTEN on port 1828, in the pserver.trace file, look for the following:
WARN [main] (PsGateway.java:185) - Cell publish gateway name was not set by install. Using the default CellPublishGatewayName: gw_ps_pncell_test12345
- Change the reference of the gateway name of the primary node in the mcell.dir file:
From
gateway.imcomm gw_ps_pncell_test12344 mc test12345:1839
To
gateway.imcomm gw_ps_pncell_test12345 mc test12345:1839 - Restart the Cell.
- Restart TrueSight Infrastructure Management by using the following command:
pw system startpserver starts to LISTEN port 1839 instead of port 1828.
The installation fails due to a CSH shell issue
Issue
Even if the CSH shell is installed, the TrueSight Infrastructure Management fails and the following message is displayed:
CSH shell is not installed on the system.
********************* Check starts: validateCShellInstallation **********************]}
INFO,com.bmc.install.product.bppm.server.preinstall.BPPMServerServerPreInstall,
LOG EVENT {Description=[ <SERVER_INSTALL> cshellValidationScript: /tmp/cShellValidationScript.sh],Detail=[ Script file found]}
INFO,com.bmc.install.product.bppm.util.BPPMCommandExecuter,
LOG EVENT {Description=[ <SERVER_INSTALL> Executing Command: /tmp/cShellValidationScript.sh],Detail=[]}
WARNING,com.bmc.install.product.bppm.util.BPPMCommandExecuter,
LOG EVENT {Description=[ <SERVER_INSTALL> Unable to execute command successfully. Command: [/tmp/cShellValidationScript.sh] with return code: -1, Output: Exception running command: Cannot run program "/tmp/cShellValidationScript.sh": error=13, Permission denied
, OutputLines: [Exception running command: Cannot run program "/tmp/cShellValidationScript.sh": error=13, Permission denied]],Detail=[]}
WARNING,com.bmc.install.task.InstallationStateHelper,
LOG EVENT {Description=[BMC TrueSight Infrastructure Management Server warned],Detail=[SERVER]}
INFO,com.bmc.install.product.bppm.server.preinstall.BPPMServerServerPreInstall,
LOG EVENT {Description=[ <SERVER_INSTALL> command output = Exception running command: Cannot run program "/tmp/cShellValidationScript.sh": error=13, Permission denied
]}
INFO,com.bmc.install.product.bppm.server.preinstall.BPPMServerServerPreInstall,
LOG EVENT {Description=[ <SERVER_INSTALL> Failed to execute c-shell installation validation command, return code: -1],Detail=[]}
Cause
This occurs when /tmp filesystem is mounted with the noexec option.
Resolution
- Run the following command:
findmnt -l | grep tmp - Mount /tmp with the exec option:
mount -o remount,exec /tmp
- Start the installation again.
The JBoss Wildfly services process fails to start
Issue
While starting the TrueSight Infrastructure Management server, the JBoss Wildfly services process appears in the Not Running state. A manual attempt to start the process fails.
All logs in the <Install_Dir>\TrueSight\pw\wildfly\standalone\log directory are of the zero KB size.
Cause
The default port used by JBoss Wildfly, port 9990, is used by another application.
Resolution
- Stop the other application to free the port 9990.
Restart all TrueSight Infrastructure Management services by using the following commands:
- pw sys stop
- pw sys start
The services process starts successfully.
The pserver process appears as not running on Windows
Issue
The <TrueSight Infrastructure Management_Install_Directory>/pw/server/tmp/<PSSERVER>/pserver.trace file shows following messages:
Case 1
WARN [InitializeAtriumCMDB] (PublishServersMgr.java:106) - PServer ps_pncell_<servername> missed 5 heartbeats. Therefore assuming it is no longer running
WARN [ActiveMQ Connection Executor: tcp://localhost/127.0.0.1:8093@61383] (JmsComm.java:171) - JMS connection lost. I'll retry to establish connection.
ERROR [ActiveMQ Connection Executor: tcp://localhost/127.0.0.1:8093@61383] (JmsComm.java:459) - Unable to clean up queueConnection: Unable to close queueConnection: javax.jms.JMSException: Connection reset
ERROR [ActiveMQ Connection Executor: tcp://localhost/127.0.0.1:8093@61383] (JmsComm.java:470) - Unable to clean up topicConnection: Unable to set ExceptionListener to null: org.apache.activemq.ConnectionFailedException: The JMS connection has failed: Connection reset
The <TrueSight Infrastructure Management_Install_Directory>/pw/pronto/logs/PSProcess.log file shows following messages:
DEBUG getPsServiceName() BMC TrueSight Operations Management Publishing Server ps_pncell_<servername>
DEBUG getPsServiceName() Method Ends...
DEBUG CheckForPublishingServerStatus() Publishing Server is running!
DEBUG stopPublishingServer() Invoking GetServerPath() method to get the server path
DEBUG stopPublishingServer() Start-up string to stop the Publishing server
DEBUG stopPublishingServer() "D:\Program Files\BMC Software\TrueSight\pw\server\bin\pscontrol.bat" -f stop
Cause
Case 1: Windows operating system cache issue.
Case 2: Antivirus software issue.
Resolution
Case 1
Perform the following actions:
- From the Services window, verify whether the service for the BMC TrueSight Publishing server is running or not.
- Verify whether the java_ps.exe process appears as running in the Task Manager.
The service for the BMC TrueSight Publishing server appears as not running in one of the above locations. For example, the service may appear as running in the Services window, but the java_ps.exe process appears as not running in the Task Manager.
Perform the following actions:
- Stop the Publishing server service from the Services window and ensure there is no process running with java_ps.exe.
- On the command line, run the following command on TrueSight Infrastructure Management:
pw p s pserver
Case 2
Check if the antivirus software is installed and the TrueSight folder is being scanned from it. You can verify it from the Windows Event Viewer.
If the TrueSight folder is being scanned by the antivirus, remove the folder from being scanned and start the Publishing server by using the following command:
pw p s pserver
Not receiving email alerts when Oracle database is down
Issue
INFO: ServerComponentAvailability [HA-ServerComponentsAvailability-Monitor] 600002 Database is unavailable for consecutive 12 attempt(s) Details: Database is unavailable for the duration:11 min If using an Oracle database, contact your Oracle Database Administrator immediately to rectify your database connectivity issue.
ERROR : ServerComponentAvailability [HA-ServerComponentsAvailability-Monitor] <html>Hello,<br><br><br><b>Issue:</b><br><p>The TrueSight server running on <b>"dmbas-esxvm-533"</b> server. Error in sending the email. Missing Resource String
Resolution
- Navigate to <TSIM_HOME>pw\apps3rdparty\javamail\ and take a backup of mail.jar file outside the TSIM_HOME directory.
- Unzip the to a temporary folder.
- Copy the extracted mail-1.3.3.jar file to ..\TrueSight\pw\apps3rdparty\javamail on the TSIM Server.
- Rename mail-1.3.3.jar in ..\TrueSight\pw\apps3rdparty\javamail to mail.jar.
- Restart the TrueSight Infrastructure Server processes using the following command:
pw system start