Troubleshooting


This topic provides solutions to problems that you might encounter when using or administering BMC AMI Ops Insight.

 BMC AMI Ops Insight and BMC AMI Ops User Interface are in the same or different SMP/E environments

Click here to expand...

BMC AMI Ops Insight and BMC AMI Ops User Interface are in different or same environment. Users might have shared DASD for BMC AMI Ops Insight and BMC AMI Ops User Interface.

Solution 

Use one of the following to resolve this issue.

Error
Warning

Perform the following steps only when both the products are at the same release or maintenance level.

BMC AMI Ops Insight and BMC AMI Ops User Interface are on a shared DASD

BMC AMI Ops Insight and BMC AMI Ops User Interface are not on a shared DASD

Replace the common environment file (AMIHLQ.UBBSAMP(AMICMNEV)) in BMC AMI Manager JCL of BMC AMI Ops Insight with the common environment file (AMIHLQ.UBBSAMP(AMICMNEV)) of BMC AMI Ops User Interface.

  1. Change the following properties in BMC AMI Ops Insight common environment file (AMIHLQ.UBBSAMP(AMICMNEV)). Update the BMC AMI Ops UI Discovery server host and port in this file with the BMC AMI Ops UI Discovery server host and port of BMC AMI Ops User Interface or the converse.
    • AMIDSC_PORT= portNumber
    • AMIDSC_HOST= hostName
  2. Make sure that the JWT token in BMC AMI Ops Insight AMIHLQ.UBBSAMP(AMICMNEV)file matches the JWT token value in the AMIHLQ.UBBSAMP(AMICMNEV)file of BMC AMI Ops User Interface.

The following points apply to both whether or not BMC AMI Ops Insight and BMC AMI Ops User Interface are on a shared DASD: 

  • Make sure that the AMI_MGR_HOST property value in amipdt.properties of BMC AMI Ops Insight is a valid host name and not just localhost.
  • If you running the product in SSL mode (https), import the truststore that is used in BMC AMI Ops Insight  into the Truststore that is used in BMC AMI Ops User Interface or the converse.   

View Detailed Analysis returns Analysis Failed message

Click here to expand...

When using the View Detailed Analysis option, you get an Analysis Failed message. The View Detailed Analysis option of the product uses the existing BMC AMI Ops Monitor products (BMC AMI Ops Monitor for Db2, CICS, z/OS, and IMS) for analysis. Because the Data Preparation address space queries the BMC AMI Ops Monitor Products, the CAS to which the Data Preparation address space connects must be part of the MV Plex that includes the BMC AMI Ops Monitor watched by BMC AMI Ops Insight.

Example

image2022-11-1_15-40-18.png

Solution

If a CAS was installed purely for testing the product, but not set up to interact with the rest of the MV Plex, either connect to a CAS that is already part of the MV Plex or update the CASDEF so that it can connect to the other CAS in the MV Plex. 

A training request fails immediately and gives “Process ended with no output” error

Click here to expand...

When you request Training, the request fails immediately, giving Process ended with no output error message. 

This happens when the owner of the Java 17 home directory (J17.0_64)  is not set to BPXROOT, or there is no APF authorization for sorting and other files at installPath/aoihome/tomcat9/default/lib.

Solution

  1. Set the file owner of BMC AMI Ops Insight JAVA 17 home directory (J17.0_64) symbolic link to BPXROOT.

    You must run this job with a user ID having superuser permission (uid=0).

    The generic ingest calls BMCSORT from Java to sort SMF records.

    To prevent SEC6 abend, you must set the Java path file owner to BPXROOT according to IBM's requirement, because BMCSORT should run in supervisor mode.

    The file owner of $INSTALL_PATH/aoihome/java/J17.0_64 symbolic link must be BPXROOT.

    Symbolic Link Attributes                       
                               
     Pathname : $INSTALL_PATH/aoihome/java/J17.0_64               
     External link  . . : 0                                               
     File size  . . . . : 21                                              
     File owner . . . . : BPXROOT(0)                                      
     Group owner  . . . : CC100538(102)                                   
     Last modified  . . : 2022-12-12 16:09:33                             
     Last changed . . . : 2022-12-12 16:09:33                             
     Last accessed  . . : 2022-12-12 16:09:33                             
     Created  . . . . . : 2022-12-12 16:09:33                             
     Link count . . . . : 1                                               
     Device number  . . : 1BF                                             
     Inode number . . . : 667                                             
     Seclabel . . . . . :                                                 
     Symbolic link contents:                                              
                    
        /usr/lpp/java/J17.0_64       
      
  2. Make sure all files under the USS directory &AMIINST/aoihome/tomcat9/default/lib must be APF-authorized. The AMIHLQ.BMCLINK and AMIHLQ.BMCLIB install data sets must also be APF authorized.

    Make sure that the APF-authorized and program-controlled extended attributes are ON:

    • For the pdrjvm17 & libsortjni64.so files in  $INSTALL_PATH/aoihome/tomcat9/default/lib
    • For the datasets specified in the STEPLIB parameter in the AMITCEN7 member
  3. The MODELGEN_JAVA parameter in AMITCEN7 member must point to  $INSTALL_PATH/aoihome/java/J17.0_64.

     

BMC AMI Ops Insight content is not visible

Click here to expand...

When you log in to the application, the BMC AMI Ops Insight-related content is not visible

Solution

When you log in to the application, you can view the administration windows, but you cannot create a new model.

Your training request is stuck in progress

Click here to expand...

A generic Ingest task is looping in LGBSORT, and no records are ingested. During sorting of input SMF files, if there are records with a record length greater than 32767, it creates an infinite loop. 

Solution

Apply PTF BQQ6434 to your test environment.

To verify whether the fix is applied, search for PTF BQQ6434 in the LGBSRT64 load module in your BMCLIB.

Tomcat REST interface not available

Click here to expand...

The following message is displayed in the user interface if the Tomcat is not running:

PDTAM0030E: Tomcat Rest interface not available

The following message is issued in the BMC AMI Manager log:

ERROR
com.bmc.ami.controllers.AppController - PDTAM0059E: Error occurred when
connecting to Tomcat Rest Interface: I/O error on GET request for
"http://<server>:<port>": Connection refused: connect; nested exception is
java.net.ConnectException: Connection refused: connect

Solution

Restart Tomcat.

SSL is not configured properly

Click here to expand...

One of the following messages is also issued in the BMC AMI Manager log. This happens when BMC AMI Manager is trying to connect via SSL, but SSL is not configured properly.

ERROR
com.bmc.ami.controllers.AppController - PDTAM0059E: Error occurred when
connecting to Tomcat Rest Interface: I/O error on GET request for
"https://<server>:<port>": Unrecognized SSL message, plaintext connection?;
nested exception is javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
ERROR
com.bmc.ami.controllers.AppController - PDTAM0059E: Error occurred when
connecting to Tomcat Rest Interface: I/O error on GET request for
"https://qac5:15546": sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target;
nested exception is javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

Solution

Make sure that SSL is configured properly. For more information, see Enabling TLS authentication between Tomcat and BMC AMI Manager for more details.

The data preparation address space is incorrect or down

Click here to expand...

The following message is displayed in the user interface. The data preparation address space entered in the amipdt.properties file is incorrect, or the address space is not available.

PDTAM0042E: Exception occurred : ["Error Occurred
processing train","Request failed: Unknown AMI ID:
<addressSpaceName> Name Token failure: BMC AMI "

The following message is issued in the BMC AMI Manager log:

ERROR
com.bmc.ami.controllers.AppController - PDTAM0059E: Error occurred when
connecting to Tomcat Rest Interface: ["Error Occurred processing  train","Request failed: Unknown AMI
ID: <address_space_name> Name Token failure: BMC AMI "

Solution

Verify that the data preparation address space entered in the amipdt.properties file is correct and that the address space is running.

There is not enough memory in the training engine

Click here to expand...

The following error occurs when the training engine doesn't have enough memory allocated:

JVMJ9GC017E -Xmx too small, must be at least 1 Mbytes
JVMJ9VM015W Initialization error for library j9gc29(2): Failed to initialize
Error: Could not create the Java Virtual Machine. Error: A fatal exception has
occurred. Program will exit”

Solution

Set MODELGEN_MEMORY to a value equal to or greater than 64 MB.

The MODELGEN_MEMORY is very small

Click here to expand...

The following errors are issued when the MODELGEN_MEMORY is set to a value that is too small:

JVMDUMP039I Processing dump event systhrow, detail java/lang/OutOfMemoryError at
2020/03/24 13:59:07 - please wait. JVMDUMP032I JVM requested System dump using
'RGS.UIM.JVM.TDUMP.MVSXXXTC.D200324.T135907' in response to an event
JVMPORT022I Appending .X&DS to user-specified dump template to enable
multi-part dumps. IEATDUMP in progress with options
SDATA=(LPA,GRSQ,LSQA,NUC,PSA,RGN,SQA,SUM,SWA,TRT) IEATDUMP failure for
DSN='RGS.UIM.JVM.TDUMP.MVSXXXTC.D200324.T135907.X&DS' RC=0x00000008
RSN=0x00000022
JVMPORT023W IEATDUMP failed because user-specified dump template
was too long. Retrying dump with default template. IEATDUMP in progress with
options SDATA=(LPA,GRSQ,LSQA,NUC,PSA,RGN,SQA,SUM,SWA,TRT) IEATDUMP success for
DSN='MVSXXX.JVM.MVSXXXTC.D200324.T135907.X&DS'
JVMDUMP040I System dump written to dataset(s) using name template
MVSXXX.JVM.MVSXXXTC.D200324.T135907.X&DS
JVMDUMP032I JVM requested Heap dump using
'/shrd/mvsxxx/trainingEngine/heapdump.20200324.135907.34079992.0002.phd' in
response to an event
JVMDUMP010I Heap dump written to /shrd/mvsxxx/trainingEngine/heapdump.20200324.135907.34079992.0002.phd
JVMDUMP032I JVM requested Java dump using
'/shrd/mvsxxx/trainingEngine/javacore.20200324.135907.34079992.0003.txt' in
response to an event
JVMDUMP010I Java dump written to /shrd/mvsxxx/trainingEngine/javacore.20200324.135907.34079992.0003.txt
JVMDUMP032I JVM requested Snap dump using
'/shrd/mvsxxx/trainingEngine/Snap.20200324.135907.34079992.0004.trc' in
response to an event
JVMDUMP010I Snap dump written to
/shrd/mvsxxx/trainingEngine/Snap.20200324.135907.34079992.0004.trc
JVMDUMP013I Processed dump event systhrow, detail java/lang/OutOfMemoryError.
Exception in thread main java.lang.OutOfMemoryError: Java heap space tat
com.bmc.zso.ami.trainingEngine.DataSet$Model.<init>(DataSet.java:992) tat
com.bmc.zso.ami.trainingEngine.DataSet.add(DataSet.java:248) tat
com.bmc.zso.ami.trainingEngine.Driver.main(Driver.java:424) tat
com.bmc.zso.ami.trainingEngine.Main.main(Main.java:9

Solution

Set MODELGEN_MEMORY to a larger value. 

  • Default value is 40 GB.
  • Minimum value is 64 MB.

We recommend using the default value if possible. Otherwise, use as large a value as feasible.

The system memory limit is lower than MODELGEN_MEMORY

Click here to expand...

The following errors are issued if the system memory limit is lower than the MODELGEN_MEMORY setting (default value is 40 GB):

2020/03/12
07:46:29,716 ERROR íù íù com.bmc.zso.ami.pipe.AmiPipeManager  - error
message is:  JVMJ9VM015W Initialization error for library j9gc29(2):
Failed to instantiate heap.  40G requested Error: Could not create the
Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.

Solution

Increase the started task region size and USS MEMLIMIT.

The started task region size is too small

Click here to expand...

The following errors are issued if the started task region size is too small:

10.01.23 S0015688  OSZ0100E ERROR REPORT FOR YCSBWATC (ASID=X'08BF',CSCB=YCSBWATC)  508   
   508              TCB=008C3280 FMID=ZPDR110 RMID=PDR1100 PROD=AMIPRD                    
   508              LMOD=PDRPRDCT MEMBER=PDRWMPCR FUNCTION=PDRWMPCR                       
   508              RC=X'00000008' RSN=X'00000026' R1=X'00000000_00000100' EX_OFF=X'01AC'  

Solution

Increase the started task region size.

The MODELGEN_JAVA value is invalid

Click here to expand...

An error similar to the following one is issued if MODELGEN_JAVA is set to an invalid value:

TrainingEngine Exception class java.io.IOException
Cannot run program \"\/java\/J17.0_64\/bin\/java\" (in directory
\"\/shrd\/mvsxxx\/trainingEngine\"):
\/java\/J17.0_64\/bin\/java\u0000: not found

Solution

Set MODELGEN_JAVA to the correct JAVA_HOME value.

The MODELGEN_PATH value is invalid

Click here to expand...

An error similar to the following one is issued if MODELGEN_PATH is set to an invalid value:

TrainingEngine Exception
class java.io.IOException Cannot run program /shrd/java/J17.0_64/bin/java (in
directory /mvsxxx/trainingEngine): EDC5129I No such file or directory.

Solution

Set MODELGEN_PATH to the correct path for the DataTraining.jar file.

Where can I find the details for all containers allocated in the Docker instance?

Click here to expand...

To see the details for resource allocation and use in your Docker container, use the Docker stats command.

The details for all containers allocated in the Docker instance are displayed.

image-2023-10-10_12-37-44.png

How can I check logs, status, and file system information in a dashboard container?

Click here to expand...

The following table describes the dashboard container commands:

Action

Commands

To list the available images

docker images

To list the running containers

docker ps

To stop a running container

docker stop <container-name>

To remove a running container from memory

docker rm <container-name>

To check the container logs if everything started successfully

docker logs <container-name>

To check docker container system information 

docker system info

The Tomcat address space abends with an SEC6 error

Click here to expand...

One of the following errors, which are issued to the Tomcat standard error file, might be the probable cause:

  • Extended attributes APF and PRGCTL are not specified for libpredictjni.so and libpredictjni64.so
  • The Java path symlink file owner is not BPXROOT.
BPXP029I OPEN ERROR FOR FILE PATH  451   
/shrd/mvsxxx/tomcat/bin/libpredictjni.so
DEVICE ID 35 INODE 656.                  

IEA995I SYMPTOM DUMP OUTPUT  514                                       
SYSTEM COMPLETION CODE=EC6  REASON CODE=0594E04B                       
 TIME=13.26.46  SEQ=02731  CPU=0000  ASID=004F                         
 PSW AT TIME OF ERROR  070C4401   B9BFB48E  ILC 2  INTC 0D             
   NO ACTIVE MODULE FOUND - PRIMARY NOT EQUAL TO HOME                  
   NAME=UNKNOWN                                                        
   DATA AT PSW  39BFB488 - C07818F2  0A0DEBE9  D3600096                
   AR/GR 0: FFF00001/00000000_00000920   1: 00000000/00000000_04EC6000
         2: 00000000/00000000_0594E04B   3: 00000000/00000000_063FBB10
         4: 01FF000A/00000000_00000004   5: 00000000/00000000_00000004
         6: 00000000/00000050_00000040   7: 0101006D/00000050_4E534B20
         8: 0101006D/00000000_01D94518   9: 00000000/00000050_4E53531F
         A: 00000000/00000050_4E533DA0   B: 00000000/00000000_063FBB10
         C: 00000000/00000000_39BFB6E8   D: 00000000/00000050_4E534320
         E: 00000002/00000000_39BFB473   F: FFFFFFFE/00000000_0594E04B
 END OF SYMPTOM DUMP

Solution

To resolve this issue:

  • Make sure you specify the APF and PRGCTL extended attributes for libpredictjni.so and libpredictjni64.so.
  • Make sure that the Java path is defined with the owner BPXROOT (uid=0).

The SMF data set name doesn't exist

Click here to expand...

The following errors are issued to the Tomcat standard error file if the SMF data set name doesn't exist:

IKJ56228I DATA SET LPG.SMFTEST.QBGL.DATAX NOT IN CATALOG OR CATALOG CAN NOT BE ACCESSED
OSZ0062E DYNALLOC ERROR RC=04 ERROR=1708 INFO=0002 EERR=0000 EINFO=0000 ERSN=00000FD6
PDTPR0610E  Request ID (123456789012345678901234)  484                          
PDTPR0610E  Allocation error with SMF Input file                                
PDTPR0610E  Dataset: LPG.SMFTEST.QBGL.DATAX                                     
PDTPR0610E   RC 04  ERROR 1708  INFO 0002 ERSN 00000FD6                         
PDTPR0610E           EERR 0000 EINFO 0000                                       

Solution

Enter the correct data set name.

RACF permissions are configured, but you can't log in to the application

Click here to expand...

All RACF permissions are configured, but you can't log in to the application, and the log displays the following message:

Unable to login Unable to verify password for user <username> Success: false, errno: 157, errno2: 151782063, errnoMsg: EDC5157I An internal error has occurred.

Solution

If you see the following INFO message in the job log, follow the steps that follow it:

BPXP015I HFS PROGRAM <JAVA_HOME>/bin/classic/libjvm.so IS FROM A FILE SYSTEM MOUNTED WITH THE NOSETUID ATTRIBUTE.
  • If the program's file system is mounted with the SETUID attribute, verify that <JAVA_HOME> value is set to 1.
  • Verify that the APF and Program Control attributes of the <JAVA_HOME>/bin/classic/ libjvm.so program are honored. If not, set the <JAVA_HOME> directory to 1.

IBM Java JNI codes require the Program Control bit to invoke the RACF functions functions, so the Java file system must be mounted with SETUID. For more information, see the SAF/RACF-based security topic in the IBM documentation.

CPU consumption in Tomcat STC is very high

Click here to expand...

When the BMC AMI Ops Insight Tomcat server is running, the CPU consumption is very high. This happens when the HOME property in the OMVS segment of the Tomcat STC user ID is empty.

Solution

Make sure that the OMVS segment of the user ID, which is assigned to the Tomcat started task, is defined with a home directory. 

Set the HOME property of the OMVS segment as follows:

HOME=/u/<started_task_userid>

An error occurs while processing RCA during real-time scoring with Java 17.0.7

Click here to expand...

When processing RCA, you get the following errors:

  • In Tomcat
    • Exception creating pipe:null
    • Failed to delete pipe nullclass java.io.IOException
  • In BMC AMI Manager:
    • Error Occurred processing  rca","class java.io.IOException","Cannot run program \"rm\": error=124, EDC5124I Too many open files.

Solution

Upgrade to Java 17.0.9.

Real-time data streaming stops with message PDRRH0405E in syslog (WTO)

Click here to expand...

BMC AMI Ops Insight suddenly loses connection, resulting in the stopping of real-time data streaming. This causes a socket timeout exception in the Tomcat servlet log.

This is a connection timeout in the SMF record handler.          

When a socket timeout exception occurs, the product retries connecting three times, and if the connection is still not established, the streaming process stops, and the message PDRRH0405E is sent to operator (WTO) as follows:

PDRRH0405E
Realtime streaming has stopped for AMI-Manager <IP>:<port> due to connection timeout with SMF Record Handler. Restart SMF Record Handler started task to resume realtime streaming.  
                         

Solution                                                                                                                                  

Restart the SMF record handler to resume the real-time streaming process.                         

The USS file system is out of storage space

Click here to expand...

The USS file system runs out of storage space.

Solution

You can use one of the following options:

  • Increase storage space.
  • Check for redundant DB backups in ../aoidata/aoiinst/backups.
  • Check for redundant logs and logs management configurations. 
  • Customize the logs size and numbers using the following parameters: 

    • For Tomcat, add and modify the following parameters in <HLQ>.XXSAMP(AMITCEN7):

      IJO="$IJO -DTCAT_SERVLET_LOG_SIZE=30MB
      IJO="$IJO -DTCAT_SERVLET_ROLL_MAX=30" 
    • For SMF Record Handler, modify the following parameters in ../aoihome/smfrh/bin/run_smf_handler.sh:

      export SMFHDL_LOG_SIZE=20MB
      export SMFHDL_ROLL_MAX=10
    • For BMC AMI Manager modify the following parameters in ../aoihome/oi_home/mgr/bin/run_manager.sh:

      export AMIMGR_LOG_SIZE=20MB
      export AMIMGR_ROLL_MAX=10

Error in job log of BMC AMI Ops UI Discovery server, BMC AMI Ops UI server, or BMC AMI Manager started task

Click here to expand...

One of the following errors is displayed in the job log of BMC AMI Ops UI Discovery server, BMC AMI Ops UI server, or BMC AMI Manager started task, and the server starts.

  • exception=java.net.ConnectException: EDC8128I Connection refused. (Connection refused) stacktrace=com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: EDC8128I Connection refused. 
  • exception=java.net.ConnectException: EDC8128I Connection refused. (Connection refused) stacktrace=com.sun.jersey.api.client.ClientHandlerException: java.net.ConnectException: EDC8128I Connection refused. 

For example, BMC AMI Ops UI Discovery job log displays ‘Started AmiOpsDiscoveryServiceApplication in 18.742 seconds (process running for 22.456)’. error message.

Solution

  1. Make sure the ports are open and accepting connections. You can use the TSO NETSTATS command.
  2. Make sure that the incoming connections to this host and port are not getting blocked in the TCPIP log. If an error message is displayed in the log for this host and port combination, then refer to the relevant IBM documentation to know more about the error code.
  3. If you are running in TLS or AT-TLS mode and using RACF Keyrings, make sure the certificates in the keyring are valid.
    User IDs listed against the started task in SDSF have access to the Keyring.
  4. To validate the settings, use one of the following options:
    1. If you are using AT-TLS, see Configuring the system for AT-TLS connections.

The job log started task shows a valid certificate not found error

Click here to expand...

Job log started task shows a valid certificate not found error when a new or changed certificate is not present in the truststore.

Solution

  1. To download the client certificate to the server from the browser, click View site information > Connection is secure on the left of the address bar.
  2. FTP this .crt file to a USS location on the mainframe in ASCII format.
  3. Add this certificate to the truststore defined in AMIHLQ.UBBSAMP(AMCMNEV) file.
    Alternatively, you can add the certificate to the truststore (PKCS12/JKS) by using the following Java keytool command:

    keytool -import -keystore <//trustStoreName//> -storepass <//password//> -file <//absolutePathTotheCertificate//> -alias <//certificateAliasName//> -trustcacerts -storetype <//trustStoretype//>

BMC AMI Ops Monitor hyperlinks are not available on the Probable Cause Analysis page

Click here to expand...

Monitor hyperlinks are either not available or are not working on the Probable Cause Analysis page. When you hover over the links or click them, the following messages are displayed:
* Unable to determine Target Context
* The AMI OPS Monitor is not configured to monitor the target, or one of the components is down.

Solution

  1. In the amipdt.properties file, make sure the value of ENABLE_RCA property is set to true.
  2. Make sure the target MV Host Explorer is up and running on any of the PLEX LPARs that BMC AMI Ops Insight is monitoring.

Scoring paused owing to a critical component or resource issue

Click here to expand...

The following message is displayed. This indicates that a critical resource, such as memory, CPU, or disk space, is low or unavailable, or a critical component (hsqldb, scoring engine, Tomcat, or SMF record handler) is not available.

PDTAM0117E= Scoring paused due to critical component or resource issue

Solution
View the active issues on the Product Health Status page.

To resolve the health issue, restart any missing components, or address the memory, CPU, and storage-related problems.

 

 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*

Analyze the probable cause of an IMS event with the BMC AMI Ops Insight product