Page tree

Unsupported content

 

This version of the documentation is no longer supported. However, the documentation is available for your convenience. You will not be able to leave comments.

This topic contains information about troubleshooting problems with the publishing server and publication failures.

Promotion, reconciliation, and publish are independent processes. It is possible that the promotion and reconciliation processes are successful, but the subsequent publication fails.

BMC Impact Model Designer notifies you only of the success or failure of a promotion, not whether the publication is successful or has failed. BMC recommends that you monitor the success or failure of publications that are automatically started.

Verifying that the publishing server is running

To ensure that only one publishing server is running, the publishing server maintains the file installationDirectory/pw/server/log/ps_ hostName/ps.lock. If the Infrastructure Management JMS is not functioning properly, you can use ps.lock to verify whether the publishing server is running.

Stopping the publishing server when JMS is not running

If the Java Message Service (JMS) is not up and running properly, find and stop the publishing server process.

Ensure that you do not kill the publishing server process when it is processing a request. Note that this procedure is for UNIX platforms only.

  1. At the command prompt, navigate to the installationDirectory/pw/server/log/ ps_ hostName directory.
  2. Execute the command fuser ps.lock
    The process ID of the publishing server process is returned.
  3. Kill the publishing server process by executing the command: fuser -k ps.lock

Publishing large service models

When publishing large models, several parameters might require adjustment, which are as follows:

  • In the pserver.conf file, the configuration parameter ARSXLongTimeOut may not be set high enough. This parameter specifies the timeout value for the communication between the Publishing Server and the BMC Atrium CMDB.

    Reinitialization of a cell (pinit) and a new, successful publication are necessary to avoid subsequent publication job failure with the message unique data identifier not/already in use .

    By default, the Publishing Server estimates the timeout needed. If the timeout is not adequate, set ARSXLongTimeOutEstimate=F and increase ARSXLongTimeOut

    If publication fails during the database update with the message Failure while applying publish on CMDB - Error - 92 Timeout, the operation has been accepted by the server and usually completes successfully. The value for ARSXLongTimeOut is not set high enough and expires before the BMC Atrium CMDB has terminated committing modifications in the impact dataset.

    BMC Atrium CMDB continues to commit modifications in the impact dataset and after a while the service model will be available in the impact dataset. Ensure that the parameters are set correctly. The same failure might occura  when initializing a CMDB that contains large service models.
  • In the installationDirectory/pw/server/etc/smmgr.conf file, the DestinationBufferKeepSent parameter is the timeout for communication between smmgr and the cell. In the installationDirectory/pw/server/etc/pserver.conf file, the SMMMessageBufferKeepSent parameter is the timeout for communication between the publishing server and smmgr.

    By default, the Publishing Server estimates the timeout needed. If the publication fails with the publish verification of IMs failed, set SMMMessageBufferKeepSentEstimate=F and increase SMMMessageBufferKeepSent

    If publication fails with the publish validation of IMs failed, use the following information to troubleshoot the problem according to the message you receive:
    IM <CellName> failed to upload service model from SMM

    The DestinationBufferKeepSent of smmgr is not high enough and expires before the cell has terminated uploading thevservice model from smmgr.
    IM <CellName> did not answer the request

    The SMMMessageBufferKeepSent of the publishing server is not high enough and expires before the smmgr has applied the verification or upload.
    In both cases, the cell continues to upload and eventually the service model is available in the cell. Nevertheless, reinitialize the cell and publish again to avoid subsequent publish jobs failing with the message Unique data identifier not/already in use .
  • When performing a penv init or a pinit of a large service model, the stack size and the heap size of the publishing server might must be increased. For service models with approximately 10,000 CIs and 10,000 relationships, you must double both the stack size and the heap size. To double the stack size, in the pserver.bat file (Windows) or the pserver file (UNIX), change the default -Xoss400k and -Xss512k values to -Xoss800k and -Xss1M. To double the heap size, in the pserver.bat, file change the default -Xms256M -Xmx800M values to -Xms512M -Xmx1600M. Additionally, in the pserver_service.conf file, add the following parameters before restarting the publishing server:
    wrapper.java.additional.2=-Xms512M
    wrapper.java.additional.3=-Xoss800k
    wrapper.java.additional.4=-Xss1M
    wrapper.java.additional.5=-Xmx1600M
    Making these changes will allow publishing of 10k models successfully if you run pserver.bat manually.

Publishing failures and reattempts

When an automated publish request fails because of reasons independent of model consistency (for example, when a cell is not available), the automated publisher retries the publish. The configuration parameter AutomatedPublishRetryPeriod in the pserver.conf file defines the interval between two publish requests.) If a request is still not terminated when the interval runs out, a new interval is started.

The configuration parameter AutomatedPublishRetryCount gives the maximum number of retries where:

    • 0 means no retrial, thus only a single publish request is performed.
    • 1 means a publish request and one retry attempt, if necessary.
    • A number less than zero (-1) means the automated publisher will republish indefinitely, until a publish is successful.

The publishing server service or daemon fails to start

Only one publishing server may be running at any given time. This is controlled in the installationDirectory/pw/server/log/<PSName>/ps.lock file, which is updated with a time stamp every minute by the publishing server as it runs. If the publishing server is stopped gracefully, then ps.lock is removed.

If the publishing server service or daemon fails to start and displays the error message

Unable to launch BMC Impact Publishing Server. Another BMC Impact Publishing Server is already running

remove the ps.lock file in the installationDirectory/pw/server/log/ps_ hostname/ directory and restart the publishing server service (or daemon).

No publication after successful promotion

Even if promotion is successful, publication might still fail. Promotion and publication are asynchronous processes. If the Promotion dialog box in BMC Impact Model Designer indicates that the promotion was successful but data does not appear in Infrastructure Management, follow the guidelines in this section to troubleshoot the problem.

Unable to start automated publishing

You might receive the following error event after switching to automated mode:

Unable to start automated publishing. ERROR-8755 The specified plug-in does not exist. (BMC.ARDBC.NOTIFY).

This error occurs when the Notify ARDBC plug-in is not loaded when the publishing server starts in automated mode. Verify that the plugin is properly installed and loaded.

Verify automated publishing mode

Verify that the publishing server is running in automated mode with the CLI command psstat. If the psstat command returns Started - Automated mode , automated publisher is up and running.

If the psstat command indicates that the publishing server is not running in automated mode, it may be in manual mode. This might have occurred because the configuration parameter AutomatedStartMode in pserver.conf is set to Manual, or because the mode was set with the CLI command pscontrol. If the publishing server is running in manual mode, you can request a publication by using the CLI command publish.

To switch to automated mode, run the CLI command pscontrol automated.

Reconciliation jobs hang

When reconciliation jobs hang and remain in a started status (causing BMC Impact Model Designer promotions to hang), the NotifyARDBC plug-in is not installed or is not running.

On the system where BMC CMDB is installed, ensure that the NOTIFY plug-in configuration and the BMC Remedy AR System plug-in environment variables are correct so that the NOTIFY plug-in is loaded. To verify that the NotifyARDBC plugin is running, follow these steps:

  1. Log on to Remedy User.
  2. Open the form NOTIFY:protocols and retrieve entries.
    You must get one entry with version 1.
  3. Open the form NOTIFY:servers and retrieve entries.
    You must get one entry. If the port is not accessible for the publishing server to open a TCP/IP connection, verify the installation of the Notify ARDBC plugin. The port must be open for the publishing server to open a TCP/IP connection.

The publishing server does not reply to requests

The client and server use the JMS service of Infrastructure Management for communication. Normally, the publishing server restores when the JMS service drops. If the JMS service remains down, the publishing server stops with a critical error.
However, occasionally the communication is not restored, and the pserver.trace file contains repeated warnings from org.jboss.mq.SpyJMSException.
To resolve this situation, perform the following steps:

  1. Verify that Infrastructure Management is running properly (or restart Infrastructure Management).
  2. Restart the publishing server.

Publishing duplicate relationships

When a service model includes more than one impact relationship between the same pair of CIs, the following warning message is displayed in Publication History dialog box of the administrator console:

The udid{0} relationship containing attributes {2} has been dropped because it is the same as the relationship with udid{1}.

The message above is visible only when the value of the DropDuplicateRelation parameter is set to T in the installationDirectory/pw/server/etc/pserver.conf file. If the value is set to F, publication fails, and an error message is displayed in the Publication History dialog box. For information about the other parameters in the pserver.conf file, see pserver.conf file and parameters.

If the value of the DropDuplicateRelation parameter is set to T, the publishing server discards the duplicate relationship and publishes the service model.

If the value of the DropDuplicateRelation parameter is set to F, you must manually delete the duplicate relationship between the CIs in the service model, and publish the service model. To delete the duplicate relationship manually, perform the following steps:

  1. Log on to BMC Remedy AR System.
  2. Open the BMC.Core:BMC_BaseRelationship form.
  3. In the ReconciliationIdentity field, enter the ID of the warning message (displayed in the Publication History window) and click Search.
    The form is populated with the details of the service model.
  4. On the toolbar at the top of the form, click Actions > Delete to delete the duplicate of the service model record.

    Note

    The warning message is not displayed in subsequent publish operations. BMC recommends that you delete the duplicate relationship.

Configuration item (CI) is not published

Use the following guidelines to diagnose why a configuration item (CI) is not published:

  • In the administrator console, open the Publication History dialog box. If the publication request log is not available, diagnose if publication happened and whether it was successful or failed.
  • If the publication history indicates that the publication failed, examine the request log in the Publication Details pane for reasons.
  • If the publication succeeded and the CI does not appear, examine the request log in the Publication Details pane for the reasons. If the CI is not included in the publication, examine the publication filters that should select the CI and for examine the Atrium filter in the PublishToSIM setting of the CI in Atrium.
  • Examine trace files: pserver.tracesmmgr.trace, and mcell.trace.

Publishing Server service is running, but the Publishing Server is not reachable

On Microsoft Windows systems, when using the pscontrol stop command to stop the Publishing Server, there are exceptional instances in which the java_ps.exe process is stopped (does not run), but the Publishing Server process (pserverinstall.exe) is still running.

Symptoms

When you execute the psstat command, the Request timeout expired message is returned.

The running processes display the pserverinstall.exe process but not the java_ps.exe process.

Fix

Kill the pserverinstall.exe process and restart the Publishing Server.

Diagnosing publication failures

When a publication attempt or other request fails, examine the details of the request log in BMC Impact Model Designer by using the menu command Publish History (Tools > Publish History.)

The following table describes the request failure messages of the publishing server, what causes the problem, and what to do to correct the problem.

Publishing server request failure messages

Failure message

Cause

Action

Classinfo is not synchronized.

For various reasons, the class definitions in the BMC Atrium CMDB can become out of sync with the class definitions of the published service model of the cells. For example, a class might be modified in the BMC Atrium CMDB after the service model is published to the cell.

Run pclassinfo -x -o mc_sm_object.baroc. Replace the existing mc_sm_baroc.object file of the target cell in the installationDirectory/pw/server /etc/ cellName /kb/classes directory. Recompile the cell's Knowledge Base, and restart the cell.

Component alias "{0}" for component "{1}" is already used by component "{2}".

Two CIs have the same alias.

  • Ensure that all CIs have unique aliases.
  • Publish the purge by using the CLI command publish -p "Purge=T".

Connection to IM cellName is not open OR Connection to IM cellName dropped.

The publishing server is not able to connect to the BMC Impact Manager or the connection was dropped.

Verify that the target cell instance is running. Restart it if necessary. Also verify that the cell's location and encryption key are registered with Infrastructure Management.

Consumer/Provider component with mc_udid {0} is not defined.

This message might occur if an impact relationship is pointing to a non-existent CI.

Such problems might occur when two promotions follow very quickly, and the first promotion adds a relationship and the second promotion moves a CI out of model. Using automated publish for two promotions will prevent this failure.

IM {0} failed to launch SMM (Service Model Manager).

In a cell's trace file, you find the message Service Model Manager process ({0}) not active within expected delay. Please verify. The cell does fork a Service Model Manager (SMM) process. In the mcell.conf file, the parameter ServiceModelManagerStartTimeOut = 60 defines the timeout.

Increase the value of ServiceModelManagerStartTimeOut.

IM {0} failed to upload service model from SMM

This failure message after a failure in the second phase of the two-phase commit.

Reinitialize the cell and publish again (to avoid subsequent publishes failing with the message unique data identifier not/already in use).

IM is not publish enabled.

The ServiceModelPublish parameter in the installationDirectory/pw/server/etc/mcell.conf file or in installationDirectory/pw/server/etc/<CellName>/mcell.conf file is set to No.

Reset the ServiceModelPublish parameter to Yes and restart the cell.

init verify failure

When you have previously published from a Direct Publish environment and now want to publish from BMC Atrium CMDB, the Direct Publish management data conflicts with management data being published from BMC Atrium CMDB.

Delete Direct Publish management data by using the pposter CLI command and the delete action command.

No user group defined with id {0}

In the BMC Atrium CMDB, a CI's securities point to BMC Remedy AR System user group ids. In the BMC IM, a CI's securities points to BMC Impact Administration User Roles. The publishing server maps the BMC Remedy AR System user group ids to user role names, by using the user group info found in the AR form groups and the AR external authentication group mappings. This failure typically occurs when you remove a group in AR Server for which there still are components that refer to it.

Modify the CIs to point only to existing user groups.

Operation on instance of different environment

The data instance is already published to the cell from another publish environment.

Use the instance's publish environment to publish modifications or deletions.

Provider_home_cell ({0}) is remote but component {1} is local

This error can occur as a result of a typo when registering cells. For example, cell X runs on port X, and cell Y runs on port Y. However, port X is mistakenly entered for both cells. While cell X is running, a provider component with cell name Y is sent to the cell on port X; thus the cell X impact relationship is sent to the cell with name Y; thus, the following is observed:

  • The cell on port X is component local (same cell as relationship).
  • Provider_home_cell has value Y, so the provider_home_cell is remote (other cell as relationship).

The issue originates from the fact that although the CI is sent to cell Y, in reality, it is sent to cell X because that cell is listening on the (erroneous) port (X) of cell Y.

Correctly register the ports of the cells.

Publish returns generic failure message, such as Publish validation of Impact Manager failed

Publication failure

Use the -v option (publish -v) to return both generic and detailed (verbose) failure messages.

The AR System plug-in server is not responding (ERROR-8939).

When the load on the BMC Remedy Action Request (AR) System server plug-in is very high, the system sometimes returns a connection error.

Try the following workarounds:

  • Start another publication.
  • Restart the BMC Remedy AR System server.
  • Adjust the BMC Remedy AR System server configuration parameters. (See pserver.conf file and parameters).

The cell alias is not mapped to a cell name in the current environment

The attribute HomeCellAlias has a value that is not defined in the publish environment's CellAliases.

Define the cell aliases correctly.

The minimum supported protocol version is 7.

The version of the target cell instance is earlier than the required version.

Uninstall the earlier version and install the appropriate version.

Unique data identifier already in use.

A service CI with the same mc_udid is already published in the cell.

The service model in the cell is most likely not in sync with the master copy kept in the BMC Atrium CMDB impact dataset. Reinitialize the cell. If reinitializing the cell fails because of invalid data, then the master copy is invalid. Reinitialize the BMC Atrium CMDB.

Unique data identifier not in use

This failure might occur when the deletion or modification of a CI with a udid that does not exist is requested. For Atrium CMDB Publish, this typically happens when the service model in a cell is not in sync with the service model in (the impact dataset of) the BMC Atrium CMDB, typically when a previous publish failed because of failure while applying publish on cell or BMC Atrium CMDB, or when cell has been restarted with the -id option.

Reinitialize the Infrastructure Management data from the publish environment by executing the CLI command pinit -n cellName -e EnvId If this solution fails, the data in the BMC Atrium CMDB may be invalid. Reinitialize the BMC Atrium CMDB.

Unknown home cell "{0}" for shadow component

The entry in the mcell.dir file of the consumer's cell is not defining the provider's cell.

Correct mcell.dir.

Verification of publish failed

Publishing was not successful.

Check the publication history to view detailed information about the cause of publishing failure. For information about viewing the publication history, see Viewing publication history.

You may receive detailed failure messages from the BMC Atrium CMDB.

For instance, you will receive failure messages when the number of CI's exceeds the limited number available with a trial license. These failures may occur in the second phase of the two-phase commit.

To troubleshoot these failure messages, consult the BMC Remedy AR System and BMC Atrium CMDB documentation. If the failure occurred in the second phase of the two-phase commit, to avoid subsequent publish failures with the message Unique data identifier not in use or already in use, reinitialize the cell and publish again.

Another publish request is ongoing

When the publishing server does not accept or begin processing a publish request, the following messages may be displayed:

  • Another publish request is ongoing.
  • The environment is not registered.
  • Error with ids/udids for partial publish, that is publish of selected instances

Message: Another publish request is ongoing

The publishing server executes only one publication at a time, per cell. If you request a new publication (by using the CLI command publish or pposter ) while another publication is in progress, the message another publish request is ongoing is displayed.

If you receive this message unexpectedly, verify that the previous publication is still running. If a publication hangs (because of an uncached exception, which can be found in tmp/ps.err), then all following publications will result in failure messages, and you must contact BMC Customer Support.

Using dynamic ports with the ARDBC Notify plug-in

The Notify plug-in uses the static port 1840 by default. However, if the Notify plug-in is configured to use a dynamic port, automated publishing might not work. If the Notify plug-in listens on a port that is registered and used by another service, for example, port 1828 used by the cell, automated publication does not function. If this occurs, psstat returns the message:

Started - Starting Automated mode 

To prevent this issue, restart the Remedy AR Server so that Notify plug-in chooses another port to which to listen.

     Using trace files

In few cases, the failure messages does not provide required information to find the root cause. To help debug such problems with publishing, you can use the debug pserver.trace file. The file installationDirectory/pw/server/tmp/ ps_ hostName/pserver.trace contains tracing information. By default, only trace information of level WARN or higher is logged. Enable debug tracing in installationDirectory/pw/server/etc/<PSName>/pserver.trace (or installationDirectory/pw/server/etc/pserver.trace) by commenting out the last two sections and then restart pserver. 

To enable debug tracing for smmgr: 
In etc/smmgr.trace, enable the following:
ALL ALL %T/smmgr.trace

The smmgr.trace file is located in tmp/<cell>.

Publishing Server is not starting in the automated mode

If you have upgraded the BMC CMDB Extensionsfrom 9.0.xx to 9.6, the Publishing Server may not start in the automated mode.

This issue occurs when the duplicate BMC-ARDBC-NOTIFY plugin entries gets configured in the Remedy AR Server.

To fix this issue, perform the following

  1. Log on to BMC Remedy AR System.
  2. Open the ar.cfg (Windows) or ar.conf (Unix) file.

  3. Search for the notifyardbc70.dll (Windows) or notifyardbc70.so (Unix) entry.
  4. Delete the entry and save the file.
  5. Restart the Remedy AR server.