Testing the BMC Server Automation infrastructure

This topic was edited by a BMC Contributor and has not been approved.  More information.

The purpose of this page is to provide a BMC Server Automation administrator with a series of tests that can be performed in a BMC Server Automation environment to determine the cause of any latency issues that might be occurring.

Overview

This topic covers methods for testing the BMC Server Automation infrastructure to determine the root cause of any latency issues that might arise in a BMC Server Automation environment. These tests cover the three tiers of the BMC Server Automation architecture: the Client Tier, Middle Tier, and Server Tier. There are many nodes of communication in a BMC Server Automation environment, so it is important to understand how a BMC Server Automation infrastructure is typically configured.

Before You Begin

Before you begin, ensure that you have root / Administrator level access to all of the BMC Server Automation infrastructure servers. This specifically includes all application servers, the database, file server, and any repeaters that may be configured in your environment.
In addition, ensure that you have the necessary tools for monitoring server and network performance and utilization.

Introduction

When preparing to test your BladeLogic infrastructure, it is best to have some idea where to start. Based on your own personal experience, knowledge of the BMC Server Automation infrastructure, and any feedback that you have gathered from end users, you should have a good idea as to which areas you would like to test first. Also, ensure that you have ready the necessary jobs in BMC Server  Automation that have been causing the most amount of delay, and also a set of target servers against which to test these jobs.

Job Execution Performance

Job execution can suffer for several reasons. You may find that it takes a long time for a job to start. You may also find that it takes a long time for a job to complete. In general you should see a fairly linear increase in the amount of time it takes to execute a job for each target you add. That being the case, there are several ways to determine the cause of issues in job execution performance in your environment.

The diagram below shows the servers involved when executing a job, and illustrates the following sequence of events:

  1. A Job is executed from within the Console.
  2. The Application Server adds an entry to the table of jobs to be executed in the database.
  3. An Application Server picks up the job for execution.
  4. If the type of job requires objects from the File Server to be copied to the selected targets, the files are copied from the file server across the network to those targets..
  5. All necessary job requests are sent from the Application Server to all selected targets.
  6. Job status and results received by the Application Server from each target are sent from the Application Server to the Database.

What to Monitor

During the following tests, you will need to monitor the following:

  • Application Server utilization (memory, CPU)
  • Database utilization (memory, CPU)
  • SQL Execution Times
  • Network throughput between the Application Server & DB
  • Application Server logs with DB debug enabled

Preparation

Change Application Server Logging to Debug

  1. Locate and open the appserver.cf file in your BladeLogic installation directory.
  2. Change the logging level to DEBUG so that you are able to view the SQL execution times.

    Note

    You may wish to clear out the appserver.log so that your new log files are generated within a clean file.

  3. Restart your application server for the new logging level to take effect.

Tests to Perform

Test One: Time to Execute a Job

  1. Execute a job, such as an Update Server Properties job against a small number of servers (perhaps 25-50). Repeat this test five times.
  2. For each run, record how much time it takes between job execution time in the Console and when the job begins to execute.

    Note

    If you have remote application servers in your environment, run each of these tests against your central application server and your remote application servers.

  3. For each run, record the SQL execution time by looking at the application server log file.

If it takes a long time for a job to start once it has been executed from the Console, you may be experiencing latency issues in your network between the Application Server and Database Server.

Test Two: Network Throughput and Infrastructure Utilization

To perform this test, we will want to find a job that typically creates a lot of writes to the database. Jobs in this category include Patch Analysis Jobs, Audit Jobs, Snapshot Jobs, and Compliance Jobs. More specifically, one of the following:

  • An Audit Job that evaluates a large number of server objects
  • A Patch Analysis job that analyzes for all patches on a server
  • A CIS or PCI Compliance Job

Once you have selected one or more jobs, run through the following steps:

  1. Execute the job against a small number of targets (perhaps 5-10).
  2. While the job is executing, capture the network latency and throughput between the Application Server being tested and your BladeLogic Database Server.
  3. While the job is executing, analyze the memory and CPU utilization of your Application Server and your Database Server.

    Note

    If you have remote application servers in your environment, please run each of these tests against your central application server and your remote application servers.

  4. For each run, record the SQL execution time by looking at the application server log file.

Use the following guidelines to analyze test results:

  • If memory and CPU utilization are low on both your Application Server and Database Server, chances are the data is taking a long time to get across the network between both servers. Consider adding bandwidth between these two servers.
  • If memory and CPU utilization is high on your Application Server, chances are you could benefit from adding an additional application server to assist with the executing of jobs.
  • If memory and CPU utilization is high on your Database Server, chances are you could benefit from moving to a dedicated and / or higher performance Database Server.
  • If the SQL execution time is slow, this could have a direct correlation to the ping latency from the application server to the database server. Note the following results received from an existing BMC Server Automation implementation:

    Data Center

    SQL Execution Time

    Ping Latency to DB Server

    San Jose

    ~150 ms

    ~80 ms

    Minneapolis

    ~3 ms

    ~0.1 ms

    Andover

    ~3 ms

    ~0.1 ms

    Vienna

    ~30 ms

    ~15 ms

    UK

    ~150 ms

    ~80 ms

Server Automation Console Performance

When using the Console, all actions performed within the GUI are sent to the BMC Server Automation application server. Actions such as browsing the various workspaces, opening objects, or viewing job results all require a query to be sent from the Application Server to the BladeLogic Core Database. Several factors can cause slowness in each leg of a particular request.

Note

Only one Application Server is listed, as the Console will only connect to one Application Server at a time.

Tests to Perform

Launch the Console and run through the following actions. Use the sections below to help determine the root cause of any slowness issues you may experience. Record the amount of time each step takes.

  1. Browse the "All Servers" group.
  2. Browse the "All Components" group.
  3. Open a Component Template containing a large number of rules (such as the BladeLogic CIS or PCI Component Templates.)

Client to Application Server

If the network connection between the Client and the Application Server is slow, there will be delays in the Console when performing any of these actions. When performing each of these tests, what is the network throughput between the Console and the Application Server it is connecting to?

Application Server to Database

When the Console sends a request to the Application Server for each of the above steps, the Application Server will send the request to the Database. When performing each of these tests, what is the network throughput between the Application Server and BMC Server Automation database?

Database Performance

If it takes a long time for the Database to return a request from the Application Server, end users will experience slowness when using the Console. When performing each of these tests, what is the response time of the database?
You may also want to try running a series of SQL queries against your database to help determine how quickly your database server can return results.

Reports Performance

Reports performance can suffer for several reasons. This can be due to slowness in your network, the reports server, or the reports database.

The diagram below shows the servers involved when browsing reports.

  1. The Web Browser makes the request to the Reports Server
  2. The Reports Server queries the Database Server for the required information
  3. The Database Server responds with the requested information
  4. The data is presented back to the Web Browser

Note

The Application Server is not shown as it is only used for authenticating the initial connection to the Reports Server.

Tests to Perform

Log into BladeLogic Reports and run through the following actions. Ensure that you have run Populate Reports to ensure you have the most up to date data. Use the sections below to help determine the root cause of any slowness issues you may experience. Record the amount of time each step takes.

  1. Run the "Top Compliance Exposures" report, or any Patch Analysis results report
  2. Run the "Detailed Server Configuration" report

Client to Application Server

If the network connection between the Client and the Application Server is slow, there will be delays in the Console when performing any of these actions. When performing each of these tests,

Conclusion

In general, most clients find that the biggest bottleneck in the BMC Server Automation infrastructure can be the Database. Without a powerful dedicated Database Server, you will have slow client response times and slow job execution times. However, bottlenecks may be found in the application server, especially if you are executing jobs frequently against a large number of targets. For the most part, simply by analyzing hardware utilization on each of your BMC Server Automation infrastructure servers and throughput speeds between each of these devices, you should be able to determine exactly where improvements can be made to ensure a smooth running environment.

Was this page helpful? Yes No Submitting... Thank you

Comments