Page tree
Skip to end of metadata
Go to start of metadata

Identifying a problem before it impacts service is critical to outage avoidance. BMC has extensive monitoring capabilities in place to proactively monitor performance for the production instances through use of its own tools. These tools include:

BMC Transaction Management Application Response Time (TMART) 

Synthetic transactions are used through our TMART deployment to provide proactive management of performance, enabling resolution of an issue frequently before the customer is aware. Synthetic transaction monitoring runs approximately 2500 scripts every two minutes, collecting over two million data points per hour. This monitoring includes baseline measurement comparisons for login and logoff activities, a health check of the URL and basic search and navigation checks.

BMC Atrium Orchestrator 

BMC makes extensive use of its Atrium Orchestrator tool to restore service by automating common recovery tasks. This automation allows us to gather critical logging information for root cause analysis quickly while restoring service. Key benefits of this automation include future outage avoidance and a reduction in resolution times. Key features of the tool that assist in performance monitoring include:

  • Problem remediation supporting the application of known fixes to recurring events and incidents in problem management
  • Health checks and remediation used to determine the status of devices, agents, applications, and so on, and automatically restart agents where / when needed
  • Disk space monitoring - this is fully automated workflow which gets triggered on a disk space event
  • Automated troubleshooting and service restoration

BMC End User Experience Monitoring (EUEM) 

BMC uses EUEM in our data centers to trap certain traffic and analyze it for various types of latency and redirects. It provides information on latency for Host, Network, SSL, e2e, Think, Idle, Number of Requests, Redirects, Transmission Failures and many other useful statistics. This is being used to predict performance degradation as early as possible.

This tool is used on a case-by- case basis and is used to trap traffic based on the following parameters:

  • Geographic location
  • Session
  • Page type
  • User id
  • Object type

EUEM gives BMC the ability to drill down by customer, by user or even by a particular session ID to detect and resolve issues. These kinds of views are available to our customers using the i.onbmc.com support portal so that they can visualize traffic in real-time along with us. We know where the traffic is coming from and how many requests are streaming, as well as an overall indicator of the user experience.


BMC TrueSight Operations Management 

BMC’s TrueSight Operations Management suite enables us to intelligently monitor and manage performance across the entire BMC Helix platform. It also enables real-time visibility into the health of underlying systems that provide services.

 We currently monitor the following components in the BMC Helix platform:

System-level monitoring for OSSystem-level monitoring for databasesCustom monitors

Health-at-a-glance

Logical disks

Memory

Network protocols/TCP

Paging files

Physical disks

Processors

Processes

Services

Systems

Metrics for each database instance

Categories

Availability

Cache

Capacity

Disks

Locks

Memory

Network

Performance

Email processing health

Automatic remediation of issues via BMC Atrium Orchestrator

Alerting when issues cannot be automatically resolved

BMC TrueSight IT Data Analytics 

BMC leverages this tool to collect about 100 GB of data per hour, providing access to millions of lines of logs – searchable in seconds. BMC executes hundreds of automation workflows a day in our operations. Some of them are focused on actively managing configuration drifts that could otherwise lead to performance problems. We call these types of jobs “closed-loop-compliance” jobs that simply run automatically based on drift detection.

Additionally, the BMC Helix platform's performance is a direct correlation to the capacity and resiliency of its underlying databases. BMC uses latest generation, scalable, high-performance hardware for its database servers. Databases are monitored using the TrueSight suite of tools.

BMC Server Automation 

In order to deliver the BMC Helix services in a repeatable, accurate and secure fashion, BMC uses the Server Automation tool. This tool offers intelligent, policy-based compliance measurement for the BMC Helix platform. This tool is used for:

  • Full-stack service provisioning — deployment of end-to-end applications and services to physical and virtual systems.
  • Compliance tracking — automated configuration management to ensure internal or regulatory standards are applied and consistently met
  • Drift management — quickly identifies any unapproved changes to systems and flags them for remediation
  • Security — makes sure the latest OS and software patches are automatically delivered in a timely fashion

In-application workflow

BMC also monitors application performance by executing low-touch in-application workflow in the production environments at certain intervals. Execution of the workflow will provide performance metrics to BMC for the prescribed use cases and allow us to validate cross-functional data flow between BMC Helix services, providing insight into the overall health of the customer systems. The workflow is designed and optimized so that it will:

  • not impact the performance of the system
  • not be visible to the customers' end users (except an administrator user)
  • not have any license usage implication for your subscription services
  • not have any impact on data storage

BMC may add additional monitoring use cases from time to time as needed.