System monitoring

Identifying a problem before it impacts service is critical to outage avoidance. BMC has extensive monitoring capabilities in place to proactively monitor performance for the production instances through use of its own tools. These tools include:

BMC Helix Operations Management 

BMC’s Helix Operations Management enables us to intelligently monitor and manage performance across the entire BMC Helix platform. It also enables real-time visibility into the health of underlying systems that provide services.

 We currently monitor the following components in the BMC Helix platform:

System-level monitoring for OSSystem-level monitoring for databasesCustom monitors

Health-at-a-glance

Logical disks

Memory

Network protocols/TCP

Paging files

Physical disks

Processors

Processes

Services

Systems

Metrics for each database instance

Categories

Availability

Cache

Capacity

Disks

Locks

Memory

Network

Performance

Email processing health

Automatic remediation of issues 

Alerting when issues cannot be automatically resolved

For more information, see BMC Helix Operations Management overview Open link .

BMC Helix Log Analytics 

BMC leverages this tool to collect about 100 GB of data per hour, providing access to millions of lines of logs – searchable in seconds. BMC executes hundreds of automation workflows a day in our operations. Some of them are focused on actively managing configuration drifts that could otherwise lead to performance problems. We call these types of jobs “closed-loop-compliance” jobs that simply run automatically based on drift detection.

Additionally, the BMC Helix platform's performance is a direct correlation to the capacity and resiliency of its underlying databases. BMC uses latest generation, scalable, high-performance hardware for its database servers. 

For more information, see  BMC Helix Log Analytics overview Open link .

BMC Helix Continuous Optimization 

BMC Helix Continuous Optimization collects and analyzes the capacity data and core metrics for CPU, memory, and storage, and provides recommendations for optimizing them. We use BMC Helix Continuous Optimization to achieve the following goals:

  • Understand resource and services capacity, utilization, and utilization trends, and to track the usage and performance of infrastructure elements and services.
  • Provide service assurance by identifying and avoiding risks such as immediate or upcoming resource saturation.
  • Identify issues that affect efficiency; for example, idle VMs, overallocated VMs, Power On/Off VMs and resolve them.
  • Predict the infrastructure needs, estimate whether the existing infrastructure is adequate to meet current and future demands, and plan accordingly for a new infrastructure.

For more information, see BMC Helix Continuous Optimization overview Open link .

BMC Helix Service Monitoring with AIOps

BMC Helix Service Monitoring with AIOps enables us to reduce the mean time to resolve (MTTR) issues and maximize service performance and availability. We use this solution for monitoring, advanced anomaly detection, event management and root cause isolation. We use BMC Helix Service Monitoring with AIOps to achieve the following goals:

  • Monitor and observe the key health vitals or parameters for services
  • Observe service performance and availability
  • Look for warning signs or identifying causes for performance degradation
  • Understand the context where the problem occurred

For more information, see BMC Helix AIOps overview Open link .

BMC PATROL Agent for BMC Helix Operations Management

With BMC PATROL Agent for BMC Helix Operations Management, we monitor the status of every vital resource in our environments including operating systems, virtualization infrastructure, business applications, databases, and hardware devices. PATROL Agent enable the real-time streaming of performance data to BMC Helix Operations Management to generate dynamic base line and root cause analysis.

For more information, see  BMC PATROL Agent for BMC Helix Operations Management overview Open link .

BMC Helix Discovery

We use BMC Helix Discovery to automate asset discovery and identify systems in the network and obtain relevant information from them as quickly as possible and with the lowest impact. BMC Helix Discovery provides the actionable insights that enable us to make informed decisions in IT service management, asset management, and infrastructure management. Configuration Management Database (CMDB) integration ensures that data in BMC Helix CMDB is continuously synchronized with information discovered by BMC Helix Discovery. 

For more information, see  BMC Helix Discovery overview Open link .

BMC Helix Intelligent Integrations

BMC Helix Intelligent Integrations enables us to configure integrations with BMC and third-party products to get event, metric, and topology data from these products. We can view and derive actionable insights from this data in BMC Helix Service Monitoring with AIOps, BMC Helix Discovery, and BMC Helix Operations Management.

For more information, see BMC Helix Intelligent Integration overview Open link .

BMC Helix Dashboards

We use BMC Helix Dashboards to get a consolidated view of performance metrics, custom metrics, events for various applications, and infrastructure. This consolidated view helps extensively for troubleshooting, identifying anomaly patterns and remediating issues in a timely manner. As a result, we can analyze and respond to issues quickly so that system downtime is minimized. With BMC Helix Dashboards, we can monitor the health status of all business and technical services in a single dashboard and identify impacted services in real time.

For more information, see BMC Helix Dashboards overview Open link .

BMC Helix Developer Tools

BMC Helix Developer Tools uses a Fluentd-based framework for integration with BMC and third-party solutions to ingest data for monitoring in BMC Helix Operations Management, BMC Helix Discovery, and BMC Helix Service Monitoring with AIOps.. We also use this product to build custom integrations for the third-party solutions for which out-of-the-box integrations are not available and collect logs from various sources such as Amazon Web Services, Kubernetes, Windows- and Linux-based applications, and so on.

For more information, see BMC Helix Developer Tools overview Open link .

BMC Helix Intelligent Automation

BMC Helix Intelligent Automation acts as an automation broker for connecting with BMC and third-party automation tools. It listens to the incoming events from various data sources, such as BMC Helix Operations Management and BMC Helix Service Monitoring with AIOps, and enables automation teams to quickly build automation policies to trigger remediation actions for the incoming events. We use BMC Helix Intelligent Automation with BMC Helix Operations Management, BMC Helix Service Monitoring with AIOps, and BMC Helix Continuous Optimization to support proactive, automated corrective action for issues identified by these solutions.

For more information, see BMC Helix Intelligent Automation overview Open link .

BMC TrueSight Synthetic Monitor 

Synthetic transactions are used through our monitoring deployment to provide proactive management of performance, enabling resolution of an issue frequently before the customer is aware. Synthetic transaction monitoring, through use of the TrueSight App Visibility Manager, runs approximately 3500 scripts every two minutes, collecting over two million data points per hour. This monitoring includes baseline measurement comparisons for login and logoff activities, a health check of the URL and basic search and navigation checks.

BMC TrueSight Orchestration Platform

BMC makes extensive use of BMC TrueSight Orchestration Platform to restore service by automating common recovery tasks. This automation allows us to gather critical logging information for root cause analysis quickly while restoring service. Key benefits of this automation include future outage avoidance and a reduction in resolution times. Key features of the tool that assist in performance monitoring include:

  • Problem remediation supporting the application of known fixes to recurring events and incidents in problem management
  • Health checks and remediation used to determine the status of devices, agents, applications, and so on, and automatically restart agents where / when needed
  • Disk space monitoring - this is fully automated workflow which gets triggered on a disk space event
Was this page helpful? Yes No Submitting... Thank you

Comments

  1. Pablo Di Genaro

    This system monitoring section is also applicable for Helix IPaaS subscriptions? Thanks!

    Jul 08, 2021 07:43
    1. Betty Xu

      Hi Pablo,

      The in-application workflow section applies to Helix iPaaS to a limited extent.  As a SaaS vendor, BMC has built-in monitoring for our Helix subscriptions, including calls to and receiving from Helix iPaaS.  It also stands true that the cross-functional data flow between our Helix services, including iPaaS, covers the four bullets.

      However, the Helix iPaaS service is ultimately provided by Jitterbit, we do not have as extensive monitoring coverage compared to our natively developed Helix subscriptions like Helix ITSM, Helix AIOps, etc. 

      That said, BMC and Jitterbit are strategic partners and collaborate closely together to ensure the service is optimally delivered to customers and anomalies are acted on, primarily according to Jitterbit's in-house monitoring policies, at a timely manner in unison with BMC. 

      May 10, 2022 08:57
  2. Andreas Petraschke

    Is BMC doing any monitoring for the non-prod environments?

    Oct 24, 2023 08:36
    1. Dhanya Menon

      Hello Andreas,

      Thank you for your query. 

      Yes, we do monitor the non-prod environments as well.

      Regards,

      Dhanya

      Dec 01, 2023 04:11