Database and application hygiene
BMC Helix ITSM and Service Management applications are dynamic and highly customizable services. These applications generate a huge volume of transient data and need periodic cleanup for optimum application performance and database size control. Therefore, BMC performs business-critical and proactive cleanups to support flexibility and function without sacrificing performance.
Database hygiene
A time-out strategy is essential for the stability and uptime of your service management applications. The termination strategy is implemented only for your primary database across:
- Long-running, idle transactions that cause extended lock contention and wait times
- Long-running queries that deplete critical resources and cause systemic problems
The following table shows the categories of long-running transactions and queries:
Category | Scope of Timeout | Reason for Timeout |
|---|---|---|
Runaway Queries |
| Queries running for longer than 5 minutes consume resources, memory, and CPU from business-critical operations. These long-standing queries also affect the WAL log disk space and can block Data Definition Language (DDL) operations. |
Blocking Queries |
| A transaction blocking other queries or operations for more than 5 minutes typically has a low chance of being completed, while also causing a high volume of lock contention on business critical operations. |
Long Transactions |
| The reason for these long transactions is typically due to the volume of records being modified. As a result, more locks are accumulated over the course of processing and often results in blocking end user operations, consuming high levels of resources, and becoming a source for WAL log bloat. If not addressed promptly, this issue can lead to the instability of your BMC Helix platform and might cause outages across your subscription services. Transactions that are idle for more than 10 minutes typically means the operation is stuck in indefinite loops, or blocking, within the application code. If the locks hold, other database operations will be blocked and will likely result in hanging threads, application unresponsiveness, probe failures, and POD restarts. |
Application data hygiene
BMC Helix manages and performs regular cleanups for transient data generated by BMC Helix ITSM and other service management applications to maintain performance and compliance in accordance with best practices.
Transient data refers to short-lived, session-based, and non-persistent data that is temporarily stored for backend processing purposes and does not contain any transactional, sensitive, or critical data. Examples of transient data in your BMC Helix ITSM and service management applications:
Process Engine
- For each ticket that initiates a workflow process, a corresponding process context record is created in the Process Context. Once the process is completed, the record is transferred to the Historic Process Context. Once the ticket is closed, this data becomes redundant and serves no further purpose.
- As part of a ticket’s lifecycle, the process engine captures detailed records of all input and output variable states for each action within a process instance.
- When the setting to collect data is disabled, additional data is not collected beyond the point of disabling data collection.
Email Engine
- For outgoing messages, emails scheduled for asynchronous delivery are stored temporarily. Once sent, the data becomes redundant and serves no further purpose.
- Error messages from failed email deliveries and messages queued for retry are retained temporarily. Once retried successfully, these records serve no purpose and should be cleaned up.
Other Backend Jobs
- CAI Events Table: Logs background asynchronous activities for ITSM and SRM.
- Long-Running API and SQL Execution Logs
- CMDB Job Runs and Status: Captures job execution history and current statuses for CMDB operations.
Please refer to the List of out-of-the-box forms and record definitions cleaned up by BMC on the Improving system performance by using the Data Cleanup utility documentation.
Configuration and metadata hygiene checks
BMC Helix performs sanity and integrity checks of customer environments to proactively identify and correct potential risks to ensure your cloud service remains operating optimally. These checks are composed of various application hygiene scripts that periodically detect anomalies within application configurations and/or metadata.
BMC Helix sanity and integrity checks can be categorized into the following, with provided examples:
- Centralized configuration system (CCS), examples include
- Checking and enforcing the correct URL consistency
- Managing session timeouts across web apps
- Managing minimum and maximum number of threads across processes
- Ensuring best practice for email configuration settings related to polling
- Enforcing allow listed run process commands
- Logging configurations, examples include
- Maintaining log levels at ERROR level
- Enforcing compliance for log file sizes across applications
- Managing debug logs within the compliant retention period
- Metadata integrity for some of the out-of-the-box objects, examples include
- Correcting mismatches of archive form fields
- Enforcing consistency across Innovation Suite bundle deployment
- Maintaining consistency for out-of-the-box form views
- Important system forms data, examples include
- Maintaining compliance for out-of-the-box archive policies
- Maintaining service accounts and internal user presence and permissions
- Sanitizing data for CMDB job runs and history
- Activiti History logging, examples include
- If enabled, disabling the data collection setting for Activiti History during the next application hygiene check (4 hour cycles)
- Weekly clean up of ACT_HI_XXX data forms are performed on Saturdays (3:00 am UTC or after), unless exception is granted through a Support Case submission (please include your desired retention period in the Support Case)
The list of examples above are a subset of ongoing checks performed.