Data management and recovery strategies in RMGR
Database management systems
A database management system (DBMS) provides structures, processes, services, and tools for handling data safely, quickly, and efficiently.
It provides the underlying support for most data management strategies.
The Information Management System (IMS) product from IBM works in OS/390 and z/OS environments to provide unparalleled reliability and performance for large-scale data processing applications. Organizations all over the world depend on IMS to handle massive amounts of data, process huge volumes of transactions rapidly, and serve thousands of users simultaneously.
IMS provides a solid foundation on which you can build strategies for preserving integrity, improving availability, and conserving resources. Despite the general reliability of IMS, you still need to develop a dependable recovery strategy.
Recovery strategy
Although you need to consider and implement a wide variety of strategies to manage your data, your recovery strategy is probably the most important.
It ensures that problems do not permanently damage or destroy important data, that unplanned outages are short, and that problems (along with preparations for problems) do not consume an unreasonable amount of resources.
Implementing a recovery strategy is like taking out an insurance policy. Both cost resources to obtain, but they reduce the cost and pain of a problem and enable normal activities to resume as quickly as possible after a problem occurs.
Your operating system, IMS, and BMC products work together to support a robust recovery strategy.
Basic processes in a recovery strategy
A general recovery strategy usually involves the following basic processes:
- Making periodic copies (backups) of the data as it exists at successive points in time
- Recording (logging) all changes that have been made to the data over time and safeguarding those log records
- Recovering the data (restoring it to the condition it was in at a specific point in time) after a problem has occurred
- Managing the recovery strategy, including making decisions about when and how to perform the other processes, analyzing recovery situations, and ensuring that recovery is performed correctly
You can refine this overall strategy by employing tools and techniques to improve the reliability, usability, and performance of these processes and their related tasks while reducing resource usage.
Copy databases
You can use any of several techniques in an IMS environment to make database backups.
The most widely used technique is to perform an image copy--make a copy by reading the blocks of data in physical sequence and writing the blocks to an image copy data set.
To perform an image copy, you can use the BMC Image Copy utility or the IBM IMS Database Image Copy utility. The BMC Image Copy utility performs a variety of functions to meet various backup and recovery needs and offers outstanding performance, ease of use, and flexibility.
Log database changes
IMS automatically records database update information and IMS message and control information about events that occur in the IMS environment.
If a problem occurs, log data can be used for backout (reversal) of uncommitted changes and for restoration (reapplication) of committed changes.
An active IMS system can generate a huge volume of log data. The change accumulation technique is commonly used to reduce the amount of resources needed to manage and store log data and to reduce the amount of time needed to process log data during recovery. This process accumulates an aggregate of changes that would be required for database recovery.
To perform change accumulation, you can use the BMC Change Accumulation utility or the IBM IMS Database Change Accumulation utility. The BMC Change Accumulation utility provides a wealth of features to perform change accumulation quickly and efficiently.
Recover databases
When a problem damages or destroys a database, you must use the previously prepared resources to recover the database.
You can perform two types of recovery:
- You can use the backout recovery method, in which the log records are processed against the current database to reverse the changes. This process is performed automatically in an online IMS environment after a failure to back out uncommitted changes, which have not yet been marked as complete. A batch utility is executed to perform this process after a failure in a batch environment.
- You can use the roll-forward recovery method, in which the database is restored from a backup copy and the log records are used to reapply the updates. This process is required if the current database is damaged. You must choose the time to use as the recovery point. A recovery point is a point to which the data can be restored without compromising data integrity. This recovery point is often the current time, but it may instead be the time when an image copy was taken, the time when the database was marked as unavailable for updates, or any other time when you know that the data was correct and complete.
Because the data is unavailable during the recovery process, the recovery must complete as quickly as possible. To obtain minimum elapsed times, your recovery strategy should focus on reducing the amount of data to be processed, devoting the maximum amount of available resources (such as tape drives, sort space, and CPU cycles) to the recovery process, and using high-performance processing techniques.
Many problems, such as logical errors in application programs and storage device failures, can damage more than one database. Your recovery strategy should take into consideration the complications that are introduced when you must recover multiple databases simultaneously. You must also make plans for recovery after a disaster (such as a fire or storm) wipes out your data center.
To recover an IMS database, you can use the BMC Recovery utility or the IBM IMS Database Recovery utility. The BMC Recovery utility provides fast performance and numerous features, including concurrent recovery of multiple databases to any point in time with efficient use of available resources.
Manage the recovery strategy
Recovery strategies can be highly complex.
Preparing for a variety of recovery scenarios ensures that you can always meet service level agreements, regardless of inevitable problems. You are faced with difficult decisions that require carefully balancing the goals of data integrity, data availability, and resource conservation. Some complicated recovery situations are challenging to analyze and tricky to address.
The Database Recovery Control (DBRC) feature of IMS is central to any recovery strategy in an IMS environment. DBRC monitors and records information about the use of system log data sets. DBRC records information about databases that are registered and, in a share-control environment, ensures data integrity of the registered databases. All BMC products that work with IMS databases work with DBRC.
However, IMS, DBRC, and utilities do not provide all the help you need for managing your recovery strategy. For example, they do not address the following concerns:
- How frequently should you copy specific databases? At what times should you copy them? How can you ensure that copies are occurring as frequently as you have planned? How long should you keep the copies? What types of backups should you perform to provide for various recovery scenarios? How can you ensure that all databases are copied--that none have been omitted inadvertently?
- Should you use change accumulation? If so, how frequently should you perform it? How can you ensure that change accumulations are occurring as frequently as you have planned? How long should you keep log and change accumulation data sets?
- What caused the problem that damaged the database? What is the best method for recovering the database? How can you ensure that all elements required for the recovery are in usable condition? If the problem has affected multiple databases, how can you ensure that all of these databases are recovered to a consistent point? What can you do if no IMS-defined recovery points are valid in this situation? How can the recovery execute as quickly and efficiently as possible? How can you prepare for disaster recovery and test your plan?
The Recovery Manager functions and utilities help you answer these questions. They simplify the recovery management process and provide the tools you need to prepare for, analyze, and initiate a recovery in any situation.