Over time, the database files within the datastore can become fragmented, meaning that the data within them is structured inefficiently, so the files take up an unnecessarily large amount of disk space, and data access speed can suffer. The
tw_ds_compact utility enables you to compact the datastore by writing new copies of the database files. As it creates new files, the data is packed more efficiently, helping to alleviate lost disc space caused by fragmentation.
In a system that is not multi-generational,
tw_ds_compact must be used in offline mode, meaning that the tideway services must be stopped when you use it. In a multi-generational datastore,
tw_ds_compact can also run in an online mode, where old generations are compacted while the system is running.
Compacting the datastore might take a long time. In a large datastore, this can be many hours. Once a compaction starts, you should not interrupt it. To prevent loss of terminal connection interrupting the compaction, you should run
tw_ds_compact inside the screen utility which is installed on the appliance. The user example below shows how to do this.
To use the utility in offline mode on a cluster, ideally you should stop the tideway services on all machines in the cluster and run
tw_ds_compact on each machine simultaneously, and then restart the tideway services when all compactions have completed. This method has the advantage that all machines can be compacted at the same time and the elapsed time is minimized. If you have a cluster of three or more with fault tolerance enabled and wish to minimize downtime, you could compact the cluster members one at a time, leaving other cluster members running, but it is generally preferable to use a multi-generational datastore to achieve compaction with no downtime.
To use the utility, type the following command:
where options are any of the options described in the following table and the common command line options described in Using command line utilities.
Because the utility touches all of the data, you must create a backup of the datastore before running it.
Command Line Option
Lower the process priority.
Data storage directory.
Database cache size in MB.
Database transaction log directory.
In offline mode, delete the specified generation.
Fix the databases after compaction is interrupted.
Do not check for free space or ask any questions.
Purge history entries older than this.
Purge relationship history older than this.
In offline mode, keep old files.
How many generations to keep uncompacted.
Maximum generation number to handle.
Process the smallest files first, risking running out of space part way through
Number of threads.
See Using a multi-generational datastore for
The following example shows backing up a datastore and then performing an offline compaction using the
Backup the datastore
Back up the datastore using the appliance backup tool.
Compact the datastore
- Login to the appliance command line interface as the tideway user.
Run the screen utility. Enter:
Stop the tideway services.
The utility checks that the tideway services have been stopped and that there is sufficient disk space to continue. It then reports the data directory, largest database file, and the free space available. Before proceeding you must confirm that you have made a backup of the datastore.
yesto confirm that you have made a backup of the datastore.
You are then prompted to confirm that you want to start the compaction.
yesto start the compaction.
The utility starts and reports progress until completion.
Do not interrupt the process.
Start the tideway services. Enter the following command:
Recovering from a lost connection using screen
If you lose the connection to the appliance and you have used screen, you can reconnect to the appliance and recover the virtual terminal running the compaction. To do this:
- Reconnect to the appliance and login as the tideway user.
List the current screen sessions. Enter:
If there is only one screen listed, you can re-attach to it with a simple command:
If there is more than one, copy the screen identifier:
The virtual terminal is recovered and you can see how the compaction is progressing until completion.
Start the tideway services. Enter the following command:
Recovering from an interrupted compaction
If the compaction has been interrupted in some way, then the database files are left in a partial state and the datastore cannot run. You can recover from this situation in the following ways:
- Perform the compaction using the
tw_ds_compactutility again and allow it to complete without interruption. See the procedure above.
- Run the
tw_ds_compactutility again with the
--fix-interruptedoption. This fixes the datastore but does not perform any more compaction.
Full file systems – processing smallest files first
Compaction writes a new copy of the database files, meaning that there must be sufficient free disk space to store the new files during the operation.
In online compaction, a new copy of all the files must be created while the running system is still using the old files, meaning that you cannot perform an online compaction if the database disk is more than 50% full.
In offline compaction, each set of database files is processed individually, meaning that compaction only requires enough free space to store the current files in progress. By default, to minimize the risk of running out of space part way through the operation,
tw_ds_compact compacts the largest set of files first, because if the largest files fit, the smaller ones are guaranteed to fit as well.
If there is insufficient free space to compact the largest files first, it may work to start with the smallest files first, in the hope that as the smaller files are compacted, enough space will be freed that by the time the larger ones are handled, there is sufficient space for those too. This is clearly a dangerous option because it may be that even after the smaller files have been processed, there is still insufficient free space for the largest files, meaning compaction will fail.
Another factor to bear in mind if disk space is very limited is that by default,
tw_ds_compact compacts multiple sets of files in parallel from multiple threads. This leads to faster operation, but means that more disk space is required as multiple sets of files are handled simultaneously. To attempt compaction with a minimum amount of free space, you should therefore process smallest files first and limit the number of threads: