Archiving data
This topic contains the following information:
What is archived
All data whose index blocks are configured for archiving is archived once it is scheduled to be purged.
If you have some data collectors that you don't want to be archived, simply set the data retention period override for those data collectors to a number of days that is less than the default data retention period. This will ensure that the data from those data collectors is purged before it is archived.
Prerequisites for archiving data
Ensure the following when you archive data.
The path on the Indexer, where the data is to be archived, should be present on Indexers.
- The path on the Indexer, where the data is to be archived, should have write permissions for the user who installed the Indexer.
Java heap requirements for a Restore node
Restore nodes need enough Java heap to support the amount of data that you are restoring. If you have a 14 day retention period on your Live nodes, then your restore node should have the same amount of java heap as your live node and should never restore more than 14 days of data. If you want to restore 28 days of data, you must double the java heap on the restore node as compared to the original Indexer node. If you already have 30 GB as the maximum Java heap, you must add more Indexer nodes to your cluster to support higher restore values to the restore Indexer cluster.
To enable archiving
- From the Administration>System Settings tab, select the Data Archive Settings tab.
- Toggle the Enable Archive switch (
).
When the Enable Archive switch is turned on, you get an option to add an archival path. Enter the path where you want the data to be archived. You can add any number of archival paths, however, only one path can be active at a time.
- Select the option
in front of a path to set it as the active path.
Click Apply.
Clicking Apply restarts all Indexers.
You also must set archive to on for the index block associated with the data collector from Administration > System Settings> Index Block Settings.
You can toggle the switch to set archive to on or off from here, however the data archive settings that have been set from Administration > System Settings> Data Archive Settings override these settings, which means that if you haven't enabled archive in the Data Archive Settings tab, you cannot enable archive for an index block.
For more information, see Modifying index blocks.
The archive process
The following diagram explains the archive process.
Snapshots of data are taken on a daily basis and archived before the data is purged. Snapshots are sent to the archive after the retention period (the default retention period is 7 days). You can restore snapshots to the Restore Node by running the restore snapshots CLI command. For information on where to find snapshots, see Finding snapshots. For information on restoring archived data, see Restoring archived data. Restored data is automatically deleted from the Restore Node after the specified retention days or the default retention days of 2 days. For information on changing the default retention days, see Changing the default retention days.
Finding snapshots
You can find snapshots in the paths that you have set in the Data Archive Settings. The latest snapshots can be found in the active path. The snapshots are taken daily and arranged in month-wise folders. The format of the name of the folder containing a month's snapshot is as follows:
repo_Year_Month_Epoch_time. For example, repo_2018_Jan_1515052753098. This means that the snapshots are from the month of January 2018 at the epoch time 1515052753098.
When you open the month-wise folders, you can find snapshots inside it.
.
Restoring archived data
If you want to search or analyze data that has been archived, you must restore it. To restore snapshots back into the system you must set up a Restore Indexer node. This is an Indexer cluster node that is set aside for housing restored snapshots.This node is a member of the cluster of Indexers, but does not participate in replicating data to the other nodes in the cluster. It is dedicated to restoring archived data to make it live again.
After you have set up an Indexer as your restore node, follow these steps to restore snapshots:
- Run the showdataavailability-CLI-command with the archive option to get details of the archived data. This will enable you to confirm that the snapshots you want to restore are, in fact, in the archive.
Run the restoresnapshots CLI Command by giving the start date and end date of the period for which you want to restore data.
If you are restoring a sizable amount of data, the restore may take several minutes or longer. You can also monitor the progress of the ongoing restore operation using the status option in this CLI command.
- Run the showdataavailability-CLI-command with the restore option to see the details of the restore,
- Search the data using the TrueSight IT Data Analytics console to verify that the data has been restored.
The number of days that the restored data remains in the restore node before getting deleted is called retention days where retention days is considered starting from the date of restore. Restored data is retained for a default period of 2 days after which it is automatically deleted. You can choose how long the data should remain in the restore node by using the restoresnapshots CLI Command with the retentionDays option.
Deleting restored data
All restored data that is older than retention days is deleted. If you don't specify retention days, 2 days is considered as the default retention days and restored data is deleted.
To delete data restored from archive for a selected period, run the deleterestoreddata CLI command. For more information, see deleterestoreddata-CLI-command.
Changing the default retention days
It is possible to change the default Retention Days by changing the number of days of retention as follows:
- Navigate to the following location to locate the searchserviceCustomConfig.properties file.
- Windows: %BMC_ITDA_HOME%\custom\conf\server
- Linux: $BMC_ITDA_HOME/custom/conf/server
- In the the searchserviceCustomConfig.properties file, uncomment and change the following property:
restore.data.retention.in.days
For example, if you want to change the Time-To-Live from 2 days to 3 days, you can do it by un-commenting the property and specifying the retention days.
Example:
restore.data.retention.in.days=3
Self-health monitoring and troubleshooting
TrueSight IT Data Analytics generates events for archiving and displays them in the destination that you set so that you can gain an early insight into likely issues. For information, see Self-health monitoring events generated for archiving.
For information on troubleshooting archiving-related issues, see Troubleshooting-archiving-related-issues.