Masking sensitive data
The View Object page of a Discovered Process shows the command used to start the process. In some cases, a user name and password, or some other sensitive data is shown in clear text. You can also view the contents of a discovered file, and in some cases these too can contain passwords or other sensitive data. You can prevent this using Sensitive data filters.
Sensitive data filters for processes only mask information from the discovered process or file; not from, for example, package names.
A sensitive data filter is a regular expression to define data that you do not want displayed. When matched, the sensitive portion of the data is hashed using MD5. The hashed data can be compared with earlier versions to determine whether it has changed, while the actual data remains hidden from users.
Sensitive data filters use MD5 hash
Common passwords and dictionary words can be extracted from MD5 hashes using commonly available tools. If you rely on sensitive data filters to entirely mask passwords, you should ensure that any that may appear in discovered data are good strong passwords.
Managing Sensitive Data Filters
- From the Discovery section of the Administration tab, select Sensitive Data Filters. The Sensitive Data Filters window is displayed with the Processes tab visible.
- To view or edit filters for files, click the Files tab.
- To edit an existing filter, click Edit.
- To delete an existing filter, click Delete.
- To add a new filter, click Add .... A new field is added into which you can enter a regular expression.
- To create the filter, click Apply.
To reorder sensitive data filters, click the up or down arrow in the ordering column for the filter you want to move. You can also move a filter to the top or bottom of the list using the top or bottom arrow buttons.
Creating a Sensitive Data Filter
The regular expression will usually match more than just the sensitive data, including for instance an identifying argument name like "-password". The portion of data to be hashed must be enclosed in brackets to form a regular expression group. Portions of the regular expression not enclosed in the brackets will be unmodified.
The following command has the "--password" in clear text. The regular expression needs to use "--password" to locate the data, and define how much to mask around it.
- This regular expression adds "\S+" to identify a sequence of one or more non whitespace characters, making a
regular expression of "--password \S+". Brackets are then added to define the portion that requires masking, making "--password (\S+)".
After rediscovery, the new process node will have the password portion replaced with an md5 hash.
- For more resilience against extra white space the single space in the regular expression should be replaced with \s+, which matches any whitespace character making "--password\s+(\S+)" which is the form that most sensitive data filters would take.
Notes on Sensitive Data Filters
- When writing regular expressions for sensitive data filters, you should ensure that it does not match too much of the command. If the filter masks some of the command that a pattern uses to identify a piece of running software, that pattern will then be unable to identify the software. See Writing efficient regular expressions for more information.
- The filters are not applied to the inferred data model until you perform a discovery run. Sensitive data discovered before applying a filter will remain in the history and DDD until it is aged.
- If applied to files, the files must remain valid. For example, if applied to an XML file, the XML must remain valid otherwise Xpath processing will not work.