Masking sensitive data
The View Object page of a discovered process shows the command used to start the process. In some cases, a user name and password, or some other sensitive data, is shown in clear text. You can also view the contents of a discovered file, and in some cases these, too, can contain passwords or other sensitive data. You can prevent this issue by using sensitive data filters.
Sensitive data filters for processes mask information only from the discovered process or file; not from, for example, package names.
A sensitive data filter is a regular expression to define data that you do not want displayed. When matched, the sensitive portion of the data is hashed using MD5. The hashed data can be compared with earlier versions to determine whether it has changed, while the actual data remains hidden from users.
Sensitive data filters use MD5 hash
Common passwords and dictionary words can be extracted from MD5 hashes using commonly available tools. If you rely on sensitive data filters to entirely mask passwords, ensure that any passwords that might appear in discovered data are strong passwords.
Managing Sensitive Data Filters
- From the main menu, click the Administration icon. The Administration page displays.
- From the Discovery section, click Sensitive Data Filters.
The Sensitive Data Filters window is displayed with the Processes tab visible.
- To view or edit filters for files, click the Files tab.
- To edit an existing filter, click Edit.
- To delete an existing filter, click Delete.
- To add a new filter, click Add. A new field is added, into which you can enter a regular expression.
- To create the filter, click Apply.
- To reorder sensitive data filters, click the up or down arrow in the ordering column for the filter you want to move.
You can also move a filter to the top or bottom of the list using the top or bottom arrow buttons.
Creating a Sensitive Data Filter
A regular expression will usually match more than just sensitive data, including, for instance, an identifying argument name such as
-password. The portion of data to be hashed must be enclosed in brackets to form a regular expression group. Portions of the regular expression not enclosed in brackets are not modified.
The following command has the
--passwordin clear text. The regular expression needs
--passwordto locate the data and define how much to mask around it.
- This regular expression adds
\S+to identify a sequence of one or more non-whitespace characters, making a regular expression of
--password \S+. Brackets are then added to define the portion that requires masking, making
After rediscovery, the new process node will have the password portion replaced with an md5 hash.
- For more resilience against extra white space, replace the single space in the regular expression
\s+, which matches any whitespace character, making
--password\s+(\S+), which is the form that most sensitive data filters take.
Notes on Sensitive Data Filters
- When writing regular expressions for sensitive data filters, ensure that they do not match too much of the command. If the filter masks some of the command that a pattern uses to identify a piece of running software, that pattern will then be unable to identify the software. For more information, see Writing efficient regular expressions.
- The filters are not applied to the inferred data model until you perform a discovery run. Sensitive data discovered before applying a filter remains in the history and DDD until it is aged.
- If sensitive data filters are applied to files, the files must remain valid. For example, if a sensitive data filter is applied to an XML file, the XML must remain valid; otherwise, Xpath processing will not work.