Create a New Composite Definition
Composite field definitions enable processing of fields that are combinations of other fields. For example, a name field may contain a title (such as Mr. or Mrs.), a first name, a middle name, a last name, and a suffix (such as Jr.). A composite definition made up of composite data items allows each piece of data to be identified and processed individually.
Each composite has a unique definition that includes what fields make up the composite field and how to break the source field into the required fields. The fields that comprise a composite field definition are called composite data items. Each composite data item is defined to include any characteristics specific to that item. Composite definitions specify how to use data items to split a composite source value into separate fields. Each field within the composite definition will be one of the defined composite data items. The composite definition also includes any information required to locate the composite data item within the composite. A composite item can be located in a fixed location, based on its order relative to other data items, or the composite value can be scanned to find a value which matches a certain value or pattern.
With fixed location processing, the same positions are always identified as being the same composite item. Fixed location is designed to split the source value based on character number, so the characters being processed are always in the same place within the composite field.
With the other types of processing, a composite item will be in a different location within each source value. These composite definitions are used to search a source value and locate the composite data item. For structured and scanned composites items, the source value is parsed by logic driven from the composite and composite data item definitions.
File-AID Data Privacy supplies packaged composite definitions that are designed to accommodate some needs. You can edit these packaged composite definitions as required or create new composite definitions of your own.
To create a new, or edit an existing, composite definition:
- Select and open the desired repository and project from the Data Privacy Explorer view.
Click the Composites tab for the open project. The Composites view appears.
Composite definitions are listed in the Composites section on the left, composite data items are listed in the Data Items section on the upper right, and data item categories are listed in the Categories section on the lower right.
- If necessary, create one or more new data item categories. For more information, see Create a New Data item Category.
- If necessary, create one or more new data items. For more information, see Create a New Composite Data Item.
To define a new composite, in the Composites section, click Add. This opens the Composite Field Definition view. You can alternatively select an existing composite and click Edit to open it in the Composite Field Definition view.
The Composite Field Definition view lists each available Data Item in the upper left section with each available Delimiter - Separator and Delimiter - Keyword listed below. Categories are used to group similar data items and delimiters. These composite building blocks can be dragged and dropped into the Structure area in the middle of the view to assemble a layout for the composite definition. The Identification Order on the right shows all the parts of the composite and is used to control the sequence in which those parts are to be identified by Data Privacy. You can drag items up or down within the Identification Order list to change the order. Below the Structure area is a Create scan... link you can click to display a scan area. Scanning allows any data not yet identified in the entire composite field to be inspected for a pattern or a specific string used to flag the targeted data. For example, you could create a scan for the pattern of either five digits or five digits, a dash, and 4 more digits (as in regular and zip+4 postal codes). You could also scan for the phrase "ZIP CODE:" if it precedes each instance of a postal code in the composite field data.
- Type a name for your composite definition in the Name field.
To build a composite definition based on data structure, drag a data item from the Data Item list and drop it into the Structure area.
Although the sequence in which items are dropped into the Structure area might not seem important, you should first analyze your data to determine the most reliable method and sequence of parsing the data, then assemble the items in the Structure area in that order. You can drag and drop a data item or delimiter at the left edge of the Structure area to make it the first piece of the composite definition, at the right edge of the area to make it the last piece of the composite definition, or in the middle, which will result in a prompt for starting position. Positioning is either relative to an adjoining data item or delimiter, or fixed in a specific location within the field.
When the desired components have been placed in the Structure area, you can right-click on each to display up to three menu choices: Delete to remove the data item or delimiter, Join Left to connect to the next data item or delimiter on the left, and/or Join Right to connect to the next data item or delimiter on the right.
When the data item or delimiter has been dropped into position, two tabs, General and Advanced, appear below the Structure area. The content of the tabs will vary depending on the location of the item within the Structure area and other factors.
- On the General tab:
- Check the Prefix box if the data item needs to be included in the data item that follows it.
- Check the Optional box if the data item might or might not be present in the data.
- Check the Multiple box if multiple instances of the data item can occur sequentially in that location within the data, then if those multiple occurrences are separated in some way, check the Separated by box and select a delimiter from the drop-down list.
- On the Advanced tab:
- Select Auto, Use delimiter, or Use pattern from the Locate method drop-down list to control the way in which that data item or delimiter will be located by Data Privacy. Auto works well in most situations.
- Select Forward or Backward from the Location direction drop-down list. A component placed at the right edge of the Structure area or attached to the left edge of another data item or delimiter is assigned a direction of Backward by default.
- In the Additional validations area, check the Match length box, the Match patterns box, or the Match list of values box. Not all boxes are always available. For example, the Match patterns box is only available if a pattern has been defined for the selected data item or delimiter.
- Drag additional data items and/or delimiters from the Data Item and Delimiter lists and drop them into the Structure area in the desired locations. Continue until all the expected parts of the data have been included, then use the Join Left and Join Right function for each data item as required to create a fully contiguous composite definition.
- Click the Validate button at the bottom right to check the validity of the current composite definition.
- To test the composite definition, click the Test button at the bottom right. The Test composite definition dialog box appears. Enter some sample data and click OK. If a problem exists with the composite definition, an error message appears. Otherwise, the Parts Identified list appears showing how each part of the sample data was identified. It includes the following columns:
- The Mode column lists the type of definition, Structure or Scan, used to identify that part of the data.
- The Part column lists the data item or delimiter matched to that piece of the data.
- The Instance column shows the number of occurrences of this part in the field.
- The Position column lists the starting location of each part within the data.
- The Data column lists the fragments of actual sample data that have been broken down into composite pieces.
- To build a composite definition that uses scanning to identify a pattern or string within the data, create a scan. A scan can be used alone or in conjunction with a structure-based composite. Click the Create scan... link below the Structure area. A Scan area appears. Drag and drop data items and delimiters into the Scan area to define the scanning operation. On the General tab, select Forward or Backward from the Scan direction drop-down list, then select First only or All from the Match drop-down list. Use the Advanced tab in the same way as with the Structure area. When the desired scan components have been assembled, test the scan with sample data as described above.
When you are satisfied with your new or edited composite definition, click OK to save it to the list of Composites and make it available to your project.