Considerations when loading CSV data
CSV data format
Ensure that your CSV data meets the following requirements:
- Your CSV file must contain delimiters. Enclosure characters (characters that enclose each field) in your CSV file are optional.
- Input strings should not contain the same characters as your field delimiters or enclosure characters. If they do, LOADPLUS might load the record incorrectly or reject it. To prevent this problem, complete one of the following tasks:
- Remove these characters from your input strings.
- Change the delimiter or enclosure character in your input file so that you do not have a conflict with embedded characters. Then, ensure that the TERMINATEDBY, ENCLOSEDBY, and AND parameters on your FORMAT CSV option are compatible with the delimiter and enclosure characters in your input file.
- Your CSV input file should use a variable-length record format (RECFM=VB). If your input file uses a fixed-length record format, and the last field on a record does not extend to the end of that record, the field must end with a delimiter.
- A one-to-one correspondence should exist between the input fields of your CSV data and the field specifications on your LOAD command. Otherwise, the following results occur. (For more information, see CSV field specifications.)
- If you have more input fields than field specifications, LOADPLUS ignores the extra input file data. LOADPLUS maps field specifications to input fields in order, starting with the first specification and the first input field.
- If you have fewer input fields than field specifications, LOADPLUS ignores the additional field specifications.
LOADPLUS supports loading data that was created by using the FORMAT CSV option of UNLOAD PLUS. However, when loading CSV data, LOADPLUS requires that DATE, TIME, and TIMESTAMP data be in Db2 EXTERNAL format. The control cards that UNLOAD PLUS produces do not specify an external format for these data types. Either modify these control cards or load the data by using a LOADPLUS job that provides the correct external format for these data types.
Installation option considerations when loading CSV data
When you specify FORMAT CSV, LOADPLUS changes the RULES installation option to STANDARD, regardless of the value specified at installation.
FORMAT CSV restrictions
LOADPLUS terminates when either of the following conditions exists and you specify FORMAT CSV:
- You are loading LOB or XML data.
- You also specify CONTINUEIF.
CSV field specifications
You must include a field specification on your LOAD command for every column that you are loading.
In addition, each field specification must map to an input field in your CSV file. LOADPLUS maps field specifications to input fields in order, starting with the first specification and the first input field.
Excluding particular input fields
If you do not want to load data from a particular input field, complete one of the following tasks:
- If the input field that you do not want to load is the last field (or fields) in the CSV file, you do not need to include a field specification for that field. LOADPLUS ignores any extra fields at the end of the input file.
- If the input field that you do not want to load is not the last field (or fields) in the CSV file, include a placeholder field specification on your LOAD command. In this field specification, use a field name that is not the name of a column in the Db2 table. You must also specify SKIPFIELDS YES on your LOAD command.
The CSV field specification mapping figure illustrates how LOADPLUS maps your field specifications for CSV files to your Db2 table columns. Example 2a does not supply a field specification for the employee department field. Therefore, LOADPLUS loads employee department data into the employee number column, which is not the desired result. Example 2b corrects this problem by adding a placeholder field specification.
Field specification requirements
Your field specifications must meet the following requirements:
- For fields that you load, field names must match the corresponding column names.
- For DATE, TIME, or TIMESTAMP data, your field specification must be an EXTERNAL data type. For DATE and TIME data, you must specify one of the delimited formats listed in the tables in Supported-external-DATE-TIME-and-TIMESTAMP-formats. For delimited TIMESTAMP data, if you do not specify a delimited format, LOADPLUS defaults to format D2E.
To have LOADPLUS interpret two consecutive delimiter characters as a null field, include the following NULLIF option:
NULLIF fieldName = ''If you do not include the NULLIF option, LOADPLUS treats input fields denoted by two consecutive delimiter characters in the following ways:
- For DATE, TIME, and TIMESTAMP fields where the column is nullable, LOADPLUS loads a null value.
- For DATE, TIME, and TIMESTAMP fields where the column is defined as NOT NULL, LOADPLUS loads the default date, time, or timestamp.
- LOADPLUS treats all other fields as VARCHAR fields with length 0.
Additional field specification considerations
Consider the following additional information about field specification options:
- You can use NULLIF and DEFAULTIF field specifications. However, be aware that LOADPLUS changes the value of the RULES installation option to STANDARD., affecting how LOADPLUS handles NULLIF and DEFAULTIF specifications.
- LOADPLUS ignores the following field specifications:
POSITION (start:end)
LOADPLUS starts from the first position of each field. However, we recommend that you specify POSITION(*).
Data type, except on specifications for DATE, TIME, and TIMESTAMP fields
LOADPLUS performs data type conversion only to DATE, TIME, and TIMESTAMP external types.
- Length
Example field mapping
The following figure illustrates how LOADPLUS maps your field specification to the table that you are loading: