Data translation
DIRECT YES processing
For DIRECT YES processing, UNLOAD PLUS translates data to the specified output encoding scheme using the ASCII, EBCDIC, or UNICODE option and the CCSID option. During translation processing, UNLOAD PLUS searches the SYSIBM.SYSSTRINGS catalog table for a row that identifies how to translate the data.
Translation processing is handled in the following manner:
- Based on the CCSID that you specify or that UNLOAD PLUS retrieves from DSNHDECP, UNLOAD PLUS searches the SYSIBM.SYSSTRINGS catalog table for a row that identifies how to translate the data. (To review the values that are assigned to the CCSID, see the UNLOAD PLUS parameter listing under message number BMC50471I.)
If a row exists, UNLOAD PLUS searches the row for a value, first in the TRANSPROC field and then in the TRANSTAB field. UNLOAD PLUS terminates when either of the following conditions exists:
The TRANSPROC field contains DSNXVJPC or DSNXVTWC.
UNLOAD PLUS does not support translations that require a translation procedure.
- The TRANSTAB field is empty, and the TRANSPROC field contains a value.
In all other cases, UNLOAD PLUS performs the data translation providing that the combination is supported. For a list of supported translations, see Supported character conversions.
- If a row does not exist in SYSIBM.SYSSTRINGS, UNLOAD PLUS uses the z/OS Unicode Conversion Services to translate the data.
DIRECT NO processing
For DIRECT NO processing, Db2 performs the data translation. Using DIRECT NO, you can translate your encoded data with full SELECT capabilities.
Supported character conversions
The following table identifies the character conversions that UNLOAD PLUS supports for translation processing:
Target ----------- Source | ASCII SBCS | ASCII MIXED | ASCII DBCS | EBCDIC SBCS | EBCDIC MIXED | EBCDIC DBCS | UNICODE SBCS | UNICODE MIXED | UNICODE DBCS |
---|---|---|---|---|---|---|---|---|---|
ASCII SBCS | X | X | Y | ||||||
ASCII MIXED | X | ||||||||
ASCII DBCS | X | ||||||||
EBCDIC SBCS | X | X | Y | ||||||
EBCDIC MIXED | X | ||||||||
EBCDIC DBCS | X | ||||||||
UNICODE SBCS | X | X | X | Y | |||||
UNICODE MIXED | X | ||||||||
UNICODE DBCS | X | ||||||||
This table indicates the supported translations with the following identifiers: X: allowable translation for either DIRECT YES or DIRECT NO Y: allowable translation for DIRECT YES only |
Expansion and contraction
The translation process might cause the length of the resultant data to change based on the CCSID type that you specified. To determine the resultant length of the data based on each CCSID conversion, see the IBM Db2 SQL Reference.
Restrictions
UNLOAD PLUS has the following restrictions on data translation:
- Because UNLOAD PLUS does not convert data when you specify FORMAT BMCLOAD, the utility ignores special options for date and time, ASCII, EBCDIC, and UNICODE. The CCSID of the target table must be identical to the CCSID of the source table, however.
- UNLOAD PLUS does not support the UNICODE option with the FORMAT XML option.
- UNLOAD PLUS does not translate the following data:
- Binary data
- Columns that are defined as FOR BIT DATA, including RID values when you specify SELECT CURRENT RID
Mixed data
The translation is not supported from Unicode MBCS (UTF-8) to EBCDIC or ASCII MIXED on a Db2 subsystem that is defined as MIXED=NO, because the target CCSID value is undefined. However, UNLOAD PLUS and other SQL applications such as SPUFI and DSNTIAUL, extract MIXED columns as SBCS data. During the translation process, character substitutions might occur between the source CCSID and target CCSID; a loss of data integrity can result, due to SUBBYTE characters in the translated data. For more details, see SUBBYTE and ERRORBYTE.
Command constants
For comparisons between command constants and row data, UNLOAD PLUS must translate certain command constants from EBCDIC (using the Db2 installation default EBCDIC SBCS CCSID) to the encoding scheme of the table. UNLOAD PLUS translates the following character constants for comparison:
- Predicate block constants
- LIKE constants
- IN constants
For additional information about these constants, see predicate.
UNLOAD PLUS translates output data from EBCDIC (using the Db2 installation default EBCDIC SBCS CCSID) to the output encoding scheme. UNLOAD PLUS translates the following character constants for output:
- AUTOTAG values
- SELECT constants
- FORMAT CSV constants
- FORMAT XML tags
- IF VALUE constants
- NULLCHAR
For additional information about these constants, see the description for the individual option in this chapter.
SUBBYTE and ERRORBYTE
UNLOAD PLUS uses the SUBBYTE and ERRORBYTE fields in a similar way. If you specify NOSUBS to prevent character substitutions when a string is converted from one CCSID to another, and during conversion, a substitution character is placed in the output string because a character in the source CCSID does not exist in the target, the attempted character substitution causes a conversion error.
In UNLOAD PLUS, the DISCARDS option defines the limit on the number of records that UNLOAD PLUS ignores during the character conversion. UNLOAD PLUS ends abnormally if it reaches the discard limit, and does not write the discarded records to any data set.
Order of data type conversion and data translation
UNLOAD PLUS supports data type conversions from one encoding scheme to another. The following tables detail when translation occurs with respect to the data type conversion that UNLOAD PLUS is performing.
The following tables identify conversion and translation processing for numeric output:
Output field | SMALLINT | INTEGER | BIGINT | DECIMAL | DECIMAL ZONED | FLOAT | All numeric EXTERNAL output except DECFLOAT | DECFLOAT | DECFLOAT EXTERNAL |
---|---|---|---|---|---|---|---|---|---|
Input column | |||||||||
SMALLINT | x | x | x | x | x | x | cpt 2 | x | cpt2 |
INTEGER | x | x | x | x | x | x | cpt 2 | x | cpt2 |
BIGINT | x | x | x | x | x | x | cpt 2 | x | cpt2 |
DECIMAL | x | x | x | x | x | x | cpt 2 | x | cpt2 |
FLOAT | x | x | x | x | x | x | cpt 2 | x | cpt2 |
CHAR | tc1 | tc1 | tc1 | tc1 | tc1 | tc1 | |||
VARCHAR | tc1 | tc1 | tc1 | tc1 | tc1 | ||||
GRAPHIC | |||||||||
VARGRAPHIC | |||||||||
BINARY | |||||||||
VARBINARY | |||||||||
DATE | |||||||||
TIME | |||||||||
TIMESTAMP | |||||||||
TIMESTAMP WITH TIME ZONE | |||||||||
BLOB | |||||||||
CLOB | |||||||||
DBCLOB | |||||||||
DECFLOAT | x | cpt2 | |||||||
XML | |||||||||
The character combinations in this table indicate the order in which data type conversions (c), data translations (t), and character padding (p) occur. The numbers (1,2,3) represent the type of data translation that occurs when converting from one data type to another. x = Support conversion, no translation required c = Convert t = Translate p = Pad with blank 1 = Translate input CCSID to EBCDIC SBCS (system default) 2 = Translate EBCDIC SBCS (system default) to output CCSID 3 = Translate input CCSID to output SBCS |
The following tables identify conversion and translation processing for the character, graphic, and binary output:
Output field | CHARACTER | VARCHAR | GRAPHIC, GRAPHIC EXTERNAL | VARGRAPHIC | BINARY | VARBINARY |
---|---|---|---|---|---|---|
Input column | ||||||
SMALLINT | cpt 2 | cpt 2 | ||||
INTEGER | cpt 2 | cpt 2 | ||||
BIGINT | cpt 2 | cpt 2 | ||||
DECIMAL | cpt 2 | cpt 2 | ||||
FLOAT | cpt 2 | cpt 2 | ||||
CHAR | cpt 3 | cpt 3 | ||||
VARCHAR | cpt 3 | cpt 3 | ||||
GRAPHIC | x | x | ||||
VARGRAPHIC | x | x | ||||
BINARY | x | x | ||||
VARBINARY | x | x | ||||
DATE | cpt 2 | cpt 2 | ||||
TIME | cpt 2 | cpt 2 | ||||
TIMESTAMP | cpt 2 | cpt 2 | ||||
TIMESTAMP WITH TIME ZONE | cpt2 | cpt2 | ||||
BLOB | ||||||
CLOB | ||||||
DBCLOB | ||||||
DECFLOAT | ||||||
XML | ||||||
The character combinations in this table indicate the order in which data type conversions (c), data translations (t), and character padding (p) occur. The numbers (1,2,3) represent the type of data translation that occurs when converting from one data type to another. x = Support conversion, no translation required c = Convert t = Translate p = Pad with blank 1 = Translate input CCSID to EBCDIC SBCS (system default) 2 = Translate EBCDIC SBCS (system default) to output CCSID 3 = Translate input CCSID to output SBCS |
The following tables identify conversion and translation processing for the date, time, and timestamp output:
Output field | DATE | DATE EXTERNAL | DATE -format EXTERNAL | TIME | TIME EXTERNAL | TIME -format EXTERNAL | TIMESTAMP, TIMESTAMP WITH TIME ZONE | TIMESTAMP EXTERNAL | TIMESTAMP -format EXTERNAL |
---|---|---|---|---|---|---|---|---|---|
Input column | |||||||||
SMALLINT | |||||||||
INTEGER | x 1 | x 1 | |||||||
BIGINT | |||||||||
DECIMAL | x 1 | x 1 | |||||||
FLOAT | |||||||||
CHAR | tc1 | tctp1,2 1 | tc1 | tctp1,2 1 | tc1 | tctp1,2 1 | |||
VARCHAR | tc1 | tc1 | tc1 | ||||||
GRAPHIC | |||||||||
VARGRAPHIC | |||||||||
BINARY | |||||||||
VARBINARY | |||||||||
DATE | x | ctp 2 | |||||||
TIME | x | ctp2 | |||||||
TIMESTAMP | x | ctp2 | x | ctp2 | x | ctp2 | |||
TIMESTAMP WITH TIME ZONE | x | x | x | ctp | |||||
BLOB | |||||||||
CLOB | |||||||||
DBCLOB | |||||||||
DECFLOAT | |||||||||
XML | |||||||||
The character combinations in this table indicate the order in which data type conversions (c), data translations (t), and character padding (p) occur. The numbers (1,2,3) represent the type of data translation that occurs when converting from one data type to another. x = Support conversion, no translation required c = Convert t = Translate p = Pad with blank 1 = Translate input CCSID to EBCDIC SBCS (system default) 2 = Translate EBCDIC SBCS (system default) to output CCSID 3 = Translate input CCSID to output SBCS |
1 When using DIRECT NO, specifying DATE -format, TIMESTAMP -format, or TIME -format formats might require extra consideration due to the various ways in which the data is returned from Db2. Often, you should not specify the same format when using DIRECT NO versus DIRECT YES.
The following tables identify conversion and translation processing for LOB and XML output:
Output field | BLOB | CLOB | DBCLOB | XML |
---|---|---|---|---|
Input column | ||||
SMALLINT | ||||
INTEGER | ||||
BIGINT | ||||
DECIMAL | ||||
FLOAT | ||||
CHAR | ||||
VARCHAR | ||||
GRAPHIC | ||||
VARGRAPHIC | ||||
BINARY | ||||
VARBINARY | ||||
DATE | ||||
TIME | ||||
TIMESTAMP | ||||
TIMESTAMP WITH TIME ZONE | ||||
BLOB | x | |||
CLOB | x | |||
DBCLOB | x | |||
DECFLOAT | ||||
XML | tc3 | |||
The character combinations in this table indicate the order in which data type conversions (c), data translations (t), and character padding (p) occur. The numbers (1,2,3) represent the type of data translation that occurs when converting from one data type to another. x = Support conversion, no translation required c = Convert t = Translate p = Pad with blank 1 = Translate input CCSID to EBCDIC SBCS (system default) 2 = Translate EBCDIC SBCS (system default) to output CCSID 3 = Translate input CCSID to output SBCS |