Data translation


Before writing the output record to the output data set, UNLOAD PLUS translates the data to the specified output encoding scheme. How UNLOAD PLUS performs this translation depends on the value of your DIRECT option.

DIRECT YES processing

For DIRECT YES processing, UNLOAD PLUS translates data to the specified output encoding scheme using the ASCII, EBCDIC, or UNICODE option and the CCSID option. During translation processing, UNLOAD PLUS searches the SYSIBM.SYSSTRINGS catalog table for a row that identifies how to translate the data.

Translation processing is handled in the following manner:

  1. Based on the CCSID that you specify or that UNLOAD PLUS retrieves from DSNHDECP, UNLOAD PLUS searches the SYSIBM.SYSSTRINGS catalog table for a row that identifies how to translate the data. (To review the values that are assigned to the CCSID, see the UNLOAD PLUS parameter listing under message number BMC50471I.)
  2. If a row exists, UNLOAD PLUS searches the row for a value, first in the TRANSPROC field and then in the TRANSTAB field. UNLOAD PLUS terminates when either of the following conditions exists:

    • The TRANSPROC field contains DSNXVJPC or DSNXVTWC.

      UNLOAD PLUS does not support translations that require a translation procedure.

    • The TRANSTAB field is empty, and the TRANSPROC field contains a value.

    In all other cases, UNLOAD PLUS performs the data translation providing that the combination is supported. For a list of supported translations, see Supported character conversions.

  3. If a row does not exist in SYSIBM.SYSSTRINGS, UNLOAD PLUS uses the z/OS Unicode Conversion Services to translate the data.

DIRECT NO processing

For DIRECT NO processing, Db2 performs the data translation. Using DIRECT NO, you can translate your encoded data with full SELECT capabilities.

Supported character conversions

The following table identifies the character conversions that UNLOAD PLUS supports for translation processing:

Target

-----------

Source

ASCII SBCS

ASCII MIXED

ASCII DBCS

EBCDIC SBCS

EBCDIC MIXED

EBCDIC DBCS

UNICODE SBCS

UNICODE MIXED

UNICODE DBCS

ASCII SBCS

X



X




Y


ASCII MIXED


X








ASCII DBCS



X







EBCDIC SBCS

X



X




Y


EBCDIC MIXED





X





EBCDIC DBCS






X




UNICODE SBCS

X



X



X

Y


UNICODE MIXED








X


UNICODE DBCS









X

This table indicates the supported translations with the following identifiers:

X: allowable translation for either DIRECT YES or DIRECT NO

Y: allowable translation for DIRECT YES only

 Expansion and contraction

The translation process might cause the length of the resultant data to change based on the CCSID type that you specified. To determine the resultant length of the data based on each CCSID conversion, see the IBM Db2 SQL Reference.

Restrictions

UNLOAD PLUS has the following restrictions on data translation:

  • Because UNLOAD PLUS does not convert data when you specify FORMAT BMCLOAD, the utility ignores special options for date and time, ASCII, EBCDIC, and UNICODE. The CCSID of the target table must be identical to the CCSID of the source table, however.
  • UNLOAD PLUS does not support the UNICODE option with the FORMAT XML option.
  • UNLOAD PLUS does not translate the following data:
    • Binary data
    • Columns that are defined as FOR BIT DATA, including RID values when you specify SELECT CURRENT RID

Mixed data

The translation is not supported from Unicode MBCS (UTF-8) to EBCDIC or ASCII MIXED on a Db2 subsystem that is defined as MIXED=NO, because the target CCSID value is undefined. However, UNLOAD PLUS and other SQL applications such as SPUFI and DSNTIAUL, extract MIXED columns as SBCS data. During the translation process, character substitutions might occur between the source CCSID and target CCSID; a loss of data integrity can result, due to SUBBYTE characters in the translated data. For more details, see SUBBYTE and ERRORBYTE.

Warning

If you specify the NOSUBS option with DIRECT NO, UNLOAD PLUS ignores the NOSUBS option for translations that Db2 handles. In cases in which UNLOAD PLUS handles row-level data translation (such as INTO field specifications or reformatting of DATE, TIME, or TIMESTAMP columns), specifying NOSUBS might result in UNLOAD PLUS discarding rows or terminating.

Command constants

For comparisons between command constants and row data, UNLOAD PLUS must translate certain command constants from EBCDIC (using the Db2 installation default EBCDIC SBCS CCSID) to the encoding scheme of the table. UNLOAD PLUS translates the following character constants for comparison:

  • Predicate block constants
  • LIKE constants
  • IN constants

For additional information about these constants, see predicate.

UNLOAD PLUS translates output data from EBCDIC (using the Db2 installation default EBCDIC SBCS CCSID) to the output encoding scheme. UNLOAD PLUS translates the following character constants for output:

  • AUTOTAG values
  • SELECT constants
  • FORMAT CSV constants
  • FORMAT XML tags
  • IF VALUE constants
  • NULLCHAR

For additional information about these constants, see the description for the individual option in this chapter.

SUBBYTE and ERRORBYTE

UNLOAD PLUS uses the SUBBYTE and ERRORBYTE fields in a similar way. If you specify NOSUBS to prevent character substitutions when a string is converted from one CCSID to another, and during conversion, a substitution character is placed in the output string because a character in the source CCSID does not exist in the target, the attempted character substitution causes a conversion error.

In UNLOAD PLUS, the DISCARDS option defines the limit on the number of records that UNLOAD PLUS ignores during the character conversion. UNLOAD PLUS ends abnormally if it reaches the discard limit, and does not write the discarded records to any data set.

Order of data type conversion and data translation

UNLOAD PLUS supports data type conversions from one encoding scheme to another. The following tables detail when translation occurs with respect to the data type conversion that UNLOAD PLUS is performing.

The following tables identify conversion and translation processing for numeric output:

Output field

SMALLINT

INTEGER

BIGINT

DECIMAL

DECIMAL ZONED

FLOAT

All numeric EXTERNAL output except DECFLOAT

DECFLOAT

DECFLOAT EXTERNAL

Input column

SMALLINT

x

x

x

x

x

x

cpt 2

x

cpt2

INTEGER

x

x

x

x

x

x

cpt 2

x

cpt2

BIGINT

x

x

x

x

x

x

cpt 2

x

cpt2

DECIMAL

x

x

x

x

x

x

cpt 2

x

cpt2

FLOAT

x

x

x

x

x

x

cpt 2

x

cpt2

CHAR

tc1

tc1

tc1

tc1


tc1


tc1


VARCHAR

tc1

tc1

tc1

tc1


tc1




GRAPHIC










VARGRAPHIC










BINARY










VARBINARY










DATE










TIME










TIMESTAMP










TIMESTAMP WITH TIME ZONE










BLOB










CLOB










DBCLOB










DECFLOAT








x

cpt2

XML










The character combinations in this table indicate the order in which data type conversions (c), data translations (t), and character padding (p) occur. The numbers (1,2,3) represent the type of data translation that occurs when converting from one data type to another.

x = Support conversion, no translation required

c = Convert

t = Translate

p = Pad with blank

1 = Translate input CCSID to EBCDIC SBCS (system default)

2 = Translate EBCDIC SBCS (system default) to output CCSID

3 = Translate input CCSID to output SBCS

The following tables identify conversion and translation processing for the character, graphic, and binary output:

Output field

CHARACTER

VARCHAR

GRAPHIC, GRAPHIC EXTERNAL

VARGRAPHIC

BINARY

VARBINARY

Input column

SMALLINT

cpt 2

cpt 2





INTEGER

cpt 2

cpt 2





BIGINT

cpt 2

cpt 2





DECIMAL

cpt 2

cpt 2





FLOAT

cpt 2

cpt 2





CHAR

cpt 3

cpt 3





VARCHAR

cpt 3

cpt 3





GRAPHIC



x

x



VARGRAPHIC



x

x



BINARY





x

x

VARBINARY





x

x

DATE

cpt 2

cpt 2





TIME

cpt 2

cpt 2





TIMESTAMP

cpt 2

cpt 2





TIMESTAMP WITH TIME ZONE

cpt2

cpt2





BLOB







CLOB







DBCLOB







DECFLOAT







XML







The character combinations in this table indicate the order in which data type conversions (c), data translations (t), and character padding (p) occur. The numbers (1,2,3) represent the type of data translation that occurs when converting from one data type to another.

x = Support conversion, no translation required

c = Convert

t = Translate

p = Pad with blank

1 = Translate input CCSID to EBCDIC SBCS (system default)

2 = Translate EBCDIC SBCS (system default) to output CCSID

3 = Translate input CCSID to output SBCS

The following tables identify conversion and translation processing for the date, time, and timestamp output:

Output field

DATE

DATE EXTERNAL

DATE -format EXTERNAL

TIME

TIME EXTERNAL

TIME -format EXTERNAL

TIMESTAMP, TIMESTAMP WITH TIME ZONE

TIMESTAMP EXTERNAL

TIMESTAMP -format EXTERNAL

Input column

SMALLINT










INTEGER



x 1



x 1  




BIGINT










DECIMAL



x 1  



x 1  




FLOAT










CHAR

tc1


tctp1,2 1  

tc1


tctp1,2 1    

tc1


tctp1,2 1    

VARCHAR

tc1



tc1



tc1



GRAPHIC










VARGRAPHIC










BINARY










VARBINARY










DATE

x

ctp 2








TIME




x

ctp2





TIMESTAMP

x

ctp2


x

ctp2


x

ctp2


TIMESTAMP WITH TIME ZONE

x



x



x

ctp


BLOB










CLOB










DBCLOB










DECFLOAT










XML










The character combinations in this table indicate the order in which data type conversions (c), data translations (t), and character padding (p) occur. The numbers (1,2,3) represent the type of data translation that occurs when converting from one data type to another.

x = Support conversion, no translation required

c = Convert

t = Translate

p = Pad with blank

1 = Translate input CCSID to EBCDIC SBCS (system default)

2 = Translate EBCDIC SBCS (system default) to output CCSID

3 = Translate input CCSID to output SBCS

1 When using DIRECT NO, specifying DATE -format, TIMESTAMP -format, or TIME -format formats might require extra consideration due to the various ways in which the data is returned from Db2. Often, you should not specify the same format when using DIRECT NO versus DIRECT YES.

The following tables identify conversion and translation processing for LOB and XML output:

Output field

BLOB

CLOB

DBCLOB

XML

Input column

SMALLINT





INTEGER





BIGINT





DECIMAL





FLOAT





CHAR





VARCHAR





GRAPHIC





VARGRAPHIC





BINARY





VARBINARY





DATE





TIME





TIMESTAMP





TIMESTAMP WITH TIME ZONE





BLOB

x




CLOB


x



DBCLOB



x


DECFLOAT





XML




tc3

The character combinations in this table indicate the order in which data type conversions (c), data translations (t), and character padding (p) occur. The numbers (1,2,3) represent the type of data translation that occurs when converting from one data type to another.

x = Support conversion, no translation required

c = Convert

t = Translate

p = Pad with blank

1 = Translate input CCSID to EBCDIC SBCS (system default)

2 = Translate EBCDIC SBCS (system default) to output CCSID

3 = Translate input CCSID to output SBCS



 

Tip: For faster searching, add an asterisk to the end of your partial query. Example: cert*