How field widths are determined

BMC Remedy AR System Unicode servers store characters in databases using UTF-8 (Oracle database) or UTF-16 (Microsoft SQL Server databases). UTF-8, being a byte-oriented character encoding, is similar to other byte-oriented encodings that BMC Remedy AR System supports, such as Shift-JIS and EUC (for Japanese) or GB2312 (for Simplified Chinese).

However, a character sequence encoded in UTF-8 tends to be longer than the same characters in one of the other encodings. Also, characters from European languages, which occupy 1 byte each in the other encodings, can occupy 1 or 2 bytes in UTF-8, as outlined in the following table.The [confluence_table-plus] macro is a standalone macro and it cannot be used inline.

Note

Because of this expansion on conversion into UTF-8, data converted to UTF-8 might not be imported correctly because it no longer fits into the database columns. To avoid this problem, expand the sizes of the affected form fields.

In UTF-16, each Unicode character occupies 1 or 2 code units (each code unit is a 16-bit quantity). Each ASCII and European character occupies 1 code unit; each Chinese, Korean, and Japanese character, which might be 2 bytes in its language-specific encoding, also occupies 1 code unit.

These expansions are valid for the characters of Unicode's Basic Multilingual Plane (BMP) — the original set of 65,536 characters presented in Unicode 1.0 and modified in Unicode 2.0. Since version 3.0, Unicode provides a mechanism to define up to about 1 million supplemental characters. Supplemental characters are defined for Chinese and also for some specialized usages in mathematics, musical typesetting, and information processing. Each supplemental character occupies 4 bytes in UTF-8, and 2 code units in UTF-16.

How field widths are determined

Remedy Action Request System 20.02

On this page