Locale and Codeset

Locale is information that PATROL uses to read and write language-specific text. A locale includes information such as sort order, date and time formats, special characters, and currency formats. A locale also includes a codeset. This is a group of rules that the PATROL Agent uses to interpret the ones and zeros of a text file into characters. The eucJP codeset is an example of a Japanese codeset name. A locale name follows the format language_country.codeset. The language code consists of two lowercase letters that are defined by ISO 639, country consists of two uppercase
letters that are defined by ISO 3166, and codeset is the name of the codeset.

The following table shows the locale names that PATROL supports.

Locale NameLanguage
ja_JP.CP932 Japanese 
ja_JP.eucJP Japanese 
ja_JP.SJIS Japanese 
ko_KR.eucKR Korean 
zh_CN.gb Simplified Chinese 
zh_TW.big5 Traditional Chinese 
zh_TW.eucTW Traditional Chinese 

The locale and codeset registries list additional locale and codeset names for mapping purposes. The names listed in the preceding table are currently the only ones that you can use with PSL functions. You can find the registries at the following directory paths:

  • PATROL_HOME/lib/nls/locale/loc_registry
  • PATROL_HOME/lib/nls/charmaps/cs_registry

Locale names play an important role in writing internationalized PSL scripts. If you read, write, or display text, you must verify that you are using the right locale to process the text. For example, if your script needs to read a text file encoded with the ja_JP.eucJP locale name, you must verify that PATROL is configured to read the file with the ja_JP.eucJP locale.

Multiple-Byte Characters

Generally, English language codesets use a single byte to represent a character. Some international codesets need more than one byte to represent a single character because many international alphabets have a very large number of characters. Any character that requires more than one byte is a multiple-byte character. All PSL built-in string functions support multiple-byte characters. You can, for example, use multiple-byte characters in the following types of string operations: string comparisons, sorts, and range expressions using regular expressions.

Where to go from here

Introduction to Internationalized PSL Scripts

Was this page helpful? Yes No Submitting... Thank you