BGS2T AT Command Set
1.6 Supported character sets
BGS2T_ATC_V01.301
Page 22 of 554
1/31/12
c
1.6
Supported character sets
BGS2T supports two character sets:
GSM 7 bit
, also referred to as GSM alphabet or SMS alphabet (3GPP TS
23.038
) and
UCS216 bit
(ISO-10646
). See
for information about selecting the character set.
Character tables can be found below.
Explanation of terms
•
Escape Character
There are two types of escape sequences which lead to an alternative interpretation on subsequent charac-
ters by the ME:
-
AT command interface
Escape sequences starting with character value 0x5C are used for the ME's non-UCS2 input and output.
-
GSM 7 bit default alphabet
The escape sequence used within a text coded in the GSM 7 bit default alphabet is starting with character
value 0x1B and needs to be correctly interpreted by the TE, both for character input and output. To the
BGS2T, an escape sequence appears like any other byte received or sent.
For SMS user data input after the prompt '>' in text mode (
="GSM" the character
0x1A is interpreted as 'CTRL-Z'. The character 0x1B (interpreted as 'ESC') as well as the escape character
0x5C (is interpreted as 'Ö'), therefore both escape mechanisms are not supported in this case.
•
TE Character Set
The character set currently used by the Customer Application is selected with
. It is recommended
to select UCS2 setting.
•
Data Coding Scheme (DCS)
DCS is part of a short message and is saved on the SIM. When writing a short message to the SIM in text
mode, the DCS stored with
is used and determines the coded character set.
•
International Reference Alphabet (IRA)
The International Reference Alphabet is equivalent to ASCII (American Standard Code for Information Inter-
change) and ISO 646, i.e. it defines a 7-bit coded character set. The mapping can be obtained from the char-
acter set tables below (UCS2 values 0x0000 to 0x007F).
When you enter characters that are not valid characters of the supported alphabets the behavior is undefined.
If GSM alphabet is selected, all characters sent over the serial line (between TE and ME) must be in the range
from 0 to 127 (7 bit range).
Note: If the ME is configured for GSM alphabet, but the Customer Application (TE) uses ASCII, bear in mind that
some characters have different code values, such as the following:
•
"@" character with GSM alphabet value 0 is not displayable by an ASCII terminal program, e.g. Microsoft©
Hyperterminal®.
•
"@" character with GSM alphabet value 0 will terminate any C string! This is because value 0 is defined as C
string end tag. Therefore, the GSM Null character will cause problems on application level when using 'C'-
functions, e.g. "strlen()". Using an escape sequence as shown in the table below solves the problem. By the
way, this may be the reason why even network providers sometimes replace '@' with "@=*" in their SIM appli-
cation.
•
Some other characters of the GSM alphabet may be misinterpreted by an ASCII terminal program. For exam-
ple, GSM "ö" (as in "Börse") is assumed to be "|" in ASCII, thus resulting in "B|rse". This is because in both
alphabets there are different characters assigned to value 7C (hexadecimal).
If the TE sends characters differently coded or undefined in ASCII or GSM (e.g. Ä, Ö, Ü) it is possible to use
escape sequences. The ME's input parser translates the escape sequence to the corresponding GSM character
value.
Note:
The ME also uses escape sequences for its non-UCS2 output: Quotation mark (") and the escape character itself
(\, respectively Ö in GSM alphabet) are converted, as well as all characters with a value below 32 (hexadecimal
0x20).
Hence, the input parser of the Customer Application needs to be able to translate escape sequences back to the
corresponding character of the currently used alphabet.
Confidential / Released