Manual of Board ET-BASE GSM SIM900
17. Principles of decoding Unicode
The structure of Unicode always consists of 2 Byte Code; the
first byte notifies Table to know what language of Unicode is. If it
is Unicode of English Language, the first byte is 00H and the second
byte is character code that accords with ASCII Code. If it is Thai
Language, the first byte is 0EH and the second byte is character
code. Referred to the demonstration of receiving SMS, it is all
ASCII Code if it only sends SMS by English Language; it means that
there is 1 Byte Data for each a character. If it sends SMS by both
Thai and English characters together, English character is encoded
by Unicode.
In summary, if sending SMS by Thai characters, it always uses
Unicode; but, if it is English, it can be both Unicode and ASCII
Code. If it is Unicode, it uses 2 Byte character code as same as
Thai character; its value is in the range of 0000H…007FH and Code
00H is the first Byte Data. If SMS only has English character, the
character code for SMS is ASCII Code; it uses 1 Byte Code and it
omits 00H that is the first byte in Unicode. For example, if it is
“
A
”, it will be
41H
instead of
0041H
. If SMS has both Thai and
English characters, characters are encoded by Unicode as same as
Thai character.
So, user has to consider this issue when decoding any
character. If user found the character code in the range of 20H-7FH,
it means that it is ASCII Code and it can be displayed instantly;
or, if user found 00H, it means that it is English Unicode and its
character code will be in the next Byte Data; or, if user found 0EH,
it means that it is Thai Unicode and its character code will be in
the next data byte as well.
For example, it sends SMS as “
สวสด
Jack”
to the Module SIM900,
the Module received the message successfully and stored the message
in the first order; if it uses Program Hyper Terminal or other
Terminal that displays the operating result as ASCII, it will
reports the operating result as shown in the picture below;
+CMTI: "SM",1
When it displays the received data in the format of HEX String,
user found that the amount of the received data is much than the
received data is displayed through the screen of Program Hyper
Terminal because Program Terminal only displays the received data in
the part of ASCII Code (20H…FFH); other code below 20H (00H-1FH) is
assumed the Command. For example, if it is
0DH,0AH
, it is not
displayed but it assumes that it is the Command to shift the Cursor
position to the beginning of line and it starts the new line. In
this case, it only describes the received data in the format of HEX
String instead; for example, when it receives ASCII Code of
character “A”, it displays the result as “41” instead. It displays
the result of HEX String on the left side and it displays the result
of ASCII Code on the right side in order to compare; in this case,
user can understand the format better. Referred to the message
+CMTI: “SM”,1
on the screen of Program Hyper Terminal, if it is
displayed in the format of HEX String, the result is;
ETT CO., LTD.
-
23
-
www.etteam.com