Encoding and count of characters
Generally, the SMS text has to be transfered with UTF-8 encoding to the gateway. You have the option to use the parameter “is_unicode” to decide if you want to send a normal 7 bit or UCS-2 SMS.
GSM 7 bit Standard-Alphabet (GSM 03.38)
Section titled “GSM 7 bit Standard-Alphabet (GSM 03.38)”The basic character set contains all characters that can be transmitted with a normal 7 bit SMS. Each of this characters counts as 1 character. The extended character set can also be transferred, but requires 2 characters for transmission. SMS which contain more than 160 characters are delivered to the handset as multiple SMS, each with 153 characters, and concatenated by the phone to one message. If you send an SMS explicitly with the GSM 03.38 charset and your message contains characters which are not part of the GSM charset, our system will try to convert your text using transliteration.
Basic Character Set
| 0x00 | 0x10 | 0x20 | 0x30 | 0x40 | 0x50 | 0x60 | 0x70 | |
|---|---|---|---|---|---|---|---|---|
| 0x00 | @ | Δ | SP | 0 | ¡ | P | ¿ | p |
| 0x01 | £ | _ | ! | 1 | A | Q | a | q |
| 0x02 | $ | Φ | ” | 2 | B | R | b | r |
| 0x03 | ¥ | Γ | # | 3 | C | S | c | s |
| 0x04 | è | Λ | ¤ | 4 | D | T | d | t |
| 0x05 | é | Ω | % | 5 | E | U | e | u |
| 0x06 | ù | Π | & | 6 | F | V | f | v |
| 0x07 | ì | Ψ | ’ | 7 | G | W | g | w |
| 0x08 | ò | Σ | ( | 8 | H | X | h | x |
| 0x09 | Ç | Θ | ) | 9 | I | Y | i | y |
| 0x0A | LF | Ξ | * | : | J | Z | j | z |
| 0x0B | Ø | ESC | + | ; | K | Ä | k | ä |
| 0x0C | ø | Æ | , | < | L | Ö | l | ö |
| 0x0D | CR | æ | - | = | M | Ñ | m | ñ |
| 0x0E | Å | ß | . | > | N | Ü | n | ü |
| 0x0F | å | É | / | ? | O | § | o | à |
- LF is a Line Feed control.
- CR is a Carriage Return control, or filler.
- ESC is an Escape control.
- SP is a Space character.
Extended Character Set
| 0x00 | 0x10 | 0x20 | 0x30 | 0x40 | 0x50 | 0x60 | 0x70 | |
|---|---|---|---|---|---|---|---|---|
| 0x00 | ||||||||
| 0x01 | ||||||||
| 0x02 | ||||||||
| 0x03 | ||||||||
| 0x04 | ^ | |||||||
| 0x05 | € | |||||||
| 0x06 | ||||||||
| 0x07 | ||||||||
| 0x08 | { | |||||||
| 0x09 | } | |||||||
| 0x0A | FF | |||||||
| 0x0B | SS2 | |||||||
| 0x0C | [ | |||||||
| 0x0D | CR2 | ~ | ||||||
| 0x0E | ] | |||||||
| 0x0F | | |
- FF is a Page Break control. If not recognized, it shall be treated like LF.
- CR2 is a control character. No language specific character shall be encoded at this position.
- SS2 is a second Single Shift Escape control reserved for future extensions.
UCS-2 (Unicode SMS)
Section titled “UCS-2 (Unicode SMS)”With UCS-2 it is possible to transmit 1,112,064 characters. These characters represent nearly every spoken language. Each character counts as 1 character, but Unicode SMS can only consist of 70 characters, or consist of multiple concatenated SMS, each with 67 characters.
Concatenated SMS
Section titled “Concatenated SMS”For technical reasons, with one sms only 160 characters of text (70 characters Unicode text) can be transferred. If you send an SMS with more characters, then you send a so-called concatenated SMS. Overlong text messages / Multi-SMS (concatenated SMS, Long SMS) are part messages and transmitted separately. Each part is a separate text message. The receiver converts the parts again into a coherent text. To determine that the message is a concatenated message and which part belongs to which part 7 characters (or 3 in Unicode) of the available characters are needed. This is the reason why one part of an overlong SMS can only consist 153 characters (or 67 for Unicode) of your text. The maximum is 1530 characters with one concatenated message (or 670 characters for Unicode).