 |
Understanding Character Variants |
 |
|
 |
|
Learn More
Scripts and Characters
| |
|
Scripts and Characters
|
|
Latin |
Arabic |
Han |
Greek |
|
L |

|

|

|
|
English |
Farsi |
Chinese |
Greek |
Script
A script is a collection of symbols used to
represent textual information in a language.
Examples of scripts:
Arabic, Cyrillic, Greek, Han, Hiragana, Latin
Character
A character, in an abstract sense, is an element
of writing that is the smallest quantity having semantic value. A character
is the basic building block of any script, and thus any written language.
It invokes a meaning at a fundamental level; you cannot break a character
down any further and still have meaning.
Examples:
The English character "A" means something
at a fundamental level. But a smaller portion of the character, such
as the left leg "/" or cross member "-", has no
meaning in English.
In Latin-based languages, such as German, French,
and English, several characters are needed to form a word that represents
a complete idea (e.g., "light").
In Asian countries, such as China, Japan and
Korea, the same definition for character applies, but a single character
can represent an idea. For example, a single Chinese character expresses
the same idea as the word "light" does in English.
Written Language
A written language is a writing system made
of characters from one or more scripts.
Examples of languages:
English, French, Japanese, Russian, Urdu
See Supported
Character Scripts for more information on character scripts.
Encoded Character Set
When grouped together, all the characters of
a particular script form a character repertoire. Order these characters
and assign a number to each of them, and you have a coded character
set. ASCII, for example, is a coded character set in which uppercase
"A" is assigned the number 65.
Unicode
Unicode is another example of a coded character
set. The idea behind Unicode was to create a universal character set
that covers all the major scripts of the world. Because of this, Unicode
is the coded character set of choice for IDNs. As of this writing, Unicode
is still being updated with new scripts and new characters.
|