ProZ.com website localization
  1. Help Center
  2. Community
  3. ProZ.com website localization

What is Unicode?

To understand what Unicode is, it is important to understand the concept behind a character set.

When text is entered into and stored on a computer it is stored in binary, like all data on a computer is stored. A character set is essentially a table that tells the computer how the binary version of the text turns into actual characters on the screen. Most character sets only have room for a maximum of 255 characters, so originally each language family would have its own character set that mapped the stored data into the characters of that language's alphabet or symbols.

Because character sets were specific to languages that shared the same symbols, and there were often more than one character set for any given language, exchanging text between computers became increasingly difficult as the internet and international exchanges became more widespread. 

To solve this problem the Unicode character set was created. The intent of Unicode is to offer a single character set that contains a way of representing all characters from as many languages as possible. For most people the technical details are not important, the important thing is that with Unicode it is possible to display almost any character on the same page or document without having to worry about which character set needs to be used.