View the Most Wanted LQ Wiki articles.
LinuxQuestions.org > Linux Wiki > ISO 10646

From LQWiki

Jump to: navigation, search

ISO 10646 is a project that aims to be able to represent every character used in every language in the world, and there are even old languages and decorative ones it defines.

The goal of ISO 10646 is very similar to that of Unicode, and as such, these 2 projects usually cooperate and are compatible. ISO 10646 is more geared to the actual character set, while Unicode finishes up the loose ends by describing ways to represent the data on a computer system.

ISO 10646 is completely compatible with ASCII. The first 128 values of ISO 10646 map directly to ASCII, which means that it's trivial to convert from ASCII to ISO 10646.

The major drawback to ISO 10646 is that it required 32 bits of storage for each character. (this was later changed to 31 bits, as that is what Unicode uses). In comparison, it takes 7 bits to represent 1 ASCII character, but as computers typically don't support directly addressing data smaller than a byte, ASCII characters typically take 1 byte of storage each, ISO 10646 characters 4 bytes of storage each. (see Unicode for more details)


Personal tools