Utf-16

From LQWiki
Jump to navigation Jump to search

UTF-16 is a unicode flare. Let's take the editor yudit to create an example. Save the text u that only contains an A. Save it as UTF-16. Analyze it like this:

duffman:~ # ll u
-rw-r--r--  1 root root 4 Nov  4 11:07 u
duffman:~ # hexdump u
0000000 feff 0041
0000004
duffman:~ # cat u
A

You see that the file starts with a fingerprint (feffh) to tell you that this file is in UTF-16 on little endian. It then continues using at least two bytes per character, putting a leading zero in front of every ascii character.