FRUUG : What You Need to Know About Unicode

Truth is Stranger Than Fiction: Why You Need To Know About Unicode

At our September 2003 meeting, Bruce Haddon discussed the ins and outs of Unicode, the mother-of-all character set encodings. We all know about character sets that run left-to-right, and some that run right-to-left, and top-to-bottom, but how about those such as Arabic, whose characters are in script and require sequences of characters to line up as if the pen doesn't leave the paper? Or how about sequences of left-to-right characters (like numerals) that are embedded in right-to-left text?

Unicode covers it all, and it turns out that Unicode sits at a very interesting intersection of computer science, phonetics, sociology, and archaeology. For example, if you were going to store text in ancient Egyptian hierolgyphs on a computer, how would you do it? Unicode, of course. Bruce's talk touched on this, and many of the subleties of encoding all known languages in a single, common character set description.

Bruce wrote a comprehensive book review (HTML, PDF) of the The Unicode Standard Version 4.0 book by Addison Wesley that is almost a summary of his talk; his presentation slides are also available (HTML, (PDF 10MB).



		February 15, 2009 February 2008: FRUUG Enters Quiescent Phase After 27 years running, we're suspending operations. Future Meetings: None planned

Site by
Lone Eagle Systems, Inc.,
Hosted courtesy of Indra