ucs-tables.el by Dave Love -------------------------- Put ucs-tables.el in your load-path and write this in your ~/.emacs or ~/.gnus: (require 'ucs-tables) (ucs-unify-8859) Emacs 21.3 will have this file included, so make sure that you don't load this seperate ucs-tables.el if you're using the CVS version of Emacs 21 or any released Emacs version greater than 21.1 Raymond Scholz //////////////// From: d.love@dl.ac.uk (Dave Love) Subject: iso-8859/mule-unicode unification for Emacs 21 Newsgroups: gnu.emacs.sources Date: Fri Oct 26 18:41:00 2001 +0200 X-Sent: 2 weeks, 1 day, 22 hours, 13 minutes, 33 seconds ago Message-ID: User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.0.107 This package provides tables and a little code to `unify' equivalent characters from Emacs's internal charsets. For example, ?\xf69 ?\x8e9 are both 'Latin small letter e with acute', which you might type with Latin-9 and Latin-1 input methods respectively. They are distinct because of the unfortunate 8-bit European character set standards (ISO 8859) and the use of the appropriate international standard (ISO 2022) that Emacs follows to `multiplex' them together. [Mule follows the relevant European-originated standards, and predates a useful definition of Unicode.] Normally a buffer containing both of those Emacs characters can only be encoded (saved) in a general -- more-or-less Emacs-specific -- encoding: iso-2022-{7,8}bit or emacs-mule. With unification enabled, and, say, preferred coding system Latin-9, a buffer containing only those two non-ASCII characters will be saved as Latin-9. [This sort of situation is probably most relevant when responding to mail in a different encoding to what you normally use for input.] If the buffer contains characters which aren't common to a single supported 8859 set, it should probably be saved as utf-8 (see below). This package directly supports unification of ISO 8859 on encoding or decoding, as ISO 2022 suggests. It disproves the mythinformation that it can't be done by Mule. [Emacs 20 had the unification (`translation') hooks.] Companion changes to utf-8.el mentioned in the commentary enable the utf-8 coding system to encode ISO-8859-N characters for N>1. I expect to post them after checking the released code. Similarly for latin1-disp.el. More unification could be done, e.g. of the European characters in the Far Eastern character sets Emacs supports, but that's probably of little interest. Especially if you edit multilingual code for Emacs, note the warning in the commentary about munging multilingual files, such as this one! \\\\\\\\\\\\