--- /dev/null
+.arch-ids
+.arch-ids
+.arch-inventory
+.arch-ids
+.arch-ids
+.arch-inventory
+Makefile
+autom4te.cache
+config.cache
+config.log
+config.status
+rtf-parser.txt
+semantic.cache
+tests
+{arch}
--- /dev/null
+CATDOC CODING STANDARD
+~~~~~~~~~~~~~~~~~~~~~~
+0. CATDOC ISN'T WRITTEN ON C++!!!
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ C and C++ are different languages.
+ No // comments, no references, no declaration in the middle of block.
+
+1. Catdoc is portable program.
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Please never make following assumptions:
+1. That int is more than 16-bit wide
+ (consequentually, that signed int can hold Unicode character)
+2. That sizeof(int)>=sizeof(int *)
+3. That int is always 16-bit (it can be 32 bit as well)
+4. That long is 32-bit
+5. That char (and int and short as well) is either signed or unsigned
+ Always use explicit signedness specifier
+6. That integer arithmetic is 32-bit long.
+7. That input is always seekable. Catdoc is often used as filter
+8. That filenames are either case-sensitive or case-insensitive
+9. That there is no difference between binary and text file opening mode
+10. That opening file in the text mode will do something reasonable.
+ Always open files in binary mode. This is only way to produce
+ results, consistent on all platforms.
+11. That you can rely on compiler POSIX or C99 compliance. If you need
+ to use some function defined by this standard, write configure test
+ and provide fallback.
+12. That you can allocate chunk of memory larger than 64K.
+13. That filenames can be longer that 8+3.
+
+2. Catdoc is used world-wide
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+1. Never write comments on languages other than English.
+2. Never assume that you can output character without passing it through
+ convert_char function.
+
+3. Code formatting
+~~~~~~~~~~~~~~~~~
+1. Use <Tab> for identation. If your text editor insists on <Tab> being
+ 8 char, consider using some other editor. vim is at least a bit more
+ portable than catdoc.
+2. Open curly bracket on the same line as statement it belongs to:
+ if (condition) {
+ code
+ }
+ rather than
+ if (condition)
+ {
+ code
+ }
+
+3. The only exeception from rule 2 are blocks in the switch statement:
+ switch (var) {
+ case value:
+ {
+ code
+ }
+ }
+ rather than
+ switch (var) {
+ case value: {
+ code
+ }
+ }
+
+4. Write comments at the start of each function describing its purpose
+ and arguments.
+
+5. If you use some potentially dangerous construct, such as sprintf on
+ static buffer, comment why it is safe in this particular case.
+
--- /dev/null
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+ 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+\f
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+\f
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+\f
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+\f
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+\f
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) year name of author
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Library General
+Public License instead of this License.
--- /dev/null
+Note: people listed in this file are listed in arbitrary order.
+Kawai Takanori (Hippo2000) kwitknr@cpan.org
+ Author of perl module Spreadsheet::ParseExcel, which I use as
+ reference manual for Excel format
+Alex Ott <ott@jet.msk.su>
+ Fixed handling of long SST, contributed handling of RK records,
+ wrote RTF and OLE parsers
+Pawel Wiecek <coven@debian.org>
+ Current maintainer of Debian catdoc packag
+Peter Novodvosky <nidd@debian.org>
+ maintained debian package for catdoc.
+Bjorn Brenander <bjorn@debian.org>
+ maintained debian package for catdoc.
+Eugene B. Byrganov <E.B.Byrganov@inp.nsk.su>
+ Suggested -l switch, found me an example of partly 8-bit/partly
+ 16-bit file and some typos in builtin docs. Fixed some long-standing
+ bugs in config-parsed code.
+Artem Chuprina <ran@ran.pp.ru>
+ Provided lot of bugfixes and suggestions. Also maintained some
+ unofficial packaged versions of catdoc.
+Stephen Farrell <stephen@farrell.org>
+ maintains FreeBSD port, and have persuaded me to write autoconf
+ configuration
+Martin Kraemer <martin.kraemer@mch.sni.de>
+ contributed some fixes for ascii.rpl and noted typo in catdoc.h
+Arfst Ludwig <Arfst.Ludwig@LHSystems.COM>
+ give me the idea of creating README.charset
+Dmitry Potapov <dpotapov@capitalsoft.com>
+ contributed rtf-parsing code
+David Rysdam
+ Wrote program biffview, which parses XLS file and used as base
+ for xls2csv.
+Duncan Simpson <dps@io.stargate.co.uk>
+ audited catdoc code for possible buffer overruns (and found much more
+ of them than actually existed)
--- /dev/null
+INSTALLING catdoc 0.91.x
+
+Starting with patchlevel alpha 3 catdoc version 0.90 have autoconf
+configuration. Thanks for Stephen Farell to convince me.
+
+So typically you should run
+./configure
+make
+make install
+
+to compile and install catdoc.
+
+NOTE for HPUX users. If you want to compile catdoc with aCC,
+use CC="aCC -Ae" ./configure
+
+Configure script for catdoc recognizes following options (apart from
+standard --prefix, --exec-prefix and so on)
+
+--disable-wordview - disables building of Tcl/Tk viewer wordview,
+ which requires X11. (note, it would be disabled automatically,
+ if you don't have appropriate version of Tcl/Tk). You may
+ wish to use this if you don't have X installed.
+
+--with-wish=path - specifies path to wish interpreter. This option have
+ two uses
+ 1. If executable named wish, found in your PATH is old, and
+ you have newer wish installed as wish4.2 or wish8.0,
+ you should specify this in order to build wordview viewer
+ 2. If you are compiling catdoc from telnet connection or
+ text console, you can specify this option to skip tcl
+ version check, which would run wish and fail if it couldn't
+ find X display (which would lead configure to assume, that
+ you don't have good wish)
+
+--with-input=charset
+--with-output=charset
+ Allows you to specify charset names to expect in 8-bit word
+ file and to produce as output text file. Do ls ./charsets/*.txt
+ to find out which charsets are provided in distribution.
+ Additional charsets can be obtained from
+ ftp.unicode.org
+ Note that make would fail if you specify charset, which
+ doesn't exist in charset directory.
+
+--disable-charset-check
+ By default, make in charsets directory fails, if it is unable
+ to find *.txt files corresponding to default input and output
+ charsets. This option allows you to disable this check. Make
+ in charsets directory would always succeed, but it is your
+ responsibility to provide charset files in catdoc library
+ directory after make install.
+--disable-langinfo
+ By default, catdoc tries to use your current locale charset
+ as its output charset. It can be, of cource always overriden
+ by command line switch. But charset from the locale takes
+ precedence over charset in configuration file, unless
+ you put use_locale=no into this file.
+
+ If your C library is not XPG4-compatible, and configure fails
+ to detect it, you can completely disable langinfo support
+ using this switch.
+
+If you experience strange and unexpected behavoir of catdoc, try to
+remove optimization flag (-02) from FLAGS in src/Makefile.
+If you can write autoconf test to check for this problem, please send it
+to me.
+
+It was known problem with version 0.35 on HP/UX 9, and I scarcely changed
+my style of writing since.
+
+
--- /dev/null
+INSTALLING catdoc 0.90a on MS-DOS system.\r
+\r
+Surprise, but MS-DOS is native platform for this version of catdoc.\r
+In difference of previous version, which was UNIX program, ported to\r
+DOS, this one was developed under DOS on nine-years old 286 laptop\r
+with Turbo C 2.0.\r
+\r
+So, catdoc works perfectly well on MS-DOS systems. \r
+\r
+Documentation can be found in files CATDOC.TXT and CATDOC.PS\r
+(both produced by UNIX man command)\r
+\r
+If you've fetched BINARY DISTRIBUTION, note following:\r
+\r
+1. catdoc expect to find its system-wide configuration file\r
+ in the same directory as executable (and therefore require DOS\r
+ version 3 or above) If you wish to move charset and special char\r
+ maps to location other than default (charsets subdirectory of\r
+ directory, containing executable) you must have this configuration\r
+ file.\r
+\r
+2. Any file name in configuration file can contain %s escape, which\r
+ would be substituted by directory of executable.\r
+\r
+3. All configuration files can use either DOS or UNIX end-of-line\r
+ convention.\r
+\r
+4. Per-user configuration probably wouldn't work. But try to define\r
+ environment variable HOME and put catdoc.rc file in directory,\r
+ pointed by it.\r
+\r
+5. Catdoc uses DOS country information as specified by COUNTRY statement\r
+ in your configuration file to determine output encoding. This\r
+ settings have priority over settings in configuration files (either\r
+ per-user or system-wide). If it is not what you want, set \r
+ use_locale = no in the configuration file.\r
+\r
+If you are insisting on COMPILING catdoc YOURSELF.\r
+Please note that catdoc was compiled under DOS using Turbo C 2.01,\r
+downloaded from http://community.borland.com/museum. You can get the\r
+same one. \r
+\r
+I've made some attempts to compile catdoc with Watcom C (16-bit),\r
+but haven't completely socceeded. If you do, let me know. \r
+\r
+1. With 16-bit compilier, use COMPACT memory model\r
+ If you are using Turbo C make -fmakefile.tc in src directory\r
+ should be enough. If you have to change anything in\r
+ the makefile.tc, please let me know.\r
+ \r
+2. If you are using compilier other than Turbo C /Borland C or\r
+ Watcom, you should take look on fileutil.c file and possible\r
+ add couple of #ifdefs here. If your succed with it, send me a\r
+ patch (or entire modified file, if you don't know how to make\r
+ a good unix-like patch). \r
+\r
+\r
+3. With 32-bit compilier you are on your own. I don't think that\r
+ small utilities like catdoc should require extender or DPMI host,\r
+ so I've never tried to build 32-bit version of catdoc for DOS,\r
+ But if you mix buffer sizes from UNIX version and file-name\r
+ dependent defines from DOS, you should probably achieve good\r
+ results.\r
+\r
+4. With Turbo C you'll need file getopt.c which comes with Turbo C\r
+ and unistd.h which is provided in compat directory.\r
+ Compile getopt.c and add it to cc.lib and put unistd.h in\r
+ your include directory. Later it might help you to port other\r
+ unix software. With other compilier you can also make use\r
+ of getopt.c in compat directory (which is from GNU), but I was\r
+ unable to make it work with Watcom 10.0\r
+\r
+5. It is probably good idea to link wildargs.obj (or wildargv.obj)\r
+ with catdoc. I didn't do it myself becouse I use korn shell on\r
+ machine where I've developed catdoc, so I don't need to include\r
+ parameter expansion in program.\r
--- /dev/null
+
+# Your C compilier and flags
+SHELL = /bin/sh
+
+
+all:
+ for i in src doc charsets; do\
+ (cd $$i; $(MAKE) all);\
+ done
+
+install:
+ for i in src doc charsets; do\
+ (cd $$i; $(MAKE) install);\
+ done
+clean:
+ for i in src doc charsets; do\
+ (cd $$i; $(MAKE) clean);\
+ done
+distclean:
+ for i in src doc charsets; do\
+ (cd $$i; $(MAKE) distclean);\
+ done
+ rm Makefile config.*
+dist:
+ $(MAKE) -C doc dosdoc
+ $(MAKE) distclean
--- /dev/null
+ 0.90.1 Nov 26 1998
+ Top-level Makefile now uses $MAKE instead of make
+ fixed missing end-line escaping in wordview.tcl
+ All occurences of strcpy, strcat and sprinf investigated
+ to avoid buffer overflows.
+ 0.90 Oct 29 1998
+ Fixed bug with charset names redeclared locally in main()
+ Fixed problem in configure with wish 8.0.3
+ Catdoc considered to be stable enough for release
+ 0.90b5 Oct 14 1998
+ Fixed handling of 0x1F char (soft hyphen in Word 6.0),
+ now it is translated to 0x00AD (unicode soft hyphen)
+ Fixed permissions for manual page
+ Added --with-install-root configure arg to simplify
+ building of binary packages.
+ 0.90b4 September 17 1998
+ Added proper configuration of library dir in wordview.
+ Added --disable-charset-check config option
+ Added 0x2026 symbol in ascii.rpl
+ Added more Windows codepages in distribution
+ 0.90b3 September 11 1998
+ Added -x switch to simplify debugging of substitution maps
+ 0.90b2 September 10 1998
+ Added some symbols is 0x2000-0x20FF range to substituton maps
+ These symbols occurs in cp1251 so they are frequently found
+ in Word files. Fixed some filename-handling problems in
+ wordview.tcl
+
+ 0.90b1 September 8 1998
+ Added us-ascii.charset, fixed small bugs in confugre,
+ install is used for all installation files. Code is
+ considered stable enough to be beta.
+
+ 0.90a3 September 7 1998
+ Fixed small bug in table handling, which caused catdoc to
+ output extra column delimiter just before row delimiter. Added
+ autoconf configuration. install is back, although not for
+ charsets
+
+ 0.90a2 August 18 1998
+ version 0.90 was tested on BSDI and Solaris platform. Makefile
+ was rewritten to avoid use of highly incompatible
+ /usr/{ucb,bin}/install
+
+ 0.90a1 August 13 1998
+ Catdoc undergone major rewrite. Now it has proper charset
+ handling, including UNICODE and runtime configurability.
+
+ 0.35 - June 5 1998
+ Fixed bug with -s switch which prevents catdoc from returning
+ non-zero code when invoked on UNIX text file
+
+ 0.34 - Apr 28 1998
+ Files now opened in binary mode thus allowing catdoc to work on
+ DOS and simular systems. All specs arrays now have terminating
+ NULL
+
+ 0.33 - October 1997
+ Fixed missing terminating NUL in specs array, which caused
+ random seqfaults on Linux and many other systems, becouse
+ _specs_ is searched by _strchr_ fynction
+
+ 0.32 - August 1997
+ First mayor public release, uploaded to CTAN. Tk interface
+ appeared, manual page was written. Unfortunately, this release
+ was buggy.
+
--- /dev/null
+CATDOC version 0.93
+
+CATDOC is program which reads MS-Word file and prints readable
+ASCII text to stdout, just like Unix cat command.
+It also able to produce correct escape sequences if some UNICODE
+charachers have to be represented specially in your typesetting system
+such as (La)TeX.
+
+This is completely new version of catdoc, rewritten from scratch.
+It features runtime configuration, proper charset handling,
+user-definable output formats and support
+for Word97 files, which contain UNICODE internally.
+
+Since 0.93.0 catdoc parses OLE structure and extracts WordDocment
+stream, but doesn't parse internal structure of it.
+
+This rough approach inevitable results in some garbage in output file,
+especially near the end of file and if file contains embedded OLE objects,
+such as pictures or equations.
+
+So, if you are looking for purely authomatic way to convert Word to LaTeX,
+you can better investigate word2x, wvware or LAOLA.
+
+
+Catdoc is distributed under GNU Public License version 2 or above.
+
+
+Your bug reports and suggestions are welcome.
+
+There is also major work to do - define correct TeX commands
+for accented latin letters into tex.specchars file and commands
+for mathematical symbols (unicode 20xx-25xx).
+
+
+Contributions are welcome.
+
+See files INSTALL and INSTALL.dos for information about compiling and
+installing catdoc.
+
+Catdoc is documented in its UNIX-style manual page. For those who don't
+have man command (i.e. MS-DOS users) plain text and postscript versions
+of manual are provided in doc directory
+ Victor Wagner <vitus@45.free.net>
+
+
--- /dev/null
+* support dual-byte (CJK) encodings as output \r
+* Find a way to extract rowspan information from XLS.\r
+* Make XLS2CSV to output sheet partially when memory exhausted\r
+* Plain-text output method for XLS2CSV and its support in wordview\r
+* textmode (ck) wordview\r
+* Improve RTF support\r
+* Extract text from Top Level OLE objects ???\r
+* Write correct TeX commands for most often used mathematical symbols\r
+ (20xx-25xx) into TeX spec chars file\r
+* Add handling of tables & footnotes\r
+* Fastsave support\r
--- /dev/null
+#undef VERSION
+#undef PACKAGE
--- /dev/null
+Makefile
+semantic.cache
--- /dev/null
+#\r
+# Name: ISO 8859-1 (1987) to Unicode\r
+# Unicode version: 1.1\r
+# Table version: 0.1\r
+# Table format: Format A\r
+# Date: 16 January 1995\r
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>\r
+# John H. Jenkins <John_Jenkins@taligent.com>\r
+#\r
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.\r
+#\r
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).\r
+# No claims are made as to fitness for any particular purpose. No\r
+# warranties of any kind are expressed or implied. The recipient\r
+# agrees to determine applicability of information provided. If this\r
+# file has been provided on magnetic media by Unicode, Inc., the sole\r
+# remedy for any claim will be exchange of defective media within 90\r
+# days of receipt.\r
+#\r
+# Recipient is granted the right to make copies in any form for\r
+# internal distribution and to freely use the information supplied\r
+# in the creation of products supporting Unicode. Unicode, Inc.\r
+# specifically excludes the right to re-distribute this file directly\r
+# to third parties or other organizations whether for profit or not.\r
+#\r
+# General notes:\r
+#\r
+# This table contains the data the Unicode Consortium has on how\r
+# ISO 8859-1 (1987) characters map into Unicode.\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the ISO 8859-1 code (in hex as 0xXX)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in ISO 8859-1 order\r
+#\r
+# Any comments or problems, contact <John_Jenkins@taligent.com>\r
+#\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0030 # DIGIT ZERO\r
+0x31 0x0031 # DIGIT ONE\r
+0x32 0x0032 # DIGIT TWO\r
+0x33 0x0033 # DIGIT THREE\r
+0x34 0x0034 # DIGIT FOUR\r
+0x35 0x0035 # DIGIT FIVE\r
+0x36 0x0036 # DIGIT SIX\r
+0x37 0x0037 # DIGIT SEVEN\r
+0x38 0x0038 # DIGIT EIGHT\r
+0x39 0x0039 # DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
+0xA0 0x00A0 # NO-BREAK SPACE\r
+0xA1 0x00A1 # INVERTED EXCLAMATION MARK\r
+0xA2 0x00A2 # CENT SIGN\r
+0xA3 0x00A3 # POUND SIGN\r
+0xA4 0x00A4 # CURRENCY SIGN\r
+0xA5 0x00A5 # YEN SIGN\r
+0xA6 0x00A6 # BROKEN BAR\r
+0xA7 0x00A7 # SECTION SIGN\r
+0xA8 0x00A8 # DIAERESIS\r
+0xA9 0x00A9 # COPYRIGHT SIGN\r
+0xAA 0x00AA # FEMININE ORDINAL INDICATOR\r
+0xAB 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC # NOT SIGN\r
+0xAD 0x00AD # SOFT HYPHEN\r
+0xAE 0x00AE # REGISTERED SIGN\r
+0xAF 0x00AF # MACRON\r
+0xB0 0x00B0 # DEGREE SIGN\r
+0xB1 0x00B1 # PLUS-MINUS SIGN\r
+0xB2 0x00B2 # SUPERSCRIPT TWO\r
+0xB3 0x00B3 # SUPERSCRIPT THREE\r
+0xB4 0x00B4 # ACUTE ACCENT\r
+0xB5 0x00B5 # MICRO SIGN\r
+0xB6 0x00B6 # PILCROW SIGN\r
+0xB7 0x00B7 # MIDDLE DOT\r
+0xB8 0x00B8 # CEDILLA\r
+0xB9 0x00B9 # SUPERSCRIPT ONE\r
+0xBA 0x00BA # MASCULINE ORDINAL INDICATOR\r
+0xBB 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x00BC # VULGAR FRACTION ONE QUARTER\r
+0xBD 0x00BD # VULGAR FRACTION ONE HALF\r
+0xBE 0x00BE # VULGAR FRACTION THREE QUARTERS\r
+0xBF 0x00BF # INVERTED QUESTION MARK\r
+0xC0 0x00C0 # LATIN CAPITAL LETTER A WITH GRAVE\r
+0xC1 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC3 0x00C3 # LATIN CAPITAL LETTER A WITH TILDE\r
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x00C5 # LATIN CAPITAL LETTER A WITH RING ABOVE\r
+0xC6 0x00C6 # LATIN CAPITAL LETTER AE\r
+0xC7 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA\r
+0xC8 0x00C8 # LATIN CAPITAL LETTER E WITH GRAVE\r
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x00CA # LATIN CAPITAL LETTER E WITH CIRCUMFLEX\r
+0xCB 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x00CC # LATIN CAPITAL LETTER I WITH GRAVE\r
+0xCD 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x00CF # LATIN CAPITAL LETTER I WITH DIAERESIS\r
+0xD0 0x00D0 # LATIN CAPITAL LETTER ETH (Icelandic)\r
+0xD1 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE\r
+0xD2 0x00D2 # LATIN CAPITAL LETTER O WITH GRAVE\r
+0xD3 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE\r
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 # MULTIPLICATION SIGN\r
+0xD8 0x00D8 # LATIN CAPITAL LETTER O WITH STROKE\r
+0xD9 0x00D9 # LATIN CAPITAL LETTER U WITH GRAVE\r
+0xDA 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x00DB # LATIN CAPITAL LETTER U WITH CIRCUMFLEX\r
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x00DD # LATIN CAPITAL LETTER Y WITH ACUTE\r
+0xDE 0x00DE # LATIN CAPITAL LETTER THORN (Icelandic)\r
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S (German)\r
+0xE0 0x00E0 # LATIN SMALL LETTER A WITH GRAVE\r
+0xE1 0x00E1 # LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x00E3 # LATIN SMALL LETTER A WITH TILDE\r
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE\r
+0xE6 0x00E6 # LATIN SMALL LETTER AE\r
+0xE7 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x00E8 # LATIN SMALL LETTER E WITH GRAVE\r
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX\r
+0xEB 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x00EC # LATIN SMALL LETTER I WITH GRAVE\r
+0xED 0x00ED # LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS\r
+0xF0 0x00F0 # LATIN SMALL LETTER ETH (Icelandic)\r
+0xF1 0x00F1 # LATIN SMALL LETTER N WITH TILDE\r
+0xF2 0x00F2 # LATIN SMALL LETTER O WITH GRAVE\r
+0xF3 0x00F3 # LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x00F5 # LATIN SMALL LETTER O WITH TILDE\r
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 # DIVISION SIGN\r
+0xF8 0x00F8 # LATIN SMALL LETTER O WITH STROKE\r
+0xF9 0x00F9 # LATIN SMALL LETTER U WITH GRAVE\r
+0xFA 0x00FA # LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX\r
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x00FD # LATIN SMALL LETTER Y WITH ACUTE\r
+0xFE 0x00FE # LATIN SMALL LETTER THORN (Icelandic)\r
+0xFF 0x00FF # LATIN SMALL LETTER Y WITH DIAERESIS\r
--- /dev/null
+#
+# Name: ISO/IEC 8859-10:1998 to Unicode
+# Unicode version: 3.0
+# Table version: 1.1
+# Table format: Format A
+# Date: 1999 October 11
+# Authors: Ken Whistler <kenw@sybase.com>
+#
+# Copyright (c) 1999 Unicode, Inc. All Rights reserved.
+#
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
+# No claims are made as to fitness for any particular purpose. No
+# warranties of any kind are expressed or implied. The recipient
+# agrees to determine applicability of information provided. If this
+# file has been provided on optical media by Unicode, Inc., the sole
+# remedy for any claim will be exchange of defective media within 90
+# days of receipt.
+#
+# Unicode, Inc. hereby grants the right to freely use the information
+# supplied in this file in the creation of products supporting the
+# Unicode Standard, and to make copies of this file in any form for
+# internal or external distribution as long as this notice remains
+# attached.
+#
+# General notes:
+#
+# This table contains the data the Unicode Consortium has on how
+# ISO/IEC 8859-10:1998 characters map into Unicode.
+#
+# Format: Three tab-separated columns
+# Column #1 is the ISO/IEC 8859-10 code (in hex as 0xXX)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 the Unicode name (follows a comment sign, '#')
+#
+# The entries are in ISO/IEC 8859-10 order.
+#
+# Version history
+# 1.0 version new.
+# 1.1 corrected mistake in mapping of 0xA4
+#
+# Updated versions of this file may be found in:
+# <ftp://ftp.unicode.org/Public/MAPPINGS/>
+#
+# Any comments or problems, contact <errata@unicode.org>
+# Please note that <errata@unicode.org> is an archival address;
+# notices will be checked, but do not expect an immediate response.
+#
+0x00 0x0000 # NULL
+0x01 0x0001 # START OF HEADING
+0x02 0x0002 # START OF TEXT
+0x03 0x0003 # END OF TEXT
+0x04 0x0004 # END OF TRANSMISSION
+0x05 0x0005 # ENQUIRY
+0x06 0x0006 # ACKNOWLEDGE
+0x07 0x0007 # BELL
+0x08 0x0008 # BACKSPACE
+0x09 0x0009 # HORIZONTAL TABULATION
+0x0A 0x000A # LINE FEED
+0x0B 0x000B # VERTICAL TABULATION
+0x0C 0x000C # FORM FEED
+0x0D 0x000D # CARRIAGE RETURN
+0x0E 0x000E # SHIFT OUT
+0x0F 0x000F # SHIFT IN
+0x10 0x0010 # DATA LINK ESCAPE
+0x11 0x0011 # DEVICE CONTROL ONE
+0x12 0x0012 # DEVICE CONTROL TWO
+0x13 0x0013 # DEVICE CONTROL THREE
+0x14 0x0014 # DEVICE CONTROL FOUR
+0x15 0x0015 # NEGATIVE ACKNOWLEDGE
+0x16 0x0016 # SYNCHRONOUS IDLE
+0x17 0x0017 # END OF TRANSMISSION BLOCK
+0x18 0x0018 # CANCEL
+0x19 0x0019 # END OF MEDIUM
+0x1A 0x001A # SUBSTITUTE
+0x1B 0x001B # ESCAPE
+0x1C 0x001C # FILE SEPARATOR
+0x1D 0x001D # GROUP SEPARATOR
+0x1E 0x001E # RECORD SEPARATOR
+0x1F 0x001F # UNIT SEPARATOR
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+0x7F 0x007F # DELETE
+0x80 0x0080 # <control>
+0x81 0x0081 # <control>
+0x82 0x0082 # <control>
+0x83 0x0083 # <control>
+0x84 0x0084 # <control>
+0x85 0x0085 # <control>
+0x86 0x0086 # <control>
+0x87 0x0087 # <control>
+0x88 0x0088 # <control>
+0x89 0x0089 # <control>
+0x8A 0x008A # <control>
+0x8B 0x008B # <control>
+0x8C 0x008C # <control>
+0x8D 0x008D # <control>
+0x8E 0x008E # <control>
+0x8F 0x008F # <control>
+0x90 0x0090 # <control>
+0x91 0x0091 # <control>
+0x92 0x0092 # <control>
+0x93 0x0093 # <control>
+0x94 0x0094 # <control>
+0x95 0x0095 # <control>
+0x96 0x0096 # <control>
+0x97 0x0097 # <control>
+0x98 0x0098 # <control>
+0x99 0x0099 # <control>
+0x9A 0x009A # <control>
+0x9B 0x009B # <control>
+0x9C 0x009C # <control>
+0x9D 0x009D # <control>
+0x9E 0x009E # <control>
+0x9F 0x009F # <control>
+0xA0 0x00A0 # NO-BREAK SPACE
+0xA1 0x0104 # LATIN CAPITAL LETTER A WITH OGONEK
+0xA2 0x0112 # LATIN CAPITAL LETTER E WITH MACRON
+0xA3 0x0122 # LATIN CAPITAL LETTER G WITH CEDILLA
+0xA4 0x012A # LATIN CAPITAL LETTER I WITH MACRON
+0xA5 0x0128 # LATIN CAPITAL LETTER I WITH TILDE
+0xA6 0x0136 # LATIN CAPITAL LETTER K WITH CEDILLA
+0xA7 0x00A7 # SECTION SIGN
+0xA8 0x013B # LATIN CAPITAL LETTER L WITH CEDILLA
+0xA9 0x0110 # LATIN CAPITAL LETTER D WITH STROKE
+0xAA 0x0160 # LATIN CAPITAL LETTER S WITH CARON
+0xAB 0x0166 # LATIN CAPITAL LETTER T WITH STROKE
+0xAC 0x017D # LATIN CAPITAL LETTER Z WITH CARON
+0xAD 0x00AD # SOFT HYPHEN
+0xAE 0x016A # LATIN CAPITAL LETTER U WITH MACRON
+0xAF 0x014A # LATIN CAPITAL LETTER ENG
+0xB0 0x00B0 # DEGREE SIGN
+0xB1 0x0105 # LATIN SMALL LETTER A WITH OGONEK
+0xB2 0x0113 # LATIN SMALL LETTER E WITH MACRON
+0xB3 0x0123 # LATIN SMALL LETTER G WITH CEDILLA
+0xB4 0x012B # LATIN SMALL LETTER I WITH MACRON
+0xB5 0x0129 # LATIN SMALL LETTER I WITH TILDE
+0xB6 0x0137 # LATIN SMALL LETTER K WITH CEDILLA
+0xB7 0x00B7 # MIDDLE DOT
+0xB8 0x013C # LATIN SMALL LETTER L WITH CEDILLA
+0xB9 0x0111 # LATIN SMALL LETTER D WITH STROKE
+0xBA 0x0161 # LATIN SMALL LETTER S WITH CARON
+0xBB 0x0167 # LATIN SMALL LETTER T WITH STROKE
+0xBC 0x017E # LATIN SMALL LETTER Z WITH CARON
+0xBD 0x2015 # HORIZONTAL BAR
+0xBE 0x016B # LATIN SMALL LETTER U WITH MACRON
+0xBF 0x014B # LATIN SMALL LETTER ENG
+0xC0 0x0100 # LATIN CAPITAL LETTER A WITH MACRON
+0xC1 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE
+0xC2 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0xC3 0x00C3 # LATIN CAPITAL LETTER A WITH TILDE
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0xC5 0x00C5 # LATIN CAPITAL LETTER A WITH RING ABOVE
+0xC6 0x00C6 # LATIN CAPITAL LETTER AE
+0xC7 0x012E # LATIN CAPITAL LETTER I WITH OGONEK
+0xC8 0x010C # LATIN CAPITAL LETTER C WITH CARON
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0xCA 0x0118 # LATIN CAPITAL LETTER E WITH OGONEK
+0xCB 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS
+0xCC 0x0116 # LATIN CAPITAL LETTER E WITH DOT ABOVE
+0xCD 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE
+0xCE 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+0xCF 0x00CF # LATIN CAPITAL LETTER I WITH DIAERESIS
+0xD0 0x00D0 # LATIN CAPITAL LETTER ETH (Icelandic)
+0xD1 0x0145 # LATIN CAPITAL LETTER N WITH CEDILLA
+0xD2 0x014C # LATIN CAPITAL LETTER O WITH MACRON
+0xD3 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE
+0xD4 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0xD5 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0xD7 0x0168 # LATIN CAPITAL LETTER U WITH TILDE
+0xD8 0x00D8 # LATIN CAPITAL LETTER O WITH STROKE
+0xD9 0x0172 # LATIN CAPITAL LETTER U WITH OGONEK
+0xDA 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE
+0xDB 0x00DB # LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0xDD 0x00DD # LATIN CAPITAL LETTER Y WITH ACUTE
+0xDE 0x00DE # LATIN CAPITAL LETTER THORN (Icelandic)
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S (German)
+0xE0 0x0101 # LATIN SMALL LETTER A WITH MACRON
+0xE1 0x00E1 # LATIN SMALL LETTER A WITH ACUTE
+0xE2 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX
+0xE3 0x00E3 # LATIN SMALL LETTER A WITH TILDE
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0xE5 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE
+0xE6 0x00E6 # LATIN SMALL LETTER AE
+0xE7 0x012F # LATIN SMALL LETTER I WITH OGONEK
+0xE8 0x010D # LATIN SMALL LETTER C WITH CARON
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0xEA 0x0119 # LATIN SMALL LETTER E WITH OGONEK
+0xEB 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS
+0xEC 0x0117 # LATIN SMALL LETTER E WITH DOT ABOVE
+0xED 0x00ED # LATIN SMALL LETTER I WITH ACUTE
+0xEE 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX
+0xEF 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS
+0xF0 0x00F0 # LATIN SMALL LETTER ETH (Icelandic)
+0xF1 0x0146 # LATIN SMALL LETTER N WITH CEDILLA
+0xF2 0x014D # LATIN SMALL LETTER O WITH MACRON
+0xF3 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
+0xF4 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
+0xF5 0x00F5 # LATIN SMALL LETTER O WITH TILDE
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0xF7 0x0169 # LATIN SMALL LETTER U WITH TILDE
+0xF8 0x00F8 # LATIN SMALL LETTER O WITH STROKE
+0xF9 0x0173 # LATIN SMALL LETTER U WITH OGONEK
+0xFA 0x00FA # LATIN SMALL LETTER U WITH ACUTE
+0xFB 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xFD 0x00FD # LATIN SMALL LETTER Y WITH ACUTE
+0xFE 0x00FE # LATIN SMALL LETTER THORN (Icelandic)
+0xFF 0x0138 # LATIN SMALL LETTER KRA
--- /dev/null
+#
+# Name: ISO/IEC 8859-11:2001 to Unicode
+# Unicode version: 3.2
+# Table version: 1.0
+# Table format: Format A
+# Date: 2002 October 7
+# Authors: Ken Whistler <kenw@sybase.com>
+#
+# Copyright (c) 2002 Unicode, Inc. All Rights reserved.
+#
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
+# No claims are made as to fitness for any particular purpose. No
+# warranties of any kind are expressed or implied. The recipient
+# agrees to determine applicability of information provided. If this
+# file has been provided on optical media by Unicode, Inc., the sole
+# remedy for any claim will be exchange of defective media within 90
+# days of receipt.
+#
+# Unicode, Inc. hereby grants the right to freely use the information
+# supplied in this file in the creation of products supporting the
+# Unicode Standard, and to make copies of this file in any form for
+# internal or external distribution as long as this notice remains
+# attached.
+#
+# General notes:
+#
+# This table contains the data the Unicode Consortium has on how
+# ISO/IEC 8859-11:2001 characters map into Unicode.
+#
+# ISO/IEC 8859-11:2001 is equivalent to TIS 620-2533 (1990) with
+# the addition of 0xA0 NO-BREAK SPACE.
+#
+# Format: Three tab-separated columns
+# Column #1 is the ISO/IEC 8859-11 code (in hex as 0xXX)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 the Unicode name (follows a comment sign, '#')
+#
+# The entries are in ISO/IEC 8859-11 order.
+#
+# Version history:
+# 2002 October 7 Created
+#
+# Updated versions of this file may be found in:
+# <ftp://ftp.unicode.org/Public/MAPPINGS/>
+#
+# For any comments or problems, please use the Unicode
+# web contact form at:
+# http://www.unicode.org/unicode/reporting.html
+#
+0x00 0x0000 # NULL
+0x01 0x0001 # START OF HEADING
+0x02 0x0002 # START OF TEXT
+0x03 0x0003 # END OF TEXT
+0x04 0x0004 # END OF TRANSMISSION
+0x05 0x0005 # ENQUIRY
+0x06 0x0006 # ACKNOWLEDGE
+0x07 0x0007 # BELL
+0x08 0x0008 # BACKSPACE
+0x09 0x0009 # HORIZONTAL TABULATION
+0x0A 0x000A # LINE FEED
+0x0B 0x000B # VERTICAL TABULATION
+0x0C 0x000C # FORM FEED
+0x0D 0x000D # CARRIAGE RETURN
+0x0E 0x000E # SHIFT OUT
+0x0F 0x000F # SHIFT IN
+0x10 0x0010 # DATA LINK ESCAPE
+0x11 0x0011 # DEVICE CONTROL ONE
+0x12 0x0012 # DEVICE CONTROL TWO
+0x13 0x0013 # DEVICE CONTROL THREE
+0x14 0x0014 # DEVICE CONTROL FOUR
+0x15 0x0015 # NEGATIVE ACKNOWLEDGE
+0x16 0x0016 # SYNCHRONOUS IDLE
+0x17 0x0017 # END OF TRANSMISSION BLOCK
+0x18 0x0018 # CANCEL
+0x19 0x0019 # END OF MEDIUM
+0x1A 0x001A # SUBSTITUTE
+0x1B 0x001B # ESCAPE
+0x1C 0x001C # FILE SEPARATOR
+0x1D 0x001D # GROUP SEPARATOR
+0x1E 0x001E # RECORD SEPARATOR
+0x1F 0x001F # UNIT SEPARATOR
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+0x7F 0x007F # DELETE
+0x80 0x0080 # <control>
+0x81 0x0081 # <control>
+0x82 0x0082 # <control>
+0x83 0x0083 # <control>
+0x84 0x0084 # <control>
+0x85 0x0085 # <control>
+0x86 0x0086 # <control>
+0x87 0x0087 # <control>
+0x88 0x0088 # <control>
+0x89 0x0089 # <control>
+0x8A 0x008A # <control>
+0x8B 0x008B # <control>
+0x8C 0x008C # <control>
+0x8D 0x008D # <control>
+0x8E 0x008E # <control>
+0x8F 0x008F # <control>
+0x90 0x0090 # <control>
+0x91 0x0091 # <control>
+0x92 0x0092 # <control>
+0x93 0x0093 # <control>
+0x94 0x0094 # <control>
+0x95 0x0095 # <control>
+0x96 0x0096 # <control>
+0x97 0x0097 # <control>
+0x98 0x0098 # <control>
+0x99 0x0099 # <control>
+0x9A 0x009A # <control>
+0x9B 0x009B # <control>
+0x9C 0x009C # <control>
+0x9D 0x009D # <control>
+0x9E 0x009E # <control>
+0x9F 0x009F # <control>
+0xA0 0x00A0 # NO-BREAK SPACE
+0xA1 0x0E01 # THAI CHARACTER KO KAI
+0xA2 0x0E02 # THAI CHARACTER KHO KHAI
+0xA3 0x0E03 # THAI CHARACTER KHO KHUAT
+0xA4 0x0E04 # THAI CHARACTER KHO KHWAI
+0xA5 0x0E05 # THAI CHARACTER KHO KHON
+0xA6 0x0E06 # THAI CHARACTER KHO RAKHANG
+0xA7 0x0E07 # THAI CHARACTER NGO NGU
+0xA8 0x0E08 # THAI CHARACTER CHO CHAN
+0xA9 0x0E09 # THAI CHARACTER CHO CHING
+0xAA 0x0E0A # THAI CHARACTER CHO CHANG
+0xAB 0x0E0B # THAI CHARACTER SO SO
+0xAC 0x0E0C # THAI CHARACTER CHO CHOE
+0xAD 0x0E0D # THAI CHARACTER YO YING
+0xAE 0x0E0E # THAI CHARACTER DO CHADA
+0xAF 0x0E0F # THAI CHARACTER TO PATAK
+0xB0 0x0E10 # THAI CHARACTER THO THAN
+0xB1 0x0E11 # THAI CHARACTER THO NANGMONTHO
+0xB2 0x0E12 # THAI CHARACTER THO PHUTHAO
+0xB3 0x0E13 # THAI CHARACTER NO NEN
+0xB4 0x0E14 # THAI CHARACTER DO DEK
+0xB5 0x0E15 # THAI CHARACTER TO TAO
+0xB6 0x0E16 # THAI CHARACTER THO THUNG
+0xB7 0x0E17 # THAI CHARACTER THO THAHAN
+0xB8 0x0E18 # THAI CHARACTER THO THONG
+0xB9 0x0E19 # THAI CHARACTER NO NU
+0xBA 0x0E1A # THAI CHARACTER BO BAIMAI
+0xBB 0x0E1B # THAI CHARACTER PO PLA
+0xBC 0x0E1C # THAI CHARACTER PHO PHUNG
+0xBD 0x0E1D # THAI CHARACTER FO FA
+0xBE 0x0E1E # THAI CHARACTER PHO PHAN
+0xBF 0x0E1F # THAI CHARACTER FO FAN
+0xC0 0x0E20 # THAI CHARACTER PHO SAMPHAO
+0xC1 0x0E21 # THAI CHARACTER MO MA
+0xC2 0x0E22 # THAI CHARACTER YO YAK
+0xC3 0x0E23 # THAI CHARACTER RO RUA
+0xC4 0x0E24 # THAI CHARACTER RU
+0xC5 0x0E25 # THAI CHARACTER LO LING
+0xC6 0x0E26 # THAI CHARACTER LU
+0xC7 0x0E27 # THAI CHARACTER WO WAEN
+0xC8 0x0E28 # THAI CHARACTER SO SALA
+0xC9 0x0E29 # THAI CHARACTER SO RUSI
+0xCA 0x0E2A # THAI CHARACTER SO SUA
+0xCB 0x0E2B # THAI CHARACTER HO HIP
+0xCC 0x0E2C # THAI CHARACTER LO CHULA
+0xCD 0x0E2D # THAI CHARACTER O ANG
+0xCE 0x0E2E # THAI CHARACTER HO NOKHUK
+0xCF 0x0E2F # THAI CHARACTER PAIYANNOI
+0xD0 0x0E30 # THAI CHARACTER SARA A
+0xD1 0x0E31 # THAI CHARACTER MAI HAN-AKAT
+0xD2 0x0E32 # THAI CHARACTER SARA AA
+0xD3 0x0E33 # THAI CHARACTER SARA AM
+0xD4 0x0E34 # THAI CHARACTER SARA I
+0xD5 0x0E35 # THAI CHARACTER SARA II
+0xD6 0x0E36 # THAI CHARACTER SARA UE
+0xD7 0x0E37 # THAI CHARACTER SARA UEE
+0xD8 0x0E38 # THAI CHARACTER SARA U
+0xD9 0x0E39 # THAI CHARACTER SARA UU
+0xDA 0x0E3A # THAI CHARACTER PHINTHU
+0xDF 0x0E3F # THAI CURRENCY SYMBOL BAHT
+0xE0 0x0E40 # THAI CHARACTER SARA E
+0xE1 0x0E41 # THAI CHARACTER SARA AE
+0xE2 0x0E42 # THAI CHARACTER SARA O
+0xE3 0x0E43 # THAI CHARACTER SARA AI MAIMUAN
+0xE4 0x0E44 # THAI CHARACTER SARA AI MAIMALAI
+0xE5 0x0E45 # THAI CHARACTER LAKKHANGYAO
+0xE6 0x0E46 # THAI CHARACTER MAIYAMOK
+0xE7 0x0E47 # THAI CHARACTER MAITAIKHU
+0xE8 0x0E48 # THAI CHARACTER MAI EK
+0xE9 0x0E49 # THAI CHARACTER MAI THO
+0xEA 0x0E4A # THAI CHARACTER MAI TRI
+0xEB 0x0E4B # THAI CHARACTER MAI CHATTAWA
+0xEC 0x0E4C # THAI CHARACTER THANTHAKHAT
+0xED 0x0E4D # THAI CHARACTER NIKHAHIT
+0xEE 0x0E4E # THAI CHARACTER YAMAKKAN
+0xEF 0x0E4F # THAI CHARACTER FONGMAN
+0xF0 0x0E50 # THAI DIGIT ZERO
+0xF1 0x0E51 # THAI DIGIT ONE
+0xF2 0x0E52 # THAI DIGIT TWO
+0xF3 0x0E53 # THAI DIGIT THREE
+0xF4 0x0E54 # THAI DIGIT FOUR
+0xF5 0x0E55 # THAI DIGIT FIVE
+0xF6 0x0E56 # THAI DIGIT SIX
+0xF7 0x0E57 # THAI DIGIT SEVEN
+0xF8 0x0E58 # THAI DIGIT EIGHT
+0xF9 0x0E59 # THAI DIGIT NINE
+0xFA 0x0E5A # THAI CHARACTER ANGKHANKHU
+0xFB 0x0E5B # THAI CHARACTER KHOMUT
--- /dev/null
+#
+# Name: ISO/IEC 8859-13:1998 to Unicode
+# Unicode version: 3.0
+# Table version: 1.0
+# Table format: Format A
+# Date: 1999 July 27
+# Authors: Ken Whistler <kenw@sybase.com>
+#
+# Copyright (c) 1998 - 1999 Unicode, Inc. All Rights reserved.
+#
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
+# No claims are made as to fitness for any particular purpose. No
+# warranties of any kind are expressed or implied. The recipient
+# agrees to determine applicability of information provided. If this
+# file has been provided on optical media by Unicode, Inc., the sole
+# remedy for any claim will be exchange of defective media within 90
+# days of receipt.
+#
+# Unicode, Inc. hereby grants the right to freely use the information
+# supplied in this file in the creation of products supporting the
+# Unicode Standard, and to make copies of this file in any form for
+# internal or external distribution as long as this notice remains
+# attached.
+#
+# General notes:
+#
+# This table contains the data the Unicode Consortium has on how
+# ISO/IEC 8859-13:1998 characters map into Unicode.
+#
+# Format: Three tab-separated columns
+# Column #1 is the ISO/IEC 8859-13 code (in hex as 0xXX)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 the Unicode name (follows a comment sign, '#')
+#
+# The entries are in ISO/IEC 8859-13 order.
+#
+# Updated versions of this file may be found in:
+# <ftp://ftp.unicode.org/Public/MAPPINGS/>
+#
+# Any comments or problems, contact <errata@unicode.org>
+# Please note that <errata@unicode.org> is an archival address;
+# notices will be checked, but do not expect an immediate response.
+#
+0x00 0x0000 # NULL
+0x01 0x0001 # START OF HEADING
+0x02 0x0002 # START OF TEXT
+0x03 0x0003 # END OF TEXT
+0x04 0x0004 # END OF TRANSMISSION
+0x05 0x0005 # ENQUIRY
+0x06 0x0006 # ACKNOWLEDGE
+0x07 0x0007 # BELL
+0x08 0x0008 # BACKSPACE
+0x09 0x0009 # HORIZONTAL TABULATION
+0x0A 0x000A # LINE FEED
+0x0B 0x000B # VERTICAL TABULATION
+0x0C 0x000C # FORM FEED
+0x0D 0x000D # CARRIAGE RETURN
+0x0E 0x000E # SHIFT OUT
+0x0F 0x000F # SHIFT IN
+0x10 0x0010 # DATA LINK ESCAPE
+0x11 0x0011 # DEVICE CONTROL ONE
+0x12 0x0012 # DEVICE CONTROL TWO
+0x13 0x0013 # DEVICE CONTROL THREE
+0x14 0x0014 # DEVICE CONTROL FOUR
+0x15 0x0015 # NEGATIVE ACKNOWLEDGE
+0x16 0x0016 # SYNCHRONOUS IDLE
+0x17 0x0017 # END OF TRANSMISSION BLOCK
+0x18 0x0018 # CANCEL
+0x19 0x0019 # END OF MEDIUM
+0x1A 0x001A # SUBSTITUTE
+0x1B 0x001B # ESCAPE
+0x1C 0x001C # FILE SEPARATOR
+0x1D 0x001D # GROUP SEPARATOR
+0x1E 0x001E # RECORD SEPARATOR
+0x1F 0x001F # UNIT SEPARATOR
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+0x7F 0x007F # DELETE
+0x80 0x0080 # <control>
+0x81 0x0081 # <control>
+0x82 0x0082 # <control>
+0x83 0x0083 # <control>
+0x84 0x0084 # <control>
+0x85 0x0085 # <control>
+0x86 0x0086 # <control>
+0x87 0x0087 # <control>
+0x88 0x0088 # <control>
+0x89 0x0089 # <control>
+0x8A 0x008A # <control>
+0x8B 0x008B # <control>
+0x8C 0x008C # <control>
+0x8D 0x008D # <control>
+0x8E 0x008E # <control>
+0x8F 0x008F # <control>
+0x90 0x0090 # <control>
+0x91 0x0091 # <control>
+0x92 0x0092 # <control>
+0x93 0x0093 # <control>
+0x94 0x0094 # <control>
+0x95 0x0095 # <control>
+0x96 0x0096 # <control>
+0x97 0x0097 # <control>
+0x98 0x0098 # <control>
+0x99 0x0099 # <control>
+0x9A 0x009A # <control>
+0x9B 0x009B # <control>
+0x9C 0x009C # <control>
+0x9D 0x009D # <control>
+0x9E 0x009E # <control>
+0x9F 0x009F # <control>
+0xA0 0x00A0 # NO-BREAK SPACE
+0xA1 0x201D # RIGHT DOUBLE QUOTATION MARK
+0xA2 0x00A2 # CENT SIGN
+0xA3 0x00A3 # POUND SIGN
+0xA4 0x00A4 # CURRENCY SIGN
+0xA5 0x201E # DOUBLE LOW-9 QUOTATION MARK
+0xA6 0x00A6 # BROKEN BAR
+0xA7 0x00A7 # SECTION SIGN
+0xA8 0x00D8 # LATIN CAPITAL LETTER O WITH STROKE
+0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x0156 # LATIN CAPITAL LETTER R WITH CEDILLA
+0xAB 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xAC 0x00AC # NOT SIGN
+0xAD 0x00AD # SOFT HYPHEN
+0xAE 0x00AE # REGISTERED SIGN
+0xAF 0x00C6 # LATIN CAPITAL LETTER AE
+0xB0 0x00B0 # DEGREE SIGN
+0xB1 0x00B1 # PLUS-MINUS SIGN
+0xB2 0x00B2 # SUPERSCRIPT TWO
+0xB3 0x00B3 # SUPERSCRIPT THREE
+0xB4 0x201C # LEFT DOUBLE QUOTATION MARK
+0xB5 0x00B5 # MICRO SIGN
+0xB6 0x00B6 # PILCROW SIGN
+0xB7 0x00B7 # MIDDLE DOT
+0xB8 0x00F8 # LATIN SMALL LETTER O WITH STROKE
+0xB9 0x00B9 # SUPERSCRIPT ONE
+0xBA 0x0157 # LATIN SMALL LETTER R WITH CEDILLA
+0xBB 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xBC 0x00BC # VULGAR FRACTION ONE QUARTER
+0xBD 0x00BD # VULGAR FRACTION ONE HALF
+0xBE 0x00BE # VULGAR FRACTION THREE QUARTERS
+0xBF 0x00E6 # LATIN SMALL LETTER AE
+0xC0 0x0104 # LATIN CAPITAL LETTER A WITH OGONEK
+0xC1 0x012E # LATIN CAPITAL LETTER I WITH OGONEK
+0xC2 0x0100 # LATIN CAPITAL LETTER A WITH MACRON
+0xC3 0x0106 # LATIN CAPITAL LETTER C WITH ACUTE
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0xC5 0x00C5 # LATIN CAPITAL LETTER A WITH RING ABOVE
+0xC6 0x0118 # LATIN CAPITAL LETTER E WITH OGONEK
+0xC7 0x0112 # LATIN CAPITAL LETTER E WITH MACRON
+0xC8 0x010C # LATIN CAPITAL LETTER C WITH CARON
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0xCA 0x0179 # LATIN CAPITAL LETTER Z WITH ACUTE
+0xCB 0x0116 # LATIN CAPITAL LETTER E WITH DOT ABOVE
+0xCC 0x0122 # LATIN CAPITAL LETTER G WITH CEDILLA
+0xCD 0x0136 # LATIN CAPITAL LETTER K WITH CEDILLA
+0xCE 0x012A # LATIN CAPITAL LETTER I WITH MACRON
+0xCF 0x013B # LATIN CAPITAL LETTER L WITH CEDILLA
+0xD0 0x0160 # LATIN CAPITAL LETTER S WITH CARON
+0xD1 0x0143 # LATIN CAPITAL LETTER N WITH ACUTE
+0xD2 0x0145 # LATIN CAPITAL LETTER N WITH CEDILLA
+0xD3 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE
+0xD4 0x014C # LATIN CAPITAL LETTER O WITH MACRON
+0xD5 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0xD7 0x00D7 # MULTIPLICATION SIGN
+0xD8 0x0172 # LATIN CAPITAL LETTER U WITH OGONEK
+0xD9 0x0141 # LATIN CAPITAL LETTER L WITH STROKE
+0xDA 0x015A # LATIN CAPITAL LETTER S WITH ACUTE
+0xDB 0x016A # LATIN CAPITAL LETTER U WITH MACRON
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0xDD 0x017B # LATIN CAPITAL LETTER Z WITH DOT ABOVE
+0xDE 0x017D # LATIN CAPITAL LETTER Z WITH CARON
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S (German)
+0xE0 0x0105 # LATIN SMALL LETTER A WITH OGONEK
+0xE1 0x012F # LATIN SMALL LETTER I WITH OGONEK
+0xE2 0x0101 # LATIN SMALL LETTER A WITH MACRON
+0xE3 0x0107 # LATIN SMALL LETTER C WITH ACUTE
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0xE5 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE
+0xE6 0x0119 # LATIN SMALL LETTER E WITH OGONEK
+0xE7 0x0113 # LATIN SMALL LETTER E WITH MACRON
+0xE8 0x010D # LATIN SMALL LETTER C WITH CARON
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0xEA 0x017A # LATIN SMALL LETTER Z WITH ACUTE
+0xEB 0x0117 # LATIN SMALL LETTER E WITH DOT ABOVE
+0xEC 0x0123 # LATIN SMALL LETTER G WITH CEDILLA
+0xED 0x0137 # LATIN SMALL LETTER K WITH CEDILLA
+0xEE 0x012B # LATIN SMALL LETTER I WITH MACRON
+0xEF 0x013C # LATIN SMALL LETTER L WITH CEDILLA
+0xF0 0x0161 # LATIN SMALL LETTER S WITH CARON
+0xF1 0x0144 # LATIN SMALL LETTER N WITH ACUTE
+0xF2 0x0146 # LATIN SMALL LETTER N WITH CEDILLA
+0xF3 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
+0xF4 0x014D # LATIN SMALL LETTER O WITH MACRON
+0xF5 0x00F5 # LATIN SMALL LETTER O WITH TILDE
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0xF7 0x00F7 # DIVISION SIGN
+0xF8 0x0173 # LATIN SMALL LETTER U WITH OGONEK
+0xF9 0x0142 # LATIN SMALL LETTER L WITH STROKE
+0xFA 0x015B # LATIN SMALL LETTER S WITH ACUTE
+0xFB 0x016B # LATIN SMALL LETTER U WITH MACRON
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xFD 0x017C # LATIN SMALL LETTER Z WITH DOT ABOVE
+0xFE 0x017E # LATIN SMALL LETTER Z WITH CARON
+0xFF 0x2019 # RIGHT SINGLE QUOTATION MARK
--- /dev/null
+#
+# Name: ISO/IEC 8859-14:1998 to Unicode
+# Unicode version: 3.0
+# Table version: 1.0
+# Table format: Format A
+# Date: 1999 July 27
+# Authors: Markus Kuhn <mkuhn@acm.org>
+# Ken Whistler <kenw@sybase.com>
+#
+# Copyright (c) 1998 - 1999 Unicode, Inc. All Rights reserved.
+#
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
+# No claims are made as to fitness for any particular purpose. No
+# warranties of any kind are expressed or implied. The recipient
+# agrees to determine applicability of information provided. If this
+# file has been provided on optical media by Unicode, Inc., the sole
+# remedy for any claim will be exchange of defective media within 90
+# days of receipt.
+#
+# Unicode, Inc. hereby grants the right to freely use the information
+# supplied in this file in the creation of products supporting the
+# Unicode Standard, and to make copies of this file in any form for
+# internal or external distribution as long as this notice remains
+# attached.
+#
+# General notes:
+#
+# This table contains the data the Unicode Consortium has on how
+# ISO/IEC 8859-14:1998 characters map into Unicode.
+#
+# Format: Three tab-separated columns
+# Column #1 is the ISO/IEC 8859-14 code (in hex as 0xXX)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 the Unicode name (follows a comment sign, '#')
+#
+# The entries are in ISO/IEC 8859-14 order.
+#
+# Updated versions of this file may be found in:
+# <ftp://ftp.unicode.org/Public/MAPPINGS/>
+#
+# Any comments or problems, contact <errata@unicode.org>
+# Please note that <errata@unicode.org> is an archival address;
+# notices will be checked, but do not expect an immediate response.
+#
+0x00 0x0000 # NULL
+0x01 0x0001 # START OF HEADING
+0x02 0x0002 # START OF TEXT
+0x03 0x0003 # END OF TEXT
+0x04 0x0004 # END OF TRANSMISSION
+0x05 0x0005 # ENQUIRY
+0x06 0x0006 # ACKNOWLEDGE
+0x07 0x0007 # BELL
+0x08 0x0008 # BACKSPACE
+0x09 0x0009 # HORIZONTAL TABULATION
+0x0A 0x000A # LINE FEED
+0x0B 0x000B # VERTICAL TABULATION
+0x0C 0x000C # FORM FEED
+0x0D 0x000D # CARRIAGE RETURN
+0x0E 0x000E # SHIFT OUT
+0x0F 0x000F # SHIFT IN
+0x10 0x0010 # DATA LINK ESCAPE
+0x11 0x0011 # DEVICE CONTROL ONE
+0x12 0x0012 # DEVICE CONTROL TWO
+0x13 0x0013 # DEVICE CONTROL THREE
+0x14 0x0014 # DEVICE CONTROL FOUR
+0x15 0x0015 # NEGATIVE ACKNOWLEDGE
+0x16 0x0016 # SYNCHRONOUS IDLE
+0x17 0x0017 # END OF TRANSMISSION BLOCK
+0x18 0x0018 # CANCEL
+0x19 0x0019 # END OF MEDIUM
+0x1A 0x001A # SUBSTITUTE
+0x1B 0x001B # ESCAPE
+0x1C 0x001C # FILE SEPARATOR
+0x1D 0x001D # GROUP SEPARATOR
+0x1E 0x001E # RECORD SEPARATOR
+0x1F 0x001F # UNIT SEPARATOR
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+0x7F 0x007F # DELETE
+0x80 0x0080 # <control>
+0x81 0x0081 # <control>
+0x82 0x0082 # <control>
+0x83 0x0083 # <control>
+0x84 0x0084 # <control>
+0x85 0x0085 # <control>
+0x86 0x0086 # <control>
+0x87 0x0087 # <control>
+0x88 0x0088 # <control>
+0x89 0x0089 # <control>
+0x8A 0x008A # <control>
+0x8B 0x008B # <control>
+0x8C 0x008C # <control>
+0x8D 0x008D # <control>
+0x8E 0x008E # <control>
+0x8F 0x008F # <control>
+0x90 0x0090 # <control>
+0x91 0x0091 # <control>
+0x92 0x0092 # <control>
+0x93 0x0093 # <control>
+0x94 0x0094 # <control>
+0x95 0x0095 # <control>
+0x96 0x0096 # <control>
+0x97 0x0097 # <control>
+0x98 0x0098 # <control>
+0x99 0x0099 # <control>
+0x9A 0x009A # <control>
+0x9B 0x009B # <control>
+0x9C 0x009C # <control>
+0x9D 0x009D # <control>
+0x9E 0x009E # <control>
+0x9F 0x009F # <control>
+0xA0 0x00A0 # NO-BREAK SPACE
+0xA1 0x1E02 # LATIN CAPITAL LETTER B WITH DOT ABOVE
+0xA2 0x1E03 # LATIN SMALL LETTER B WITH DOT ABOVE
+0xA3 0x00A3 # POUND SIGN
+0xA4 0x010A # LATIN CAPITAL LETTER C WITH DOT ABOVE
+0xA5 0x010B # LATIN SMALL LETTER C WITH DOT ABOVE
+0xA6 0x1E0A # LATIN CAPITAL LETTER D WITH DOT ABOVE
+0xA7 0x00A7 # SECTION SIGN
+0xA8 0x1E80 # LATIN CAPITAL LETTER W WITH GRAVE
+0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x1E82 # LATIN CAPITAL LETTER W WITH ACUTE
+0xAB 0x1E0B # LATIN SMALL LETTER D WITH DOT ABOVE
+0xAC 0x1EF2 # LATIN CAPITAL LETTER Y WITH GRAVE
+0xAD 0x00AD # SOFT HYPHEN
+0xAE 0x00AE # REGISTERED SIGN
+0xAF 0x0178 # LATIN CAPITAL LETTER Y WITH DIAERESIS
+0xB0 0x1E1E # LATIN CAPITAL LETTER F WITH DOT ABOVE
+0xB1 0x1E1F # LATIN SMALL LETTER F WITH DOT ABOVE
+0xB2 0x0120 # LATIN CAPITAL LETTER G WITH DOT ABOVE
+0xB3 0x0121 # LATIN SMALL LETTER G WITH DOT ABOVE
+0xB4 0x1E40 # LATIN CAPITAL LETTER M WITH DOT ABOVE
+0xB5 0x1E41 # LATIN SMALL LETTER M WITH DOT ABOVE
+0xB6 0x00B6 # PILCROW SIGN
+0xB7 0x1E56 # LATIN CAPITAL LETTER P WITH DOT ABOVE
+0xB8 0x1E81 # LATIN SMALL LETTER W WITH GRAVE
+0xB9 0x1E57 # LATIN SMALL LETTER P WITH DOT ABOVE
+0xBA 0x1E83 # LATIN SMALL LETTER W WITH ACUTE
+0xBB 0x1E60 # LATIN CAPITAL LETTER S WITH DOT ABOVE
+0xBC 0x1EF3 # LATIN SMALL LETTER Y WITH GRAVE
+0xBD 0x1E84 # LATIN CAPITAL LETTER W WITH DIAERESIS
+0xBE 0x1E85 # LATIN SMALL LETTER W WITH DIAERESIS
+0xBF 0x1E61 # LATIN SMALL LETTER S WITH DOT ABOVE
+0xC0 0x00C0 # LATIN CAPITAL LETTER A WITH GRAVE
+0xC1 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE
+0xC2 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0xC3 0x00C3 # LATIN CAPITAL LETTER A WITH TILDE
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0xC5 0x00C5 # LATIN CAPITAL LETTER A WITH RING ABOVE
+0xC6 0x00C6 # LATIN CAPITAL LETTER AE
+0xC7 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA
+0xC8 0x00C8 # LATIN CAPITAL LETTER E WITH GRAVE
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0xCA 0x00CA # LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+0xCB 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS
+0xCC 0x00CC # LATIN CAPITAL LETTER I WITH GRAVE
+0xCD 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE
+0xCE 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+0xCF 0x00CF # LATIN CAPITAL LETTER I WITH DIAERESIS
+0xD0 0x0174 # LATIN CAPITAL LETTER W WITH CIRCUMFLEX
+0xD1 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE
+0xD2 0x00D2 # LATIN CAPITAL LETTER O WITH GRAVE
+0xD3 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE
+0xD4 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0xD5 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0xD7 0x1E6A # LATIN CAPITAL LETTER T WITH DOT ABOVE
+0xD8 0x00D8 # LATIN CAPITAL LETTER O WITH STROKE
+0xD9 0x00D9 # LATIN CAPITAL LETTER U WITH GRAVE
+0xDA 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE
+0xDB 0x00DB # LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0xDD 0x00DD # LATIN CAPITAL LETTER Y WITH ACUTE
+0xDE 0x0176 # LATIN CAPITAL LETTER Y WITH CIRCUMFLEX
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S
+0xE0 0x00E0 # LATIN SMALL LETTER A WITH GRAVE
+0xE1 0x00E1 # LATIN SMALL LETTER A WITH ACUTE
+0xE2 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX
+0xE3 0x00E3 # LATIN SMALL LETTER A WITH TILDE
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0xE5 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE
+0xE6 0x00E6 # LATIN SMALL LETTER AE
+0xE7 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA
+0xE8 0x00E8 # LATIN SMALL LETTER E WITH GRAVE
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0xEA 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX
+0xEB 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS
+0xEC 0x00EC # LATIN SMALL LETTER I WITH GRAVE
+0xED 0x00ED # LATIN SMALL LETTER I WITH ACUTE
+0xEE 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX
+0xEF 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS
+0xF0 0x0175 # LATIN SMALL LETTER W WITH CIRCUMFLEX
+0xF1 0x00F1 # LATIN SMALL LETTER N WITH TILDE
+0xF2 0x00F2 # LATIN SMALL LETTER O WITH GRAVE
+0xF3 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
+0xF4 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
+0xF5 0x00F5 # LATIN SMALL LETTER O WITH TILDE
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0xF7 0x1E6B # LATIN SMALL LETTER T WITH DOT ABOVE
+0xF8 0x00F8 # LATIN SMALL LETTER O WITH STROKE
+0xF9 0x00F9 # LATIN SMALL LETTER U WITH GRAVE
+0xFA 0x00FA # LATIN SMALL LETTER U WITH ACUTE
+0xFB 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xFD 0x00FD # LATIN SMALL LETTER Y WITH ACUTE
+0xFE 0x0177 # LATIN SMALL LETTER Y WITH CIRCUMFLEX
+0xFF 0x00FF # LATIN SMALL LETTER Y WITH DIAERESIS
+
--- /dev/null
+#
+# Name: ISO/IEC 8859-15:1999 to Unicode
+# Unicode version: 3.0
+# Table version: 1.0
+# Table format: Format A
+# Date: 1999 July 27
+# Authors: Markus Kuhn <mkuhn@acm.org>
+# Ken Whistler <kenw@sybase.com>
+#
+# Copyright (c) 1998 - 1999 Unicode, Inc. All Rights reserved.
+#
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
+# No claims are made as to fitness for any particular purpose. No
+# warranties of any kind are expressed or implied. The recipient
+# agrees to determine applicability of information provided. If this
+# file has been provided on optical media by Unicode, Inc., the sole
+# remedy for any claim will be exchange of defective media within 90
+# days of receipt.
+#
+# Unicode, Inc. hereby grants the right to freely use the information
+# supplied in this file in the creation of products supporting the
+# Unicode Standard, and to make copies of this file in any form for
+# internal or external distribution as long as this notice remains
+# attached.
+#
+# General notes:
+#
+# This table contains the data the Unicode Consortium has on how
+# ISO/IEC 8859-15:1999 characters map into Unicode.
+#
+# Format: Three tab-separated columns
+# Column #1 is the ISO/IEC 8859-15 code (in hex as 0xXX)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 the Unicode name (follows a comment sign, '#')
+#
+# The entries are in ISO/IEC 8859-15 order.
+#
+# Version history
+#
+# Updated versions of this file may be found in:
+# <ftp://ftp.unicode.org/Public/MAPPINGS/>
+#
+# Any comments or problems, contact <errata@unicode.org>
+# Please note that <errata@unicode.org> is an archival address;
+# notices will be checked, but do not expect an immediate response.
+#
+0x00 0x0000 # NULL
+0x01 0x0001 # START OF HEADING
+0x02 0x0002 # START OF TEXT
+0x03 0x0003 # END OF TEXT
+0x04 0x0004 # END OF TRANSMISSION
+0x05 0x0005 # ENQUIRY
+0x06 0x0006 # ACKNOWLEDGE
+0x07 0x0007 # BELL
+0x08 0x0008 # BACKSPACE
+0x09 0x0009 # HORIZONTAL TABULATION
+0x0A 0x000A # LINE FEED
+0x0B 0x000B # VERTICAL TABULATION
+0x0C 0x000C # FORM FEED
+0x0D 0x000D # CARRIAGE RETURN
+0x0E 0x000E # SHIFT OUT
+0x0F 0x000F # SHIFT IN
+0x10 0x0010 # DATA LINK ESCAPE
+0x11 0x0011 # DEVICE CONTROL ONE
+0x12 0x0012 # DEVICE CONTROL TWO
+0x13 0x0013 # DEVICE CONTROL THREE
+0x14 0x0014 # DEVICE CONTROL FOUR
+0x15 0x0015 # NEGATIVE ACKNOWLEDGE
+0x16 0x0016 # SYNCHRONOUS IDLE
+0x17 0x0017 # END OF TRANSMISSION BLOCK
+0x18 0x0018 # CANCEL
+0x19 0x0019 # END OF MEDIUM
+0x1A 0x001A # SUBSTITUTE
+0x1B 0x001B # ESCAPE
+0x1C 0x001C # FILE SEPARATOR
+0x1D 0x001D # GROUP SEPARATOR
+0x1E 0x001E # RECORD SEPARATOR
+0x1F 0x001F # UNIT SEPARATOR
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+0x7F 0x007F # DELETE
+0x80 0x0080 # <control>
+0x81 0x0081 # <control>
+0x82 0x0082 # <control>
+0x83 0x0083 # <control>
+0x84 0x0084 # <control>
+0x85 0x0085 # <control>
+0x86 0x0086 # <control>
+0x87 0x0087 # <control>
+0x88 0x0088 # <control>
+0x89 0x0089 # <control>
+0x8A 0x008A # <control>
+0x8B 0x008B # <control>
+0x8C 0x008C # <control>
+0x8D 0x008D # <control>
+0x8E 0x008E # <control>
+0x8F 0x008F # <control>
+0x90 0x0090 # <control>
+0x91 0x0091 # <control>
+0x92 0x0092 # <control>
+0x93 0x0093 # <control>
+0x94 0x0094 # <control>
+0x95 0x0095 # <control>
+0x96 0x0096 # <control>
+0x97 0x0097 # <control>
+0x98 0x0098 # <control>
+0x99 0x0099 # <control>
+0x9A 0x009A # <control>
+0x9B 0x009B # <control>
+0x9C 0x009C # <control>
+0x9D 0x009D # <control>
+0x9E 0x009E # <control>
+0x9F 0x009F # <control>
+0xA0 0x00A0 # NO-BREAK SPACE
+0xA1 0x00A1 # INVERTED EXCLAMATION MARK
+0xA2 0x00A2 # CENT SIGN
+0xA3 0x00A3 # POUND SIGN
+0xA4 0x20AC # EURO SIGN
+0xA5 0x00A5 # YEN SIGN
+0xA6 0x0160 # LATIN CAPITAL LETTER S WITH CARON
+0xA7 0x00A7 # SECTION SIGN
+0xA8 0x0161 # LATIN SMALL LETTER S WITH CARON
+0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x00AA # FEMININE ORDINAL INDICATOR
+0xAB 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xAC 0x00AC # NOT SIGN
+0xAD 0x00AD # SOFT HYPHEN
+0xAE 0x00AE # REGISTERED SIGN
+0xAF 0x00AF # MACRON
+0xB0 0x00B0 # DEGREE SIGN
+0xB1 0x00B1 # PLUS-MINUS SIGN
+0xB2 0x00B2 # SUPERSCRIPT TWO
+0xB3 0x00B3 # SUPERSCRIPT THREE
+0xB4 0x017D # LATIN CAPITAL LETTER Z WITH CARON
+0xB5 0x00B5 # MICRO SIGN
+0xB6 0x00B6 # PILCROW SIGN
+0xB7 0x00B7 # MIDDLE DOT
+0xB8 0x017E # LATIN SMALL LETTER Z WITH CARON
+0xB9 0x00B9 # SUPERSCRIPT ONE
+0xBA 0x00BA # MASCULINE ORDINAL INDICATOR
+0xBB 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xBC 0x0152 # LATIN CAPITAL LIGATURE OE
+0xBD 0x0153 # LATIN SMALL LIGATURE OE
+0xBE 0x0178 # LATIN CAPITAL LETTER Y WITH DIAERESIS
+0xBF 0x00BF # INVERTED QUESTION MARK
+0xC0 0x00C0 # LATIN CAPITAL LETTER A WITH GRAVE
+0xC1 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE
+0xC2 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0xC3 0x00C3 # LATIN CAPITAL LETTER A WITH TILDE
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0xC5 0x00C5 # LATIN CAPITAL LETTER A WITH RING ABOVE
+0xC6 0x00C6 # LATIN CAPITAL LETTER AE
+0xC7 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA
+0xC8 0x00C8 # LATIN CAPITAL LETTER E WITH GRAVE
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0xCA 0x00CA # LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+0xCB 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS
+0xCC 0x00CC # LATIN CAPITAL LETTER I WITH GRAVE
+0xCD 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE
+0xCE 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+0xCF 0x00CF # LATIN CAPITAL LETTER I WITH DIAERESIS
+0xD0 0x00D0 # LATIN CAPITAL LETTER ETH
+0xD1 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE
+0xD2 0x00D2 # LATIN CAPITAL LETTER O WITH GRAVE
+0xD3 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE
+0xD4 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0xD5 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0xD7 0x00D7 # MULTIPLICATION SIGN
+0xD8 0x00D8 # LATIN CAPITAL LETTER O WITH STROKE
+0xD9 0x00D9 # LATIN CAPITAL LETTER U WITH GRAVE
+0xDA 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE
+0xDB 0x00DB # LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0xDD 0x00DD # LATIN CAPITAL LETTER Y WITH ACUTE
+0xDE 0x00DE # LATIN CAPITAL LETTER THORN
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S
+0xE0 0x00E0 # LATIN SMALL LETTER A WITH GRAVE
+0xE1 0x00E1 # LATIN SMALL LETTER A WITH ACUTE
+0xE2 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX
+0xE3 0x00E3 # LATIN SMALL LETTER A WITH TILDE
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0xE5 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE
+0xE6 0x00E6 # LATIN SMALL LETTER AE
+0xE7 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA
+0xE8 0x00E8 # LATIN SMALL LETTER E WITH GRAVE
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0xEA 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX
+0xEB 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS
+0xEC 0x00EC # LATIN SMALL LETTER I WITH GRAVE
+0xED 0x00ED # LATIN SMALL LETTER I WITH ACUTE
+0xEE 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX
+0xEF 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS
+0xF0 0x00F0 # LATIN SMALL LETTER ETH
+0xF1 0x00F1 # LATIN SMALL LETTER N WITH TILDE
+0xF2 0x00F2 # LATIN SMALL LETTER O WITH GRAVE
+0xF3 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
+0xF4 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
+0xF5 0x00F5 # LATIN SMALL LETTER O WITH TILDE
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0xF7 0x00F7 # DIVISION SIGN
+0xF8 0x00F8 # LATIN SMALL LETTER O WITH STROKE
+0xF9 0x00F9 # LATIN SMALL LETTER U WITH GRAVE
+0xFA 0x00FA # LATIN SMALL LETTER U WITH ACUTE
+0xFB 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xFD 0x00FD # LATIN SMALL LETTER Y WITH ACUTE
+0xFE 0x00FE # LATIN SMALL LETTER THORN
+0xFF 0x00FF # LATIN SMALL LETTER Y WITH DIAERESIS
+
--- /dev/null
+#\r
+# Name: ISO 8859-2 (1987) to Unicode\r
+# Unicode version: 1.1\r
+# Table version: 0.1\r
+# Table format: Format A\r
+# Date: 16 January 1995\r
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>\r
+# John H. Jenkins <John_Jenkins@taligent.com>\r
+#\r
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.\r
+#\r
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).\r
+# No claims are made as to fitness for any particular purpose. No\r
+# warranties of any kind are expressed or implied. The recipient\r
+# agrees to determine applicability of information provided. If this\r
+# file has been provided on magnetic media by Unicode, Inc., the sole\r
+# remedy for any claim will be exchange of defective media within 90\r
+# days of receipt.\r
+#\r
+# Recipient is granted the right to make copies in any form for\r
+# internal distribution and to freely use the information supplied\r
+# in the creation of products supporting Unicode. Unicode, Inc.\r
+# specifically excludes the right to re-distribute this file directly\r
+# to third parties or other organizations whether for profit or not.\r
+#\r
+# General notes:\r
+#\r
+# This table contains the data the Unicode Consortium has on how\r
+# ISO 8859-2 (1987) characters map into Unicode.\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the ISO 8859-2 code (in hex as 0xXX)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in ISO 8859-2 order\r
+#\r
+# Any comments or problems, contact <John_Jenkins@taligent.com>\r
+#\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0030 # DIGIT ZERO\r
+0x31 0x0031 # DIGIT ONE\r
+0x32 0x0032 # DIGIT TWO\r
+0x33 0x0033 # DIGIT THREE\r
+0x34 0x0034 # DIGIT FOUR\r
+0x35 0x0035 # DIGIT FIVE\r
+0x36 0x0036 # DIGIT SIX\r
+0x37 0x0037 # DIGIT SEVEN\r
+0x38 0x0038 # DIGIT EIGHT\r
+0x39 0x0039 # DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
+0xA0 0x00A0 # NO-BREAK SPACE\r
+0xA1 0x0104 # LATIN CAPITAL LETTER A WITH OGONEK\r
+0xA2 0x02D8 # BREVE\r
+0xA3 0x0141 # LATIN CAPITAL LETTER L WITH STROKE\r
+0xA4 0x00A4 # CURRENCY SIGN\r
+0xA5 0x013D # LATIN CAPITAL LETTER L WITH CARON\r
+0xA6 0x015A # LATIN CAPITAL LETTER S WITH ACUTE\r
+0xA7 0x00A7 # SECTION SIGN\r
+0xA8 0x00A8 # DIAERESIS\r
+0xA9 0x0160 # LATIN CAPITAL LETTER S WITH CARON\r
+0xAA 0x015E # LATIN CAPITAL LETTER S WITH CEDILLA\r
+0xAB 0x0164 # LATIN CAPITAL LETTER T WITH CARON\r
+0xAC 0x0179 # LATIN CAPITAL LETTER Z WITH ACUTE\r
+0xAD 0x00AD # SOFT HYPHEN\r
+0xAE 0x017D # LATIN CAPITAL LETTER Z WITH CARON\r
+0xAF 0x017B # LATIN CAPITAL LETTER Z WITH DOT ABOVE\r
+0xB0 0x00B0 # DEGREE SIGN\r
+0xB1 0x0105 # LATIN SMALL LETTER A WITH OGONEK\r
+0xB2 0x02DB # OGONEK\r
+0xB3 0x0142 # LATIN SMALL LETTER L WITH STROKE\r
+0xB4 0x00B4 # ACUTE ACCENT\r
+0xB5 0x013E # LATIN SMALL LETTER L WITH CARON\r
+0xB6 0x015B # LATIN SMALL LETTER S WITH ACUTE\r
+0xB7 0x02C7 # CARON\r
+0xB8 0x00B8 # CEDILLA\r
+0xB9 0x0161 # LATIN SMALL LETTER S WITH CARON\r
+0xBA 0x015F # LATIN SMALL LETTER S WITH CEDILLA\r
+0xBB 0x0165 # LATIN SMALL LETTER T WITH CARON\r
+0xBC 0x017A # LATIN SMALL LETTER Z WITH ACUTE\r
+0xBD 0x02DD # DOUBLE ACUTE ACCENT\r
+0xBE 0x017E # LATIN SMALL LETTER Z WITH CARON\r
+0xBF 0x017C # LATIN SMALL LETTER Z WITH DOT ABOVE\r
+0xC0 0x0154 # LATIN CAPITAL LETTER R WITH ACUTE\r
+0xC1 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC3 0x0102 # LATIN CAPITAL LETTER A WITH BREVE\r
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x0139 # LATIN CAPITAL LETTER L WITH ACUTE\r
+0xC6 0x0106 # LATIN CAPITAL LETTER C WITH ACUTE\r
+0xC7 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA\r
+0xC8 0x010C # LATIN CAPITAL LETTER C WITH CARON\r
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x0118 # LATIN CAPITAL LETTER E WITH OGONEK\r
+0xCB 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x011A # LATIN CAPITAL LETTER E WITH CARON\r
+0xCD 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x010E # LATIN CAPITAL LETTER D WITH CARON\r
+0xD0 0x0110 # LATIN CAPITAL LETTER D WITH STROKE\r
+0xD1 0x0143 # LATIN CAPITAL LETTER N WITH ACUTE\r
+0xD2 0x0147 # LATIN CAPITAL LETTER N WITH CARON\r
+0xD3 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x0150 # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE\r
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 # MULTIPLICATION SIGN\r
+0xD8 0x0158 # LATIN CAPITAL LETTER R WITH CARON\r
+0xD9 0x016E # LATIN CAPITAL LETTER U WITH RING ABOVE\r
+0xDA 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x0170 # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE\r
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x00DD # LATIN CAPITAL LETTER Y WITH ACUTE\r
+0xDE 0x0162 # LATIN CAPITAL LETTER T WITH CEDILLA\r
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S\r
+0xE0 0x0155 # LATIN SMALL LETTER R WITH ACUTE\r
+0xE1 0x00E1 # LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x0103 # LATIN SMALL LETTER A WITH BREVE\r
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x013A # LATIN SMALL LETTER L WITH ACUTE\r
+0xE6 0x0107 # LATIN SMALL LETTER C WITH ACUTE\r
+0xE7 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x010D # LATIN SMALL LETTER C WITH CARON\r
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x0119 # LATIN SMALL LETTER E WITH OGONEK\r
+0xEB 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x011B # LATIN SMALL LETTER E WITH CARON\r
+0xED 0x00ED # LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x010F # LATIN SMALL LETTER D WITH CARON\r
+0xF0 0x0111 # LATIN SMALL LETTER D WITH STROKE\r
+0xF1 0x0144 # LATIN SMALL LETTER N WITH ACUTE\r
+0xF2 0x0148 # LATIN SMALL LETTER N WITH CARON\r
+0xF3 0x00F3 # LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x0151 # LATIN SMALL LETTER O WITH DOUBLE ACUTE\r
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 # DIVISION SIGN\r
+0xF8 0x0159 # LATIN SMALL LETTER R WITH CARON\r
+0xF9 0x016F # LATIN SMALL LETTER U WITH RING ABOVE\r
+0xFA 0x00FA # LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x0171 # LATIN SMALL LETTER U WITH DOUBLE ACUTE\r
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x00FD # LATIN SMALL LETTER Y WITH ACUTE\r
+0xFE 0x0163 # LATIN SMALL LETTER T WITH CEDILLA\r
+0xFF 0x02D9 # DOT ABOVE\r
--- /dev/null
+#\r
+# Name: ISO 8859-3 (1988) to Unicode\r
+# Unicode version: 1.1\r
+# Table version: 0.1\r
+# Table format: Format A\r
+# Date: 16 January 1995\r
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>\r
+# John H. Jenkins <John_Jenkins@taligent.com>\r
+#\r
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.\r
+#\r
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).\r
+# No claims are made as to fitness for any particular purpose. No\r
+# warranties of any kind are expressed or implied. The recipient\r
+# agrees to determine applicability of information provided. If this\r
+# file has been provided on magnetic media by Unicode, Inc., the sole\r
+# remedy for any claim will be exchange of defective media within 90\r
+# days of receipt.\r
+#\r
+# Recipient is granted the right to make copies in any form for\r
+# internal distribution and to freely use the information supplied\r
+# in the creation of products supporting Unicode. Unicode, Inc.\r
+# specifically excludes the right to re-distribute this file directly\r
+# to third parties or other organizations whether for profit or not.\r
+#\r
+# General notes:\r
+#\r
+# This table contains the data the Unicode Consortium has on how\r
+# ISO 8859-3 (1988) characters map into Unicode.\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the ISO 8859-3 code (in hex as 0xXX)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in ISO 8859-3 order\r
+#\r
+# Any comments or problems, contact <John_Jenkins@taligent.com>\r
+#\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0030 # DIGIT ZERO\r
+0x31 0x0031 # DIGIT ONE\r
+0x32 0x0032 # DIGIT TWO\r
+0x33 0x0033 # DIGIT THREE\r
+0x34 0x0034 # DIGIT FOUR\r
+0x35 0x0035 # DIGIT FIVE\r
+0x36 0x0036 # DIGIT SIX\r
+0x37 0x0037 # DIGIT SEVEN\r
+0x38 0x0038 # DIGIT EIGHT\r
+0x39 0x0039 # DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
+0xA0 0x00A0 # NO-BREAK SPACE\r
+0xA1 0x0126 # LATIN CAPITAL LETTER H WITH STROKE\r
+0xA2 0x02D8 # BREVE\r
+0xA3 0x00A3 # POUND SIGN\r
+0xA4 0x00A4 # CURRENCY SIGN\r
+0xA6 0x0124 # LATIN CAPITAL LETTER H WITH CIRCUMFLEX\r
+0xA7 0x00A7 # SECTION SIGN\r
+0xA8 0x00A8 # DIAERESIS\r
+0xA9 0x0130 # LATIN CAPITAL LETTER I WITH DOT ABOVE\r
+0xAA 0x015E # LATIN CAPITAL LETTER S WITH CEDILLA\r
+0xAB 0x011E # LATIN CAPITAL LETTER G WITH BREVE\r
+0xAC 0x0134 # LATIN CAPITAL LETTER J WITH CIRCUMFLEX\r
+0xAD 0x00AD # SOFT HYPHEN\r
+0xAF 0x017B # LATIN CAPITAL LETTER Z WITH DOT ABOVE\r
+0xB0 0x00B0 # DEGREE SIGN\r
+0xB1 0x0127 # LATIN SMALL LETTER H WITH STROKE\r
+0xB2 0x00B2 # SUPERSCRIPT TWO\r
+0xB3 0x00B3 # SUPERSCRIPT THREE\r
+0xB4 0x00B4 # ACUTE ACCENT\r
+0xB5 0x00B5 # MICRO SIGN\r
+0xB6 0x0125 # LATIN SMALL LETTER H WITH CIRCUMFLEX\r
+0xB7 0x00B7 # MIDDLE DOT\r
+0xB8 0x00B8 # CEDILLA\r
+0xB9 0x0131 # LATIN SMALL LETTER DOTLESS I\r
+0xBA 0x015F # LATIN SMALL LETTER S WITH CEDILLA\r
+0xBB 0x011F # LATIN SMALL LETTER G WITH BREVE\r
+0xBC 0x0135 # LATIN SMALL LETTER J WITH CIRCUMFLEX\r
+0xBD 0x00BD # VULGAR FRACTION ONE HALF\r
+0xBF 0x017C # LATIN SMALL LETTER Z WITH DOT ABOVE\r
+0xC0 0x00C0 # LATIN CAPITAL LETTER A WITH GRAVE\r
+0xC1 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x010A # LATIN CAPITAL LETTER C WITH DOT ABOVE\r
+0xC6 0x0108 # LATIN CAPITAL LETTER C WITH CIRCUMFLEX\r
+0xC7 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA\r
+0xC8 0x00C8 # LATIN CAPITAL LETTER E WITH GRAVE\r
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x00CA # LATIN CAPITAL LETTER E WITH CIRCUMFLEX\r
+0xCB 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x00CC # LATIN CAPITAL LETTER I WITH GRAVE\r
+0xCD 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x00CF # LATIN CAPITAL LETTER I WITH DIAERESIS\r
+0xD1 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE\r
+0xD2 0x00D2 # LATIN CAPITAL LETTER O WITH GRAVE\r
+0xD3 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x0120 # LATIN CAPITAL LETTER G WITH DOT ABOVE\r
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 # MULTIPLICATION SIGN\r
+0xD8 0x011C # LATIN CAPITAL LETTER G WITH CIRCUMFLEX\r
+0xD9 0x00D9 # LATIN CAPITAL LETTER U WITH GRAVE\r
+0xDA 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x00DB # LATIN CAPITAL LETTER U WITH CIRCUMFLEX\r
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x016C # LATIN CAPITAL LETTER U WITH BREVE\r
+0xDE 0x015C # LATIN CAPITAL LETTER S WITH CIRCUMFLEX\r
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S\r
+0xE0 0x00E0 # LATIN SMALL LETTER A WITH GRAVE\r
+0xE1 0x00E1 # LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x010B # LATIN SMALL LETTER C WITH DOT ABOVE\r
+0xE6 0x0109 # LATIN SMALL LETTER C WITH CIRCUMFLEX\r
+0xE7 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x00E8 # LATIN SMALL LETTER E WITH GRAVE\r
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX\r
+0xEB 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x00EC # LATIN SMALL LETTER I WITH GRAVE\r
+0xED 0x00ED # LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS\r
+0xF1 0x00F1 # LATIN SMALL LETTER N WITH TILDE\r
+0xF2 0x00F2 # LATIN SMALL LETTER O WITH GRAVE\r
+0xF3 0x00F3 # LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x0121 # LATIN SMALL LETTER G WITH DOT ABOVE\r
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 # DIVISION SIGN\r
+0xF8 0x011D # LATIN SMALL LETTER G WITH CIRCUMFLEX\r
+0xF9 0x00F9 # LATIN SMALL LETTER U WITH GRAVE\r
+0xFA 0x00FA # LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX\r
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x016D # LATIN SMALL LETTER U WITH BREVE\r
+0xFE 0x015D # LATIN SMALL LETTER S WITH CIRCUMFLEX\r
+0xFF 0x02D9 # DOT ABOVE\r
--- /dev/null
+#\r
+# Name: ISO 8859-4 (1988) to Unicode\r
+# Unicode version: 1.1\r
+# Table version: 0.1\r
+# Table format: Format A\r
+# Date: 16 January 1995\r
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>\r
+# John H. Jenkins <John_Jenkins@taligent.com>\r
+#\r
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.\r
+#\r
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).\r
+# No claims are made as to fitness for any particular purpose. No\r
+# warranties of any kind are expressed or implied. The recipient\r
+# agrees to determine applicability of information provided. If this\r
+# file has been provided on magnetic media by Unicode, Inc., the sole\r
+# remedy for any claim will be exchange of defective media within 90\r
+# days of receipt.\r
+#\r
+# Recipient is granted the right to make copies in any form for\r
+# internal distribution and to freely use the information supplied\r
+# in the creation of products supporting Unicode. Unicode, Inc.\r
+# specifically excludes the right to re-distribute this file directly\r
+# to third parties or other organizations whether for profit or not.\r
+#\r
+# General notes:\r
+#\r
+# This table contains the data the Unicode Consortium has on how\r
+# ISO 8859-4 (1988) characters map into Unicode.\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the ISO 8859-4 code (in hex as 0xXX)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in ISO 8859-4 order\r
+#\r
+# Any comments or problems, contact <John_Jenkins@taligent.com>\r
+#\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0030 # DIGIT ZERO\r
+0x31 0x0031 # DIGIT ONE\r
+0x32 0x0032 # DIGIT TWO\r
+0x33 0x0033 # DIGIT THREE\r
+0x34 0x0034 # DIGIT FOUR\r
+0x35 0x0035 # DIGIT FIVE\r
+0x36 0x0036 # DIGIT SIX\r
+0x37 0x0037 # DIGIT SEVEN\r
+0x38 0x0038 # DIGIT EIGHT\r
+0x39 0x0039 # DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
+0xA0 0x00A0 # NO-BREAK SPACE\r
+0xA1 0x0104 # LATIN CAPITAL LETTER A WITH OGONEK\r
+0xA2 0x0138 # LATIN SMALL LETTER KRA\r
+0xA3 0x0156 # LATIN CAPITAL LETTER R WITH CEDILLA\r
+0xA4 0x00A4 # CURRENCY SIGN\r
+0xA5 0x0128 # LATIN CAPITAL LETTER I WITH TILDE\r
+0xA6 0x013B # LATIN CAPITAL LETTER L WITH CEDILLA\r
+0xA7 0x00A7 # SECTION SIGN\r
+0xA8 0x00A8 # DIAERESIS\r
+0xA9 0x0160 # LATIN CAPITAL LETTER S WITH CARON\r
+0xAA 0x0112 # LATIN CAPITAL LETTER E WITH MACRON\r
+0xAB 0x0122 # LATIN CAPITAL LETTER G WITH CEDILLA\r
+0xAC 0x0166 # LATIN CAPITAL LETTER T WITH STROKE\r
+0xAD 0x00AD # SOFT HYPHEN\r
+0xAE 0x017D # LATIN CAPITAL LETTER Z WITH CARON\r
+0xAF 0x00AF # MACRON\r
+0xB0 0x00B0 # DEGREE SIGN\r
+0xB1 0x0105 # LATIN SMALL LETTER A WITH OGONEK\r
+0xB2 0x02DB # OGONEK\r
+0xB3 0x0157 # LATIN SMALL LETTER R WITH CEDILLA\r
+0xB4 0x00B4 # ACUTE ACCENT\r
+0xB5 0x0129 # LATIN SMALL LETTER I WITH TILDE\r
+0xB6 0x013C # LATIN SMALL LETTER L WITH CEDILLA\r
+0xB7 0x02C7 # CARON\r
+0xB8 0x00B8 # CEDILLA\r
+0xB9 0x0161 # LATIN SMALL LETTER S WITH CARON\r
+0xBA 0x0113 # LATIN SMALL LETTER E WITH MACRON\r
+0xBB 0x0123 # LATIN SMALL LETTER G WITH CEDILLA\r
+0xBC 0x0167 # LATIN SMALL LETTER T WITH STROKE\r
+0xBD 0x014A # LATIN CAPITAL LETTER ENG\r
+0xBE 0x017E # LATIN SMALL LETTER Z WITH CARON\r
+0xBF 0x014B # LATIN SMALL LETTER ENG\r
+0xC0 0x0100 # LATIN CAPITAL LETTER A WITH MACRON\r
+0xC1 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC3 0x00C3 # LATIN CAPITAL LETTER A WITH TILDE\r
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x00C5 # LATIN CAPITAL LETTER A WITH RING ABOVE\r
+0xC6 0x00C6 # LATIN CAPITAL LETTER AE\r
+0xC7 0x012E # LATIN CAPITAL LETTER I WITH OGONEK\r
+0xC8 0x010C # LATIN CAPITAL LETTER C WITH CARON\r
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x0118 # LATIN CAPITAL LETTER E WITH OGONEK\r
+0xCB 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x0116 # LATIN CAPITAL LETTER E WITH DOT ABOVE\r
+0xCD 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x012A # LATIN CAPITAL LETTER I WITH MACRON\r
+0xD0 0x0110 # LATIN CAPITAL LETTER D WITH STROKE\r
+0xD1 0x0145 # LATIN CAPITAL LETTER N WITH CEDILLA\r
+0xD2 0x014C # LATIN CAPITAL LETTER O WITH MACRON\r
+0xD3 0x0136 # LATIN CAPITAL LETTER K WITH CEDILLA\r
+0xD4 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE\r
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 # MULTIPLICATION SIGN\r
+0xD8 0x00D8 # LATIN CAPITAL LETTER O WITH STROKE\r
+0xD9 0x0172 # LATIN CAPITAL LETTER U WITH OGONEK\r
+0xDA 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x00DB # LATIN CAPITAL LETTER U WITH CIRCUMFLEX\r
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x0168 # LATIN CAPITAL LETTER U WITH TILDE\r
+0xDE 0x016A # LATIN CAPITAL LETTER U WITH MACRON\r
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S\r
+0xE0 0x0101 # LATIN SMALL LETTER A WITH MACRON\r
+0xE1 0x00E1 # LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x00E3 # LATIN SMALL LETTER A WITH TILDE\r
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE\r
+0xE6 0x00E6 # LATIN SMALL LETTER AE\r
+0xE7 0x012F # LATIN SMALL LETTER I WITH OGONEK\r
+0xE8 0x010D # LATIN SMALL LETTER C WITH CARON\r
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x0119 # LATIN SMALL LETTER E WITH OGONEK\r
+0xEB 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x0117 # LATIN SMALL LETTER E WITH DOT ABOVE\r
+0xED 0x00ED # LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x012B # LATIN SMALL LETTER I WITH MACRON\r
+0xF0 0x0111 # LATIN SMALL LETTER D WITH STROKE\r
+0xF1 0x0146 # LATIN SMALL LETTER N WITH CEDILLA\r
+0xF2 0x014D # LATIN SMALL LETTER O WITH MACRON\r
+0xF3 0x0137 # LATIN SMALL LETTER K WITH CEDILLA\r
+0xF4 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x00F5 # LATIN SMALL LETTER O WITH TILDE\r
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 # DIVISION SIGN\r
+0xF8 0x00F8 # LATIN SMALL LETTER O WITH STROKE\r
+0xF9 0x0173 # LATIN SMALL LETTER U WITH OGONEK\r
+0xFA 0x00FA # LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX\r
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x0169 # LATIN SMALL LETTER U WITH TILDE\r
+0xFE 0x016B # LATIN SMALL LETTER U WITH MACRON\r
+0xFF 0x02D9 # DOT ABOVE\r
--- /dev/null
+#\r
+# Name: ISO 8859-5 (1988) to Unicode\r
+# Unicode version: 1.1\r
+# Table version: 0.1\r
+# Table format: Format A\r
+# Date: 16 January 1995\r
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>\r
+# John H. Jenkins <John_Jenkins@taligent.com>\r
+#\r
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.\r
+#\r
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).\r
+# No claims are made as to fitness for any particular purpose. No\r
+# warranties of any kind are expressed or implied. The recipient\r
+# agrees to determine applicability of information provided. If this\r
+# file has been provided on magnetic media by Unicode, Inc., the sole\r
+# remedy for any claim will be exchange of defective media within 90\r
+# days of receipt.\r
+#\r
+# Recipient is granted the right to make copies in any form for\r
+# internal distribution and to freely use the information supplied\r
+# in the creation of products supporting Unicode. Unicode, Inc.\r
+# specifically excludes the right to re-distribute this file directly\r
+# to third parties or other organizations whether for profit or not.\r
+#\r
+# General notes:\r
+#\r
+# This table contains the data the Unicode Consortium has on how\r
+# ISO 8859-5 (1988) characters map into Unicode.\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the ISO 8859-5 code (in hex as 0xXX)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in ISO 8859-5 order\r
+#\r
+# Any comments or problems, contact <John_Jenkins@taligent.com>\r
+#\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0030 # DIGIT ZERO\r
+0x31 0x0031 # DIGIT ONE\r
+0x32 0x0032 # DIGIT TWO\r
+0x33 0x0033 # DIGIT THREE\r
+0x34 0x0034 # DIGIT FOUR\r
+0x35 0x0035 # DIGIT FIVE\r
+0x36 0x0036 # DIGIT SIX\r
+0x37 0x0037 # DIGIT SEVEN\r
+0x38 0x0038 # DIGIT EIGHT\r
+0x39 0x0039 # DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
+0xA0 0x00A0 # NO-BREAK SPACE\r
+0xA1 0x0401 # CYRILLIC CAPITAL LETTER IO\r
+0xA2 0x0402 # CYRILLIC CAPITAL LETTER DJE\r
+0xA3 0x0403 # CYRILLIC CAPITAL LETTER GJE\r
+0xA4 0x0404 # CYRILLIC CAPITAL LETTER UKRAINIAN IE\r
+0xA5 0x0405 # CYRILLIC CAPITAL LETTER DZE\r
+0xA6 0x0406 # CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I\r
+0xA7 0x0407 # CYRILLIC CAPITAL LETTER YI\r
+0xA8 0x0408 # CYRILLIC CAPITAL LETTER JE\r
+0xA9 0x0409 # CYRILLIC CAPITAL LETTER LJE\r
+0xAA 0x040A # CYRILLIC CAPITAL LETTER NJE\r
+0xAB 0x040B # CYRILLIC CAPITAL LETTER TSHE\r
+0xAC 0x040C # CYRILLIC CAPITAL LETTER KJE\r
+0xAD 0x00AD # SOFT HYPHEN\r
+0xAE 0x040E # CYRILLIC CAPITAL LETTER SHORT U\r
+0xAF 0x040F # CYRILLIC CAPITAL LETTER DZHE\r
+0xB0 0x0410 # CYRILLIC CAPITAL LETTER A\r
+0xB1 0x0411 # CYRILLIC CAPITAL LETTER BE\r
+0xB2 0x0412 # CYRILLIC CAPITAL LETTER VE\r
+0xB3 0x0413 # CYRILLIC CAPITAL LETTER GHE\r
+0xB4 0x0414 # CYRILLIC CAPITAL LETTER DE\r
+0xB5 0x0415 # CYRILLIC CAPITAL LETTER IE\r
+0xB6 0x0416 # CYRILLIC CAPITAL LETTER ZHE\r
+0xB7 0x0417 # CYRILLIC CAPITAL LETTER ZE\r
+0xB8 0x0418 # CYRILLIC CAPITAL LETTER I\r
+0xB9 0x0419 # CYRILLIC CAPITAL LETTER SHORT I\r
+0xBA 0x041A # CYRILLIC CAPITAL LETTER KA\r
+0xBB 0x041B # CYRILLIC CAPITAL LETTER EL\r
+0xBC 0x041C # CYRILLIC CAPITAL LETTER EM\r
+0xBD 0x041D # CYRILLIC CAPITAL LETTER EN\r
+0xBE 0x041E # CYRILLIC CAPITAL LETTER O\r
+0xBF 0x041F # CYRILLIC CAPITAL LETTER PE\r
+0xC0 0x0420 # CYRILLIC CAPITAL LETTER ER\r
+0xC1 0x0421 # CYRILLIC CAPITAL LETTER ES\r
+0xC2 0x0422 # CYRILLIC CAPITAL LETTER TE\r
+0xC3 0x0423 # CYRILLIC CAPITAL LETTER U\r
+0xC4 0x0424 # CYRILLIC CAPITAL LETTER EF\r
+0xC5 0x0425 # CYRILLIC CAPITAL LETTER HA\r
+0xC6 0x0426 # CYRILLIC CAPITAL LETTER TSE\r
+0xC7 0x0427 # CYRILLIC CAPITAL LETTER CHE\r
+0xC8 0x0428 # CYRILLIC CAPITAL LETTER SHA\r
+0xC9 0x0429 # CYRILLIC CAPITAL LETTER SHCHA\r
+0xCA 0x042A # CYRILLIC CAPITAL LETTER HARD SIGN\r
+0xCB 0x042B # CYRILLIC CAPITAL LETTER YERU\r
+0xCC 0x042C # CYRILLIC CAPITAL LETTER SOFT SIGN\r
+0xCD 0x042D # CYRILLIC CAPITAL LETTER E\r
+0xCE 0x042E # CYRILLIC CAPITAL LETTER YU\r
+0xCF 0x042F # CYRILLIC CAPITAL LETTER YA\r
+0xD0 0x0430 # CYRILLIC SMALL LETTER A\r
+0xD1 0x0431 # CYRILLIC SMALL LETTER BE\r
+0xD2 0x0432 # CYRILLIC SMALL LETTER VE\r
+0xD3 0x0433 # CYRILLIC SMALL LETTER GHE\r
+0xD4 0x0434 # CYRILLIC SMALL LETTER DE\r
+0xD5 0x0435 # CYRILLIC SMALL LETTER IE\r
+0xD6 0x0436 # CYRILLIC SMALL LETTER ZHE\r
+0xD7 0x0437 # CYRILLIC SMALL LETTER ZE\r
+0xD8 0x0438 # CYRILLIC SMALL LETTER I\r
+0xD9 0x0439 # CYRILLIC SMALL LETTER SHORT I\r
+0xDA 0x043A # CYRILLIC SMALL LETTER KA\r
+0xDB 0x043B # CYRILLIC SMALL LETTER EL\r
+0xDC 0x043C # CYRILLIC SMALL LETTER EM\r
+0xDD 0x043D # CYRILLIC SMALL LETTER EN\r
+0xDE 0x043E # CYRILLIC SMALL LETTER O\r
+0xDF 0x043F # CYRILLIC SMALL LETTER PE\r
+0xE0 0x0440 # CYRILLIC SMALL LETTER ER\r
+0xE1 0x0441 # CYRILLIC SMALL LETTER ES\r
+0xE2 0x0442 # CYRILLIC SMALL LETTER TE\r
+0xE3 0x0443 # CYRILLIC SMALL LETTER U\r
+0xE4 0x0444 # CYRILLIC SMALL LETTER EF\r
+0xE5 0x0445 # CYRILLIC SMALL LETTER HA\r
+0xE6 0x0446 # CYRILLIC SMALL LETTER TSE\r
+0xE7 0x0447 # CYRILLIC SMALL LETTER CHE\r
+0xE8 0x0448 # CYRILLIC SMALL LETTER SHA\r
+0xE9 0x0449 # CYRILLIC SMALL LETTER SHCHA\r
+0xEA 0x044A # CYRILLIC SMALL LETTER HARD SIGN\r
+0xEB 0x044B # CYRILLIC SMALL LETTER YERU\r
+0xEC 0x044C # CYRILLIC SMALL LETTER SOFT SIGN\r
+0xED 0x044D # CYRILLIC SMALL LETTER E\r
+0xEE 0x044E # CYRILLIC SMALL LETTER YU\r
+0xEF 0x044F # CYRILLIC SMALL LETTER YA\r
+0xF0 0x2116 # NUMERO SIGN\r
+0xF1 0x0451 # CYRILLIC SMALL LETTER IO\r
+0xF2 0x0452 # CYRILLIC SMALL LETTER DJE\r
+0xF3 0x0453 # CYRILLIC SMALL LETTER GJE\r
+0xF4 0x0454 # CYRILLIC SMALL LETTER UKRAINIAN IE\r
+0xF5 0x0455 # CYRILLIC SMALL LETTER DZE\r
+0xF6 0x0456 # CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I\r
+0xF7 0x0457 # CYRILLIC SMALL LETTER YI\r
+0xF8 0x0458 # CYRILLIC SMALL LETTER JE\r
+0xF9 0x0459 # CYRILLIC SMALL LETTER LJE\r
+0xFA 0x045A # CYRILLIC SMALL LETTER NJE\r
+0xFB 0x045B # CYRILLIC SMALL LETTER TSHE\r
+0xFC 0x045C # CYRILLIC SMALL LETTER KJE\r
+0xFD 0x00A7 # SECTION SIGN\r
+0xFE 0x045E # CYRILLIC SMALL LETTER SHORT U\r
+0xFF 0x045F # CYRILLIC SMALL LETTER DZHE\r
--- /dev/null
+#\r
+# Name: ISO 8859-6 (1987) to Unicode\r
+# Unicode version: 1.1\r
+# Table version: 0.1\r
+# Table format: Format A\r
+# Date: 16 January 1995\r
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>\r
+# John H. Jenkins <John_Jenkins@taligent.com>\r
+#\r
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.\r
+#\r
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).\r
+# No claims are made as to fitness for any particular purpose. No\r
+# warranties of any kind are expressed or implied. The recipient\r
+# agrees to determine applicability of information provided. If this\r
+# file has been provided on magnetic media by Unicode, Inc., the sole\r
+# remedy for any claim will be exchange of defective media within 90\r
+# days of receipt.\r
+#\r
+# Recipient is granted the right to make copies in any form for\r
+# internal distribution and to freely use the information supplied\r
+# in the creation of products supporting Unicode. Unicode, Inc.\r
+# specifically excludes the right to re-distribute this file directly\r
+# to third parties or other organizations whether for profit or not.\r
+#\r
+# General notes:\r
+#\r
+# This table contains the data the Unicode Consortium has on how\r
+# ISO 8859-6 (1987) characters map into Unicode.\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the ISO 8859-6 code (in hex as 0xXX)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in ISO 8859-6 order\r
+#\r
+# Any comments or problems, contact <John_Jenkins@taligent.com>\r
+#\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0660 # ARABIC-INDIC DIGIT ZERO\r
+0x31 0x0661 # ARABIC-INDIC DIGIT ONE\r
+0x32 0x0662 # ARABIC-INDIC DIGIT TWO\r
+0x33 0x0663 # ARABIC-INDIC DIGIT THREE\r
+0x34 0x0664 # ARABIC-INDIC DIGIT FOUR\r
+0x35 0x0665 # ARABIC-INDIC DIGIT FIVE\r
+0x36 0x0666 # ARABIC-INDIC DIGIT SIX\r
+0x37 0x0667 # ARABIC-INDIC DIGIT SEVEN\r
+0x38 0x0668 # ARABIC-INDIC DIGIT EIGHT\r
+0x39 0x0669 # ARABIC-INDIC DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
+0xA0 0x00A0 # NO-BREAK SPACE\r
+0xA4 0x00A4 # CURRENCY SIGN\r
+0xAC 0x060C # ARABIC COMMA\r
+0xAD 0x00AD # SOFT HYPHEN\r
+0xBB 0x061B # ARABIC SEMICOLON\r
+0xBF 0x061F # ARABIC QUESTION MARK\r
+0xC1 0x0621 # ARABIC LETTER HAMZA\r
+0xC2 0x0622 # ARABIC LETTER ALEF WITH MADDA ABOVE\r
+0xC3 0x0623 # ARABIC LETTER ALEF WITH HAMZA ABOVE\r
+0xC4 0x0624 # ARABIC LETTER WAW WITH HAMZA ABOVE\r
+0xC5 0x0625 # ARABIC LETTER ALEF WITH HAMZA BELOW\r
+0xC6 0x0626 # ARABIC LETTER YEH WITH HAMZA ABOVE\r
+0xC7 0x0627 # ARABIC LETTER ALEF\r
+0xC8 0x0628 # ARABIC LETTER BEH\r
+0xC9 0x0629 # ARABIC LETTER TEH MARBUTA\r
+0xCA 0x062A # ARABIC LETTER TEH\r
+0xCB 0x062B # ARABIC LETTER THEH\r
+0xCC 0x062C # ARABIC LETTER JEEM\r
+0xCD 0x062D # ARABIC LETTER HAH\r
+0xCE 0x062E # ARABIC LETTER KHAH\r
+0xCF 0x062F # ARABIC LETTER DAL\r
+0xD0 0x0630 # ARABIC LETTER THAL\r
+0xD1 0x0631 # ARABIC LETTER REH\r
+0xD2 0x0632 # ARABIC LETTER ZAIN\r
+0xD3 0x0633 # ARABIC LETTER SEEN\r
+0xD4 0x0634 # ARABIC LETTER SHEEN\r
+0xD5 0x0635 # ARABIC LETTER SAD\r
+0xD6 0x0636 # ARABIC LETTER DAD\r
+0xD7 0x0637 # ARABIC LETTER TAH\r
+0xD8 0x0638 # ARABIC LETTER ZAH\r
+0xD9 0x0639 # ARABIC LETTER AIN\r
+0xDA 0x063A # ARABIC LETTER GHAIN\r
+0xE0 0x0640 # ARABIC TATWEEL\r
+0xE1 0x0641 # ARABIC LETTER FEH\r
+0xE2 0x0642 # ARABIC LETTER QAF\r
+0xE3 0x0643 # ARABIC LETTER KAF\r
+0xE4 0x0644 # ARABIC LETTER LAM\r
+0xE5 0x0645 # ARABIC LETTER MEEM\r
+0xE6 0x0646 # ARABIC LETTER NOON\r
+0xE7 0x0647 # ARABIC LETTER HEH\r
+0xE8 0x0648 # ARABIC LETTER WAW\r
+0xE9 0x0649 # ARABIC LETTER ALEF MAKSURA\r
+0xEA 0x064A # ARABIC LETTER YEH\r
+0xEB 0x064B # ARABIC FATHATAN\r
+0xEC 0x064C # ARABIC DAMMATAN\r
+0xED 0x064D # ARABIC KASRATAN\r
+0xEE 0x064E # ARABIC FATHA\r
+0xEF 0x064F # ARABIC DAMMA\r
+0xF0 0x0650 # ARABIC KASRA\r
+0xF1 0x0651 # ARABIC SHADDA\r
+0xF2 0x0652 # ARABIC SUKUN\r
--- /dev/null
+#\r
+# Name: ISO 8859-7 (1987) to Unicode\r
+# Unicode version: 1.1\r
+# Table version: 0.1\r
+# Table format: Format A\r
+# Date: 16 January 1995\r
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>\r
+# John H. Jenkins <John_Jenkins@taligent.com>\r
+#\r
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.\r
+#\r
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).\r
+# No claims are made as to fitness for any particular purpose. No\r
+# warranties of any kind are expressed or implied. The recipient\r
+# agrees to determine applicability of information provided. If this\r
+# file has been provided on magnetic media by Unicode, Inc., the sole\r
+# remedy for any claim will be exchange of defective media within 90\r
+# days of receipt.\r
+#\r
+# Recipient is granted the right to make copies in any form for\r
+# internal distribution and to freely use the information supplied\r
+# in the creation of products supporting Unicode. Unicode, Inc.\r
+# specifically excludes the right to re-distribute this file directly\r
+# to third parties or other organizations whether for profit or not.\r
+#\r
+# General notes:\r
+#\r
+# This table contains the data the Unicode Consortium has on how\r
+# ISO 8859-7 (1987) characters map into Unicode.\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the ISO 8859-7 code (in hex as 0xXX)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in ISO 8859-7 order\r
+#\r
+# Any comments or problems, contact <John_Jenkins@taligent.com>\r
+#\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0030 # DIGIT ZERO\r
+0x31 0x0031 # DIGIT ONE\r
+0x32 0x0032 # DIGIT TWO\r
+0x33 0x0033 # DIGIT THREE\r
+0x34 0x0034 # DIGIT FOUR\r
+0x35 0x0035 # DIGIT FIVE\r
+0x36 0x0036 # DIGIT SIX\r
+0x37 0x0037 # DIGIT SEVEN\r
+0x38 0x0038 # DIGIT EIGHT\r
+0x39 0x0039 # DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
+0xA0 0x00A0 # NO-BREAK SPACE\r
+0xA1 0x02BD # MODIFIER LETTER REVERSED COMMA\r
+0xA2 0x02BC # MODIFIER LETTER APOSTROPHE\r
+0xA3 0x00A3 # POUND SIGN\r
+0xA6 0x00A6 # BROKEN BAR\r
+0xA7 0x00A7 # SECTION SIGN\r
+0xA8 0x00A8 # DIAERESIS\r
+0xA9 0x00A9 # COPYRIGHT SIGN\r
+0xAB 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC # NOT SIGN\r
+0xAD 0x00AD # SOFT HYPHEN\r
+0xAF 0x2015 # HORIZONTAL BAR\r
+0xB0 0x00B0 # DEGREE SIGN\r
+0xB1 0x00B1 # PLUS-MINUS SIGN\r
+0xB2 0x00B2 # SUPERSCRIPT TWO\r
+0xB3 0x00B3 # SUPERSCRIPT THREE\r
+0xB4 0x0384 # GREEK TONOS\r
+0xB5 0x0385 # GREEK DIALYTIKA TONOS\r
+0xB6 0x0386 # GREEK CAPITAL LETTER ALPHA WITH TONOS\r
+0xB7 0x00B7 # MIDDLE DOT\r
+0xB8 0x0388 # GREEK CAPITAL LETTER EPSILON WITH TONOS\r
+0xB9 0x0389 # GREEK CAPITAL LETTER ETA WITH TONOS\r
+0xBA 0x038A # GREEK CAPITAL LETTER IOTA WITH TONOS\r
+0xBB 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x038C # GREEK CAPITAL LETTER OMICRON WITH TONOS\r
+0xBD 0x00BD # VULGAR FRACTION ONE HALF\r
+0xBE 0x038E # GREEK CAPITAL LETTER UPSILON WITH TONOS\r
+0xBF 0x038F # GREEK CAPITAL LETTER OMEGA WITH TONOS\r
+0xC0 0x0390 # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS\r
+0xC1 0x0391 # GREEK CAPITAL LETTER ALPHA\r
+0xC2 0x0392 # GREEK CAPITAL LETTER BETA\r
+0xC3 0x0393 # GREEK CAPITAL LETTER GAMMA\r
+0xC4 0x0394 # GREEK CAPITAL LETTER DELTA\r
+0xC5 0x0395 # GREEK CAPITAL LETTER EPSILON\r
+0xC6 0x0396 # GREEK CAPITAL LETTER ZETA\r
+0xC7 0x0397 # GREEK CAPITAL LETTER ETA\r
+0xC8 0x0398 # GREEK CAPITAL LETTER THETA\r
+0xC9 0x0399 # GREEK CAPITAL LETTER IOTA\r
+0xCA 0x039A # GREEK CAPITAL LETTER KAPPA\r
+0xCB 0x039B # GREEK CAPITAL LETTER LAMDA\r
+0xCC 0x039C # GREEK CAPITAL LETTER MU\r
+0xCD 0x039D # GREEK CAPITAL LETTER NU\r
+0xCE 0x039E # GREEK CAPITAL LETTER XI\r
+0xCF 0x039F # GREEK CAPITAL LETTER OMICRON\r
+0xD0 0x03A0 # GREEK CAPITAL LETTER PI\r
+0xD1 0x03A1 # GREEK CAPITAL LETTER RHO\r
+0xD3 0x03A3 # GREEK CAPITAL LETTER SIGMA\r
+0xD4 0x03A4 # GREEK CAPITAL LETTER TAU\r
+0xD5 0x03A5 # GREEK CAPITAL LETTER UPSILON\r
+0xD6 0x03A6 # GREEK CAPITAL LETTER PHI\r
+0xD7 0x03A7 # GREEK CAPITAL LETTER CHI\r
+0xD8 0x03A8 # GREEK CAPITAL LETTER PSI\r
+0xD9 0x03A9 # GREEK CAPITAL LETTER OMEGA\r
+0xDA 0x03AA # GREEK CAPITAL LETTER IOTA WITH DIALYTIKA\r
+0xDB 0x03AB # GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA\r
+0xDC 0x03AC # GREEK SMALL LETTER ALPHA WITH TONOS\r
+0xDD 0x03AD # GREEK SMALL LETTER EPSILON WITH TONOS\r
+0xDE 0x03AE # GREEK SMALL LETTER ETA WITH TONOS\r
+0xDF 0x03AF # GREEK SMALL LETTER IOTA WITH TONOS\r
+0xE0 0x03B0 # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS\r
+0xE1 0x03B1 # GREEK SMALL LETTER ALPHA\r
+0xE2 0x03B2 # GREEK SMALL LETTER BETA\r
+0xE3 0x03B3 # GREEK SMALL LETTER GAMMA\r
+0xE4 0x03B4 # GREEK SMALL LETTER DELTA\r
+0xE5 0x03B5 # GREEK SMALL LETTER EPSILON\r
+0xE6 0x03B6 # GREEK SMALL LETTER ZETA\r
+0xE7 0x03B7 # GREEK SMALL LETTER ETA\r
+0xE8 0x03B8 # GREEK SMALL LETTER THETA\r
+0xE9 0x03B9 # GREEK SMALL LETTER IOTA\r
+0xEA 0x03BA # GREEK SMALL LETTER KAPPA\r
+0xEB 0x03BB # GREEK SMALL LETTER LAMDA\r
+0xEC 0x03BC # GREEK SMALL LETTER MU\r
+0xED 0x03BD # GREEK SMALL LETTER NU\r
+0xEE 0x03BE # GREEK SMALL LETTER XI\r
+0xEF 0x03BF # GREEK SMALL LETTER OMICRON\r
+0xF0 0x03C0 # GREEK SMALL LETTER PI\r
+0xF1 0x03C1 # GREEK SMALL LETTER RHO\r
+0xF2 0x03C2 # GREEK SMALL LETTER FINAL SIGMA\r
+0xF3 0x03C3 # GREEK SMALL LETTER SIGMA\r
+0xF4 0x03C4 # GREEK SMALL LETTER TAU\r
+0xF5 0x03C5 # GREEK SMALL LETTER UPSILON\r
+0xF6 0x03C6 # GREEK SMALL LETTER PHI\r
+0xF7 0x03C7 # GREEK SMALL LETTER CHI\r
+0xF8 0x03C8 # GREEK SMALL LETTER PSI\r
+0xF9 0x03C9 # GREEK SMALL LETTER OMEGA\r
+0xFA 0x03CA # GREEK SMALL LETTER IOTA WITH DIALYTIKA\r
+0xFB 0x03CB # GREEK SMALL LETTER UPSILON WITH DIALYTIKA\r
+0xFC 0x03CC # GREEK SMALL LETTER OMICRON WITH TONOS\r
+0xFD 0x03CD # GREEK SMALL LETTER UPSILON WITH TONOS\r
+0xFE 0x03CE # GREEK SMALL LETTER OMEGA WITH TONOS\r
--- /dev/null
+#
+# Name: ISO 8859-8 (1988) to Unicode
+# Unicode version: 1.1
+# Table version: 0.1
+# Table format: Format A
+# Date: 16 January 1995
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>
+# John H. Jenkins <John_Jenkins@taligent.com>
+#
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.
+#
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).
+# No claims are made as to fitness for any particular purpose. No
+# warranties of any kind are expressed or implied. The recipient
+# agrees to determine applicability of information provided. If this
+# file has been provided on magnetic media by Unicode, Inc., the sole
+# remedy for any claim will be exchange of defective media within 90
+# days of receipt.
+#
+# Recipient is granted the right to make copies in any form for
+# internal distribution and to freely use the information supplied
+# in the creation of products supporting Unicode. Unicode, Inc.
+# specifically excludes the right to re-distribute this file directly
+# to third parties or other organizations whether for profit or not.
+#
+# General notes:
+#
+# This table contains the data the Unicode Consortium has on how
+# ISO 8859-8 (1988) characters map into Unicode.
+#
+# Format: Three tab-separated columns
+# Column #1 is the ISO 8859-8 code (in hex as 0xXX)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 the Unicode name (follows a comment sign, '#')
+#
+# The entries are in ISO 8859-8 order
+#
+# Any comments or problems, contact <John_Jenkins@taligent.com>
+#
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+0xA0 0x00A0 # NO-BREAK SPACE
+0xA2 0x00A2 # CENT SIGN
+0xA3 0x00A3 # POUND SIGN
+0xA4 0x00A4 # CURRENCY SIGN
+0xA5 0x00A5 # YEN SIGN
+0xA6 0x00A6 # BROKEN BAR
+0xA7 0x00A7 # SECTION SIGN
+0xA8 0x00A8 # DIAERESIS
+0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x00D7 # MULTIPLICATION SIGN
+0xAB 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xAC 0x00AC # NOT SIGN
+0xAD 0x00AD # SOFT HYPHEN
+0xAE 0x00AE # REGISTERED SIGN
+0xAF 0x203E # OVERLINE
+0xB0 0x00B0 # DEGREE SIGN
+0xB1 0x00B1 # PLUS-MINUS SIGN
+0xB2 0x00B2 # SUPERSCRIPT TWO
+0xB3 0x00B3 # SUPERSCRIPT THREE
+0xB4 0x00B4 # ACUTE ACCENT
+0xB5 0x00B5 # MICRO SIGN
+0xB6 0x00B6 # PILCROW SIGN
+0xB7 0x00B7 # MIDDLE DOT
+0xB8 0x00B8 # CEDILLA
+0xB9 0x00B9 # SUPERSCRIPT ONE
+0xBA 0x00F7 # DIVISION SIGN
+0xBB 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xBC 0x00BC # VULGAR FRACTION ONE QUARTER
+0xBD 0x00BD # VULGAR FRACTION ONE HALF
+0xBE 0x00BE # VULGAR FRACTION THREE QUARTERS
+0xDF 0x2017 # DOUBLE LOW LINE
+0xE0 0x05D0 # HEBREW LETTER ALEF
+0xE1 0x05D1 # HEBREW LETTER BET
+0xE2 0x05D2 # HEBREW LETTER GIMEL
+0xE3 0x05D3 # HEBREW LETTER DALET
+0xE4 0x05D4 # HEBREW LETTER HE
+0xE5 0x05D5 # HEBREW LETTER VAV
+0xE6 0x05D6 # HEBREW LETTER ZAYIN
+0xE7 0x05D7 # HEBREW LETTER HET
+0xE8 0x05D8 # HEBREW LETTER TET
+0xE9 0x05D9 # HEBREW LETTER YOD
+0xEA 0x05DA # HEBREW LETTER FINAL KAF
+0xEB 0x05DB # HEBREW LETTER KAF
+0xEC 0x05DC # HEBREW LETTER LAMED
+0xED 0x05DD # HEBREW LETTER FINAL MEM
+0xEE 0x05DE # HEBREW LETTER MEM
+0xEF 0x05DF # HEBREW LETTER FINAL NUN
+0xF0 0x05E0 # HEBREW LETTER NUN
+0xF1 0x05E1 # HEBREW LETTER SAMEKH
+0xF2 0x05E2 # HEBREW LETTER AYIN
+0xF3 0x05E3 # HEBREW LETTER FINAL PE
+0xF4 0x05E4 # HEBREW LETTER PE
+0xF5 0x05E5 # HEBREW LETTER FINAL TSADI
+0xF6 0x05E6 # HEBREW LETTER TSADI
+0xF7 0x05E7 # HEBREW LETTER QOF
+0xF8 0x05E8 # HEBREW LETTER RESH
+0xF9 0x05E9 # HEBREW LETTER SHIN
+0xFA 0x05EA # HEBREW LETTER TAV
--- /dev/null
+#\r
+# Name: ISO 8859-9 (1989) to Unicode\r
+# Unicode version: 1.1\r
+# Table version: 0.1\r
+# Table format: Format A\r
+# Date: 16 January 1995\r
+# Authors: Tim Greenwood <greenwood@r2me2.enet.dec.com>\r
+# John H. Jenkins <John_Jenkins@taligent.com>\r
+#\r
+# Copyright (c) 1991-1995 Unicode, Inc. All Rights reserved.\r
+#\r
+# This file is provided as-is by Unicode, Inc. (The Unicode Consortium).\r
+# No claims are made as to fitness for any particular purpose. No\r
+# warranties of any kind are expressed or implied. The recipient\r
+# agrees to determine applicability of information provided. If this\r
+# file has been provided on magnetic media by Unicode, Inc., the sole\r
+# remedy for any claim will be exchange of defective media within 90\r
+# days of receipt.\r
+#\r
+# Recipient is granted the right to make copies in any form for\r
+# internal distribution and to freely use the information supplied\r
+# in the creation of products supporting Unicode. Unicode, Inc.\r
+# specifically excludes the right to re-distribute this file directly\r
+# to third parties or other organizations whether for profit or not.\r
+#\r
+# General notes:\r
+#\r
+# This table contains the data the Unicode Consortium has on how\r
+# ISO 8859-9 (1989) characters map into Unicode.\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the ISO 8859-9 code (in hex as 0xXX)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in ISO 8859-9 order\r
+#\r
+# Any comments or problems, contact <John_Jenkins@taligent.com>\r
+#\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0030 # DIGIT ZERO\r
+0x31 0x0031 # DIGIT ONE\r
+0x32 0x0032 # DIGIT TWO\r
+0x33 0x0033 # DIGIT THREE\r
+0x34 0x0034 # DIGIT FOUR\r
+0x35 0x0035 # DIGIT FIVE\r
+0x36 0x0036 # DIGIT SIX\r
+0x37 0x0037 # DIGIT SEVEN\r
+0x38 0x0038 # DIGIT EIGHT\r
+0x39 0x0039 # DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
+0xA0 0x00A0 # NO-BREAK SPACE\r
+0xA1 0x00A1 # INVERTED EXCLAMATION MARK\r
+0xA2 0x00A2 # CENT SIGN\r
+0xA3 0x00A3 # POUND SIGN\r
+0xA4 0x00A4 # CURRENCY SIGN\r
+0xA5 0x00A5 # YEN SIGN\r
+0xA6 0x00A6 # BROKEN BAR\r
+0xA7 0x00A7 # SECTION SIGN\r
+0xA8 0x00A8 # DIAERESIS\r
+0xA9 0x00A9 # COPYRIGHT SIGN\r
+0xAA 0x00AA # FEMININE ORDINAL INDICATOR\r
+0xAB 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC # NOT SIGN\r
+0xAD 0x00AD # SOFT HYPHEN\r
+0xAE 0x00AE # REGISTERED SIGN\r
+0xAF 0x00AF # MACRON\r
+0xB0 0x00B0 # DEGREE SIGN\r
+0xB1 0x00B1 # PLUS-MINUS SIGN\r
+0xB2 0x00B2 # SUPERSCRIPT TWO\r
+0xB3 0x00B3 # SUPERSCRIPT THREE\r
+0xB4 0x00B4 # ACUTE ACCENT\r
+0xB5 0x00B5 # MICRO SIGN\r
+0xB6 0x00B6 # PILCROW SIGN\r
+0xB7 0x00B7 # MIDDLE DOT\r
+0xB8 0x00B8 # CEDILLA\r
+0xB9 0x00B9 # SUPERSCRIPT ONE\r
+0xBA 0x00BA # MASCULINE ORDINAL INDICATOR\r
+0xBB 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x00BC # VULGAR FRACTION ONE QUARTER\r
+0xBD 0x00BD # VULGAR FRACTION ONE HALF\r
+0xBE 0x00BE # VULGAR FRACTION THREE QUARTERS\r
+0xBF 0x00BF # INVERTED QUESTION MARK\r
+0xC0 0x00C0 # LATIN CAPITAL LETTER A WITH GRAVE\r
+0xC1 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC3 0x00C3 # LATIN CAPITAL LETTER A WITH TILDE\r
+0xC4 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x00C5 # LATIN CAPITAL LETTER A WITH RING ABOVE\r
+0xC6 0x00C6 # LATIN CAPITAL LETTER AE\r
+0xC7 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA\r
+0xC8 0x00C8 # LATIN CAPITAL LETTER E WITH GRAVE\r
+0xC9 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x00CA # LATIN CAPITAL LETTER E WITH CIRCUMFLEX\r
+0xCB 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x00CC # LATIN CAPITAL LETTER I WITH GRAVE\r
+0xCD 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x00CF # LATIN CAPITAL LETTER I WITH DIAERESIS\r
+0xD0 0x011E # LATIN CAPITAL LETTER G WITH BREVE\r
+0xD1 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE\r
+0xD2 0x00D2 # LATIN CAPITAL LETTER O WITH GRAVE\r
+0xD3 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE\r
+0xD6 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 # MULTIPLICATION SIGN\r
+0xD8 0x00D8 # LATIN CAPITAL LETTER O WITH STROKE\r
+0xD9 0x00D9 # LATIN CAPITAL LETTER U WITH GRAVE\r
+0xDA 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x00DB # LATIN CAPITAL LETTER U WITH CIRCUMFLEX\r
+0xDC 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x0130 # LATIN CAPITAL LETTER I WITH DOT ABOVE\r
+0xDE 0x015E # LATIN CAPITAL LETTER S WITH CEDILLA\r
+0xDF 0x00DF # LATIN SMALL LETTER SHARP S\r
+0xE0 0x00E0 # LATIN SMALL LETTER A WITH GRAVE\r
+0xE1 0x00E1 # LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x00E3 # LATIN SMALL LETTER A WITH TILDE\r
+0xE4 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE\r
+0xE6 0x00E6 # LATIN SMALL LETTER AE\r
+0xE7 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x00E8 # LATIN SMALL LETTER E WITH GRAVE\r
+0xE9 0x00E9 # LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX\r
+0xEB 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x00EC # LATIN SMALL LETTER I WITH GRAVE\r
+0xED 0x00ED # LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS\r
+0xF0 0x011F # LATIN SMALL LETTER G WITH BREVE\r
+0xF1 0x00F1 # LATIN SMALL LETTER N WITH TILDE\r
+0xF2 0x00F2 # LATIN SMALL LETTER O WITH GRAVE\r
+0xF3 0x00F3 # LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x00F5 # LATIN SMALL LETTER O WITH TILDE\r
+0xF6 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 # DIVISION SIGN\r
+0xF8 0x00F8 # LATIN SMALL LETTER O WITH STROKE\r
+0xF9 0x00F9 # LATIN SMALL LETTER U WITH GRAVE\r
+0xFA 0x00FA # LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX\r
+0xFC 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x0131 # LATIN SMALL LETTER DOTLESS I\r
+0xFE 0x015F # LATIN SMALL LETTER S WITH CEDILLA\r
+0xFF 0x00FF # LATIN SMALL LETTER Y WITH DIAERESIS\r
+\r
+\r
--- /dev/null
+SHELL = /bin/sh
+INSTALL = @INSTALL@
+
+srcdir = @srcdir@
+top_srcdir = @top_srcdir@
+VPATH = @srcdir@
+prefix = @prefix@
+installroot = @installroot@
+exec_prefix = @exec_prefix@
+
+bindir = @bindir@
+sbindir = @sbindir@
+libexecdir = @libexecdir@
+datadir = @datadir@
+sysconfdir = @sysconfdir@
+sharedstatedir = @sharedstatedir@
+localstatedir = @localstatedir@
+libdir = @libdir@
+infodir = @infodir@
+mandir = @mandir@
+includedir = @includedir@
+oldincludedir = /usr/include
+
+REPL_EXT=@replsuffix@
+SPEC_EXT=@specsuffix@
+LIB_DIR=@datadir@/catdoc
+
+all: @charsetcheck@
+clean:
+install: install-dirs install-spc install-charsets
+install-spc: tex.spc ascii.spc tex.rpl ascii.rpl
+ $(INSTALL) -m 644 ascii.spc $(installroot)$(LIB_DIR)/ascii$(SPEC_EXT)
+ $(INSTALL) -m 644 tex.spc $(installroot)$(LIB_DIR)/tex$(SPEC_EXT)
+ $(INSTALL) -m 644 ascii.rpl $(installroot)$(LIB_DIR)/ascii$(REPL_EXT)
+ $(INSTALL) -m 644 tex.rpl $(installroot)$(LIB_DIR)/tex$(REPL_EXT)
+install-charsets:
+ for i in *.txt; do\
+ $(INSTALL) -m 0644 $$i $(installroot)$(LIB_DIR);\
+ done
+install-dirs:
+ ../mkinstalldirs $(installroot)$(LIB_DIR)
+distclean:
+ rm Makefile
--- /dev/null
+00A0 " "\r
+00A1 "!"\r
+00A2 "c"\r
+00A3 "F"\r
+00A4 "r"\r
+00A5 "Y"\r
+00A6 "|"\r
+00A7 " "\r
+00A9 "(c)"\r
+00AB '\"'\r
+00AC '~'\r
+00AD ""\r
+00AE "(R)"\r
+00B1 "+-"\r
+00BB '\"'\r
+00BC 1/4\r
+00BD 1/2\r
+00BE 3/4\r
+00BF ?\r
+00C0 A\r
+00C1 A\r
+00C2 A\r
+00C3 A\r
+00C4 Ae\r
+00C5 A\r
+00C6 AE\r
+00C7 C\r
+00C8 E\r
+00C9 E\r
+00CA E\r
+00CB E\r
+00CC I\r
+00CD I\r
+00CE I\r
+00CF I\r
+00D1 N \r
+00D2 O\r
+00D3 O\r
+00D4 O\r
+00D5 O\r
+00D6 Oe\r
+00D7 *\r
+00D8 O\r
+00D9 U\r
+00DA U\r
+00DB U\r
+00DC Ue\r
+00DD Y\r
+00DE TH\r
+00DF ss\r
+00E0 a\r
+00E1 a\r
+00E2 a\r
+00E3 a\r
+00E4 ae\r
+00E5 a\r
+00E6 ae\r
+00E7 c\r
+00E8 e\r
+00E9 e\r
+00EA e\r
+00EB e\r
+00EC i\r
+00ED i\r
+00EE i\r
+00EF i\r
+00F1 n\r
+00F2 o\r
+00F3 o \r
+00F4 o\r
+00F5 o\r
+00F6 oe\r
+00F7 /\r
+00F8 o\r
+00F9 u\r
+00FA u\r
+00FB u\r
+00FC ue\r
+00FD y\r
+00FE th\r
+00FF "y"\r
+0410 A\r
+0411 B\r
+0412 W\r
+0413 G\r
+0414 D\r
+0415 E\r
+0416 ZH\r
+0417 Z\r
+0418 I\r
+0419 Y\r
+041a K\r
+041b L\r
+041c M\r
+041d N\r
+041e O\r
+041f P\r
+0420 R\r
+0421 S\r
+0422 T\r
+0423 U\r
+0424 F\r
+0425 KH\r
+0426 TZ\r
+0427 CH\r
+0428 SH\r
+0429 SCH\r
+042a "'"\r
+042b Y\r
+042c "'"\r
+042d E\r
+042e YU\r
+042f YA\r
+0430 a\r
+0431 b\r
+0432 w\r
+0433 g\r
+0434 d\r
+0435 e\r
+0436 zh\r
+0437 z\r
+0438 i\r
+0439 y\r
+043a k\r
+043b l\r
+043c m\r
+043d n\r
+043e o\r
+043f p\r
+0440 r\r
+0441 s\r
+0442 t\r
+0443 u\r
+0444 f\r
+0445 kh\r
+0446 tz\r
+0447 ch\r
+0448 sh\r
+0449 sch\r
+044a "'"\r
+044b y\r
+044c "'"\r
+044d e\r
+044e yu\r
+044f ya\r
+0401 YO\r
+0451 yo\r
+201A "'"\r
+201E "''"\r
+2030 "o/oo"\r
+2039 "\""\r
+2018 "`"\r
+2019 "'"\r
+201C "``"\r
+201D "''"\r
+2022 "*"\r
+2013 "-"\r
+2014 " - "\r
+2122 "tm"\r
+203A "\""\r
+2116 "No"\r
+2026 "..."\r
--- /dev/null
+001C "\t"\r
+001E "\n"\r
+00AD ""\r
--- /dev/null
+#\r
+# Name: cp1250 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp1250 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp1250 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 #UNDEFINED\r
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK\r
+0x83 #UNDEFINED\r
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 0x2020 #DAGGER\r
+0x87 0x2021 #DOUBLE DAGGER\r
+0x88 #UNDEFINED\r
+0x89 0x2030 #PER MILLE SIGN\r
+0x8A 0x0160 #LATIN CAPITAL LETTER S WITH CARON\r
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK\r
+0x8C 0x015A #LATIN CAPITAL LETTER S WITH ACUTE\r
+0x8D 0x0164 #LATIN CAPITAL LETTER T WITH CARON\r
+0x8E 0x017D #LATIN CAPITAL LETTER Z WITH CARON\r
+0x8F 0x0179 #LATIN CAPITAL LETTER Z WITH ACUTE\r
+0x90 #UNDEFINED\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 #UNDEFINED\r
+0x99 0x2122 #TRADE MARK SIGN\r
+0x9A 0x0161 #LATIN SMALL LETTER S WITH CARON\r
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK\r
+0x9C 0x015B #LATIN SMALL LETTER S WITH ACUTE\r
+0x9D 0x0165 #LATIN SMALL LETTER T WITH CARON\r
+0x9E 0x017E #LATIN SMALL LETTER Z WITH CARON\r
+0x9F 0x017A #LATIN SMALL LETTER Z WITH ACUTE\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 0x02C7 #CARON\r
+0xA2 0x02D8 #BREVE\r
+0xA3 0x0141 #LATIN CAPITAL LETTER L WITH STROKE\r
+0xA4 0x00A4 #CURRENCY SIGN\r
+0xA5 0x0104 #LATIN CAPITAL LETTER A WITH OGONEK\r
+0xA6 0x00A6 #BROKEN BAR\r
+0xA7 0x00A7 #SECTION SIGN\r
+0xA8 0x00A8 #DIAERESIS\r
+0xA9 0x00A9 #COPYRIGHT SIGN\r
+0xAA 0x015E #LATIN CAPITAL LETTER S WITH CEDILLA\r
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC #NOT SIGN\r
+0xAD 0x00AD #SOFT HYPHEN\r
+0xAE 0x00AE #REGISTERED SIGN\r
+0xAF 0x017B #LATIN CAPITAL LETTER Z WITH DOT ABOVE\r
+0xB0 0x00B0 #DEGREE SIGN\r
+0xB1 0x00B1 #PLUS-MINUS SIGN\r
+0xB2 0x02DB #OGONEK\r
+0xB3 0x0142 #LATIN SMALL LETTER L WITH STROKE\r
+0xB4 0x00B4 #ACUTE ACCENT\r
+0xB5 0x00B5 #MICRO SIGN\r
+0xB6 0x00B6 #PILCROW SIGN\r
+0xB7 0x00B7 #MIDDLE DOT\r
+0xB8 0x00B8 #CEDILLA\r
+0xB9 0x0105 #LATIN SMALL LETTER A WITH OGONEK\r
+0xBA 0x015F #LATIN SMALL LETTER S WITH CEDILLA\r
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x013D #LATIN CAPITAL LETTER L WITH CARON\r
+0xBD 0x02DD #DOUBLE ACUTE ACCENT\r
+0xBE 0x013E #LATIN SMALL LETTER L WITH CARON\r
+0xBF 0x017C #LATIN SMALL LETTER Z WITH DOT ABOVE\r
+0xC0 0x0154 #LATIN CAPITAL LETTER R WITH ACUTE\r
+0xC1 0x00C1 #LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC3 0x0102 #LATIN CAPITAL LETTER A WITH BREVE\r
+0xC4 0x00C4 #LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x0139 #LATIN CAPITAL LETTER L WITH ACUTE\r
+0xC6 0x0106 #LATIN CAPITAL LETTER C WITH ACUTE\r
+0xC7 0x00C7 #LATIN CAPITAL LETTER C WITH CEDILLA\r
+0xC8 0x010C #LATIN CAPITAL LETTER C WITH CARON\r
+0xC9 0x00C9 #LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x0118 #LATIN CAPITAL LETTER E WITH OGONEK\r
+0xCB 0x00CB #LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x011A #LATIN CAPITAL LETTER E WITH CARON\r
+0xCD 0x00CD #LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE #LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x010E #LATIN CAPITAL LETTER D WITH CARON\r
+0xD0 0x0110 #LATIN CAPITAL LETTER D WITH STROKE\r
+0xD1 0x0143 #LATIN CAPITAL LETTER N WITH ACUTE\r
+0xD2 0x0147 #LATIN CAPITAL LETTER N WITH CARON\r
+0xD3 0x00D3 #LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x00D4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x0150 #LATIN CAPITAL LETTER O WITH DOUBLE ACUTE\r
+0xD6 0x00D6 #LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 #MULTIPLICATION SIGN\r
+0xD8 0x0158 #LATIN CAPITAL LETTER R WITH CARON\r
+0xD9 0x016E #LATIN CAPITAL LETTER U WITH RING ABOVE\r
+0xDA 0x00DA #LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x0170 #LATIN CAPITAL LETTER U WITH DOUBLE ACUTE\r
+0xDC 0x00DC #LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x00DD #LATIN CAPITAL LETTER Y WITH ACUTE\r
+0xDE 0x0162 #LATIN CAPITAL LETTER T WITH CEDILLA\r
+0xDF 0x00DF #LATIN SMALL LETTER SHARP S\r
+0xE0 0x0155 #LATIN SMALL LETTER R WITH ACUTE\r
+0xE1 0x00E1 #LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 #LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x0103 #LATIN SMALL LETTER A WITH BREVE\r
+0xE4 0x00E4 #LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x013A #LATIN SMALL LETTER L WITH ACUTE\r
+0xE6 0x0107 #LATIN SMALL LETTER C WITH ACUTE\r
+0xE7 0x00E7 #LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x010D #LATIN SMALL LETTER C WITH CARON\r
+0xE9 0x00E9 #LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x0119 #LATIN SMALL LETTER E WITH OGONEK\r
+0xEB 0x00EB #LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x011B #LATIN SMALL LETTER E WITH CARON\r
+0xED 0x00ED #LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE #LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x010F #LATIN SMALL LETTER D WITH CARON\r
+0xF0 0x0111 #LATIN SMALL LETTER D WITH STROKE\r
+0xF1 0x0144 #LATIN SMALL LETTER N WITH ACUTE\r
+0xF2 0x0148 #LATIN SMALL LETTER N WITH CARON\r
+0xF3 0x00F3 #LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x00F4 #LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x0151 #LATIN SMALL LETTER O WITH DOUBLE ACUTE\r
+0xF6 0x00F6 #LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 #DIVISION SIGN\r
+0xF8 0x0159 #LATIN SMALL LETTER R WITH CARON\r
+0xF9 0x016F #LATIN SMALL LETTER U WITH RING ABOVE\r
+0xFA 0x00FA #LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x0171 #LATIN SMALL LETTER U WITH DOUBLE ACUTE\r
+0xFC 0x00FC #LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x00FD #LATIN SMALL LETTER Y WITH ACUTE\r
+0xFE 0x0163 #LATIN SMALL LETTER T WITH CEDILLA\r
+0xFF 0x02D9 #DOT ABOVE\r
--- /dev/null
+#
+# Name: cp1251_WinCyrillic to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp1251_WinCyrillic code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp1251_WinCyrillic order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0A 0x000A #LINE FEED
+0x0B 0x000B #VERTICAL TABULATION
+0x0C 0x000C #FORM FEED
+0x0D 0x000D #CARRIAGE RETURN
+0x0E 0x000E #SHIFT OUT
+0x0F 0x000F #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1A 0x001A #SUBSTITUTE
+0x1B 0x001B #ESCAPE
+0x1C 0x001C #FILE SEPARATOR
+0x1D 0x001D #GROUP SEPARATOR
+0x1E 0x001E #RECORD SEPARATOR
+0x1F 0x001F #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2A 0x002A #ASTERISK
+0x2B 0x002B #PLUS SIGN
+0x2C 0x002C #COMMA
+0x2D 0x002D #HYPHEN-MINUS
+0x2E 0x002E #FULL STOP
+0x2F 0x002F #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3A 0x003A #COLON
+0x3B 0x003B #SEMICOLON
+0x3C 0x003C #LESS-THAN SIGN
+0x3D 0x003D #EQUALS SIGN
+0x3E 0x003E #GREATER-THAN SIGN
+0x3F 0x003F #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4A 0x004A #LATIN CAPITAL LETTER J
+0x4B 0x004B #LATIN CAPITAL LETTER K
+0x4C 0x004C #LATIN CAPITAL LETTER L
+0x4D 0x004D #LATIN CAPITAL LETTER M
+0x4E 0x004E #LATIN CAPITAL LETTER N
+0x4F 0x004F #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5A 0x005A #LATIN CAPITAL LETTER Z
+0x5B 0x005B #LEFT SQUARE BRACKET
+0x5C 0x005C #REVERSE SOLIDUS
+0x5D 0x005D #RIGHT SQUARE BRACKET
+0x5E 0x005E #CIRCUMFLEX ACCENT
+0x5F 0x005F #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6A 0x006A #LATIN SMALL LETTER J
+0x6B 0x006B #LATIN SMALL LETTER K
+0x6C 0x006C #LATIN SMALL LETTER L
+0x6D 0x006D #LATIN SMALL LETTER M
+0x6E 0x006E #LATIN SMALL LETTER N
+0x6F 0x006F #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7A 0x007A #LATIN SMALL LETTER Z
+0x7B 0x007B #LEFT CURLY BRACKET
+0x7C 0x007C #VERTICAL LINE
+0x7D 0x007D #RIGHT CURLY BRACKET
+0x7E 0x007E #TILDE
+0x7F 0x007F #DELETE
+0x80 0x0402 #CYRILLIC CAPITAL LETTER DJE
+0x81 0x0403 #CYRILLIC CAPITAL LETTER GJE
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK
+0x83 0x0453 #CYRILLIC SMALL LETTER GJE
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK
+0x85 0x2026 #HORIZONTAL ELLIPSIS
+0x86 0x2020 #DAGGER
+0x87 0x2021 #DOUBLE DAGGER
+0x88 #UNDEFINED
+0x89 0x2030 #PER MILLE SIGN
+0x8A 0x0409 #CYRILLIC CAPITAL LETTER LJE
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK
+0x8C 0x040A #CYRILLIC CAPITAL LETTER NJE
+0x8D 0x040C #CYRILLIC CAPITAL LETTER KJE
+0x8E 0x040B #CYRILLIC CAPITAL LETTER TSHE
+0x8F 0x040F #CYRILLIC CAPITAL LETTER DZHE
+0x90 0x0452 #CYRILLIC SMALL LETTER DJE
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK
+0x95 0x2022 #BULLET
+0x96 0x2013 #EN DASH
+0x97 0x2014 #EM DASH
+0x98 #UNDEFINED
+0x99 0x2122 #TRADE MARK SIGN
+0x9A 0x0459 #CYRILLIC SMALL LETTER LJE
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
+0x9C 0x045A #CYRILLIC SMALL LETTER NJE
+0x9D 0x045C #CYRILLIC SMALL LETTER KJE
+0x9E 0x045B #CYRILLIC SMALL LETTER TSHE
+0x9F 0x045F #CYRILLIC SMALL LETTER DZHE
+0xA0 0x00A0 #NO-BREAK SPACE
+0xA1 0x040E #CYRILLIC CAPITAL LETTER SHORT U
+0xA2 0x045E #CYRILLIC SMALL LETTER SHORT U
+0xA3 0x0408 #CYRILLIC CAPITAL LETTER JE
+0xA4 0x00A4 #CURRENCY SIGN
+0xA5 0x0490 #CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+0xA6 0x00A6 #BROKEN BAR
+0xA7 0x00A7 #SECTION SIGN
+0xA8 0x0401 #CYRILLIC CAPITAL LETTER IO
+0xA9 0x00A9 #COPYRIGHT SIGN
+0xAA 0x0404 #CYRILLIC CAPITAL LETTER UKRAINIAN IE
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xAC 0x00AC #NOT SIGN
+0xAD 0x00AD #SOFT HYPHEN
+0xAE 0x00AE #REGISTERED SIGN
+0xAF 0x0407 #CYRILLIC CAPITAL LETTER YI
+0xB0 0x00B0 #DEGREE SIGN
+0xB1 0x00B1 #PLUS-MINUS SIGN
+0xB2 0x0406 #CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+0xB3 0x0456 #CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+0xB4 0x0491 #CYRILLIC SMALL LETTER GHE WITH UPTURN
+0xB5 0x00B5 #MICRO SIGN
+0xB6 0x00B6 #PILCROW SIGN
+0xB7 0x00B7 #MIDDLE DOT
+0xB8 0x0451 #CYRILLIC SMALL LETTER IO
+0xB9 0x2116 #NUMERO SIGN
+0xBA 0x0454 #CYRILLIC SMALL LETTER UKRAINIAN IE
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xBC 0x0458 #CYRILLIC SMALL LETTER JE
+0xBD 0x0405 #CYRILLIC CAPITAL LETTER DZE
+0xBE 0x0455 #CYRILLIC SMALL LETTER DZE
+0xBF 0x0457 #CYRILLIC SMALL LETTER YI
+0xC0 0x0410 #CYRILLIC CAPITAL LETTER A
+0xC1 0x0411 #CYRILLIC CAPITAL LETTER BE
+0xC2 0x0412 #CYRILLIC CAPITAL LETTER VE
+0xC3 0x0413 #CYRILLIC CAPITAL LETTER GHE
+0xC4 0x0414 #CYRILLIC CAPITAL LETTER DE
+0xC5 0x0415 #CYRILLIC CAPITAL LETTER IE
+0xC6 0x0416 #CYRILLIC CAPITAL LETTER ZHE
+0xC7 0x0417 #CYRILLIC CAPITAL LETTER ZE
+0xC8 0x0418 #CYRILLIC CAPITAL LETTER I
+0xC9 0x0419 #CYRILLIC CAPITAL LETTER SHORT I
+0xCA 0x041A #CYRILLIC CAPITAL LETTER KA
+0xCB 0x041B #CYRILLIC CAPITAL LETTER EL
+0xCC 0x041C #CYRILLIC CAPITAL LETTER EM
+0xCD 0x041D #CYRILLIC CAPITAL LETTER EN
+0xCE 0x041E #CYRILLIC CAPITAL LETTER O
+0xCF 0x041F #CYRILLIC CAPITAL LETTER PE
+0xD0 0x0420 #CYRILLIC CAPITAL LETTER ER
+0xD1 0x0421 #CYRILLIC CAPITAL LETTER ES
+0xD2 0x0422 #CYRILLIC CAPITAL LETTER TE
+0xD3 0x0423 #CYRILLIC CAPITAL LETTER U
+0xD4 0x0424 #CYRILLIC CAPITAL LETTER EF
+0xD5 0x0425 #CYRILLIC CAPITAL LETTER HA
+0xD6 0x0426 #CYRILLIC CAPITAL LETTER TSE
+0xD7 0x0427 #CYRILLIC CAPITAL LETTER CHE
+0xD8 0x0428 #CYRILLIC CAPITAL LETTER SHA
+0xD9 0x0429 #CYRILLIC CAPITAL LETTER SHCHA
+0xDA 0x042A #CYRILLIC CAPITAL LETTER HARD SIGN
+0xDB 0x042B #CYRILLIC CAPITAL LETTER YERU
+0xDC 0x042C #CYRILLIC CAPITAL LETTER SOFT SIGN
+0xDD 0x042D #CYRILLIC CAPITAL LETTER E
+0xDE 0x042E #CYRILLIC CAPITAL LETTER YU
+0xDF 0x042F #CYRILLIC CAPITAL LETTER YA
+0xE0 0x0430 #CYRILLIC SMALL LETTER A
+0xE1 0x0431 #CYRILLIC SMALL LETTER BE
+0xE2 0x0432 #CYRILLIC SMALL LETTER VE
+0xE3 0x0433 #CYRILLIC SMALL LETTER GHE
+0xE4 0x0434 #CYRILLIC SMALL LETTER DE
+0xE5 0x0435 #CYRILLIC SMALL LETTER IE
+0xE6 0x0436 #CYRILLIC SMALL LETTER ZHE
+0xE7 0x0437 #CYRILLIC SMALL LETTER ZE
+0xE8 0x0438 #CYRILLIC SMALL LETTER I
+0xE9 0x0439 #CYRILLIC SMALL LETTER SHORT I
+0xEA 0x043A #CYRILLIC SMALL LETTER KA
+0xEB 0x043B #CYRILLIC SMALL LETTER EL
+0xEC 0x043C #CYRILLIC SMALL LETTER EM
+0xED 0x043D #CYRILLIC SMALL LETTER EN
+0xEE 0x043E #CYRILLIC SMALL LETTER O
+0xEF 0x043F #CYRILLIC SMALL LETTER PE
+0xF0 0x0440 #CYRILLIC SMALL LETTER ER
+0xF1 0x0441 #CYRILLIC SMALL LETTER ES
+0xF2 0x0442 #CYRILLIC SMALL LETTER TE
+0xF3 0x0443 #CYRILLIC SMALL LETTER U
+0xF4 0x0444 #CYRILLIC SMALL LETTER EF
+0xF5 0x0445 #CYRILLIC SMALL LETTER HA
+0xF6 0x0446 #CYRILLIC SMALL LETTER TSE
+0xF7 0x0447 #CYRILLIC SMALL LETTER CHE
+0xF8 0x0448 #CYRILLIC SMALL LETTER SHA
+0xF9 0x0449 #CYRILLIC SMALL LETTER SHCHA
+0xFA 0x044A #CYRILLIC SMALL LETTER HARD SIGN
+0xFB 0x044B #CYRILLIC SMALL LETTER YERU
+0xFC 0x044C #CYRILLIC SMALL LETTER SOFT SIGN
+0xFD 0x044D #CYRILLIC SMALL LETTER E
+0xFE 0x044E #CYRILLIC SMALL LETTER YU
+0xFF 0x044F #CYRILLIC SMALL LETTER YA
+
--- /dev/null
+#\r
+# Name: cp1252 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp1252 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp1252 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 #UNDEFINED\r
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK\r
+0x83 0x0192 #LATIN SMALL LETTER F WITH HOOK\r
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 0x2020 #DAGGER\r
+0x87 0x2021 #DOUBLE DAGGER\r
+0x88 0x02C6 #MODIFIER LETTER CIRCUMFLEX ACCENT\r
+0x89 0x2030 #PER MILLE SIGN\r
+0x8A 0x0160 #LATIN CAPITAL LETTER S WITH CARON\r
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK\r
+0x8C 0x0152 #LATIN CAPITAL LIGATURE OE\r
+0x8D #UNDEFINED\r
+0x8E 0x017D #LATIN CAPITAL LETTER Z WITH CARON\r
+0x8F #UNDEFINED\r
+0x90 #UNDEFINED\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 0x02DC #SMALL TILDE\r
+0x99 0x2122 #TRADE MARK SIGN\r
+0x9A 0x0161 #LATIN SMALL LETTER S WITH CARON\r
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK\r
+0x9C 0x0153 #LATIN SMALL LIGATURE OE\r
+0x9D #UNDEFINED\r
+0x9E 0x017E #LATIN SMALL LETTER Z WITH CARON\r
+0x9F 0x0178 #LATIN CAPITAL LETTER Y WITH DIAERESIS\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 0x00A1 #INVERTED EXCLAMATION MARK\r
+0xA2 0x00A2 #CENT SIGN\r
+0xA3 0x00A3 #POUND SIGN\r
+0xA4 0x00A4 #CURRENCY SIGN\r
+0xA5 0x00A5 #YEN SIGN\r
+0xA6 0x00A6 #BROKEN BAR\r
+0xA7 0x00A7 #SECTION SIGN\r
+0xA8 0x00A8 #DIAERESIS\r
+0xA9 0x00A9 #COPYRIGHT SIGN\r
+0xAA 0x00AA #FEMININE ORDINAL INDICATOR\r
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC #NOT SIGN\r
+0xAD 0x00AD #SOFT HYPHEN\r
+0xAE 0x00AE #REGISTERED SIGN\r
+0xAF 0x00AF #MACRON\r
+0xB0 0x00B0 #DEGREE SIGN\r
+0xB1 0x00B1 #PLUS-MINUS SIGN\r
+0xB2 0x00B2 #SUPERSCRIPT TWO\r
+0xB3 0x00B3 #SUPERSCRIPT THREE\r
+0xB4 0x00B4 #ACUTE ACCENT\r
+0xB5 0x00B5 #MICRO SIGN\r
+0xB6 0x00B6 #PILCROW SIGN\r
+0xB7 0x00B7 #MIDDLE DOT\r
+0xB8 0x00B8 #CEDILLA\r
+0xB9 0x00B9 #SUPERSCRIPT ONE\r
+0xBA 0x00BA #MASCULINE ORDINAL INDICATOR\r
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x00BC #VULGAR FRACTION ONE QUARTER\r
+0xBD 0x00BD #VULGAR FRACTION ONE HALF\r
+0xBE 0x00BE #VULGAR FRACTION THREE QUARTERS\r
+0xBF 0x00BF #INVERTED QUESTION MARK\r
+0xC0 0x00C0 #LATIN CAPITAL LETTER A WITH GRAVE\r
+0xC1 0x00C1 #LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC3 0x00C3 #LATIN CAPITAL LETTER A WITH TILDE\r
+0xC4 0x00C4 #LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x00C5 #LATIN CAPITAL LETTER A WITH RING ABOVE\r
+0xC6 0x00C6 #LATIN CAPITAL LETTER AE\r
+0xC7 0x00C7 #LATIN CAPITAL LETTER C WITH CEDILLA\r
+0xC8 0x00C8 #LATIN CAPITAL LETTER E WITH GRAVE\r
+0xC9 0x00C9 #LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x00CA #LATIN CAPITAL LETTER E WITH CIRCUMFLEX\r
+0xCB 0x00CB #LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x00CC #LATIN CAPITAL LETTER I WITH GRAVE\r
+0xCD 0x00CD #LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE #LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x00CF #LATIN CAPITAL LETTER I WITH DIAERESIS\r
+0xD0 0x00D0 #LATIN CAPITAL LETTER ETH\r
+0xD1 0x00D1 #LATIN CAPITAL LETTER N WITH TILDE\r
+0xD2 0x00D2 #LATIN CAPITAL LETTER O WITH GRAVE\r
+0xD3 0x00D3 #LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x00D4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x00D5 #LATIN CAPITAL LETTER O WITH TILDE\r
+0xD6 0x00D6 #LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 #MULTIPLICATION SIGN\r
+0xD8 0x00D8 #LATIN CAPITAL LETTER O WITH STROKE\r
+0xD9 0x00D9 #LATIN CAPITAL LETTER U WITH GRAVE\r
+0xDA 0x00DA #LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x00DB #LATIN CAPITAL LETTER U WITH CIRCUMFLEX\r
+0xDC 0x00DC #LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x00DD #LATIN CAPITAL LETTER Y WITH ACUTE\r
+0xDE 0x00DE #LATIN CAPITAL LETTER THORN\r
+0xDF 0x00DF #LATIN SMALL LETTER SHARP S\r
+0xE0 0x00E0 #LATIN SMALL LETTER A WITH GRAVE\r
+0xE1 0x00E1 #LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 #LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x00E3 #LATIN SMALL LETTER A WITH TILDE\r
+0xE4 0x00E4 #LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x00E5 #LATIN SMALL LETTER A WITH RING ABOVE\r
+0xE6 0x00E6 #LATIN SMALL LETTER AE\r
+0xE7 0x00E7 #LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x00E8 #LATIN SMALL LETTER E WITH GRAVE\r
+0xE9 0x00E9 #LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x00EA #LATIN SMALL LETTER E WITH CIRCUMFLEX\r
+0xEB 0x00EB #LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x00EC #LATIN SMALL LETTER I WITH GRAVE\r
+0xED 0x00ED #LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE #LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x00EF #LATIN SMALL LETTER I WITH DIAERESIS\r
+0xF0 0x00F0 #LATIN SMALL LETTER ETH\r
+0xF1 0x00F1 #LATIN SMALL LETTER N WITH TILDE\r
+0xF2 0x00F2 #LATIN SMALL LETTER O WITH GRAVE\r
+0xF3 0x00F3 #LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x00F4 #LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x00F5 #LATIN SMALL LETTER O WITH TILDE\r
+0xF6 0x00F6 #LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 #DIVISION SIGN\r
+0xF8 0x00F8 #LATIN SMALL LETTER O WITH STROKE\r
+0xF9 0x00F9 #LATIN SMALL LETTER U WITH GRAVE\r
+0xFA 0x00FA #LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x00FB #LATIN SMALL LETTER U WITH CIRCUMFLEX\r
+0xFC 0x00FC #LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x00FD #LATIN SMALL LETTER Y WITH ACUTE\r
+0xFE 0x00FE #LATIN SMALL LETTER THORN\r
+0xFF 0x00FF #LATIN SMALL LETTER Y WITH DIAERESIS\r
--- /dev/null
+#\r
+# Name: cp1253 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp1253 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp1253 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 #UNDEFINED\r
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK\r
+0x83 0x0192 #LATIN SMALL LETTER F WITH HOOK\r
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 0x2020 #DAGGER\r
+0x87 0x2021 #DOUBLE DAGGER\r
+0x88 #UNDEFINED\r
+0x89 0x2030 #PER MILLE SIGN\r
+0x8A #UNDEFINED\r
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK\r
+0x8C #UNDEFINED\r
+0x8D #UNDEFINED\r
+0x8E #UNDEFINED\r
+0x8F #UNDEFINED\r
+0x90 #UNDEFINED\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 #UNDEFINED\r
+0x99 0x2122 #TRADE MARK SIGN\r
+0x9A #UNDEFINED\r
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK\r
+0x9C #UNDEFINED\r
+0x9D #UNDEFINED\r
+0x9E #UNDEFINED\r
+0x9F #UNDEFINED\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 0x0385 #GREEK DIALYTIKA TONOS\r
+0xA2 0x0386 #GREEK CAPITAL LETTER ALPHA WITH TONOS\r
+0xA3 0x00A3 #POUND SIGN\r
+0xA4 0x00A4 #CURRENCY SIGN\r
+0xA5 0x00A5 #YEN SIGN\r
+0xA6 0x00A6 #BROKEN BAR\r
+0xA7 0x00A7 #SECTION SIGN\r
+0xA8 0x00A8 #DIAERESIS\r
+0xA9 0x00A9 #COPYRIGHT SIGN\r
+0xAA #UNDEFINED\r
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC #NOT SIGN\r
+0xAD 0x00AD #SOFT HYPHEN\r
+0xAE 0x00AE #REGISTERED SIGN\r
+0xAF 0x2015 #HORIZONTAL BAR\r
+0xB0 0x00B0 #DEGREE SIGN\r
+0xB1 0x00B1 #PLUS-MINUS SIGN\r
+0xB2 0x00B2 #SUPERSCRIPT TWO\r
+0xB3 0x00B3 #SUPERSCRIPT THREE\r
+0xB4 0x0384 #GREEK TONOS\r
+0xB5 0x00B5 #MICRO SIGN\r
+0xB6 0x00B6 #PILCROW SIGN\r
+0xB7 0x00B7 #MIDDLE DOT\r
+0xB8 0x0388 #GREEK CAPITAL LETTER EPSILON WITH TONOS\r
+0xB9 0x0389 #GREEK CAPITAL LETTER ETA WITH TONOS\r
+0xBA 0x038A #GREEK CAPITAL LETTER IOTA WITH TONOS\r
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x038C #GREEK CAPITAL LETTER OMICRON WITH TONOS\r
+0xBD 0x00BD #VULGAR FRACTION ONE HALF\r
+0xBE 0x038E #GREEK CAPITAL LETTER UPSILON WITH TONOS\r
+0xBF 0x038F #GREEK CAPITAL LETTER OMEGA WITH TONOS\r
+0xC0 0x0390 #GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS\r
+0xC1 0x0391 #GREEK CAPITAL LETTER ALPHA\r
+0xC2 0x0392 #GREEK CAPITAL LETTER BETA\r
+0xC3 0x0393 #GREEK CAPITAL LETTER GAMMA\r
+0xC4 0x0394 #GREEK CAPITAL LETTER DELTA\r
+0xC5 0x0395 #GREEK CAPITAL LETTER EPSILON\r
+0xC6 0x0396 #GREEK CAPITAL LETTER ZETA\r
+0xC7 0x0397 #GREEK CAPITAL LETTER ETA\r
+0xC8 0x0398 #GREEK CAPITAL LETTER THETA\r
+0xC9 0x0399 #GREEK CAPITAL LETTER IOTA\r
+0xCA 0x039A #GREEK CAPITAL LETTER KAPPA\r
+0xCB 0x039B #GREEK CAPITAL LETTER LAMDA\r
+0xCC 0x039C #GREEK CAPITAL LETTER MU\r
+0xCD 0x039D #GREEK CAPITAL LETTER NU\r
+0xCE 0x039E #GREEK CAPITAL LETTER XI\r
+0xCF 0x039F #GREEK CAPITAL LETTER OMICRON\r
+0xD0 0x03A0 #GREEK CAPITAL LETTER PI\r
+0xD1 0x03A1 #GREEK CAPITAL LETTER RHO\r
+0xD2 #UNDEFINED\r
+0xD3 0x03A3 #GREEK CAPITAL LETTER SIGMA\r
+0xD4 0x03A4 #GREEK CAPITAL LETTER TAU\r
+0xD5 0x03A5 #GREEK CAPITAL LETTER UPSILON\r
+0xD6 0x03A6 #GREEK CAPITAL LETTER PHI\r
+0xD7 0x03A7 #GREEK CAPITAL LETTER CHI\r
+0xD8 0x03A8 #GREEK CAPITAL LETTER PSI\r
+0xD9 0x03A9 #GREEK CAPITAL LETTER OMEGA\r
+0xDA 0x03AA #GREEK CAPITAL LETTER IOTA WITH DIALYTIKA\r
+0xDB 0x03AB #GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA\r
+0xDC 0x03AC #GREEK SMALL LETTER ALPHA WITH TONOS\r
+0xDD 0x03AD #GREEK SMALL LETTER EPSILON WITH TONOS\r
+0xDE 0x03AE #GREEK SMALL LETTER ETA WITH TONOS\r
+0xDF 0x03AF #GREEK SMALL LETTER IOTA WITH TONOS\r
+0xE0 0x03B0 #GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS\r
+0xE1 0x03B1 #GREEK SMALL LETTER ALPHA\r
+0xE2 0x03B2 #GREEK SMALL LETTER BETA\r
+0xE3 0x03B3 #GREEK SMALL LETTER GAMMA\r
+0xE4 0x03B4 #GREEK SMALL LETTER DELTA\r
+0xE5 0x03B5 #GREEK SMALL LETTER EPSILON\r
+0xE6 0x03B6 #GREEK SMALL LETTER ZETA\r
+0xE7 0x03B7 #GREEK SMALL LETTER ETA\r
+0xE8 0x03B8 #GREEK SMALL LETTER THETA\r
+0xE9 0x03B9 #GREEK SMALL LETTER IOTA\r
+0xEA 0x03BA #GREEK SMALL LETTER KAPPA\r
+0xEB 0x03BB #GREEK SMALL LETTER LAMDA\r
+0xEC 0x03BC #GREEK SMALL LETTER MU\r
+0xED 0x03BD #GREEK SMALL LETTER NU\r
+0xEE 0x03BE #GREEK SMALL LETTER XI\r
+0xEF 0x03BF #GREEK SMALL LETTER OMICRON\r
+0xF0 0x03C0 #GREEK SMALL LETTER PI\r
+0xF1 0x03C1 #GREEK SMALL LETTER RHO\r
+0xF2 0x03C2 #GREEK SMALL LETTER FINAL SIGMA\r
+0xF3 0x03C3 #GREEK SMALL LETTER SIGMA\r
+0xF4 0x03C4 #GREEK SMALL LETTER TAU\r
+0xF5 0x03C5 #GREEK SMALL LETTER UPSILON\r
+0xF6 0x03C6 #GREEK SMALL LETTER PHI\r
+0xF7 0x03C7 #GREEK SMALL LETTER CHI\r
+0xF8 0x03C8 #GREEK SMALL LETTER PSI\r
+0xF9 0x03C9 #GREEK SMALL LETTER OMEGA\r
+0xFA 0x03CA #GREEK SMALL LETTER IOTA WITH DIALYTIKA\r
+0xFB 0x03CB #GREEK SMALL LETTER UPSILON WITH DIALYTIKA\r
+0xFC 0x03CC #GREEK SMALL LETTER OMICRON WITH TONOS\r
+0xFD 0x03CD #GREEK SMALL LETTER UPSILON WITH TONOS\r
+0xFE 0x03CE #GREEK SMALL LETTER OMEGA WITH TONOS\r
+0xFF #UNDEFINED\r
--- /dev/null
+#\r
+# Name: cp1254 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp1254 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp1254 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 #UNDEFINED\r
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK\r
+0x83 0x0192 #LATIN SMALL LETTER F WITH HOOK\r
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 0x2020 #DAGGER\r
+0x87 0x2021 #DOUBLE DAGGER\r
+0x88 0x02C6 #MODIFIER LETTER CIRCUMFLEX ACCENT\r
+0x89 0x2030 #PER MILLE SIGN\r
+0x8A 0x0160 #LATIN CAPITAL LETTER S WITH CARON\r
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK\r
+0x8C 0x0152 #LATIN CAPITAL LIGATURE OE\r
+0x8D #UNDEFINED\r
+0x8E #UNDEFINED\r
+0x8F #UNDEFINED\r
+0x90 #UNDEFINED\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 0x02DC #SMALL TILDE\r
+0x99 0x2122 #TRADE MARK SIGN\r
+0x9A 0x0161 #LATIN SMALL LETTER S WITH CARON\r
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK\r
+0x9C 0x0153 #LATIN SMALL LIGATURE OE\r
+0x9D #UNDEFINED\r
+0x9E #UNDEFINED\r
+0x9F 0x0178 #LATIN CAPITAL LETTER Y WITH DIAERESIS\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 0x00A1 #INVERTED EXCLAMATION MARK\r
+0xA2 0x00A2 #CENT SIGN\r
+0xA3 0x00A3 #POUND SIGN\r
+0xA4 0x00A4 #CURRENCY SIGN\r
+0xA5 0x00A5 #YEN SIGN\r
+0xA6 0x00A6 #BROKEN BAR\r
+0xA7 0x00A7 #SECTION SIGN\r
+0xA8 0x00A8 #DIAERESIS\r
+0xA9 0x00A9 #COPYRIGHT SIGN\r
+0xAA 0x00AA #FEMININE ORDINAL INDICATOR\r
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC #NOT SIGN\r
+0xAD 0x00AD #SOFT HYPHEN\r
+0xAE 0x00AE #REGISTERED SIGN\r
+0xAF 0x00AF #MACRON\r
+0xB0 0x00B0 #DEGREE SIGN\r
+0xB1 0x00B1 #PLUS-MINUS SIGN\r
+0xB2 0x00B2 #SUPERSCRIPT TWO\r
+0xB3 0x00B3 #SUPERSCRIPT THREE\r
+0xB4 0x00B4 #ACUTE ACCENT\r
+0xB5 0x00B5 #MICRO SIGN\r
+0xB6 0x00B6 #PILCROW SIGN\r
+0xB7 0x00B7 #MIDDLE DOT\r
+0xB8 0x00B8 #CEDILLA\r
+0xB9 0x00B9 #SUPERSCRIPT ONE\r
+0xBA 0x00BA #MASCULINE ORDINAL INDICATOR\r
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x00BC #VULGAR FRACTION ONE QUARTER\r
+0xBD 0x00BD #VULGAR FRACTION ONE HALF\r
+0xBE 0x00BE #VULGAR FRACTION THREE QUARTERS\r
+0xBF 0x00BF #INVERTED QUESTION MARK\r
+0xC0 0x00C0 #LATIN CAPITAL LETTER A WITH GRAVE\r
+0xC1 0x00C1 #LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC3 0x00C3 #LATIN CAPITAL LETTER A WITH TILDE\r
+0xC4 0x00C4 #LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x00C5 #LATIN CAPITAL LETTER A WITH RING ABOVE\r
+0xC6 0x00C6 #LATIN CAPITAL LETTER AE\r
+0xC7 0x00C7 #LATIN CAPITAL LETTER C WITH CEDILLA\r
+0xC8 0x00C8 #LATIN CAPITAL LETTER E WITH GRAVE\r
+0xC9 0x00C9 #LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x00CA #LATIN CAPITAL LETTER E WITH CIRCUMFLEX\r
+0xCB 0x00CB #LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x00CC #LATIN CAPITAL LETTER I WITH GRAVE\r
+0xCD 0x00CD #LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE #LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x00CF #LATIN CAPITAL LETTER I WITH DIAERESIS\r
+0xD0 0x011E #LATIN CAPITAL LETTER G WITH BREVE\r
+0xD1 0x00D1 #LATIN CAPITAL LETTER N WITH TILDE\r
+0xD2 0x00D2 #LATIN CAPITAL LETTER O WITH GRAVE\r
+0xD3 0x00D3 #LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x00D4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x00D5 #LATIN CAPITAL LETTER O WITH TILDE\r
+0xD6 0x00D6 #LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 #MULTIPLICATION SIGN\r
+0xD8 0x00D8 #LATIN CAPITAL LETTER O WITH STROKE\r
+0xD9 0x00D9 #LATIN CAPITAL LETTER U WITH GRAVE\r
+0xDA 0x00DA #LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x00DB #LATIN CAPITAL LETTER U WITH CIRCUMFLEX\r
+0xDC 0x00DC #LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x0130 #LATIN CAPITAL LETTER I WITH DOT ABOVE\r
+0xDE 0x015E #LATIN CAPITAL LETTER S WITH CEDILLA\r
+0xDF 0x00DF #LATIN SMALL LETTER SHARP S\r
+0xE0 0x00E0 #LATIN SMALL LETTER A WITH GRAVE\r
+0xE1 0x00E1 #LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 #LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x00E3 #LATIN SMALL LETTER A WITH TILDE\r
+0xE4 0x00E4 #LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x00E5 #LATIN SMALL LETTER A WITH RING ABOVE\r
+0xE6 0x00E6 #LATIN SMALL LETTER AE\r
+0xE7 0x00E7 #LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x00E8 #LATIN SMALL LETTER E WITH GRAVE\r
+0xE9 0x00E9 #LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x00EA #LATIN SMALL LETTER E WITH CIRCUMFLEX\r
+0xEB 0x00EB #LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x00EC #LATIN SMALL LETTER I WITH GRAVE\r
+0xED 0x00ED #LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE #LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x00EF #LATIN SMALL LETTER I WITH DIAERESIS\r
+0xF0 0x011F #LATIN SMALL LETTER G WITH BREVE\r
+0xF1 0x00F1 #LATIN SMALL LETTER N WITH TILDE\r
+0xF2 0x00F2 #LATIN SMALL LETTER O WITH GRAVE\r
+0xF3 0x00F3 #LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x00F4 #LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x00F5 #LATIN SMALL LETTER O WITH TILDE\r
+0xF6 0x00F6 #LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 #DIVISION SIGN\r
+0xF8 0x00F8 #LATIN SMALL LETTER O WITH STROKE\r
+0xF9 0x00F9 #LATIN SMALL LETTER U WITH GRAVE\r
+0xFA 0x00FA #LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x00FB #LATIN SMALL LETTER U WITH CIRCUMFLEX\r
+0xFC 0x00FC #LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x0131 #LATIN SMALL LETTER DOTLESS I\r
+0xFE 0x015F #LATIN SMALL LETTER S WITH CEDILLA\r
+0xFF 0x00FF #LATIN SMALL LETTER Y WITH DIAERESIS\r
--- /dev/null
+#\r
+# Name: cp1255 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp1255 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp1255 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 #UNDEFINED\r
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK\r
+0x83 0x0192 #LATIN SMALL LETTER F WITH HOOK\r
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 0x2020 #DAGGER\r
+0x87 0x2021 #DOUBLE DAGGER\r
+0x88 0x02C6 #MODIFIER LETTER CIRCUMFLEX ACCENT\r
+0x89 0x2030 #PER MILLE SIGN\r
+0x8A #UNDEFINED\r
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK\r
+0x8C #UNDEFINED\r
+0x8D #UNDEFINED\r
+0x8E #UNDEFINED\r
+0x8F #UNDEFINED\r
+0x90 #UNDEFINED\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 0x02DC #SMALL TILDE\r
+0x99 0x2122 #TRADE MARK SIGN\r
+0x9A #UNDEFINED\r
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK\r
+0x9C #UNDEFINED\r
+0x9D #UNDEFINED\r
+0x9E #UNDEFINED\r
+0x9F #UNDEFINED\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 0x00A1 #INVERTED EXCLAMATION MARK\r
+0xA2 0x00A2 #CENT SIGN\r
+0xA3 0x00A3 #POUND SIGN\r
+0xA4 0x20AA #NEW SHEQEL SIGN\r
+0xA5 0x00A5 #YEN SIGN\r
+0xA6 0x00A6 #BROKEN BAR\r
+0xA7 0x00A7 #SECTION SIGN\r
+0xA8 0x00A8 #DIAERESIS\r
+0xA9 0x00A9 #COPYRIGHT SIGN\r
+0xAA 0x00D7 #MULTIPLICATION SIGN\r
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC #NOT SIGN\r
+0xAD 0x00AD #SOFT HYPHEN\r
+0xAE 0x00AE #REGISTERED SIGN\r
+0xAF 0x00AF #MACRON\r
+0xB0 0x00B0 #DEGREE SIGN\r
+0xB1 0x00B1 #PLUS-MINUS SIGN\r
+0xB2 0x00B2 #SUPERSCRIPT TWO\r
+0xB3 0x00B3 #SUPERSCRIPT THREE\r
+0xB4 0x00B4 #ACUTE ACCENT\r
+0xB5 0x00B5 #MICRO SIGN\r
+0xB6 0x00B6 #PILCROW SIGN\r
+0xB7 0x00B7 #MIDDLE DOT\r
+0xB8 0x00B8 #CEDILLA\r
+0xB9 0x00B9 #SUPERSCRIPT ONE\r
+0xBA 0x00F7 #DIVISION SIGN\r
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x00BC #VULGAR FRACTION ONE QUARTER\r
+0xBD 0x00BD #VULGAR FRACTION ONE HALF\r
+0xBE 0x00BE #VULGAR FRACTION THREE QUARTERS\r
+0xBF 0x00BF #INVERTED QUESTION MARK\r
+0xC0 0x05B0 #HEBREW POINT SHEVA\r
+0xC1 0x05B1 #HEBREW POINT HATAF SEGOL\r
+0xC2 0x05B2 #HEBREW POINT HATAF PATAH\r
+0xC3 0x05B3 #HEBREW POINT HATAF QAMATS\r
+0xC4 0x05B4 #HEBREW POINT HIRIQ\r
+0xC5 0x05B5 #HEBREW POINT TSERE\r
+0xC6 0x05B6 #HEBREW POINT SEGOL\r
+0xC7 0x05B7 #HEBREW POINT PATAH\r
+0xC8 0x05B8 #HEBREW POINT QAMATS\r
+0xC9 0x05B9 #HEBREW POINT HOLAM\r
+0xCA #UNDEFINED\r
+0xCB 0x05BB #HEBREW POINT QUBUTS\r
+0xCC 0x05BC #HEBREW POINT DAGESH OR MAPIQ\r
+0xCD 0x05BD #HEBREW POINT METEG\r
+0xCE 0x05BE #HEBREW PUNCTUATION MAQAF\r
+0xCF 0x05BF #HEBREW POINT RAFE\r
+0xD0 0x05C0 #HEBREW PUNCTUATION PASEQ\r
+0xD1 0x05C1 #HEBREW POINT SHIN DOT\r
+0xD2 0x05C2 #HEBREW POINT SIN DOT\r
+0xD3 0x05C3 #HEBREW PUNCTUATION SOF PASUQ\r
+0xD4 0x05F0 #HEBREW LIGATURE YIDDISH DOUBLE VAV\r
+0xD5 0x05F1 #HEBREW LIGATURE YIDDISH VAV YOD\r
+0xD6 0x05F2 #HEBREW LIGATURE YIDDISH DOUBLE YOD\r
+0xD7 0x05F3 #HEBREW PUNCTUATION GERESH\r
+0xD8 0x05F4 #HEBREW PUNCTUATION GERSHAYIM\r
+0xD9 #UNDEFINED\r
+0xDA #UNDEFINED\r
+0xDB #UNDEFINED\r
+0xDC #UNDEFINED\r
+0xDD #UNDEFINED\r
+0xDE #UNDEFINED\r
+0xDF #UNDEFINED\r
+0xE0 0x05D0 #HEBREW LETTER ALEF\r
+0xE1 0x05D1 #HEBREW LETTER BET\r
+0xE2 0x05D2 #HEBREW LETTER GIMEL\r
+0xE3 0x05D3 #HEBREW LETTER DALET\r
+0xE4 0x05D4 #HEBREW LETTER HE\r
+0xE5 0x05D5 #HEBREW LETTER VAV\r
+0xE6 0x05D6 #HEBREW LETTER ZAYIN\r
+0xE7 0x05D7 #HEBREW LETTER HET\r
+0xE8 0x05D8 #HEBREW LETTER TET\r
+0xE9 0x05D9 #HEBREW LETTER YOD\r
+0xEA 0x05DA #HEBREW LETTER FINAL KAF\r
+0xEB 0x05DB #HEBREW LETTER KAF\r
+0xEC 0x05DC #HEBREW LETTER LAMED\r
+0xED 0x05DD #HEBREW LETTER FINAL MEM\r
+0xEE 0x05DE #HEBREW LETTER MEM\r
+0xEF 0x05DF #HEBREW LETTER FINAL NUN\r
+0xF0 0x05E0 #HEBREW LETTER NUN\r
+0xF1 0x05E1 #HEBREW LETTER SAMEKH\r
+0xF2 0x05E2 #HEBREW LETTER AYIN\r
+0xF3 0x05E3 #HEBREW LETTER FINAL PE\r
+0xF4 0x05E4 #HEBREW LETTER PE\r
+0xF5 0x05E5 #HEBREW LETTER FINAL TSADI\r
+0xF6 0x05E6 #HEBREW LETTER TSADI\r
+0xF7 0x05E7 #HEBREW LETTER QOF\r
+0xF8 0x05E8 #HEBREW LETTER RESH\r
+0xF9 0x05E9 #HEBREW LETTER SHIN\r
+0xFA 0x05EA #HEBREW LETTER TAV\r
+0xFB #UNDEFINED\r
+0xFC #UNDEFINED\r
+0xFD 0x200E #LEFT-TO-RIGHT MARK\r
+0xFE 0x200F #RIGHT-TO-LEFT MARK\r
+0xFF #UNDEFINED\r
--- /dev/null
+#\r
+# Name: cp1256 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp1256 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp1256 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 0x067E #ARABIC LETTER PEH\r
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK\r
+0x83 0x0192 #LATIN SMALL LETTER F WITH HOOK\r
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 0x2020 #DAGGER\r
+0x87 0x2021 #DOUBLE DAGGER\r
+0x88 0x02C6 #MODIFIER LETTER CIRCUMFLEX ACCENT\r
+0x89 0x2030 #PER MILLE SIGN\r
+0x8A #UNDEFINED\r
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK\r
+0x8C 0x0152 #LATIN CAPITAL LIGATURE OE\r
+0x8D 0x0686 #ARABIC LETTER TCHEH\r
+0x8E 0x0698 #ARABIC LETTER JEH\r
+0x8F #UNDEFINED\r
+0x90 0x06AF #ARABIC LETTER GAF\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 #UNDEFINED\r
+0x99 0x2122 #TRADE MARK SIGN\r
+0x9A #UNDEFINED\r
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK\r
+0x9C 0x0153 #LATIN SMALL LIGATURE OE\r
+0x9D 0x200C #ZERO WIDTH NON-JOINER\r
+0x9E 0x200D #ZERO WIDTH JOINER\r
+0x9F #UNDEFINED\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 0x060C #ARABIC COMMA\r
+0xA2 0x00A2 #CENT SIGN\r
+0xA3 0x00A3 #POUND SIGN\r
+0xA4 0x00A4 #CURRENCY SIGN\r
+0xA5 0x00A5 #YEN SIGN\r
+0xA6 0x00A6 #BROKEN BAR\r
+0xA7 0x00A7 #SECTION SIGN\r
+0xA8 0x00A8 #DIAERESIS\r
+0xA9 0x00A9 #COPYRIGHT SIGN\r
+0xAA #UNDEFINED\r
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC #NOT SIGN\r
+0xAD 0x00AD #SOFT HYPHEN\r
+0xAE 0x00AE #REGISTERED SIGN\r
+0xAF 0x00AF #MACRON\r
+0xB0 0x00B0 #DEGREE SIGN\r
+0xB1 0x00B1 #PLUS-MINUS SIGN\r
+0xB2 0x00B2 #SUPERSCRIPT TWO\r
+0xB3 0x00B3 #SUPERSCRIPT THREE\r
+0xB4 0x00B4 #ACUTE ACCENT\r
+0xB5 0x00B5 #MICRO SIGN\r
+0xB6 0x00B6 #PILCROW SIGN\r
+0xB7 0x00B7 #MIDDLE DOT\r
+0xB8 0x00B8 #CEDILLA\r
+0xB9 0x00B9 #SUPERSCRIPT ONE\r
+0xBA 0x061B #ARABIC SEMICOLON\r
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x00BC #VULGAR FRACTION ONE QUARTER\r
+0xBD 0x00BD #VULGAR FRACTION ONE HALF\r
+0xBE 0x00BE #VULGAR FRACTION THREE QUARTERS\r
+0xBF 0x061F #ARABIC QUESTION MARK\r
+0xC0 #UNDEFINED\r
+0xC1 0x0621 #ARABIC LETTER HAMZA\r
+0xC2 0x0622 #ARABIC LETTER ALEF WITH MADDA ABOVE\r
+0xC3 0x0623 #ARABIC LETTER ALEF WITH HAMZA ABOVE\r
+0xC4 0x0624 #ARABIC LETTER WAW WITH HAMZA ABOVE\r
+0xC5 0x0625 #ARABIC LETTER ALEF WITH HAMZA BELOW\r
+0xC6 0x0626 #ARABIC LETTER YEH WITH HAMZA ABOVE\r
+0xC7 0x0627 #ARABIC LETTER ALEF\r
+0xC8 0x0628 #ARABIC LETTER BEH\r
+0xC9 0x0629 #ARABIC LETTER TEH MARBUTA\r
+0xCA 0x062A #ARABIC LETTER TEH\r
+0xCB 0x062B #ARABIC LETTER THEH\r
+0xCC 0x062C #ARABIC LETTER JEEM\r
+0xCD 0x062D #ARABIC LETTER HAH\r
+0xCE 0x062E #ARABIC LETTER KHAH\r
+0xCF 0x062F #ARABIC LETTER DAL\r
+0xD0 0x0630 #ARABIC LETTER THAL\r
+0xD1 0x0631 #ARABIC LETTER REH\r
+0xD2 0x0632 #ARABIC LETTER ZAIN\r
+0xD3 0x0633 #ARABIC LETTER SEEN\r
+0xD4 0x0634 #ARABIC LETTER SHEEN\r
+0xD5 0x0635 #ARABIC LETTER SAD\r
+0xD6 0x0636 #ARABIC LETTER DAD\r
+0xD7 0x00D7 #MULTIPLICATION SIGN\r
+0xD8 0x0637 #ARABIC LETTER TAH\r
+0xD9 0x0638 #ARABIC LETTER ZAH\r
+0xDA 0x0639 #ARABIC LETTER AIN\r
+0xDB 0x063A #ARABIC LETTER GHAIN\r
+0xDC 0x0640 #ARABIC TATWEEL\r
+0xDD 0x0641 #ARABIC LETTER FEH\r
+0xDE 0x0642 #ARABIC LETTER QAF\r
+0xDF 0x0643 #ARABIC LETTER KAF\r
+0xE0 0x00E0 #LATIN SMALL LETTER A WITH GRAVE\r
+0xE1 0x0644 #ARABIC LETTER LAM\r
+0xE2 0x00E2 #LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x0645 #ARABIC LETTER MEEM\r
+0xE4 0x0646 #ARABIC LETTER NOON\r
+0xE5 0x0647 #ARABIC LETTER HEH\r
+0xE6 0x0648 #ARABIC LETTER WAW\r
+0xE7 0x00E7 #LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x00E8 #LATIN SMALL LETTER E WITH GRAVE\r
+0xE9 0x00E9 #LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x00EA #LATIN SMALL LETTER E WITH CIRCUMFLEX\r
+0xEB 0x00EB #LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x0649 #ARABIC LETTER ALEF MAKSURA\r
+0xED 0x064A #ARABIC LETTER YEH\r
+0xEE 0x00EE #LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x00EF #LATIN SMALL LETTER I WITH DIAERESIS\r
+0xF0 0x064B #ARABIC FATHATAN\r
+0xF1 0x064C #ARABIC DAMMATAN\r
+0xF2 0x064D #ARABIC KASRATAN\r
+0xF3 0x064E #ARABIC FATHA\r
+0xF4 0x00F4 #LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x064F #ARABIC DAMMA\r
+0xF6 0x0650 #ARABIC KASRA\r
+0xF7 0x00F7 #DIVISION SIGN\r
+0xF8 0x0651 #ARABIC SHADDA\r
+0xF9 0x00F9 #LATIN SMALL LETTER U WITH GRAVE\r
+0xFA 0x0652 #ARABIC SUKUN\r
+0xFB 0x00FB #LATIN SMALL LETTER U WITH CIRCUMFLEX\r
+0xFC 0x00FC #LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x200E #LEFT-TO-RIGHT MARK\r
+0xFE 0x200F #RIGHT-TO-LEFT MARK\r
+0xFF #UNDEFINED\r
--- /dev/null
+#\r
+# Name: cp1257 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp1257 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp1257 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 #UNDEFINED\r
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK\r
+0x83 #UNDEFINED\r
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 0x2020 #DAGGER\r
+0x87 0x2021 #DOUBLE DAGGER\r
+0x88 #UNDEFINED\r
+0x89 0x2030 #PER MILLE SIGN\r
+0x8A #UNDEFINED\r
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK\r
+0x8C #UNDEFINED\r
+0x8D 0x00A8 #DIAERESIS\r
+0x8E 0x02C7 #CARON\r
+0x8F 0x00B8 #CEDILLA\r
+0x90 #UNDEFINED\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 #UNDEFINED\r
+0x99 0x2122 #TRADE MARK SIGN\r
+0x9A #UNDEFINED\r
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK\r
+0x9C #UNDEFINED\r
+0x9D 0x00AF #MACRON\r
+0x9E 0x02DB #OGONEK\r
+0x9F #UNDEFINED\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 #UNDEFINED\r
+0xA2 0x00A2 #CENT SIGN\r
+0xA3 0x00A3 #POUND SIGN\r
+0xA4 0x00A4 #CURRENCY SIGN\r
+0xA5 #UNDEFINED\r
+0xA6 0x00A6 #BROKEN BAR\r
+0xA7 0x00A7 #SECTION SIGN\r
+0xA8 0x00D8 #LATIN CAPITAL LETTER O WITH STROKE\r
+0xA9 0x00A9 #COPYRIGHT SIGN\r
+0xAA 0x0156 #LATIN CAPITAL LETTER R WITH CEDILLA\r
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC #NOT SIGN\r
+0xAD 0x00AD #SOFT HYPHEN\r
+0xAE 0x00AE #REGISTERED SIGN\r
+0xAF 0x00C6 #LATIN CAPITAL LETTER AE\r
+0xB0 0x00B0 #DEGREE SIGN\r
+0xB1 0x00B1 #PLUS-MINUS SIGN\r
+0xB2 0x00B2 #SUPERSCRIPT TWO\r
+0xB3 0x00B3 #SUPERSCRIPT THREE\r
+0xB4 0x00B4 #ACUTE ACCENT\r
+0xB5 0x00B5 #MICRO SIGN\r
+0xB6 0x00B6 #PILCROW SIGN\r
+0xB7 0x00B7 #MIDDLE DOT\r
+0xB8 0x00F8 #LATIN SMALL LETTER O WITH STROKE\r
+0xB9 0x00B9 #SUPERSCRIPT ONE\r
+0xBA 0x0157 #LATIN SMALL LETTER R WITH CEDILLA\r
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x00BC #VULGAR FRACTION ONE QUARTER\r
+0xBD 0x00BD #VULGAR FRACTION ONE HALF\r
+0xBE 0x00BE #VULGAR FRACTION THREE QUARTERS\r
+0xBF 0x00E6 #LATIN SMALL LETTER AE\r
+0xC0 0x0104 #LATIN CAPITAL LETTER A WITH OGONEK\r
+0xC1 0x012E #LATIN CAPITAL LETTER I WITH OGONEK\r
+0xC2 0x0100 #LATIN CAPITAL LETTER A WITH MACRON\r
+0xC3 0x0106 #LATIN CAPITAL LETTER C WITH ACUTE\r
+0xC4 0x00C4 #LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x00C5 #LATIN CAPITAL LETTER A WITH RING ABOVE\r
+0xC6 0x0118 #LATIN CAPITAL LETTER E WITH OGONEK\r
+0xC7 0x0112 #LATIN CAPITAL LETTER E WITH MACRON\r
+0xC8 0x010C #LATIN CAPITAL LETTER C WITH CARON\r
+0xC9 0x00C9 #LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x0179 #LATIN CAPITAL LETTER Z WITH ACUTE\r
+0xCB 0x0116 #LATIN CAPITAL LETTER E WITH DOT ABOVE\r
+0xCC 0x0122 #LATIN CAPITAL LETTER G WITH CEDILLA\r
+0xCD 0x0136 #LATIN CAPITAL LETTER K WITH CEDILLA\r
+0xCE 0x012A #LATIN CAPITAL LETTER I WITH MACRON\r
+0xCF 0x013B #LATIN CAPITAL LETTER L WITH CEDILLA\r
+0xD0 0x0160 #LATIN CAPITAL LETTER S WITH CARON\r
+0xD1 0x0143 #LATIN CAPITAL LETTER N WITH ACUTE\r
+0xD2 0x0145 #LATIN CAPITAL LETTER N WITH CEDILLA\r
+0xD3 0x00D3 #LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x014C #LATIN CAPITAL LETTER O WITH MACRON\r
+0xD5 0x00D5 #LATIN CAPITAL LETTER O WITH TILDE\r
+0xD6 0x00D6 #LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 #MULTIPLICATION SIGN\r
+0xD8 0x0172 #LATIN CAPITAL LETTER U WITH OGONEK\r
+0xD9 0x0141 #LATIN CAPITAL LETTER L WITH STROKE\r
+0xDA 0x015A #LATIN CAPITAL LETTER S WITH ACUTE\r
+0xDB 0x016A #LATIN CAPITAL LETTER U WITH MACRON\r
+0xDC 0x00DC #LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x017B #LATIN CAPITAL LETTER Z WITH DOT ABOVE\r
+0xDE 0x017D #LATIN CAPITAL LETTER Z WITH CARON\r
+0xDF 0x00DF #LATIN SMALL LETTER SHARP S\r
+0xE0 0x0105 #LATIN SMALL LETTER A WITH OGONEK\r
+0xE1 0x012F #LATIN SMALL LETTER I WITH OGONEK\r
+0xE2 0x0101 #LATIN SMALL LETTER A WITH MACRON\r
+0xE3 0x0107 #LATIN SMALL LETTER C WITH ACUTE\r
+0xE4 0x00E4 #LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x00E5 #LATIN SMALL LETTER A WITH RING ABOVE\r
+0xE6 0x0119 #LATIN SMALL LETTER E WITH OGONEK\r
+0xE7 0x0113 #LATIN SMALL LETTER E WITH MACRON\r
+0xE8 0x010D #LATIN SMALL LETTER C WITH CARON\r
+0xE9 0x00E9 #LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x017A #LATIN SMALL LETTER Z WITH ACUTE\r
+0xEB 0x0117 #LATIN SMALL LETTER E WITH DOT ABOVE\r
+0xEC 0x0123 #LATIN SMALL LETTER G WITH CEDILLA\r
+0xED 0x0137 #LATIN SMALL LETTER K WITH CEDILLA\r
+0xEE 0x012B #LATIN SMALL LETTER I WITH MACRON\r
+0xEF 0x013C #LATIN SMALL LETTER L WITH CEDILLA\r
+0xF0 0x0161 #LATIN SMALL LETTER S WITH CARON\r
+0xF1 0x0144 #LATIN SMALL LETTER N WITH ACUTE\r
+0xF2 0x0146 #LATIN SMALL LETTER N WITH CEDILLA\r
+0xF3 0x00F3 #LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x014D #LATIN SMALL LETTER O WITH MACRON\r
+0xF5 0x00F5 #LATIN SMALL LETTER O WITH TILDE\r
+0xF6 0x00F6 #LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 #DIVISION SIGN\r
+0xF8 0x0173 #LATIN SMALL LETTER U WITH OGONEK\r
+0xF9 0x0142 #LATIN SMALL LETTER L WITH STROKE\r
+0xFA 0x015B #LATIN SMALL LETTER S WITH ACUTE\r
+0xFB 0x016B #LATIN SMALL LETTER U WITH MACRON\r
+0xFC 0x00FC #LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x017C #LATIN SMALL LETTER Z WITH DOT ABOVE\r
+0xFE 0x017E #LATIN SMALL LETTER Z WITH CARON\r
+0xFF 0x02D9 #DOT ABOVE\r
--- /dev/null
+#\r
+# Name: cp1258 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp1258 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp1258 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 #UNDEFINED\r
+0x82 0x201A #SINGLE LOW-9 QUOTATION MARK\r
+0x83 0x0192 #LATIN SMALL LETTER F WITH HOOK\r
+0x84 0x201E #DOUBLE LOW-9 QUOTATION MARK\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 0x2020 #DAGGER\r
+0x87 0x2021 #DOUBLE DAGGER\r
+0x88 0x02C6 #MODIFIER LETTER CIRCUMFLEX ACCENT\r
+0x89 0x2030 #PER MILLE SIGN\r
+0x8A #UNDEFINED\r
+0x8B 0x2039 #SINGLE LEFT-POINTING ANGLE QUOTATION MARK\r
+0x8C 0x0152 #LATIN CAPITAL LIGATURE OE\r
+0x8D #UNDEFINED\r
+0x8E #UNDEFINED\r
+0x8F #UNDEFINED\r
+0x90 #UNDEFINED\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 0x02DC #SMALL TILDE\r
+0x99 0x2122 #TRADE MARK SIGN\r
+0x9A #UNDEFINED\r
+0x9B 0x203A #SINGLE RIGHT-POINTING ANGLE QUOTATION MARK\r
+0x9C 0x0153 #LATIN SMALL LIGATURE OE\r
+0x9D #UNDEFINED\r
+0x9E #UNDEFINED\r
+0x9F 0x0178 #LATIN CAPITAL LETTER Y WITH DIAERESIS\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 0x00A1 #INVERTED EXCLAMATION MARK\r
+0xA2 0x00A2 #CENT SIGN\r
+0xA3 0x00A3 #POUND SIGN\r
+0xA4 0x00A4 #CURRENCY SIGN\r
+0xA5 0x00A5 #YEN SIGN\r
+0xA6 0x00A6 #BROKEN BAR\r
+0xA7 0x00A7 #SECTION SIGN\r
+0xA8 0x00A8 #DIAERESIS\r
+0xA9 0x00A9 #COPYRIGHT SIGN\r
+0xAA 0x00AA #FEMININE ORDINAL INDICATOR\r
+0xAB 0x00AB #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xAC 0x00AC #NOT SIGN\r
+0xAD 0x00AD #SOFT HYPHEN\r
+0xAE 0x00AE #REGISTERED SIGN\r
+0xAF 0x00AF #MACRON\r
+0xB0 0x00B0 #DEGREE SIGN\r
+0xB1 0x00B1 #PLUS-MINUS SIGN\r
+0xB2 0x00B2 #SUPERSCRIPT TWO\r
+0xB3 0x00B3 #SUPERSCRIPT THREE\r
+0xB4 0x00B4 #ACUTE ACCENT\r
+0xB5 0x00B5 #MICRO SIGN\r
+0xB6 0x00B6 #PILCROW SIGN\r
+0xB7 0x00B7 #MIDDLE DOT\r
+0xB8 0x00B8 #CEDILLA\r
+0xB9 0x00B9 #SUPERSCRIPT ONE\r
+0xBA 0x00BA #MASCULINE ORDINAL INDICATOR\r
+0xBB 0x00BB #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK\r
+0xBC 0x00BC #VULGAR FRACTION ONE QUARTER\r
+0xBD 0x00BD #VULGAR FRACTION ONE HALF\r
+0xBE 0x00BE #VULGAR FRACTION THREE QUARTERS\r
+0xBF 0x00BF #INVERTED QUESTION MARK\r
+0xC0 0x00C0 #LATIN CAPITAL LETTER A WITH GRAVE\r
+0xC1 0x00C1 #LATIN CAPITAL LETTER A WITH ACUTE\r
+0xC2 0x00C2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX\r
+0xC3 0x0102 #LATIN CAPITAL LETTER A WITH BREVE\r
+0xC4 0x00C4 #LATIN CAPITAL LETTER A WITH DIAERESIS\r
+0xC5 0x00C5 #LATIN CAPITAL LETTER A WITH RING ABOVE\r
+0xC6 0x00C6 #LATIN CAPITAL LETTER AE\r
+0xC7 0x00C7 #LATIN CAPITAL LETTER C WITH CEDILLA\r
+0xC8 0x00C8 #LATIN CAPITAL LETTER E WITH GRAVE\r
+0xC9 0x00C9 #LATIN CAPITAL LETTER E WITH ACUTE\r
+0xCA 0x00CA #LATIN CAPITAL LETTER E WITH CIRCUMFLEX\r
+0xCB 0x00CB #LATIN CAPITAL LETTER E WITH DIAERESIS\r
+0xCC 0x0300 #COMBINING GRAVE ACCENT\r
+0xCD 0x00CD #LATIN CAPITAL LETTER I WITH ACUTE\r
+0xCE 0x00CE #LATIN CAPITAL LETTER I WITH CIRCUMFLEX\r
+0xCF 0x00CF #LATIN CAPITAL LETTER I WITH DIAERESIS\r
+0xD0 0x0110 #LATIN CAPITAL LETTER D WITH STROKE\r
+0xD1 0x00D1 #LATIN CAPITAL LETTER N WITH TILDE\r
+0xD2 0x0309 #COMBINING HOOK ABOVE\r
+0xD3 0x00D3 #LATIN CAPITAL LETTER O WITH ACUTE\r
+0xD4 0x00D4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX\r
+0xD5 0x01A0 #LATIN CAPITAL LETTER O WITH HORN\r
+0xD6 0x00D6 #LATIN CAPITAL LETTER O WITH DIAERESIS\r
+0xD7 0x00D7 #MULTIPLICATION SIGN\r
+0xD8 0x00D8 #LATIN CAPITAL LETTER O WITH STROKE\r
+0xD9 0x00D9 #LATIN CAPITAL LETTER U WITH GRAVE\r
+0xDA 0x00DA #LATIN CAPITAL LETTER U WITH ACUTE\r
+0xDB 0x00DB #LATIN CAPITAL LETTER U WITH CIRCUMFLEX\r
+0xDC 0x00DC #LATIN CAPITAL LETTER U WITH DIAERESIS\r
+0xDD 0x01AF #LATIN CAPITAL LETTER U WITH HORN\r
+0xDE 0x0303 #COMBINING TILDE\r
+0xDF 0x00DF #LATIN SMALL LETTER SHARP S\r
+0xE0 0x00E0 #LATIN SMALL LETTER A WITH GRAVE\r
+0xE1 0x00E1 #LATIN SMALL LETTER A WITH ACUTE\r
+0xE2 0x00E2 #LATIN SMALL LETTER A WITH CIRCUMFLEX\r
+0xE3 0x0103 #LATIN SMALL LETTER A WITH BREVE\r
+0xE4 0x00E4 #LATIN SMALL LETTER A WITH DIAERESIS\r
+0xE5 0x00E5 #LATIN SMALL LETTER A WITH RING ABOVE\r
+0xE6 0x00E6 #LATIN SMALL LETTER AE\r
+0xE7 0x00E7 #LATIN SMALL LETTER C WITH CEDILLA\r
+0xE8 0x00E8 #LATIN SMALL LETTER E WITH GRAVE\r
+0xE9 0x00E9 #LATIN SMALL LETTER E WITH ACUTE\r
+0xEA 0x00EA #LATIN SMALL LETTER E WITH CIRCUMFLEX\r
+0xEB 0x00EB #LATIN SMALL LETTER E WITH DIAERESIS\r
+0xEC 0x0301 #COMBINING ACUTE ACCENT\r
+0xED 0x00ED #LATIN SMALL LETTER I WITH ACUTE\r
+0xEE 0x00EE #LATIN SMALL LETTER I WITH CIRCUMFLEX\r
+0xEF 0x00EF #LATIN SMALL LETTER I WITH DIAERESIS\r
+0xF0 0x0111 #LATIN SMALL LETTER D WITH STROKE\r
+0xF1 0x00F1 #LATIN SMALL LETTER N WITH TILDE\r
+0xF2 0x0323 #COMBINING DOT BELOW\r
+0xF3 0x00F3 #LATIN SMALL LETTER O WITH ACUTE\r
+0xF4 0x00F4 #LATIN SMALL LETTER O WITH CIRCUMFLEX\r
+0xF5 0x01A1 #LATIN SMALL LETTER O WITH HORN\r
+0xF6 0x00F6 #LATIN SMALL LETTER O WITH DIAERESIS\r
+0xF7 0x00F7 #DIVISION SIGN\r
+0xF8 0x00F8 #LATIN SMALL LETTER O WITH STROKE\r
+0xF9 0x00F9 #LATIN SMALL LETTER U WITH GRAVE\r
+0xFA 0x00FA #LATIN SMALL LETTER U WITH ACUTE\r
+0xFB 0x00FB #LATIN SMALL LETTER U WITH CIRCUMFLEX\r
+0xFC 0x00FC #LATIN SMALL LETTER U WITH DIAERESIS\r
+0xFD 0x01B0 #LATIN SMALL LETTER U WITH HORN\r
+0xFE 0x20AB #DONG SIGN\r
+0xFF 0x00FF #LATIN SMALL LETTER Y WITH DIAERESIS\r
--- /dev/null
+#
+# Name: cp437_DOSLatinUS to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp437_DOSLatinUS code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp437_DOSLatinUS order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
+0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
+0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
+0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x84 0x00e4 #LATIN SMALL LETTER A WITH DIAERESIS
+0x85 0x00e0 #LATIN SMALL LETTER A WITH GRAVE
+0x86 0x00e5 #LATIN SMALL LETTER A WITH RING ABOVE
+0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
+0x88 0x00ea #LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x89 0x00eb #LATIN SMALL LETTER E WITH DIAERESIS
+0x8a 0x00e8 #LATIN SMALL LETTER E WITH GRAVE
+0x8b 0x00ef #LATIN SMALL LETTER I WITH DIAERESIS
+0x8c 0x00ee #LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x8d 0x00ec #LATIN SMALL LETTER I WITH GRAVE
+0x8e 0x00c4 #LATIN CAPITAL LETTER A WITH DIAERESIS
+0x8f 0x00c5 #LATIN CAPITAL LETTER A WITH RING ABOVE
+0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
+0x91 0x00e6 #LATIN SMALL LIGATURE AE
+0x92 0x00c6 #LATIN CAPITAL LIGATURE AE
+0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x94 0x00f6 #LATIN SMALL LETTER O WITH DIAERESIS
+0x95 0x00f2 #LATIN SMALL LETTER O WITH GRAVE
+0x96 0x00fb #LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x97 0x00f9 #LATIN SMALL LETTER U WITH GRAVE
+0x98 0x00ff #LATIN SMALL LETTER Y WITH DIAERESIS
+0x99 0x00d6 #LATIN CAPITAL LETTER O WITH DIAERESIS
+0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
+0x9b 0x00a2 #CENT SIGN
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x00a5 #YEN SIGN
+0x9e 0x20a7 #PESETA SIGN
+0x9f 0x0192 #LATIN SMALL LETTER F WITH HOOK
+0xa0 0x00e1 #LATIN SMALL LETTER A WITH ACUTE
+0xa1 0x00ed #LATIN SMALL LETTER I WITH ACUTE
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x00f1 #LATIN SMALL LETTER N WITH TILDE
+0xa5 0x00d1 #LATIN CAPITAL LETTER N WITH TILDE
+0xa6 0x00aa #FEMININE ORDINAL INDICATOR
+0xa7 0x00ba #MASCULINE ORDINAL INDICATOR
+0xa8 0x00bf #INVERTED QUESTION MARK
+0xa9 0x2310 #REVERSED NOT SIGN
+0xaa 0x00ac #NOT SIGN
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x00bc #VULGAR FRACTION ONE QUARTER
+0xad 0x00a1 #INVERTED EXCLAMATION MARK
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x2561 #BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xb6 0x2562 #BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
+0xb7 0x2556 #BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
+0xb8 0x2555 #BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x255c #BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
+0xbe 0x255b #BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x255e #BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xc7 0x255f #BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x2567 #BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xd0 0x2568 #BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xd1 0x2564 #BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
+0xd2 0x2565 #BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
+0xd3 0x2559 #BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xd4 0x2558 #BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xd5 0x2552 #BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xd6 0x2553 #BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
+0xd7 0x256b #BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
+0xd8 0x256a #BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x258c #LEFT HALF BLOCK
+0xde 0x2590 #RIGHT HALF BLOCK
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x03b1 #GREEK SMALL LETTER ALPHA
+0xe1 0x00df #LATIN SMALL LETTER SHARP S
+0xe2 0x0393 #GREEK CAPITAL LETTER GAMMA
+0xe3 0x03c0 #GREEK SMALL LETTER PI
+0xe4 0x03a3 #GREEK CAPITAL LETTER SIGMA
+0xe5 0x03c3 #GREEK SMALL LETTER SIGMA
+0xe6 0x00b5 #MICRO SIGN
+0xe7 0x03c4 #GREEK SMALL LETTER TAU
+0xe8 0x03a6 #GREEK CAPITAL LETTER PHI
+0xe9 0x0398 #GREEK CAPITAL LETTER THETA
+0xea 0x03a9 #GREEK CAPITAL LETTER OMEGA
+0xeb 0x03b4 #GREEK SMALL LETTER DELTA
+0xec 0x221e #INFINITY
+0xed 0x03c6 #GREEK SMALL LETTER PHI
+0xee 0x03b5 #GREEK SMALL LETTER EPSILON
+0xef 0x2229 #INTERSECTION
+0xf0 0x2261 #IDENTICAL TO
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 0x2265 #GREATER-THAN OR EQUAL TO
+0xf3 0x2264 #LESS-THAN OR EQUAL TO
+0xf4 0x2320 #TOP HALF INTEGRAL
+0xf5 0x2321 #BOTTOM HALF INTEGRAL
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x2248 #ALMOST EQUAL TO
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x2219 #BULLET OPERATOR
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x221a #SQUARE ROOT
+0xfc 0x207f #SUPERSCRIPT LATIN SMALL LETTER N
+0xfd 0x00b2 #SUPERSCRIPT TWO
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp850_DOSLatin1 to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp850_DOSLatin1 code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp850_DOSLatin1 order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
+0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
+0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
+0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x84 0x00e4 #LATIN SMALL LETTER A WITH DIAERESIS
+0x85 0x00e0 #LATIN SMALL LETTER A WITH GRAVE
+0x86 0x00e5 #LATIN SMALL LETTER A WITH RING ABOVE
+0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
+0x88 0x00ea #LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x89 0x00eb #LATIN SMALL LETTER E WITH DIAERESIS
+0x8a 0x00e8 #LATIN SMALL LETTER E WITH GRAVE
+0x8b 0x00ef #LATIN SMALL LETTER I WITH DIAERESIS
+0x8c 0x00ee #LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x8d 0x00ec #LATIN SMALL LETTER I WITH GRAVE
+0x8e 0x00c4 #LATIN CAPITAL LETTER A WITH DIAERESIS
+0x8f 0x00c5 #LATIN CAPITAL LETTER A WITH RING ABOVE
+0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
+0x91 0x00e6 #LATIN SMALL LIGATURE AE
+0x92 0x00c6 #LATIN CAPITAL LIGATURE AE
+0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x94 0x00f6 #LATIN SMALL LETTER O WITH DIAERESIS
+0x95 0x00f2 #LATIN SMALL LETTER O WITH GRAVE
+0x96 0x00fb #LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x97 0x00f9 #LATIN SMALL LETTER U WITH GRAVE
+0x98 0x00ff #LATIN SMALL LETTER Y WITH DIAERESIS
+0x99 0x00d6 #LATIN CAPITAL LETTER O WITH DIAERESIS
+0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
+0x9b 0x00f8 #LATIN SMALL LETTER O WITH STROKE
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x00d8 #LATIN CAPITAL LETTER O WITH STROKE
+0x9e 0x00d7 #MULTIPLICATION SIGN
+0x9f 0x0192 #LATIN SMALL LETTER F WITH HOOK
+0xa0 0x00e1 #LATIN SMALL LETTER A WITH ACUTE
+0xa1 0x00ed #LATIN SMALL LETTER I WITH ACUTE
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x00f1 #LATIN SMALL LETTER N WITH TILDE
+0xa5 0x00d1 #LATIN CAPITAL LETTER N WITH TILDE
+0xa6 0x00aa #FEMININE ORDINAL INDICATOR
+0xa7 0x00ba #MASCULINE ORDINAL INDICATOR
+0xa8 0x00bf #INVERTED QUESTION MARK
+0xa9 0x00ae #REGISTERED SIGN
+0xaa 0x00ac #NOT SIGN
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x00bc #VULGAR FRACTION ONE QUARTER
+0xad 0x00a1 #INVERTED EXCLAMATION MARK
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x00c1 #LATIN CAPITAL LETTER A WITH ACUTE
+0xb6 0x00c2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0xb7 0x00c0 #LATIN CAPITAL LETTER A WITH GRAVE
+0xb8 0x00a9 #COPYRIGHT SIGN
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x00a2 #CENT SIGN
+0xbe 0x00a5 #YEN SIGN
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x00e3 #LATIN SMALL LETTER A WITH TILDE
+0xc7 0x00c3 #LATIN CAPITAL LETTER A WITH TILDE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x00a4 #CURRENCY SIGN
+0xd0 0x00f0 #LATIN SMALL LETTER ETH
+0xd1 0x00d0 #LATIN CAPITAL LETTER ETH
+0xd2 0x00ca #LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+0xd3 0x00cb #LATIN CAPITAL LETTER E WITH DIAERESIS
+0xd4 0x00c8 #LATIN CAPITAL LETTER E WITH GRAVE
+0xd5 0x0131 #LATIN SMALL LETTER DOTLESS I
+0xd6 0x00cd #LATIN CAPITAL LETTER I WITH ACUTE
+0xd7 0x00ce #LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+0xd8 0x00cf #LATIN CAPITAL LETTER I WITH DIAERESIS
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x00a6 #BROKEN BAR
+0xde 0x00cc #LATIN CAPITAL LETTER I WITH GRAVE
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x00d3 #LATIN CAPITAL LETTER O WITH ACUTE
+0xe1 0x00df #LATIN SMALL LETTER SHARP S
+0xe2 0x00d4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0xe3 0x00d2 #LATIN CAPITAL LETTER O WITH GRAVE
+0xe4 0x00f5 #LATIN SMALL LETTER O WITH TILDE
+0xe5 0x00d5 #LATIN CAPITAL LETTER O WITH TILDE
+0xe6 0x00b5 #MICRO SIGN
+0xe7 0x00fe #LATIN SMALL LETTER THORN
+0xe8 0x00de #LATIN CAPITAL LETTER THORN
+0xe9 0x00da #LATIN CAPITAL LETTER U WITH ACUTE
+0xea 0x00db #LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+0xeb 0x00d9 #LATIN CAPITAL LETTER U WITH GRAVE
+0xec 0x00fd #LATIN SMALL LETTER Y WITH ACUTE
+0xed 0x00dd #LATIN CAPITAL LETTER Y WITH ACUTE
+0xee 0x00af #MACRON
+0xef 0x00b4 #ACUTE ACCENT
+0xf0 0x00ad #SOFT HYPHEN
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 0x2017 #DOUBLE LOW LINE
+0xf3 0x00be #VULGAR FRACTION THREE QUARTERS
+0xf4 0x00b6 #PILCROW SIGN
+0xf5 0x00a7 #SECTION SIGN
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x00b8 #CEDILLA
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x00a8 #DIAERESIS
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x00b9 #SUPERSCRIPT ONE
+0xfc 0x00b3 #SUPERSCRIPT THREE
+0xfd 0x00b2 #SUPERSCRIPT TWO
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp852_DOSLatin2 to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp852_DOSLatin2 code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp852_DOSLatin2 order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
+0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
+0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
+0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x84 0x00e4 #LATIN SMALL LETTER A WITH DIAERESIS
+0x85 0x016f #LATIN SMALL LETTER U WITH RING ABOVE
+0x86 0x0107 #LATIN SMALL LETTER C WITH ACUTE
+0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
+0x88 0x0142 #LATIN SMALL LETTER L WITH STROKE
+0x89 0x00eb #LATIN SMALL LETTER E WITH DIAERESIS
+0x8a 0x0150 #LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
+0x8b 0x0151 #LATIN SMALL LETTER O WITH DOUBLE ACUTE
+0x8c 0x00ee #LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x8d 0x0179 #LATIN CAPITAL LETTER Z WITH ACUTE
+0x8e 0x00c4 #LATIN CAPITAL LETTER A WITH DIAERESIS
+0x8f 0x0106 #LATIN CAPITAL LETTER C WITH ACUTE
+0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
+0x91 0x0139 #LATIN CAPITAL LETTER L WITH ACUTE
+0x92 0x013a #LATIN SMALL LETTER L WITH ACUTE
+0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x94 0x00f6 #LATIN SMALL LETTER O WITH DIAERESIS
+0x95 0x013d #LATIN CAPITAL LETTER L WITH CARON
+0x96 0x013e #LATIN SMALL LETTER L WITH CARON
+0x97 0x015a #LATIN CAPITAL LETTER S WITH ACUTE
+0x98 0x015b #LATIN SMALL LETTER S WITH ACUTE
+0x99 0x00d6 #LATIN CAPITAL LETTER O WITH DIAERESIS
+0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
+0x9b 0x0164 #LATIN CAPITAL LETTER T WITH CARON
+0x9c 0x0165 #LATIN SMALL LETTER T WITH CARON
+0x9d 0x0141 #LATIN CAPITAL LETTER L WITH STROKE
+0x9e 0x00d7 #MULTIPLICATION SIGN
+0x9f 0x010d #LATIN SMALL LETTER C WITH CARON
+0xa0 0x00e1 #LATIN SMALL LETTER A WITH ACUTE
+0xa1 0x00ed #LATIN SMALL LETTER I WITH ACUTE
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x0104 #LATIN CAPITAL LETTER A WITH OGONEK
+0xa5 0x0105 #LATIN SMALL LETTER A WITH OGONEK
+0xa6 0x017d #LATIN CAPITAL LETTER Z WITH CARON
+0xa7 0x017e #LATIN SMALL LETTER Z WITH CARON
+0xa8 0x0118 #LATIN CAPITAL LETTER E WITH OGONEK
+0xa9 0x0119 #LATIN SMALL LETTER E WITH OGONEK
+0xaa 0x00ac #NOT SIGN
+0xab 0x017a #LATIN SMALL LETTER Z WITH ACUTE
+0xac 0x010c #LATIN CAPITAL LETTER C WITH CARON
+0xad 0x015f #LATIN SMALL LETTER S WITH CEDILLA
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x00c1 #LATIN CAPITAL LETTER A WITH ACUTE
+0xb6 0x00c2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0xb7 0x011a #LATIN CAPITAL LETTER E WITH CARON
+0xb8 0x015e #LATIN CAPITAL LETTER S WITH CEDILLA
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x017b #LATIN CAPITAL LETTER Z WITH DOT ABOVE
+0xbe 0x017c #LATIN SMALL LETTER Z WITH DOT ABOVE
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x0102 #LATIN CAPITAL LETTER A WITH BREVE
+0xc7 0x0103 #LATIN SMALL LETTER A WITH BREVE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x00a4 #CURRENCY SIGN
+0xd0 0x0111 #LATIN SMALL LETTER D WITH STROKE
+0xd1 0x0110 #LATIN CAPITAL LETTER D WITH STROKE
+0xd2 0x010e #LATIN CAPITAL LETTER D WITH CARON
+0xd3 0x00cb #LATIN CAPITAL LETTER E WITH DIAERESIS
+0xd4 0x010f #LATIN SMALL LETTER D WITH CARON
+0xd5 0x0147 #LATIN CAPITAL LETTER N WITH CARON
+0xd6 0x00cd #LATIN CAPITAL LETTER I WITH ACUTE
+0xd7 0x00ce #LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+0xd8 0x011b #LATIN SMALL LETTER E WITH CARON
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x0162 #LATIN CAPITAL LETTER T WITH CEDILLA
+0xde 0x016e #LATIN CAPITAL LETTER U WITH RING ABOVE
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x00d3 #LATIN CAPITAL LETTER O WITH ACUTE
+0xe1 0x00df #LATIN SMALL LETTER SHARP S
+0xe2 0x00d4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0xe3 0x0143 #LATIN CAPITAL LETTER N WITH ACUTE
+0xe4 0x0144 #LATIN SMALL LETTER N WITH ACUTE
+0xe5 0x0148 #LATIN SMALL LETTER N WITH CARON
+0xe6 0x0160 #LATIN CAPITAL LETTER S WITH CARON
+0xe7 0x0161 #LATIN SMALL LETTER S WITH CARON
+0xe8 0x0154 #LATIN CAPITAL LETTER R WITH ACUTE
+0xe9 0x00da #LATIN CAPITAL LETTER U WITH ACUTE
+0xea 0x0155 #LATIN SMALL LETTER R WITH ACUTE
+0xeb 0x0170 #LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
+0xec 0x00fd #LATIN SMALL LETTER Y WITH ACUTE
+0xed 0x00dd #LATIN CAPITAL LETTER Y WITH ACUTE
+0xee 0x0163 #LATIN SMALL LETTER T WITH CEDILLA
+0xef 0x00b4 #ACUTE ACCENT
+0xf0 0x00ad #SOFT HYPHEN
+0xf1 0x02dd #DOUBLE ACUTE ACCENT
+0xf2 0x02db #OGONEK
+0xf3 0x02c7 #CARON
+0xf4 0x02d8 #BREVE
+0xf5 0x00a7 #SECTION SIGN
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x00b8 #CEDILLA
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x00a8 #DIAERESIS
+0xfa 0x02d9 #DOT ABOVE
+0xfb 0x0171 #LATIN SMALL LETTER U WITH DOUBLE ACUTE
+0xfc 0x0158 #LATIN CAPITAL LETTER R WITH CARON
+0xfd 0x0159 #LATIN SMALL LETTER R WITH CARON
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp855_DOSCyrillic to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp855_DOSCyrillic code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp855_DOSCyrillic order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x0452 #CYRILLIC SMALL LETTER DJE
+0x81 0x0402 #CYRILLIC CAPITAL LETTER DJE
+0x82 0x0453 #CYRILLIC SMALL LETTER GJE
+0x83 0x0403 #CYRILLIC CAPITAL LETTER GJE
+0x84 0x0451 #CYRILLIC SMALL LETTER IO
+0x85 0x0401 #CYRILLIC CAPITAL LETTER IO
+0x86 0x0454 #CYRILLIC SMALL LETTER UKRAINIAN IE
+0x87 0x0404 #CYRILLIC CAPITAL LETTER UKRAINIAN IE
+0x88 0x0455 #CYRILLIC SMALL LETTER DZE
+0x89 0x0405 #CYRILLIC CAPITAL LETTER DZE
+0x8a 0x0456 #CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+0x8b 0x0406 #CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+0x8c 0x0457 #CYRILLIC SMALL LETTER YI
+0x8d 0x0407 #CYRILLIC CAPITAL LETTER YI
+0x8e 0x0458 #CYRILLIC SMALL LETTER JE
+0x8f 0x0408 #CYRILLIC CAPITAL LETTER JE
+0x90 0x0459 #CYRILLIC SMALL LETTER LJE
+0x91 0x0409 #CYRILLIC CAPITAL LETTER LJE
+0x92 0x045a #CYRILLIC SMALL LETTER NJE
+0x93 0x040a #CYRILLIC CAPITAL LETTER NJE
+0x94 0x045b #CYRILLIC SMALL LETTER TSHE
+0x95 0x040b #CYRILLIC CAPITAL LETTER TSHE
+0x96 0x045c #CYRILLIC SMALL LETTER KJE
+0x97 0x040c #CYRILLIC CAPITAL LETTER KJE
+0x98 0x045e #CYRILLIC SMALL LETTER SHORT U
+0x99 0x040e #CYRILLIC CAPITAL LETTER SHORT U
+0x9a 0x045f #CYRILLIC SMALL LETTER DZHE
+0x9b 0x040f #CYRILLIC CAPITAL LETTER DZHE
+0x9c 0x044e #CYRILLIC SMALL LETTER YU
+0x9d 0x042e #CYRILLIC CAPITAL LETTER YU
+0x9e 0x044a #CYRILLIC SMALL LETTER HARD SIGN
+0x9f 0x042a #CYRILLIC CAPITAL LETTER HARD SIGN
+0xa0 0x0430 #CYRILLIC SMALL LETTER A
+0xa1 0x0410 #CYRILLIC CAPITAL LETTER A
+0xa2 0x0431 #CYRILLIC SMALL LETTER BE
+0xa3 0x0411 #CYRILLIC CAPITAL LETTER BE
+0xa4 0x0446 #CYRILLIC SMALL LETTER TSE
+0xa5 0x0426 #CYRILLIC CAPITAL LETTER TSE
+0xa6 0x0434 #CYRILLIC SMALL LETTER DE
+0xa7 0x0414 #CYRILLIC CAPITAL LETTER DE
+0xa8 0x0435 #CYRILLIC SMALL LETTER IE
+0xa9 0x0415 #CYRILLIC CAPITAL LETTER IE
+0xaa 0x0444 #CYRILLIC SMALL LETTER EF
+0xab 0x0424 #CYRILLIC CAPITAL LETTER EF
+0xac 0x0433 #CYRILLIC SMALL LETTER GHE
+0xad 0x0413 #CYRILLIC CAPITAL LETTER GHE
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x0445 #CYRILLIC SMALL LETTER HA
+0xb6 0x0425 #CYRILLIC CAPITAL LETTER HA
+0xb7 0x0438 #CYRILLIC SMALL LETTER I
+0xb8 0x0418 #CYRILLIC CAPITAL LETTER I
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x0439 #CYRILLIC SMALL LETTER SHORT I
+0xbe 0x0419 #CYRILLIC CAPITAL LETTER SHORT I
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x043a #CYRILLIC SMALL LETTER KA
+0xc7 0x041a #CYRILLIC CAPITAL LETTER KA
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x00a4 #CURRENCY SIGN
+0xd0 0x043b #CYRILLIC SMALL LETTER EL
+0xd1 0x041b #CYRILLIC CAPITAL LETTER EL
+0xd2 0x043c #CYRILLIC SMALL LETTER EM
+0xd3 0x041c #CYRILLIC CAPITAL LETTER EM
+0xd4 0x043d #CYRILLIC SMALL LETTER EN
+0xd5 0x041d #CYRILLIC CAPITAL LETTER EN
+0xd6 0x043e #CYRILLIC SMALL LETTER O
+0xd7 0x041e #CYRILLIC CAPITAL LETTER O
+0xd8 0x043f #CYRILLIC SMALL LETTER PE
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x041f #CYRILLIC CAPITAL LETTER PE
+0xde 0x044f #CYRILLIC SMALL LETTER YA
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x042f #CYRILLIC CAPITAL LETTER YA
+0xe1 0x0440 #CYRILLIC SMALL LETTER ER
+0xe2 0x0420 #CYRILLIC CAPITAL LETTER ER
+0xe3 0x0441 #CYRILLIC SMALL LETTER ES
+0xe4 0x0421 #CYRILLIC CAPITAL LETTER ES
+0xe5 0x0442 #CYRILLIC SMALL LETTER TE
+0xe6 0x0422 #CYRILLIC CAPITAL LETTER TE
+0xe7 0x0443 #CYRILLIC SMALL LETTER U
+0xe8 0x0423 #CYRILLIC CAPITAL LETTER U
+0xe9 0x0436 #CYRILLIC SMALL LETTER ZHE
+0xea 0x0416 #CYRILLIC CAPITAL LETTER ZHE
+0xeb 0x0432 #CYRILLIC SMALL LETTER VE
+0xec 0x0412 #CYRILLIC CAPITAL LETTER VE
+0xed 0x044c #CYRILLIC SMALL LETTER SOFT SIGN
+0xee 0x042c #CYRILLIC CAPITAL LETTER SOFT SIGN
+0xef 0x2116 #NUMERO SIGN
+0xf0 0x00ad #SOFT HYPHEN
+0xf1 0x044b #CYRILLIC SMALL LETTER YERU
+0xf2 0x042b #CYRILLIC CAPITAL LETTER YERU
+0xf3 0x0437 #CYRILLIC SMALL LETTER ZE
+0xf4 0x0417 #CYRILLIC CAPITAL LETTER ZE
+0xf5 0x0448 #CYRILLIC SMALL LETTER SHA
+0xf6 0x0428 #CYRILLIC CAPITAL LETTER SHA
+0xf7 0x044d #CYRILLIC SMALL LETTER E
+0xf8 0x042d #CYRILLIC CAPITAL LETTER E
+0xf9 0x0449 #CYRILLIC SMALL LETTER SHCHA
+0xfa 0x0429 #CYRILLIC CAPITAL LETTER SHCHA
+0xfb 0x0447 #CYRILLIC SMALL LETTER CHE
+0xfc 0x0427 #CYRILLIC CAPITAL LETTER CHE
+0xfd 0x00a7 #SECTION SIGN
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp857_DOSTurkish to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp857_DOSTurkish code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp857_DOSTurkish order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
+0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
+0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
+0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x84 0x00e4 #LATIN SMALL LETTER A WITH DIAERESIS
+0x85 0x00e0 #LATIN SMALL LETTER A WITH GRAVE
+0x86 0x00e5 #LATIN SMALL LETTER A WITH RING ABOVE
+0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
+0x88 0x00ea #LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x89 0x00eb #LATIN SMALL LETTER E WITH DIAERESIS
+0x8a 0x00e8 #LATIN SMALL LETTER E WITH GRAVE
+0x8b 0x00ef #LATIN SMALL LETTER I WITH DIAERESIS
+0x8c 0x00ee #LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x8d 0x0131 #LATIN SMALL LETTER DOTLESS I
+0x8e 0x00c4 #LATIN CAPITAL LETTER A WITH DIAERESIS
+0x8f 0x00c5 #LATIN CAPITAL LETTER A WITH RING ABOVE
+0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
+0x91 0x00e6 #LATIN SMALL LIGATURE AE
+0x92 0x00c6 #LATIN CAPITAL LIGATURE AE
+0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x94 0x00f6 #LATIN SMALL LETTER O WITH DIAERESIS
+0x95 0x00f2 #LATIN SMALL LETTER O WITH GRAVE
+0x96 0x00fb #LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x97 0x00f9 #LATIN SMALL LETTER U WITH GRAVE
+0x98 0x0130 #LATIN CAPITAL LETTER I WITH DOT ABOVE
+0x99 0x00d6 #LATIN CAPITAL LETTER O WITH DIAERESIS
+0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
+0x9b 0x00f8 #LATIN SMALL LETTER O WITH STROKE
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x00d8 #LATIN CAPITAL LETTER O WITH STROKE
+0x9e 0x015e #LATIN CAPITAL LETTER S WITH CEDILLA
+0x9f 0x015f #LATIN SMALL LETTER S WITH CEDILLA
+0xa0 0x00e1 #LATIN SMALL LETTER A WITH ACUTE
+0xa1 0x00ed #LATIN SMALL LETTER I WITH ACUTE
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x00f1 #LATIN SMALL LETTER N WITH TILDE
+0xa5 0x00d1 #LATIN CAPITAL LETTER N WITH TILDE
+0xa6 0x011e #LATIN CAPITAL LETTER G WITH BREVE
+0xa7 0x011f #LATIN SMALL LETTER G WITH BREVE
+0xa8 0x00bf #INVERTED QUESTION MARK
+0xa9 0x00ae #REGISTERED SIGN
+0xaa 0x00ac #NOT SIGN
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x00bc #VULGAR FRACTION ONE QUARTER
+0xad 0x00a1 #INVERTED EXCLAMATION MARK
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x00c1 #LATIN CAPITAL LETTER A WITH ACUTE
+0xb6 0x00c2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0xb7 0x00c0 #LATIN CAPITAL LETTER A WITH GRAVE
+0xb8 0x00a9 #COPYRIGHT SIGN
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x00a2 #CENT SIGN
+0xbe 0x00a5 #YEN SIGN
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x00e3 #LATIN SMALL LETTER A WITH TILDE
+0xc7 0x00c3 #LATIN CAPITAL LETTER A WITH TILDE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x00a4 #CURRENCY SIGN
+0xd0 0x00ba #MASCULINE ORDINAL INDICATOR
+0xd1 0x00aa #FEMININE ORDINAL INDICATOR
+0xd2 0x00ca #LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+0xd3 0x00cb #LATIN CAPITAL LETTER E WITH DIAERESIS
+0xd4 0x00c8 #LATIN CAPITAL LETTER E WITH GRAVE
+0xd5 #UNDEFINED
+0xd6 0x00cd #LATIN CAPITAL LETTER I WITH ACUTE
+0xd7 0x00ce #LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+0xd8 0x00cf #LATIN CAPITAL LETTER I WITH DIAERESIS
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x00a6 #BROKEN BAR
+0xde 0x00cc #LATIN CAPITAL LETTER I WITH GRAVE
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x00d3 #LATIN CAPITAL LETTER O WITH ACUTE
+0xe1 0x00df #LATIN SMALL LETTER SHARP S
+0xe2 0x00d4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0xe3 0x00d2 #LATIN CAPITAL LETTER O WITH GRAVE
+0xe4 0x00f5 #LATIN SMALL LETTER O WITH TILDE
+0xe5 0x00d5 #LATIN CAPITAL LETTER O WITH TILDE
+0xe6 0x00b5 #MICRO SIGN
+0xe7 #UNDEFINED
+0xe8 0x00d7 #MULTIPLICATION SIGN
+0xe9 0x00da #LATIN CAPITAL LETTER U WITH ACUTE
+0xea 0x00db #LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+0xeb 0x00d9 #LATIN CAPITAL LETTER U WITH GRAVE
+0xec 0x00ec #LATIN SMALL LETTER I WITH GRAVE
+0xed 0x00ff #LATIN SMALL LETTER Y WITH DIAERESIS
+0xee 0x00af #MACRON
+0xef 0x00b4 #ACUTE ACCENT
+0xf0 0x00ad #SOFT HYPHEN
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 #UNDEFINED
+0xf3 0x00be #VULGAR FRACTION THREE QUARTERS
+0xf4 0x00b6 #PILCROW SIGN
+0xf5 0x00a7 #SECTION SIGN
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x00b8 #CEDILLA
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x00a8 #DIAERESIS
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x00b9 #SUPERSCRIPT ONE
+0xfc 0x00b3 #SUPERSCRIPT THREE
+0xfd 0x00b2 #SUPERSCRIPT TWO
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp860_DOSPortuguese to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp860_DOSPortuguese code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp860_DOSPortuguese order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
+0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
+0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
+0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x84 0x00e3 #LATIN SMALL LETTER A WITH TILDE
+0x85 0x00e0 #LATIN SMALL LETTER A WITH GRAVE
+0x86 0x00c1 #LATIN CAPITAL LETTER A WITH ACUTE
+0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
+0x88 0x00ea #LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x89 0x00ca #LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+0x8a 0x00e8 #LATIN SMALL LETTER E WITH GRAVE
+0x8b 0x00cd #LATIN CAPITAL LETTER I WITH ACUTE
+0x8c 0x00d4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0x8d 0x00ec #LATIN SMALL LETTER I WITH GRAVE
+0x8e 0x00c3 #LATIN CAPITAL LETTER A WITH TILDE
+0x8f 0x00c2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
+0x91 0x00c0 #LATIN CAPITAL LETTER A WITH GRAVE
+0x92 0x00c8 #LATIN CAPITAL LETTER E WITH GRAVE
+0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x94 0x00f5 #LATIN SMALL LETTER O WITH TILDE
+0x95 0x00f2 #LATIN SMALL LETTER O WITH GRAVE
+0x96 0x00da #LATIN CAPITAL LETTER U WITH ACUTE
+0x97 0x00f9 #LATIN SMALL LETTER U WITH GRAVE
+0x98 0x00cc #LATIN CAPITAL LETTER I WITH GRAVE
+0x99 0x00d5 #LATIN CAPITAL LETTER O WITH TILDE
+0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
+0x9b 0x00a2 #CENT SIGN
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x00d9 #LATIN CAPITAL LETTER U WITH GRAVE
+0x9e 0x20a7 #PESETA SIGN
+0x9f 0x00d3 #LATIN CAPITAL LETTER O WITH ACUTE
+0xa0 0x00e1 #LATIN SMALL LETTER A WITH ACUTE
+0xa1 0x00ed #LATIN SMALL LETTER I WITH ACUTE
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x00f1 #LATIN SMALL LETTER N WITH TILDE
+0xa5 0x00d1 #LATIN CAPITAL LETTER N WITH TILDE
+0xa6 0x00aa #FEMININE ORDINAL INDICATOR
+0xa7 0x00ba #MASCULINE ORDINAL INDICATOR
+0xa8 0x00bf #INVERTED QUESTION MARK
+0xa9 0x00d2 #LATIN CAPITAL LETTER O WITH GRAVE
+0xaa 0x00ac #NOT SIGN
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x00bc #VULGAR FRACTION ONE QUARTER
+0xad 0x00a1 #INVERTED EXCLAMATION MARK
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x2561 #BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xb6 0x2562 #BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
+0xb7 0x2556 #BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
+0xb8 0x2555 #BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x255c #BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
+0xbe 0x255b #BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x255e #BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xc7 0x255f #BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x2567 #BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xd0 0x2568 #BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xd1 0x2564 #BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
+0xd2 0x2565 #BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
+0xd3 0x2559 #BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xd4 0x2558 #BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xd5 0x2552 #BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xd6 0x2553 #BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
+0xd7 0x256b #BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
+0xd8 0x256a #BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x258c #LEFT HALF BLOCK
+0xde 0x2590 #RIGHT HALF BLOCK
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x03b1 #GREEK SMALL LETTER ALPHA
+0xe1 0x00df #LATIN SMALL LETTER SHARP S
+0xe2 0x0393 #GREEK CAPITAL LETTER GAMMA
+0xe3 0x03c0 #GREEK SMALL LETTER PI
+0xe4 0x03a3 #GREEK CAPITAL LETTER SIGMA
+0xe5 0x03c3 #GREEK SMALL LETTER SIGMA
+0xe6 0x00b5 #MICRO SIGN
+0xe7 0x03c4 #GREEK SMALL LETTER TAU
+0xe8 0x03a6 #GREEK CAPITAL LETTER PHI
+0xe9 0x0398 #GREEK CAPITAL LETTER THETA
+0xea 0x03a9 #GREEK CAPITAL LETTER OMEGA
+0xeb 0x03b4 #GREEK SMALL LETTER DELTA
+0xec 0x221e #INFINITY
+0xed 0x03c6 #GREEK SMALL LETTER PHI
+0xee 0x03b5 #GREEK SMALL LETTER EPSILON
+0xef 0x2229 #INTERSECTION
+0xf0 0x2261 #IDENTICAL TO
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 0x2265 #GREATER-THAN OR EQUAL TO
+0xf3 0x2264 #LESS-THAN OR EQUAL TO
+0xf4 0x2320 #TOP HALF INTEGRAL
+0xf5 0x2321 #BOTTOM HALF INTEGRAL
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x2248 #ALMOST EQUAL TO
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x2219 #BULLET OPERATOR
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x221a #SQUARE ROOT
+0xfc 0x207f #SUPERSCRIPT LATIN SMALL LETTER N
+0xfd 0x00b2 #SUPERSCRIPT TWO
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp861_DOSIcelandic to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp861_DOSIcelandic code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp861_DOSIcelandic order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
+0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
+0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
+0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x84 0x00e4 #LATIN SMALL LETTER A WITH DIAERESIS
+0x85 0x00e0 #LATIN SMALL LETTER A WITH GRAVE
+0x86 0x00e5 #LATIN SMALL LETTER A WITH RING ABOVE
+0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
+0x88 0x00ea #LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x89 0x00eb #LATIN SMALL LETTER E WITH DIAERESIS
+0x8a 0x00e8 #LATIN SMALL LETTER E WITH GRAVE
+0x8b 0x00d0 #LATIN CAPITAL LETTER ETH
+0x8c 0x00f0 #LATIN SMALL LETTER ETH
+0x8d 0x00de #LATIN CAPITAL LETTER THORN
+0x8e 0x00c4 #LATIN CAPITAL LETTER A WITH DIAERESIS
+0x8f 0x00c5 #LATIN CAPITAL LETTER A WITH RING ABOVE
+0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
+0x91 0x00e6 #LATIN SMALL LIGATURE AE
+0x92 0x00c6 #LATIN CAPITAL LIGATURE AE
+0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x94 0x00f6 #LATIN SMALL LETTER O WITH DIAERESIS
+0x95 0x00fe #LATIN SMALL LETTER THORN
+0x96 0x00fb #LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x97 0x00dd #LATIN CAPITAL LETTER Y WITH ACUTE
+0x98 0x00fd #LATIN SMALL LETTER Y WITH ACUTE
+0x99 0x00d6 #LATIN CAPITAL LETTER O WITH DIAERESIS
+0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
+0x9b 0x00f8 #LATIN SMALL LETTER O WITH STROKE
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x00d8 #LATIN CAPITAL LETTER O WITH STROKE
+0x9e 0x20a7 #PESETA SIGN
+0x9f 0x0192 #LATIN SMALL LETTER F WITH HOOK
+0xa0 0x00e1 #LATIN SMALL LETTER A WITH ACUTE
+0xa1 0x00ed #LATIN SMALL LETTER I WITH ACUTE
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x00c1 #LATIN CAPITAL LETTER A WITH ACUTE
+0xa5 0x00cd #LATIN CAPITAL LETTER I WITH ACUTE
+0xa6 0x00d3 #LATIN CAPITAL LETTER O WITH ACUTE
+0xa7 0x00da #LATIN CAPITAL LETTER U WITH ACUTE
+0xa8 0x00bf #INVERTED QUESTION MARK
+0xa9 0x2310 #REVERSED NOT SIGN
+0xaa 0x00ac #NOT SIGN
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x00bc #VULGAR FRACTION ONE QUARTER
+0xad 0x00a1 #INVERTED EXCLAMATION MARK
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x2561 #BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xb6 0x2562 #BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
+0xb7 0x2556 #BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
+0xb8 0x2555 #BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x255c #BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
+0xbe 0x255b #BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x255e #BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xc7 0x255f #BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x2567 #BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xd0 0x2568 #BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xd1 0x2564 #BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
+0xd2 0x2565 #BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
+0xd3 0x2559 #BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xd4 0x2558 #BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xd5 0x2552 #BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xd6 0x2553 #BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
+0xd7 0x256b #BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
+0xd8 0x256a #BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x258c #LEFT HALF BLOCK
+0xde 0x2590 #RIGHT HALF BLOCK
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x03b1 #GREEK SMALL LETTER ALPHA
+0xe1 0x00df #LATIN SMALL LETTER SHARP S
+0xe2 0x0393 #GREEK CAPITAL LETTER GAMMA
+0xe3 0x03c0 #GREEK SMALL LETTER PI
+0xe4 0x03a3 #GREEK CAPITAL LETTER SIGMA
+0xe5 0x03c3 #GREEK SMALL LETTER SIGMA
+0xe6 0x00b5 #MICRO SIGN
+0xe7 0x03c4 #GREEK SMALL LETTER TAU
+0xe8 0x03a6 #GREEK CAPITAL LETTER PHI
+0xe9 0x0398 #GREEK CAPITAL LETTER THETA
+0xea 0x03a9 #GREEK CAPITAL LETTER OMEGA
+0xeb 0x03b4 #GREEK SMALL LETTER DELTA
+0xec 0x221e #INFINITY
+0xed 0x03c6 #GREEK SMALL LETTER PHI
+0xee 0x03b5 #GREEK SMALL LETTER EPSILON
+0xef 0x2229 #INTERSECTION
+0xf0 0x2261 #IDENTICAL TO
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 0x2265 #GREATER-THAN OR EQUAL TO
+0xf3 0x2264 #LESS-THAN OR EQUAL TO
+0xf4 0x2320 #TOP HALF INTEGRAL
+0xf5 0x2321 #BOTTOM HALF INTEGRAL
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x2248 #ALMOST EQUAL TO
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x2219 #BULLET OPERATOR
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x221a #SQUARE ROOT
+0xfc 0x207f #SUPERSCRIPT LATIN SMALL LETTER N
+0xfd 0x00b2 #SUPERSCRIPT TWO
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp862_DOSHebrew to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp862_DOSHebrew code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp862_DOSHebrew order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x05d0 #HEBREW LETTER ALEF
+0x81 0x05d1 #HEBREW LETTER BET
+0x82 0x05d2 #HEBREW LETTER GIMEL
+0x83 0x05d3 #HEBREW LETTER DALET
+0x84 0x05d4 #HEBREW LETTER HE
+0x85 0x05d5 #HEBREW LETTER VAV
+0x86 0x05d6 #HEBREW LETTER ZAYIN
+0x87 0x05d7 #HEBREW LETTER HET
+0x88 0x05d8 #HEBREW LETTER TET
+0x89 0x05d9 #HEBREW LETTER YOD
+0x8a 0x05da #HEBREW LETTER FINAL KAF
+0x8b 0x05db #HEBREW LETTER KAF
+0x8c 0x05dc #HEBREW LETTER LAMED
+0x8d 0x05dd #HEBREW LETTER FINAL MEM
+0x8e 0x05de #HEBREW LETTER MEM
+0x8f 0x05df #HEBREW LETTER FINAL NUN
+0x90 0x05e0 #HEBREW LETTER NUN
+0x91 0x05e1 #HEBREW LETTER SAMEKH
+0x92 0x05e2 #HEBREW LETTER AYIN
+0x93 0x05e3 #HEBREW LETTER FINAL PE
+0x94 0x05e4 #HEBREW LETTER PE
+0x95 0x05e5 #HEBREW LETTER FINAL TSADI
+0x96 0x05e6 #HEBREW LETTER TSADI
+0x97 0x05e7 #HEBREW LETTER QOF
+0x98 0x05e8 #HEBREW LETTER RESH
+0x99 0x05e9 #HEBREW LETTER SHIN
+0x9a 0x05ea #HEBREW LETTER TAV
+0x9b 0x00a2 #CENT SIGN
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x00a5 #YEN SIGN
+0x9e 0x20a7 #PESETA SIGN
+0x9f 0x0192 #LATIN SMALL LETTER F WITH HOOK
+0xa0 0x00e1 #LATIN SMALL LETTER A WITH ACUTE
+0xa1 0x00ed #LATIN SMALL LETTER I WITH ACUTE
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x00f1 #LATIN SMALL LETTER N WITH TILDE
+0xa5 0x00d1 #LATIN CAPITAL LETTER N WITH TILDE
+0xa6 0x00aa #FEMININE ORDINAL INDICATOR
+0xa7 0x00ba #MASCULINE ORDINAL INDICATOR
+0xa8 0x00bf #INVERTED QUESTION MARK
+0xa9 0x2310 #REVERSED NOT SIGN
+0xaa 0x00ac #NOT SIGN
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x00bc #VULGAR FRACTION ONE QUARTER
+0xad 0x00a1 #INVERTED EXCLAMATION MARK
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x2561 #BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xb6 0x2562 #BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
+0xb7 0x2556 #BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
+0xb8 0x2555 #BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x255c #BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
+0xbe 0x255b #BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x255e #BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xc7 0x255f #BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x2567 #BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xd0 0x2568 #BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xd1 0x2564 #BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
+0xd2 0x2565 #BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
+0xd3 0x2559 #BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xd4 0x2558 #BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xd5 0x2552 #BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xd6 0x2553 #BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
+0xd7 0x256b #BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
+0xd8 0x256a #BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x258c #LEFT HALF BLOCK
+0xde 0x2590 #RIGHT HALF BLOCK
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x03b1 #GREEK SMALL LETTER ALPHA
+0xe1 0x00df #LATIN SMALL LETTER SHARP S (GERMAN)
+0xe2 0x0393 #GREEK CAPITAL LETTER GAMMA
+0xe3 0x03c0 #GREEK SMALL LETTER PI
+0xe4 0x03a3 #GREEK CAPITAL LETTER SIGMA
+0xe5 0x03c3 #GREEK SMALL LETTER SIGMA
+0xe6 0x00b5 #MICRO SIGN
+0xe7 0x03c4 #GREEK SMALL LETTER TAU
+0xe8 0x03a6 #GREEK CAPITAL LETTER PHI
+0xe9 0x0398 #GREEK CAPITAL LETTER THETA
+0xea 0x03a9 #GREEK CAPITAL LETTER OMEGA
+0xeb 0x03b4 #GREEK SMALL LETTER DELTA
+0xec 0x221e #INFINITY
+0xed 0x03c6 #GREEK SMALL LETTER PHI
+0xee 0x03b5 #GREEK SMALL LETTER EPSILON
+0xef 0x2229 #INTERSECTION
+0xf0 0x2261 #IDENTICAL TO
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 0x2265 #GREATER-THAN OR EQUAL TO
+0xf3 0x2264 #LESS-THAN OR EQUAL TO
+0xf4 0x2320 #TOP HALF INTEGRAL
+0xf5 0x2321 #BOTTOM HALF INTEGRAL
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x2248 #ALMOST EQUAL TO
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x2219 #BULLET OPERATOR
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x221a #SQUARE ROOT
+0xfc 0x207f #SUPERSCRIPT LATIN SMALL LETTER N
+0xfd 0x00b2 #SUPERSCRIPT TWO
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp863_DOSCanadaF to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp863_DOSCanadaF code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp863_DOSCanadaF order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
+0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
+0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
+0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x84 0x00c2 #LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0x85 0x00e0 #LATIN SMALL LETTER A WITH GRAVE
+0x86 0x00b6 #PILCROW SIGN
+0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
+0x88 0x00ea #LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x89 0x00eb #LATIN SMALL LETTER E WITH DIAERESIS
+0x8a 0x00e8 #LATIN SMALL LETTER E WITH GRAVE
+0x8b 0x00ef #LATIN SMALL LETTER I WITH DIAERESIS
+0x8c 0x00ee #LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x8d 0x2017 #DOUBLE LOW LINE
+0x8e 0x00c0 #LATIN CAPITAL LETTER A WITH GRAVE
+0x8f 0x00a7 #SECTION SIGN
+0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
+0x91 0x00c8 #LATIN CAPITAL LETTER E WITH GRAVE
+0x92 0x00ca #LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x94 0x00cb #LATIN CAPITAL LETTER E WITH DIAERESIS
+0x95 0x00cf #LATIN CAPITAL LETTER I WITH DIAERESIS
+0x96 0x00fb #LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x97 0x00f9 #LATIN SMALL LETTER U WITH GRAVE
+0x98 0x00a4 #CURRENCY SIGN
+0x99 0x00d4 #LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
+0x9b 0x00a2 #CENT SIGN
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x00d9 #LATIN CAPITAL LETTER U WITH GRAVE
+0x9e 0x00db #LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+0x9f 0x0192 #LATIN SMALL LETTER F WITH HOOK
+0xa0 0x00a6 #BROKEN BAR
+0xa1 0x00b4 #ACUTE ACCENT
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x00a8 #DIAERESIS
+0xa5 0x00b8 #CEDILLA
+0xa6 0x00b3 #SUPERSCRIPT THREE
+0xa7 0x00af #MACRON
+0xa8 0x00ce #LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+0xa9 0x2310 #REVERSED NOT SIGN
+0xaa 0x00ac #NOT SIGN
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x00bc #VULGAR FRACTION ONE QUARTER
+0xad 0x00be #VULGAR FRACTION THREE QUARTERS
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x2561 #BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xb6 0x2562 #BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
+0xb7 0x2556 #BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
+0xb8 0x2555 #BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x255c #BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
+0xbe 0x255b #BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x255e #BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xc7 0x255f #BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x2567 #BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xd0 0x2568 #BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xd1 0x2564 #BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
+0xd2 0x2565 #BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
+0xd3 0x2559 #BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xd4 0x2558 #BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xd5 0x2552 #BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xd6 0x2553 #BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
+0xd7 0x256b #BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
+0xd8 0x256a #BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x258c #LEFT HALF BLOCK
+0xde 0x2590 #RIGHT HALF BLOCK
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x03b1 #GREEK SMALL LETTER ALPHA
+0xe1 0x00df #LATIN SMALL LETTER SHARP S
+0xe2 0x0393 #GREEK CAPITAL LETTER GAMMA
+0xe3 0x03c0 #GREEK SMALL LETTER PI
+0xe4 0x03a3 #GREEK CAPITAL LETTER SIGMA
+0xe5 0x03c3 #GREEK SMALL LETTER SIGMA
+0xe6 0x00b5 #MICRO SIGN
+0xe7 0x03c4 #GREEK SMALL LETTER TAU
+0xe8 0x03a6 #GREEK CAPITAL LETTER PHI
+0xe9 0x0398 #GREEK CAPITAL LETTER THETA
+0xea 0x03a9 #GREEK CAPITAL LETTER OMEGA
+0xeb 0x03b4 #GREEK SMALL LETTER DELTA
+0xec 0x221e #INFINITY
+0xed 0x03c6 #GREEK SMALL LETTER PHI
+0xee 0x03b5 #GREEK SMALL LETTER EPSILON
+0xef 0x2229 #INTERSECTION
+0xf0 0x2261 #IDENTICAL TO
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 0x2265 #GREATER-THAN OR EQUAL TO
+0xf3 0x2264 #LESS-THAN OR EQUAL TO
+0xf4 0x2320 #TOP HALF INTEGRAL
+0xf5 0x2321 #BOTTOM HALF INTEGRAL
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x2248 #ALMOST EQUAL TO
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x2219 #BULLET OPERATOR
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x221a #SQUARE ROOT
+0xfc 0x207f #SUPERSCRIPT LATIN SMALL LETTER N
+0xfd 0x00b2 #SUPERSCRIPT TWO
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp864_DOSArabic to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp864_DOSArabic code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp864_DOSArabic order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x066a #ARABIC PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00b0 #DEGREE SIGN
+0x81 0x00b7 #MIDDLE DOT
+0x82 0x2219 #BULLET OPERATOR
+0x83 0x221a #SQUARE ROOT
+0x84 0x2592 #MEDIUM SHADE
+0x85 0x2500 #FORMS LIGHT HORIZONTAL
+0x86 0x2502 #FORMS LIGHT VERTICAL
+0x87 0x253c #FORMS LIGHT VERTICAL AND HORIZONTAL
+0x88 0x2524 #FORMS LIGHT VERTICAL AND LEFT
+0x89 0x252c #FORMS LIGHT DOWN AND HORIZONTAL
+0x8a 0x251c #FORMS LIGHT VERTICAL AND RIGHT
+0x8b 0x2534 #FORMS LIGHT UP AND HORIZONTAL
+0x8c 0x2510 #FORMS LIGHT DOWN AND LEFT
+0x8d 0x250c #FORMS LIGHT DOWN AND RIGHT
+0x8e 0x2514 #FORMS LIGHT UP AND RIGHT
+0x8f 0x2518 #FORMS LIGHT UP AND LEFT
+0x90 0x03b2 #GREEK SMALL BETA
+0x91 0x221e #INFINITY
+0x92 0x03c6 #GREEK SMALL PHI
+0x93 0x00b1 #PLUS-OR-MINUS SIGN
+0x94 0x00bd #FRACTION 1/2
+0x95 0x00bc #FRACTION 1/4
+0x96 0x2248 #ALMOST EQUAL TO
+0x97 0x00ab #LEFT POINTING GUILLEMET
+0x98 0x00bb #RIGHT POINTING GUILLEMET
+0x99 0xfef7 #ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE ISOLATED FORM
+0x9a 0xfef8 #ARABIC LIGATURE LAM WITH ALEF WITH HAMZA ABOVE FINAL FORM
+0x9b #UNDEFINED
+0x9c #UNDEFINED
+0x9d 0xfefb #ARABIC LIGATURE LAM WITH ALEF ISOLATED FORM
+0x9e 0xfefc #ARABIC LIGATURE LAM WITH ALEF FINAL FORM
+0x9f #UNDEFINED
+0xa0 0x00a0 #NON-BREAKING SPACE
+0xa1 0x00ad #SOFT HYPHEN
+0xa2 0xfe82 #ARABIC LETTER ALEF WITH MADDA ABOVE FINAL FORM
+0xa3 0x00a3 #POUND SIGN
+0xa4 0x00a4 #CURRENCY SIGN
+0xa5 0xfe84 #ARABIC LETTER ALEF WITH HAMZA ABOVE FINAL FORM
+0xa6 #UNDEFINED
+0xa7 #UNDEFINED
+0xa8 0xfe8e #ARABIC LETTER ALEF FINAL FORM
+0xa9 0xfe8f #ARABIC LETTER BEH ISOLATED FORM
+0xaa 0xfe95 #ARABIC LETTER TEH ISOLATED FORM
+0xab 0xfe99 #ARABIC LETTER THEH ISOLATED FORM
+0xac 0x060c #ARABIC COMMA
+0xad 0xfe9d #ARABIC LETTER JEEM ISOLATED FORM
+0xae 0xfea1 #ARABIC LETTER HAH ISOLATED FORM
+0xaf 0xfea5 #ARABIC LETTER KHAH ISOLATED FORM
+0xb0 0x0660 #ARABIC-INDIC DIGIT ZERO
+0xb1 0x0661 #ARABIC-INDIC DIGIT ONE
+0xb2 0x0662 #ARABIC-INDIC DIGIT TWO
+0xb3 0x0663 #ARABIC-INDIC DIGIT THREE
+0xb4 0x0664 #ARABIC-INDIC DIGIT FOUR
+0xb5 0x0665 #ARABIC-INDIC DIGIT FIVE
+0xb6 0x0666 #ARABIC-INDIC DIGIT SIX
+0xb7 0x0667 #ARABIC-INDIC DIGIT SEVEN
+0xb8 0x0668 #ARABIC-INDIC DIGIT EIGHT
+0xb9 0x0669 #ARABIC-INDIC DIGIT NINE
+0xba 0xfed1 #ARABIC LETTER FEH ISOLATED FORM
+0xbb 0x061b #ARABIC SEMICOLON
+0xbc 0xfeb1 #ARABIC LETTER SEEN ISOLATED FORM
+0xbd 0xfeb5 #ARABIC LETTER SHEEN ISOLATED FORM
+0xbe 0xfeb9 #ARABIC LETTER SAD ISOLATED FORM
+0xbf 0x061f #ARABIC QUESTION MARK
+0xc0 0x00a2 #CENT SIGN
+0xc1 0xfe80 #ARABIC LETTER HAMZA ISOLATED FORM
+0xc2 0xfe81 #ARABIC LETTER ALEF WITH MADDA ABOVE ISOLATED FORM
+0xc3 0xfe83 #ARABIC LETTER ALEF WITH HAMZA ABOVE ISOLATED FORM
+0xc4 0xfe85 #ARABIC LETTER WAW WITH HAMZA ABOVE ISOLATED FORM
+0xc5 0xfeca #ARABIC LETTER AIN FINAL FORM
+0xc6 0xfe8b #ARABIC LETTER YEH WITH HAMZA ABOVE INITIAL FORM
+0xc7 0xfe8d #ARABIC LETTER ALEF ISOLATED FORM
+0xc8 0xfe91 #ARABIC LETTER BEH INITIAL FORM
+0xc9 0xfe93 #ARABIC LETTER TEH MARBUTA ISOLATED FORM
+0xca 0xfe97 #ARABIC LETTER TEH INITIAL FORM
+0xcb 0xfe9b #ARABIC LETTER THEH INITIAL FORM
+0xcc 0xfe9f #ARABIC LETTER JEEM INITIAL FORM
+0xcd 0xfea3 #ARABIC LETTER HAH INITIAL FORM
+0xce 0xfea7 #ARABIC LETTER KHAH INITIAL FORM
+0xcf 0xfea9 #ARABIC LETTER DAL ISOLATED FORM
+0xd0 0xfeab #ARABIC LETTER THAL ISOLATED FORM
+0xd1 0xfead #ARABIC LETTER REH ISOLATED FORM
+0xd2 0xfeaf #ARABIC LETTER ZAIN ISOLATED FORM
+0xd3 0xfeb3 #ARABIC LETTER SEEN INITIAL FORM
+0xd4 0xfeb7 #ARABIC LETTER SHEEN INITIAL FORM
+0xd5 0xfebb #ARABIC LETTER SAD INITIAL FORM
+0xd6 0xfebf #ARABIC LETTER DAD INITIAL FORM
+0xd7 0xfec1 #ARABIC LETTER TAH ISOLATED FORM
+0xd8 0xfec5 #ARABIC LETTER ZAH ISOLATED FORM
+0xd9 0xfecb #ARABIC LETTER AIN INITIAL FORM
+0xda 0xfecf #ARABIC LETTER GHAIN INITIAL FORM
+0xdb 0x00a6 #BROKEN VERTICAL BAR
+0xdc 0x00ac #NOT SIGN
+0xdd 0x00f7 #DIVISION SIGN
+0xde 0x00d7 #MULTIPLICATION SIGN
+0xdf 0xfec9 #ARABIC LETTER AIN ISOLATED FORM
+0xe0 0x0640 #ARABIC TATWEEL
+0xe1 0xfed3 #ARABIC LETTER FEH INITIAL FORM
+0xe2 0xfed7 #ARABIC LETTER QAF INITIAL FORM
+0xe3 0xfedb #ARABIC LETTER KAF INITIAL FORM
+0xe4 0xfedf #ARABIC LETTER LAM INITIAL FORM
+0xe5 0xfee3 #ARABIC LETTER MEEM INITIAL FORM
+0xe6 0xfee7 #ARABIC LETTER NOON INITIAL FORM
+0xe7 0xfeeb #ARABIC LETTER HEH INITIAL FORM
+0xe8 0xfeed #ARABIC LETTER WAW ISOLATED FORM
+0xe9 0xfeef #ARABIC LETTER ALEF MAKSURA ISOLATED FORM
+0xea 0xfef3 #ARABIC LETTER YEH INITIAL FORM
+0xeb 0xfebd #ARABIC LETTER DAD ISOLATED FORM
+0xec 0xfecc #ARABIC LETTER AIN MEDIAL FORM
+0xed 0xfece #ARABIC LETTER GHAIN FINAL FORM
+0xee 0xfecd #ARABIC LETTER GHAIN ISOLATED FORM
+0xef 0xfee1 #ARABIC LETTER MEEM ISOLATED FORM
+0xf0 0xfe7d #ARABIC SHADDA MEDIAL FORM
+0xf1 0x0651 #ARABIC SHADDAH
+0xf2 0xfee5 #ARABIC LETTER NOON ISOLATED FORM
+0xf3 0xfee9 #ARABIC LETTER HEH ISOLATED FORM
+0xf4 0xfeec #ARABIC LETTER HEH MEDIAL FORM
+0xf5 0xfef0 #ARABIC LETTER ALEF MAKSURA FINAL FORM
+0xf6 0xfef2 #ARABIC LETTER YEH FINAL FORM
+0xf7 0xfed0 #ARABIC LETTER GHAIN MEDIAL FORM
+0xf8 0xfed5 #ARABIC LETTER QAF ISOLATED FORM
+0xf9 0xfef5 #ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE ISOLATED FORM
+0xfa 0xfef6 #ARABIC LIGATURE LAM WITH ALEF WITH MADDA ABOVE FINAL FORM
+0xfb 0xfedd #ARABIC LETTER LAM ISOLATED FORM
+0xfc 0xfed9 #ARABIC LETTER KAF ISOLATED FORM
+0xfd 0xfef1 #ARABIC LETTER YEH ISOLATED FORM
+0xfe 0x25a0 #BLACK SQUARE
+0xff #UNDEFINED
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp865_DOSNordic to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp865_DOSNordic code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp865_DOSNordic order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
+0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
+0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
+0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x84 0x00e4 #LATIN SMALL LETTER A WITH DIAERESIS
+0x85 0x00e0 #LATIN SMALL LETTER A WITH GRAVE
+0x86 0x00e5 #LATIN SMALL LETTER A WITH RING ABOVE
+0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
+0x88 0x00ea #LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x89 0x00eb #LATIN SMALL LETTER E WITH DIAERESIS
+0x8a 0x00e8 #LATIN SMALL LETTER E WITH GRAVE
+0x8b 0x00ef #LATIN SMALL LETTER I WITH DIAERESIS
+0x8c 0x00ee #LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x8d 0x00ec #LATIN SMALL LETTER I WITH GRAVE
+0x8e 0x00c4 #LATIN CAPITAL LETTER A WITH DIAERESIS
+0x8f 0x00c5 #LATIN CAPITAL LETTER A WITH RING ABOVE
+0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
+0x91 0x00e6 #LATIN SMALL LIGATURE AE
+0x92 0x00c6 #LATIN CAPITAL LIGATURE AE
+0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x94 0x00f6 #LATIN SMALL LETTER O WITH DIAERESIS
+0x95 0x00f2 #LATIN SMALL LETTER O WITH GRAVE
+0x96 0x00fb #LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x97 0x00f9 #LATIN SMALL LETTER U WITH GRAVE
+0x98 0x00ff #LATIN SMALL LETTER Y WITH DIAERESIS
+0x99 0x00d6 #LATIN CAPITAL LETTER O WITH DIAERESIS
+0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
+0x9b 0x00f8 #LATIN SMALL LETTER O WITH STROKE
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x00d8 #LATIN CAPITAL LETTER O WITH STROKE
+0x9e 0x20a7 #PESETA SIGN
+0x9f 0x0192 #LATIN SMALL LETTER F WITH HOOK
+0xa0 0x00e1 #LATIN SMALL LETTER A WITH ACUTE
+0xa1 0x00ed #LATIN SMALL LETTER I WITH ACUTE
+0xa2 0x00f3 #LATIN SMALL LETTER O WITH ACUTE
+0xa3 0x00fa #LATIN SMALL LETTER U WITH ACUTE
+0xa4 0x00f1 #LATIN SMALL LETTER N WITH TILDE
+0xa5 0x00d1 #LATIN CAPITAL LETTER N WITH TILDE
+0xa6 0x00aa #FEMININE ORDINAL INDICATOR
+0xa7 0x00ba #MASCULINE ORDINAL INDICATOR
+0xa8 0x00bf #INVERTED QUESTION MARK
+0xa9 0x2310 #REVERSED NOT SIGN
+0xaa 0x00ac #NOT SIGN
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x00bc #VULGAR FRACTION ONE QUARTER
+0xad 0x00a1 #INVERTED EXCLAMATION MARK
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00a4 #CURRENCY SIGN
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x2561 #BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xb6 0x2562 #BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
+0xb7 0x2556 #BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
+0xb8 0x2555 #BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x255c #BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
+0xbe 0x255b #BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x255e #BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xc7 0x255f #BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x2567 #BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xd0 0x2568 #BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xd1 0x2564 #BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
+0xd2 0x2565 #BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
+0xd3 0x2559 #BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xd4 0x2558 #BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xd5 0x2552 #BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xd6 0x2553 #BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
+0xd7 0x256b #BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
+0xd8 0x256a #BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x258c #LEFT HALF BLOCK
+0xde 0x2590 #RIGHT HALF BLOCK
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x03b1 #GREEK SMALL LETTER ALPHA
+0xe1 0x00df #LATIN SMALL LETTER SHARP S
+0xe2 0x0393 #GREEK CAPITAL LETTER GAMMA
+0xe3 0x03c0 #GREEK SMALL LETTER PI
+0xe4 0x03a3 #GREEK CAPITAL LETTER SIGMA
+0xe5 0x03c3 #GREEK SMALL LETTER SIGMA
+0xe6 0x00b5 #MICRO SIGN
+0xe7 0x03c4 #GREEK SMALL LETTER TAU
+0xe8 0x03a6 #GREEK CAPITAL LETTER PHI
+0xe9 0x0398 #GREEK CAPITAL LETTER THETA
+0xea 0x03a9 #GREEK CAPITAL LETTER OMEGA
+0xeb 0x03b4 #GREEK SMALL LETTER DELTA
+0xec 0x221e #INFINITY
+0xed 0x03c6 #GREEK SMALL LETTER PHI
+0xee 0x03b5 #GREEK SMALL LETTER EPSILON
+0xef 0x2229 #INTERSECTION
+0xf0 0x2261 #IDENTICAL TO
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 0x2265 #GREATER-THAN OR EQUAL TO
+0xf3 0x2264 #LESS-THAN OR EQUAL TO
+0xf4 0x2320 #TOP HALF INTEGRAL
+0xf5 0x2321 #BOTTOM HALF INTEGRAL
+0xf6 0x00f7 #DIVISION SIGN
+0xf7 0x2248 #ALMOST EQUAL TO
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x2219 #BULLET OPERATOR
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x221a #SQUARE ROOT
+0xfc 0x207f #SUPERSCRIPT LATIN SMALL LETTER N
+0xfd 0x00b2 #SUPERSCRIPT TWO
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp866_DOSCyrillicRussian to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp866_DOSCyrillicRussian code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp866_DOSCyrillicRussian order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 0x0410 #CYRILLIC CAPITAL LETTER A
+0x81 0x0411 #CYRILLIC CAPITAL LETTER BE
+0x82 0x0412 #CYRILLIC CAPITAL LETTER VE
+0x83 0x0413 #CYRILLIC CAPITAL LETTER GHE
+0x84 0x0414 #CYRILLIC CAPITAL LETTER DE
+0x85 0x0415 #CYRILLIC CAPITAL LETTER IE
+0x86 0x0416 #CYRILLIC CAPITAL LETTER ZHE
+0x87 0x0417 #CYRILLIC CAPITAL LETTER ZE
+0x88 0x0418 #CYRILLIC CAPITAL LETTER I
+0x89 0x0419 #CYRILLIC CAPITAL LETTER SHORT I
+0x8a 0x041a #CYRILLIC CAPITAL LETTER KA
+0x8b 0x041b #CYRILLIC CAPITAL LETTER EL
+0x8c 0x041c #CYRILLIC CAPITAL LETTER EM
+0x8d 0x041d #CYRILLIC CAPITAL LETTER EN
+0x8e 0x041e #CYRILLIC CAPITAL LETTER O
+0x8f 0x041f #CYRILLIC CAPITAL LETTER PE
+0x90 0x0420 #CYRILLIC CAPITAL LETTER ER
+0x91 0x0421 #CYRILLIC CAPITAL LETTER ES
+0x92 0x0422 #CYRILLIC CAPITAL LETTER TE
+0x93 0x0423 #CYRILLIC CAPITAL LETTER U
+0x94 0x0424 #CYRILLIC CAPITAL LETTER EF
+0x95 0x0425 #CYRILLIC CAPITAL LETTER HA
+0x96 0x0426 #CYRILLIC CAPITAL LETTER TSE
+0x97 0x0427 #CYRILLIC CAPITAL LETTER CHE
+0x98 0x0428 #CYRILLIC CAPITAL LETTER SHA
+0x99 0x0429 #CYRILLIC CAPITAL LETTER SHCHA
+0x9a 0x042a #CYRILLIC CAPITAL LETTER HARD SIGN
+0x9b 0x042b #CYRILLIC CAPITAL LETTER YERU
+0x9c 0x042c #CYRILLIC CAPITAL LETTER SOFT SIGN
+0x9d 0x042d #CYRILLIC CAPITAL LETTER E
+0x9e 0x042e #CYRILLIC CAPITAL LETTER YU
+0x9f 0x042f #CYRILLIC CAPITAL LETTER YA
+0xa0 0x0430 #CYRILLIC SMALL LETTER A
+0xa1 0x0431 #CYRILLIC SMALL LETTER BE
+0xa2 0x0432 #CYRILLIC SMALL LETTER VE
+0xa3 0x0433 #CYRILLIC SMALL LETTER GHE
+0xa4 0x0434 #CYRILLIC SMALL LETTER DE
+0xa5 0x0435 #CYRILLIC SMALL LETTER IE
+0xa6 0x0436 #CYRILLIC SMALL LETTER ZHE
+0xa7 0x0437 #CYRILLIC SMALL LETTER ZE
+0xa8 0x0438 #CYRILLIC SMALL LETTER I
+0xa9 0x0439 #CYRILLIC SMALL LETTER SHORT I
+0xaa 0x043a #CYRILLIC SMALL LETTER KA
+0xab 0x043b #CYRILLIC SMALL LETTER EL
+0xac 0x043c #CYRILLIC SMALL LETTER EM
+0xad 0x043d #CYRILLIC SMALL LETTER EN
+0xae 0x043e #CYRILLIC SMALL LETTER O
+0xaf 0x043f #CYRILLIC SMALL LETTER PE
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x2561 #BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xb6 0x2562 #BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
+0xb7 0x2556 #BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
+0xb8 0x2555 #BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x255c #BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
+0xbe 0x255b #BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x255e #BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xc7 0x255f #BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x2567 #BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xd0 0x2568 #BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xd1 0x2564 #BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
+0xd2 0x2565 #BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
+0xd3 0x2559 #BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xd4 0x2558 #BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xd5 0x2552 #BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xd6 0x2553 #BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
+0xd7 0x256b #BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
+0xd8 0x256a #BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x258c #LEFT HALF BLOCK
+0xde 0x2590 #RIGHT HALF BLOCK
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x0440 #CYRILLIC SMALL LETTER ER
+0xe1 0x0441 #CYRILLIC SMALL LETTER ES
+0xe2 0x0442 #CYRILLIC SMALL LETTER TE
+0xe3 0x0443 #CYRILLIC SMALL LETTER U
+0xe4 0x0444 #CYRILLIC SMALL LETTER EF
+0xe5 0x0445 #CYRILLIC SMALL LETTER HA
+0xe6 0x0446 #CYRILLIC SMALL LETTER TSE
+0xe7 0x0447 #CYRILLIC SMALL LETTER CHE
+0xe8 0x0448 #CYRILLIC SMALL LETTER SHA
+0xe9 0x0449 #CYRILLIC SMALL LETTER SHCHA
+0xea 0x044a #CYRILLIC SMALL LETTER HARD SIGN
+0xeb 0x044b #CYRILLIC SMALL LETTER YERU
+0xec 0x044c #CYRILLIC SMALL LETTER SOFT SIGN
+0xed 0x044d #CYRILLIC SMALL LETTER E
+0xee 0x044e #CYRILLIC SMALL LETTER YU
+0xef 0x044f #CYRILLIC SMALL LETTER YA
+0xf0 0x0401 #CYRILLIC CAPITAL LETTER IO
+0xf1 0x0451 #CYRILLIC SMALL LETTER IO
+0xf2 0x0404 #CYRILLIC CAPITAL LETTER UKRAINIAN IE
+0xf3 0x0454 #CYRILLIC SMALL LETTER UKRAINIAN IE
+0xf4 0x0407 #CYRILLIC CAPITAL LETTER YI
+0xf5 0x0457 #CYRILLIC SMALL LETTER YI
+0xf6 0x040e #CYRILLIC CAPITAL LETTER SHORT U
+0xf7 0x045e #CYRILLIC SMALL LETTER SHORT U
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x2219 #BULLET OPERATOR
+0xfa 0x00b7 #MIDDLE DOT
+0xfb 0x221a #SQUARE ROOT
+0xfc 0x2116 #NUMERO SIGN
+0xfd 0x00a4 #CURRENCY SIGN
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#
+# Name: cp869_DOSGreek2 to Unicode table
+# Unicode version: 2.0
+# Table version: 2.00
+# Table format: Format A
+# Date: 04/24/96
+# Authors: Lori Brownell <loribr@microsoft.com>
+# K.D. Chang <a-kchang@microsoft.com>
+# General notes: none
+#
+# Format: Three tab-separated columns
+# Column #1 is the cp869_DOSGreek2 code (in hex)
+# Column #2 is the Unicode (in hex as 0xXXXX)
+# Column #3 is the Unicode name (follows a comment sign, '#')
+#
+# The entries are in cp869_DOSGreek2 order
+#
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0a 0x000a #LINE FEED
+0x0b 0x000b #VERTICAL TABULATION
+0x0c 0x000c #FORM FEED
+0x0d 0x000d #CARRIAGE RETURN
+0x0e 0x000e #SHIFT OUT
+0x0f 0x000f #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1a 0x001a #SUBSTITUTE
+0x1b 0x001b #ESCAPE
+0x1c 0x001c #FILE SEPARATOR
+0x1d 0x001d #GROUP SEPARATOR
+0x1e 0x001e #RECORD SEPARATOR
+0x1f 0x001f #UNIT SEPARATOR
+0x20 0x0020 #SPACE
+0x21 0x0021 #EXCLAMATION MARK
+0x22 0x0022 #QUOTATION MARK
+0x23 0x0023 #NUMBER SIGN
+0x24 0x0024 #DOLLAR SIGN
+0x25 0x0025 #PERCENT SIGN
+0x26 0x0026 #AMPERSAND
+0x27 0x0027 #APOSTROPHE
+0x28 0x0028 #LEFT PARENTHESIS
+0x29 0x0029 #RIGHT PARENTHESIS
+0x2a 0x002a #ASTERISK
+0x2b 0x002b #PLUS SIGN
+0x2c 0x002c #COMMA
+0x2d 0x002d #HYPHEN-MINUS
+0x2e 0x002e #FULL STOP
+0x2f 0x002f #SOLIDUS
+0x30 0x0030 #DIGIT ZERO
+0x31 0x0031 #DIGIT ONE
+0x32 0x0032 #DIGIT TWO
+0x33 0x0033 #DIGIT THREE
+0x34 0x0034 #DIGIT FOUR
+0x35 0x0035 #DIGIT FIVE
+0x36 0x0036 #DIGIT SIX
+0x37 0x0037 #DIGIT SEVEN
+0x38 0x0038 #DIGIT EIGHT
+0x39 0x0039 #DIGIT NINE
+0x3a 0x003a #COLON
+0x3b 0x003b #SEMICOLON
+0x3c 0x003c #LESS-THAN SIGN
+0x3d 0x003d #EQUALS SIGN
+0x3e 0x003e #GREATER-THAN SIGN
+0x3f 0x003f #QUESTION MARK
+0x40 0x0040 #COMMERCIAL AT
+0x41 0x0041 #LATIN CAPITAL LETTER A
+0x42 0x0042 #LATIN CAPITAL LETTER B
+0x43 0x0043 #LATIN CAPITAL LETTER C
+0x44 0x0044 #LATIN CAPITAL LETTER D
+0x45 0x0045 #LATIN CAPITAL LETTER E
+0x46 0x0046 #LATIN CAPITAL LETTER F
+0x47 0x0047 #LATIN CAPITAL LETTER G
+0x48 0x0048 #LATIN CAPITAL LETTER H
+0x49 0x0049 #LATIN CAPITAL LETTER I
+0x4a 0x004a #LATIN CAPITAL LETTER J
+0x4b 0x004b #LATIN CAPITAL LETTER K
+0x4c 0x004c #LATIN CAPITAL LETTER L
+0x4d 0x004d #LATIN CAPITAL LETTER M
+0x4e 0x004e #LATIN CAPITAL LETTER N
+0x4f 0x004f #LATIN CAPITAL LETTER O
+0x50 0x0050 #LATIN CAPITAL LETTER P
+0x51 0x0051 #LATIN CAPITAL LETTER Q
+0x52 0x0052 #LATIN CAPITAL LETTER R
+0x53 0x0053 #LATIN CAPITAL LETTER S
+0x54 0x0054 #LATIN CAPITAL LETTER T
+0x55 0x0055 #LATIN CAPITAL LETTER U
+0x56 0x0056 #LATIN CAPITAL LETTER V
+0x57 0x0057 #LATIN CAPITAL LETTER W
+0x58 0x0058 #LATIN CAPITAL LETTER X
+0x59 0x0059 #LATIN CAPITAL LETTER Y
+0x5a 0x005a #LATIN CAPITAL LETTER Z
+0x5b 0x005b #LEFT SQUARE BRACKET
+0x5c 0x005c #REVERSE SOLIDUS
+0x5d 0x005d #RIGHT SQUARE BRACKET
+0x5e 0x005e #CIRCUMFLEX ACCENT
+0x5f 0x005f #LOW LINE
+0x60 0x0060 #GRAVE ACCENT
+0x61 0x0061 #LATIN SMALL LETTER A
+0x62 0x0062 #LATIN SMALL LETTER B
+0x63 0x0063 #LATIN SMALL LETTER C
+0x64 0x0064 #LATIN SMALL LETTER D
+0x65 0x0065 #LATIN SMALL LETTER E
+0x66 0x0066 #LATIN SMALL LETTER F
+0x67 0x0067 #LATIN SMALL LETTER G
+0x68 0x0068 #LATIN SMALL LETTER H
+0x69 0x0069 #LATIN SMALL LETTER I
+0x6a 0x006a #LATIN SMALL LETTER J
+0x6b 0x006b #LATIN SMALL LETTER K
+0x6c 0x006c #LATIN SMALL LETTER L
+0x6d 0x006d #LATIN SMALL LETTER M
+0x6e 0x006e #LATIN SMALL LETTER N
+0x6f 0x006f #LATIN SMALL LETTER O
+0x70 0x0070 #LATIN SMALL LETTER P
+0x71 0x0071 #LATIN SMALL LETTER Q
+0x72 0x0072 #LATIN SMALL LETTER R
+0x73 0x0073 #LATIN SMALL LETTER S
+0x74 0x0074 #LATIN SMALL LETTER T
+0x75 0x0075 #LATIN SMALL LETTER U
+0x76 0x0076 #LATIN SMALL LETTER V
+0x77 0x0077 #LATIN SMALL LETTER W
+0x78 0x0078 #LATIN SMALL LETTER X
+0x79 0x0079 #LATIN SMALL LETTER Y
+0x7a 0x007a #LATIN SMALL LETTER Z
+0x7b 0x007b #LEFT CURLY BRACKET
+0x7c 0x007c #VERTICAL LINE
+0x7d 0x007d #RIGHT CURLY BRACKET
+0x7e 0x007e #TILDE
+0x7f 0x007f #DELETE
+0x80 #UNDEFINED
+0x81 #UNDEFINED
+0x82 #UNDEFINED
+0x83 #UNDEFINED
+0x84 #UNDEFINED
+0x85 #UNDEFINED
+0x86 0x0386 #GREEK CAPITAL LETTER ALPHA WITH TONOS
+0x87 #UNDEFINED
+0x88 0x00b7 #MIDDLE DOT
+0x89 0x00ac #NOT SIGN
+0x8a 0x00a6 #BROKEN BAR
+0x8b 0x2018 #LEFT SINGLE QUOTATION MARK
+0x8c 0x2019 #RIGHT SINGLE QUOTATION MARK
+0x8d 0x0388 #GREEK CAPITAL LETTER EPSILON WITH TONOS
+0x8e 0x2015 #HORIZONTAL BAR
+0x8f 0x0389 #GREEK CAPITAL LETTER ETA WITH TONOS
+0x90 0x038a #GREEK CAPITAL LETTER IOTA WITH TONOS
+0x91 0x03aa #GREEK CAPITAL LETTER IOTA WITH DIALYTIKA
+0x92 0x038c #GREEK CAPITAL LETTER OMICRON WITH TONOS
+0x93 #UNDEFINED
+0x94 #UNDEFINED
+0x95 0x038e #GREEK CAPITAL LETTER UPSILON WITH TONOS
+0x96 0x03ab #GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA
+0x97 0x00a9 #COPYRIGHT SIGN
+0x98 0x038f #GREEK CAPITAL LETTER OMEGA WITH TONOS
+0x99 0x00b2 #SUPERSCRIPT TWO
+0x9a 0x00b3 #SUPERSCRIPT THREE
+0x9b 0x03ac #GREEK SMALL LETTER ALPHA WITH TONOS
+0x9c 0x00a3 #POUND SIGN
+0x9d 0x03ad #GREEK SMALL LETTER EPSILON WITH TONOS
+0x9e 0x03ae #GREEK SMALL LETTER ETA WITH TONOS
+0x9f 0x03af #GREEK SMALL LETTER IOTA WITH TONOS
+0xa0 0x03ca #GREEK SMALL LETTER IOTA WITH DIALYTIKA
+0xa1 0x0390 #GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
+0xa2 0x03cc #GREEK SMALL LETTER OMICRON WITH TONOS
+0xa3 0x03cd #GREEK SMALL LETTER UPSILON WITH TONOS
+0xa4 0x0391 #GREEK CAPITAL LETTER ALPHA
+0xa5 0x0392 #GREEK CAPITAL LETTER BETA
+0xa6 0x0393 #GREEK CAPITAL LETTER GAMMA
+0xa7 0x0394 #GREEK CAPITAL LETTER DELTA
+0xa8 0x0395 #GREEK CAPITAL LETTER EPSILON
+0xa9 0x0396 #GREEK CAPITAL LETTER ZETA
+0xaa 0x0397 #GREEK CAPITAL LETTER ETA
+0xab 0x00bd #VULGAR FRACTION ONE HALF
+0xac 0x0398 #GREEK CAPITAL LETTER THETA
+0xad 0x0399 #GREEK CAPITAL LETTER IOTA
+0xae 0x00ab #LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xaf 0x00bb #RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xb0 0x2591 #LIGHT SHADE
+0xb1 0x2592 #MEDIUM SHADE
+0xb2 0x2593 #DARK SHADE
+0xb3 0x2502 #BOX DRAWINGS LIGHT VERTICAL
+0xb4 0x2524 #BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0xb5 0x039a #GREEK CAPITAL LETTER KAPPA
+0xb6 0x039b #GREEK CAPITAL LETTER LAMDA
+0xb7 0x039c #GREEK CAPITAL LETTER MU
+0xb8 0x039d #GREEK CAPITAL LETTER NU
+0xb9 0x2563 #BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xba 0x2551 #BOX DRAWINGS DOUBLE VERTICAL
+0xbb 0x2557 #BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xbc 0x255d #BOX DRAWINGS DOUBLE UP AND LEFT
+0xbd 0x039e #GREEK CAPITAL LETTER XI
+0xbe 0x039f #GREEK CAPITAL LETTER OMICRON
+0xbf 0x2510 #BOX DRAWINGS LIGHT DOWN AND LEFT
+0xc0 0x2514 #BOX DRAWINGS LIGHT UP AND RIGHT
+0xc1 0x2534 #BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0xc2 0x252c #BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0xc3 0x251c #BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0xc4 0x2500 #BOX DRAWINGS LIGHT HORIZONTAL
+0xc5 0x253c #BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0xc6 0x03a0 #GREEK CAPITAL LETTER PI
+0xc7 0x03a1 #GREEK CAPITAL LETTER RHO
+0xc8 0x255a #BOX DRAWINGS DOUBLE UP AND RIGHT
+0xc9 0x2554 #BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xca 0x2569 #BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xcb 0x2566 #BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xcc 0x2560 #BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xcd 0x2550 #BOX DRAWINGS DOUBLE HORIZONTAL
+0xce 0x256c #BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xcf 0x03a3 #GREEK CAPITAL LETTER SIGMA
+0xd0 0x03a4 #GREEK CAPITAL LETTER TAU
+0xd1 0x03a5 #GREEK CAPITAL LETTER UPSILON
+0xd2 0x03a6 #GREEK CAPITAL LETTER PHI
+0xd3 0x03a7 #GREEK CAPITAL LETTER CHI
+0xd4 0x03a8 #GREEK CAPITAL LETTER PSI
+0xd5 0x03a9 #GREEK CAPITAL LETTER OMEGA
+0xd6 0x03b1 #GREEK SMALL LETTER ALPHA
+0xd7 0x03b2 #GREEK SMALL LETTER BETA
+0xd8 0x03b3 #GREEK SMALL LETTER GAMMA
+0xd9 0x2518 #BOX DRAWINGS LIGHT UP AND LEFT
+0xda 0x250c #BOX DRAWINGS LIGHT DOWN AND RIGHT
+0xdb 0x2588 #FULL BLOCK
+0xdc 0x2584 #LOWER HALF BLOCK
+0xdd 0x03b4 #GREEK SMALL LETTER DELTA
+0xde 0x03b5 #GREEK SMALL LETTER EPSILON
+0xdf 0x2580 #UPPER HALF BLOCK
+0xe0 0x03b6 #GREEK SMALL LETTER ZETA
+0xe1 0x03b7 #GREEK SMALL LETTER ETA
+0xe2 0x03b8 #GREEK SMALL LETTER THETA
+0xe3 0x03b9 #GREEK SMALL LETTER IOTA
+0xe4 0x03ba #GREEK SMALL LETTER KAPPA
+0xe5 0x03bb #GREEK SMALL LETTER LAMDA
+0xe6 0x03bc #GREEK SMALL LETTER MU
+0xe7 0x03bd #GREEK SMALL LETTER NU
+0xe8 0x03be #GREEK SMALL LETTER XI
+0xe9 0x03bf #GREEK SMALL LETTER OMICRON
+0xea 0x03c0 #GREEK SMALL LETTER PI
+0xeb 0x03c1 #GREEK SMALL LETTER RHO
+0xec 0x03c3 #GREEK SMALL LETTER SIGMA
+0xed 0x03c2 #GREEK SMALL LETTER FINAL SIGMA
+0xee 0x03c4 #GREEK SMALL LETTER TAU
+0xef 0x0384 #GREEK TONOS
+0xf0 0x00ad #SOFT HYPHEN
+0xf1 0x00b1 #PLUS-MINUS SIGN
+0xf2 0x03c5 #GREEK SMALL LETTER UPSILON
+0xf3 0x03c6 #GREEK SMALL LETTER PHI
+0xf4 0x03c7 #GREEK SMALL LETTER CHI
+0xf5 0x00a7 #SECTION SIGN
+0xf6 0x03c8 #GREEK SMALL LETTER PSI
+0xf7 0x0385 #GREEK DIALYTIKA TONOS
+0xf8 0x00b0 #DEGREE SIGN
+0xf9 0x00a8 #DIAERESIS
+0xfa 0x03c9 #GREEK SMALL LETTER OMEGA
+0xfb 0x03cb #GREEK SMALL LETTER UPSILON WITH DIALYTIKA
+0xfc 0x03b0 #GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
+0xfd 0x03ce #GREEK SMALL LETTER OMEGA WITH TONOS
+0xfe 0x25a0 #BLACK SQUARE
+0xff 0x00a0 #NO-BREAK SPACE
+
+\1a
\ No newline at end of file
--- /dev/null
+#\r
+# Name: cp874 to Unicode table\r
+# Unicode version: 2.0\r
+# Table version: 2.00\r
+# Table format: Format A\r
+# Date: 04/15/98\r
+#\r
+# Contact: cpxlate@microsoft.com\r
+#\r
+# General notes: none\r
+#\r
+# Format: Three tab-separated columns\r
+# Column #1 is the cp874 code (in hex)\r
+# Column #2 is the Unicode (in hex as 0xXXXX)\r
+# Column #3 is the Unicode name (follows a comment sign, '#')\r
+#\r
+# The entries are in cp874 order\r
+#\r
+0x00 0x0000 #NULL\r
+0x01 0x0001 #START OF HEADING\r
+0x02 0x0002 #START OF TEXT\r
+0x03 0x0003 #END OF TEXT\r
+0x04 0x0004 #END OF TRANSMISSION\r
+0x05 0x0005 #ENQUIRY\r
+0x06 0x0006 #ACKNOWLEDGE\r
+0x07 0x0007 #BELL\r
+0x08 0x0008 #BACKSPACE\r
+0x09 0x0009 #HORIZONTAL TABULATION\r
+0x0A 0x000A #LINE FEED\r
+0x0B 0x000B #VERTICAL TABULATION\r
+0x0C 0x000C #FORM FEED\r
+0x0D 0x000D #CARRIAGE RETURN\r
+0x0E 0x000E #SHIFT OUT\r
+0x0F 0x000F #SHIFT IN\r
+0x10 0x0010 #DATA LINK ESCAPE\r
+0x11 0x0011 #DEVICE CONTROL ONE\r
+0x12 0x0012 #DEVICE CONTROL TWO\r
+0x13 0x0013 #DEVICE CONTROL THREE\r
+0x14 0x0014 #DEVICE CONTROL FOUR\r
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE\r
+0x16 0x0016 #SYNCHRONOUS IDLE\r
+0x17 0x0017 #END OF TRANSMISSION BLOCK\r
+0x18 0x0018 #CANCEL\r
+0x19 0x0019 #END OF MEDIUM\r
+0x1A 0x001A #SUBSTITUTE\r
+0x1B 0x001B #ESCAPE\r
+0x1C 0x001C #FILE SEPARATOR\r
+0x1D 0x001D #GROUP SEPARATOR\r
+0x1E 0x001E #RECORD SEPARATOR\r
+0x1F 0x001F #UNIT SEPARATOR\r
+0x20 0x0020 #SPACE\r
+0x21 0x0021 #EXCLAMATION MARK\r
+0x22 0x0022 #QUOTATION MARK\r
+0x23 0x0023 #NUMBER SIGN\r
+0x24 0x0024 #DOLLAR SIGN\r
+0x25 0x0025 #PERCENT SIGN\r
+0x26 0x0026 #AMPERSAND\r
+0x27 0x0027 #APOSTROPHE\r
+0x28 0x0028 #LEFT PARENTHESIS\r
+0x29 0x0029 #RIGHT PARENTHESIS\r
+0x2A 0x002A #ASTERISK\r
+0x2B 0x002B #PLUS SIGN\r
+0x2C 0x002C #COMMA\r
+0x2D 0x002D #HYPHEN-MINUS\r
+0x2E 0x002E #FULL STOP\r
+0x2F 0x002F #SOLIDUS\r
+0x30 0x0030 #DIGIT ZERO\r
+0x31 0x0031 #DIGIT ONE\r
+0x32 0x0032 #DIGIT TWO\r
+0x33 0x0033 #DIGIT THREE\r
+0x34 0x0034 #DIGIT FOUR\r
+0x35 0x0035 #DIGIT FIVE\r
+0x36 0x0036 #DIGIT SIX\r
+0x37 0x0037 #DIGIT SEVEN\r
+0x38 0x0038 #DIGIT EIGHT\r
+0x39 0x0039 #DIGIT NINE\r
+0x3A 0x003A #COLON\r
+0x3B 0x003B #SEMICOLON\r
+0x3C 0x003C #LESS-THAN SIGN\r
+0x3D 0x003D #EQUALS SIGN\r
+0x3E 0x003E #GREATER-THAN SIGN\r
+0x3F 0x003F #QUESTION MARK\r
+0x40 0x0040 #COMMERCIAL AT\r
+0x41 0x0041 #LATIN CAPITAL LETTER A\r
+0x42 0x0042 #LATIN CAPITAL LETTER B\r
+0x43 0x0043 #LATIN CAPITAL LETTER C\r
+0x44 0x0044 #LATIN CAPITAL LETTER D\r
+0x45 0x0045 #LATIN CAPITAL LETTER E\r
+0x46 0x0046 #LATIN CAPITAL LETTER F\r
+0x47 0x0047 #LATIN CAPITAL LETTER G\r
+0x48 0x0048 #LATIN CAPITAL LETTER H\r
+0x49 0x0049 #LATIN CAPITAL LETTER I\r
+0x4A 0x004A #LATIN CAPITAL LETTER J\r
+0x4B 0x004B #LATIN CAPITAL LETTER K\r
+0x4C 0x004C #LATIN CAPITAL LETTER L\r
+0x4D 0x004D #LATIN CAPITAL LETTER M\r
+0x4E 0x004E #LATIN CAPITAL LETTER N\r
+0x4F 0x004F #LATIN CAPITAL LETTER O\r
+0x50 0x0050 #LATIN CAPITAL LETTER P\r
+0x51 0x0051 #LATIN CAPITAL LETTER Q\r
+0x52 0x0052 #LATIN CAPITAL LETTER R\r
+0x53 0x0053 #LATIN CAPITAL LETTER S\r
+0x54 0x0054 #LATIN CAPITAL LETTER T\r
+0x55 0x0055 #LATIN CAPITAL LETTER U\r
+0x56 0x0056 #LATIN CAPITAL LETTER V\r
+0x57 0x0057 #LATIN CAPITAL LETTER W\r
+0x58 0x0058 #LATIN CAPITAL LETTER X\r
+0x59 0x0059 #LATIN CAPITAL LETTER Y\r
+0x5A 0x005A #LATIN CAPITAL LETTER Z\r
+0x5B 0x005B #LEFT SQUARE BRACKET\r
+0x5C 0x005C #REVERSE SOLIDUS\r
+0x5D 0x005D #RIGHT SQUARE BRACKET\r
+0x5E 0x005E #CIRCUMFLEX ACCENT\r
+0x5F 0x005F #LOW LINE\r
+0x60 0x0060 #GRAVE ACCENT\r
+0x61 0x0061 #LATIN SMALL LETTER A\r
+0x62 0x0062 #LATIN SMALL LETTER B\r
+0x63 0x0063 #LATIN SMALL LETTER C\r
+0x64 0x0064 #LATIN SMALL LETTER D\r
+0x65 0x0065 #LATIN SMALL LETTER E\r
+0x66 0x0066 #LATIN SMALL LETTER F\r
+0x67 0x0067 #LATIN SMALL LETTER G\r
+0x68 0x0068 #LATIN SMALL LETTER H\r
+0x69 0x0069 #LATIN SMALL LETTER I\r
+0x6A 0x006A #LATIN SMALL LETTER J\r
+0x6B 0x006B #LATIN SMALL LETTER K\r
+0x6C 0x006C #LATIN SMALL LETTER L\r
+0x6D 0x006D #LATIN SMALL LETTER M\r
+0x6E 0x006E #LATIN SMALL LETTER N\r
+0x6F 0x006F #LATIN SMALL LETTER O\r
+0x70 0x0070 #LATIN SMALL LETTER P\r
+0x71 0x0071 #LATIN SMALL LETTER Q\r
+0x72 0x0072 #LATIN SMALL LETTER R\r
+0x73 0x0073 #LATIN SMALL LETTER S\r
+0x74 0x0074 #LATIN SMALL LETTER T\r
+0x75 0x0075 #LATIN SMALL LETTER U\r
+0x76 0x0076 #LATIN SMALL LETTER V\r
+0x77 0x0077 #LATIN SMALL LETTER W\r
+0x78 0x0078 #LATIN SMALL LETTER X\r
+0x79 0x0079 #LATIN SMALL LETTER Y\r
+0x7A 0x007A #LATIN SMALL LETTER Z\r
+0x7B 0x007B #LEFT CURLY BRACKET\r
+0x7C 0x007C #VERTICAL LINE\r
+0x7D 0x007D #RIGHT CURLY BRACKET\r
+0x7E 0x007E #TILDE\r
+0x7F 0x007F #DELETE\r
+0x80 0x20AC #EURO SIGN\r
+0x81 #UNDEFINED\r
+0x82 #UNDEFINED\r
+0x83 #UNDEFINED\r
+0x84 #UNDEFINED\r
+0x85 0x2026 #HORIZONTAL ELLIPSIS\r
+0x86 #UNDEFINED\r
+0x87 #UNDEFINED\r
+0x88 #UNDEFINED\r
+0x89 #UNDEFINED\r
+0x8A #UNDEFINED\r
+0x8B #UNDEFINED\r
+0x8C #UNDEFINED\r
+0x8D #UNDEFINED\r
+0x8E #UNDEFINED\r
+0x8F #UNDEFINED\r
+0x90 #UNDEFINED\r
+0x91 0x2018 #LEFT SINGLE QUOTATION MARK\r
+0x92 0x2019 #RIGHT SINGLE QUOTATION MARK\r
+0x93 0x201C #LEFT DOUBLE QUOTATION MARK\r
+0x94 0x201D #RIGHT DOUBLE QUOTATION MARK\r
+0x95 0x2022 #BULLET\r
+0x96 0x2013 #EN DASH\r
+0x97 0x2014 #EM DASH\r
+0x98 #UNDEFINED\r
+0x99 #UNDEFINED\r
+0x9A #UNDEFINED\r
+0x9B #UNDEFINED\r
+0x9C #UNDEFINED\r
+0x9D #UNDEFINED\r
+0x9E #UNDEFINED\r
+0x9F #UNDEFINED\r
+0xA0 0x00A0 #NO-BREAK SPACE\r
+0xA1 0x0E01 #THAI CHARACTER KO KAI\r
+0xA2 0x0E02 #THAI CHARACTER KHO KHAI\r
+0xA3 0x0E03 #THAI CHARACTER KHO KHUAT\r
+0xA4 0x0E04 #THAI CHARACTER KHO KHWAI\r
+0xA5 0x0E05 #THAI CHARACTER KHO KHON\r
+0xA6 0x0E06 #THAI CHARACTER KHO RAKHANG\r
+0xA7 0x0E07 #THAI CHARACTER NGO NGU\r
+0xA8 0x0E08 #THAI CHARACTER CHO CHAN\r
+0xA9 0x0E09 #THAI CHARACTER CHO CHING\r
+0xAA 0x0E0A #THAI CHARACTER CHO CHANG\r
+0xAB 0x0E0B #THAI CHARACTER SO SO\r
+0xAC 0x0E0C #THAI CHARACTER CHO CHOE\r
+0xAD 0x0E0D #THAI CHARACTER YO YING\r
+0xAE 0x0E0E #THAI CHARACTER DO CHADA\r
+0xAF 0x0E0F #THAI CHARACTER TO PATAK\r
+0xB0 0x0E10 #THAI CHARACTER THO THAN\r
+0xB1 0x0E11 #THAI CHARACTER THO NANGMONTHO\r
+0xB2 0x0E12 #THAI CHARACTER THO PHUTHAO\r
+0xB3 0x0E13 #THAI CHARACTER NO NEN\r
+0xB4 0x0E14 #THAI CHARACTER DO DEK\r
+0xB5 0x0E15 #THAI CHARACTER TO TAO\r
+0xB6 0x0E16 #THAI CHARACTER THO THUNG\r
+0xB7 0x0E17 #THAI CHARACTER THO THAHAN\r
+0xB8 0x0E18 #THAI CHARACTER THO THONG\r
+0xB9 0x0E19 #THAI CHARACTER NO NU\r
+0xBA 0x0E1A #THAI CHARACTER BO BAIMAI\r
+0xBB 0x0E1B #THAI CHARACTER PO PLA\r
+0xBC 0x0E1C #THAI CHARACTER PHO PHUNG\r
+0xBD 0x0E1D #THAI CHARACTER FO FA\r
+0xBE 0x0E1E #THAI CHARACTER PHO PHAN\r
+0xBF 0x0E1F #THAI CHARACTER FO FAN\r
+0xC0 0x0E20 #THAI CHARACTER PHO SAMPHAO\r
+0xC1 0x0E21 #THAI CHARACTER MO MA\r
+0xC2 0x0E22 #THAI CHARACTER YO YAK\r
+0xC3 0x0E23 #THAI CHARACTER RO RUA\r
+0xC4 0x0E24 #THAI CHARACTER RU\r
+0xC5 0x0E25 #THAI CHARACTER LO LING\r
+0xC6 0x0E26 #THAI CHARACTER LU\r
+0xC7 0x0E27 #THAI CHARACTER WO WAEN\r
+0xC8 0x0E28 #THAI CHARACTER SO SALA\r
+0xC9 0x0E29 #THAI CHARACTER SO RUSI\r
+0xCA 0x0E2A #THAI CHARACTER SO SUA\r
+0xCB 0x0E2B #THAI CHARACTER HO HIP\r
+0xCC 0x0E2C #THAI CHARACTER LO CHULA\r
+0xCD 0x0E2D #THAI CHARACTER O ANG\r
+0xCE 0x0E2E #THAI CHARACTER HO NOKHUK\r
+0xCF 0x0E2F #THAI CHARACTER PAIYANNOI\r
+0xD0 0x0E30 #THAI CHARACTER SARA A\r
+0xD1 0x0E31 #THAI CHARACTER MAI HAN-AKAT\r
+0xD2 0x0E32 #THAI CHARACTER SARA AA\r
+0xD3 0x0E33 #THAI CHARACTER SARA AM\r
+0xD4 0x0E34 #THAI CHARACTER SARA I\r
+0xD5 0x0E35 #THAI CHARACTER SARA II\r
+0xD6 0x0E36 #THAI CHARACTER SARA UE\r
+0xD7 0x0E37 #THAI CHARACTER SARA UEE\r
+0xD8 0x0E38 #THAI CHARACTER SARA U\r
+0xD9 0x0E39 #THAI CHARACTER SARA UU\r
+0xDA 0x0E3A #THAI CHARACTER PHINTHU\r
+0xDB #UNDEFINED\r
+0xDC #UNDEFINED\r
+0xDD #UNDEFINED\r
+0xDE #UNDEFINED\r
+0xDF 0x0E3F #THAI CURRENCY SYMBOL BAHT\r
+0xE0 0x0E40 #THAI CHARACTER SARA E\r
+0xE1 0x0E41 #THAI CHARACTER SARA AE\r
+0xE2 0x0E42 #THAI CHARACTER SARA O\r
+0xE3 0x0E43 #THAI CHARACTER SARA AI MAIMUAN\r
+0xE4 0x0E44 #THAI CHARACTER SARA AI MAIMALAI\r
+0xE5 0x0E45 #THAI CHARACTER LAKKHANGYAO\r
+0xE6 0x0E46 #THAI CHARACTER MAIYAMOK\r
+0xE7 0x0E47 #THAI CHARACTER MAITAIKHU\r
+0xE8 0x0E48 #THAI CHARACTER MAI EK\r
+0xE9 0x0E49 #THAI CHARACTER MAI THO\r
+0xEA 0x0E4A #THAI CHARACTER MAI TRI\r
+0xEB 0x0E4B #THAI CHARACTER MAI CHATTAWA\r
+0xEC 0x0E4C #THAI CHARACTER THANTHAKHAT\r
+0xED 0x0E4D #THAI CHARACTER NIKHAHIT\r
+0xEE 0x0E4E #THAI CHARACTER YAMAKKAN\r
+0xEF 0x0E4F #THAI CHARACTER FONGMAN\r
+0xF0 0x0E50 #THAI DIGIT ZERO\r
+0xF1 0x0E51 #THAI DIGIT ONE\r
+0xF2 0x0E52 #THAI DIGIT TWO\r
+0xF3 0x0E53 #THAI DIGIT THREE\r
+0xF4 0x0E54 #THAI DIGIT FOUR\r
+0xF5 0x0E55 #THAI DIGIT FIVE\r
+0xF6 0x0E56 #THAI DIGIT SIX\r
+0xF7 0x0E57 #THAI DIGIT SEVEN\r
+0xF8 0x0E58 #THAI DIGIT EIGHT\r
+0xF9 0x0E59 #THAI DIGIT NINE\r
+0xFA 0x0E5A #THAI CHARACTER ANGKHANKHU\r
+0xFB 0x0E5B #THAI CHARACTER KHOMUT\r
+0xFC #UNDEFINED\r
+0xFD #UNDEFINED\r
+0xFE #UNDEFINED\r
+0xFF #UNDEFINED\r
--- /dev/null
+# Koi8 to unicode translation (from rfc1489)
+0x00 0x0000 # NULL
+0x01 0x0001 # START OF HEADING
+0x02 0x0002 # START OF TEXT
+0x03 0x0003 # END OF TEXT
+0x04 0x0004 # END OF TRANSMISSION
+0x05 0x0005 # ENQUIRY
+0x06 0x0006 # ACKNOWLEDGE
+0x07 0x0007 # BELL
+0x08 0x0008 # BACKSPACE
+0x09 0x0009 # HORIZONTAL TABULATION
+0x0a 0x000a # LINE FEED
+0x0b 0x000b # VERTICAL TABULATION
+0x0c 0x000c # FORM FEED
+0x0d 0x000d # CARRIAGE RETURN
+0x0e 0x000e # SHIFT OUT
+0x0f 0x000f # SHIFT IN
+0x10 0x0010 # DATA LINK ESCAPE
+0x11 0x0011 # DEVICE CONTROL ONE
+0x12 0x0012 # DEVICE CONTROL TWO
+0x13 0x0013 # DEVICE CONTROL THREE
+0x14 0x0014 # DEVICE CONTROL FOUR
+0x15 0x0015 # NEGATIVE ACKNOWLEDGE
+0x16 0x0016 # SYNCHRONOUS IDLE
+0x17 0x0017 # END OF TRANSMISSION BLOCK
+0x18 0x0018 # CANCEL
+0x19 0x0019 # END OF MEDIUM
+0x1a 0x001a # SUBSTITUTE
+0x1b 0x001b # ESCAPE
+0x1c 0x001c # FILE SEPARATOR
+0x1d 0x001d # GROUP SEPARATOR
+0x1e 0x001e # RECORD SEPARATOR
+0x1f 0x001f # UNIT SEPARATOR
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2a 0x002a # ASTERISK
+0x2b 0x002b # PLUS SIGN
+0x2c 0x002c # COMMA
+0x2d 0x002d # HYPHEN-MINUS
+0x2e 0x002e # FULL STOP
+0x2f 0x002f # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3a 0x003a # COLON
+0x3b 0x003b # SEMICOLON
+0x3c 0x003c # LESS-THAN SIGN
+0x3d 0x003d # EQUALS SIGN
+0x3e 0x003e # GREATER-THAN SIGN
+0x3f 0x003f # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4a 0x004a # LATIN CAPITAL LETTER J
+0x4b 0x004b # LATIN CAPITAL LETTER K
+0x4c 0x004c # LATIN CAPITAL LETTER L
+0x4d 0x004d # LATIN CAPITAL LETTER M
+0x4e 0x004e # LATIN CAPITAL LETTER N
+0x4f 0x004f # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5a 0x005a # LATIN CAPITAL LETTER Z
+0x5b 0x005b # LEFT SQUARE BRACKET
+0x5c 0x005c # REVERSE SOLIDUS
+0x5d 0x005d # RIGHT SQUARE BRACKET
+0x5e 0x005e # CIRCUMFLEX ACCENT
+0x5f 0x005f # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6a 0x006a # LATIN SMALL LETTER J
+0x6b 0x006b # LATIN SMALL LETTER K
+0x6c 0x006c # LATIN SMALL LETTER L
+0x6d 0x006d # LATIN SMALL LETTER M
+0x6e 0x006e # LATIN SMALL LETTER N
+0x6f 0x006f # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7a 0x007a # LATIN SMALL LETTER Z
+0x7b 0x007b # LEFT CURLY BRACKET
+0x7c 0x007c # VERTICAL LINE
+0x7d 0x007d # RIGHT CURLY BRACKET
+0x7e 0x007e # TILDE
+0x7f 0x007f # DELETE
+0x80 0x2500 # BOX DRAWINGS LIGHT HORIZONTAL
+0x81 0x2502 # BOX DRAWINGS LIGHT VERTICAL
+0x82 0x250c # BOX DRAWINGS LIGHT DOWN AND RIGHT
+0x83 0x2510 # BOX DRAWINGS LIGHT DOWN AND LEFT
+0x84 0x2514 # BOX DRAWINGS LIGHT UP AND RIGHT
+0x85 0x2518 # BOX DRAWINGS LIGHT UP AND LEFT
+0x86 0x251c # BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0x87 0x2524 # BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0x88 0x252c # BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0x89 0x2534 # BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0x8a 0x253c # BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0x8b 0x2580 # UPPER HALF BLOCK
+0x8c 0x2584 # LOWER HALF BLOCK
+0x8d 0x2588 # FULL BLOCK
+0x8e 0x258c # LEFT HALF BLOCK
+0x8f 0x2590 # RIGHT HALF BLOCK
+0x90 0x2591 # LIGHT SHADE
+0x91 0x2592 # MEDIUM SHADE
+0x92 0x2593 # DARK SHADE
+0x93 0x2320 # UPPER HALF OF INTEGRAL
+0x94 0x25a0 # BLACK SQUARE
+0x95 0x2219 # BULLET OPERATOR
+0x96 0x221a # SQUARE ROOT
+0x97 0x2248 # ALMOST EQUAL TO
+0x98 0x2264 # LESS-THAN OR EQUAL TO
+0x99 0x2265 # GREATER-THAN OR EQUAL TO
+0x9a 0x00a0 # NO-BREAK SPACE
+0x9b 0x2321 # LOWER HALF OF INTEGRAL
+0x9c 0x00b0 # DEGREE SIGN
+0x9d 0x00b2 # SUPERSCRIPT TWO
+0x9e 0x00b7 # MIDDLE DOT
+0x9f 0x00f7 # DIVISION SIGN
+0xa0 0x2550 # BOX DRAWINGS DOUBLE HORIZONTAL
+0xa1 0x2551 # BOX DRAWINGS DOUBLE VERTICAL
+0xa2 0x2552 # BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xa3 0x0451 # CYRILLIC SMALL LETTER IO
+0xa4 0x2553 # BOX DRAWINGS DOWN DOUBLE AND RIGHT SINGLE
+0xa5 0x2554 # BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xa6 0x2555 # BOX DRAWINGS DOWN SINGLE AND LEFT DOUBLE
+0xa7 0x2556 # BOX DRAWINGS DOWN DOUBLE AND LEFT SINGLE
+0xa8 0x2557 # BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xa9 0x2558 # BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xaa 0x2559 # BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xab 0x255a # BOX DRAWINGS DOUBLE UP AND RIGHT
+0xac 0x255b # BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xad 0x255c # BOX DRAWINGS UP DOUBLE AND LEFT SINGLE
+0xae 0x255d # BOX DRAWINGS DOUBLE UP AND LEFT
+0xaf 0x255e # BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xb0 0x255f # BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xb1 0x2560 # BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xb2 0x2561 # BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xb3 0x0401 # CYRILLIC CAPITAL LETTER IO
+0xb4 0x2562 # BOX DRAWINGS VERTICAL DOUBLE AND LEFT SINGLE
+0xb5 0x2563 # BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xb6 0x2564 # BOX DRAWINGS DOWN SINGLE AND HORIZONTAL DOUBLE
+0xb7 0x2565 # BOX DRAWINGS DOWN DOUBLE AND HORIZONTAL SINGLE
+0xb8 0x2566 # BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xb9 0x2567 # BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xba 0x2568 # BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xbb 0x2569 # BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xbc 0x256a # BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xbd 0x256b # BOX DRAWINGS VERTICAL DOUBLE AND HORIZONTAL SINGLE
+0xbe 0x256c # BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xbf 0x00a9 # COPYRIGHT SIGN
+0xc0 0x044e # CYRILLIC SMALL LETTER YU
+0xc1 0x0430 # CYRILLIC SMALL LETTER A
+0xc2 0x0431 # CYRILLIC SMALL LETTER BE
+0xc3 0x0446 # CYRILLIC SMALL LETTER TSE
+0xc4 0x0434 # CYRILLIC SMALL LETTER DE
+0xc5 0x0435 # CYRILLIC SMALL LETTER IE
+0xc6 0x0444 # CYRILLIC SMALL LETTER EF
+0xc7 0x0433 # CYRILLIC SMALL LETTER GHE
+0xc8 0x0445 # CYRILLIC SMALL LETTER HA
+0xc9 0x0438 # CYRILLIC SMALL LETTER I
+0xca 0x0439 # CYRILLIC SMALL LETTER SHORT I
+0xcb 0x043a # CYRILLIC SMALL LETTER KA
+0xcc 0x043b # CYRILLIC SMALL LETTER EL
+0xcd 0x043c # CYRILLIC SMALL LETTER EM
+0xce 0x043d # CYRILLIC SMALL LETTER EN
+0xcf 0x043e # CYRILLIC SMALL LETTER O
+0xd0 0x043f # CYRILLIC SMALL LETTER PE
+0xd1 0x044f # CYRILLIC SMALL LETTER YA
+0xd2 0x0440 # CYRILLIC SMALL LETTER ER
+0xd3 0x0441 # CYRILLIC SMALL LETTER ES
+0xd4 0x0442 # CYRILLIC SMALL LETTER TE
+0xd5 0x0443 # CYRILLIC SMALL LETTER U
+0xd6 0x0436 # CYRILLIC SMALL LETTER ZHE
+0xd7 0x0432 # CYRILLIC SMALL LETTER VE
+0xd8 0x044c # CYRILLIC SMALL LETTER SOFT SIGN
+0xd9 0x044b # CYRILLIC SMALL LETTER YERU
+0xda 0x0437 # CYRILLIC SMALL LETTER ZE
+0xdb 0x0448 # CYRILLIC SMALL LETTER SHA
+0xdc 0x044d # CYRILLIC SMALL LETTER E
+0xdd 0x0449 # CYRILLIC SMALL LETTER SHCHA
+0xde 0x0447 # CYRILLIC SMALL LETTER CHE
+0xdf 0x044a # CYRILLIC SMALL LETTER HARD SIGN
+0xe0 0x042e # CYRILLIC CAPITAL LETTER YU
+0xe1 0x0410 # CYRILLIC CAPITAL LETTER A
+0xe2 0x0411 # CYRILLIC CAPITAL LETTER BE
+0xe3 0x0426 # CYRILLIC CAPITAL LETTER TSE
+0xe4 0x0414 # CYRILLIC CAPITAL LETTER DE
+0xe5 0x0415 # CYRILLIC CAPITAL LETTER IE
+0xe6 0x0424 # CYRILLIC CAPITAL LETTER EF
+0xe7 0x0413 # CYRILLIC CAPITAL LETTER GHE
+0xe8 0x0425 # CYRILLIC CAPITAL LETTER HA
+0xe9 0x0418 # CYRILLIC CAPITAL LETTER I
+0xea 0x0419 # CYRILLIC CAPITAL LETTER SHORT I
+0xeb 0x041a # CYRILLIC CAPITAL LETTER KA
+0xec 0x041b # CYRILLIC CAPITAL LETTER EL
+0xed 0x041c # CYRILLIC CAPITAL LETTER EM
+0xee 0x041d # CYRILLIC CAPITAL LETTER EN
+0xef 0x041e # CYRILLIC CAPITAL LETTER O
+0xf0 0x041f # CYRILLIC CAPITAL LETTER PE
+0xf1 0x042f # CYRILLIC CAPITAL LETTER YA
+0xf2 0x0420 # CYRILLIC CAPITAL LETTER ER
+0xf3 0x0421 # CYRILLIC CAPITAL LETTER ES
+0xf4 0x0422 # CYRILLIC CAPITAL LETTER TE
+0xf5 0x0423 # CYRILLIC CAPITAL LETTER U
+0xf6 0x0416 # CYRILLIC CAPITAL LETTER ZHE
+0xf7 0x0412 # CYRILLIC CAPITAL LETTER VE
+0xf8 0x042c # CYRILLIC CAPITAL LETTER SOFT SIGN
+0xf9 0x042b # CYRILLIC CAPITAL LETTER YERU
+0xfa 0x0417 # CYRILLIC CAPITAL LETTER ZE
+0xfb 0x0428 # CYRILLIC CAPITAL LETTER SHA
+0xfc 0x042d # CYRILLIC CAPITAL LETTER E
+0xfd 0x0429 # CYRILLIC CAPITAL LETTER SHCHA
+0xfe 0x0427 # CYRILLIC CAPITAL LETTER CHE
+0xff 0x042a # CYRILLIC CAPITAL LETTER HARD SIGN
--- /dev/null
+# koi8-u to unicode translation from rfc 2319
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+0x80 0x2500 # BOX DRAWINGS LIGHT HORIZONTAL
+0x81 0x2502 # BOX DRAWINGS LIGHT VERTICAL
+0x82 0x250C # BOX DRAWINGS LIGHT DOWN AND RIGHT
+0x83 0x2510 # BOX DRAWINGS LIGHT DOWN AND LEFT
+0x84 0x2514 # BOX DRAWINGS LIGHT UP AND RIGHT
+0x85 0x2518 # BOX DRAWINGS LIGHT UP AND LEFT
+0x86 0x251C # BOX DRAWINGS LIGHT VERTICAL AND RIGHT
+0x87 0x2524 # BOX DRAWINGS LIGHT VERTICAL AND LEFT
+0x88 0x252C # BOX DRAWINGS LIGHT DOWN AND HORIZONTAL
+0x89 0x2534 # BOX DRAWINGS LIGHT UP AND HORIZONTAL
+0x8A 0x253C # BOX DRAWINGS LIGHT VERTICAL AND HORIZONTAL
+0x8B 0x2580 # UPPER HALF BLOCK
+0x8C 0x2584 # LOWER HALF BLOCK
+0x8D 0x2588 # FULL BLOCK
+0x8E 0x258C # LEFT HALF BLOCK
+0x8F 0x2590 # RIGHT HALF BLOCK
+0x90 0x2591 # LIGHT SHADE
+0x91 0x2592 # MEDIUM SHADE
+0x92 0x2593 # DARK SHADE
+0x93 0x2320 # TOP HALF INTEGRAL
+0x94 0x25A0 # BLACK SQUARE
+0x95 0x2022 # BULLET
+0x96 0x221A # SQUARE ROOT
+0x97 0x2248 # ALMOST EQUAL TO
+0x98 0x2264 # LESS-THAN OR EQUAL TO
+0x99 0x2265 # GREATER-THAN OR EQUAL TO
+0x9A 0x00A0 # NO-BREAK SPACE
+0x9B 0x2321 # BOTTOM HALF INTEGRAL
+0x9C 0x00B0 # DEGREE SIGN
+0x9D 0x00B2 # SUPERSCRIPT TWO
+0x9E 0x00B7 # MIDDLE DOT
+0x9F 0x00F7 # DIVISION SIGN
+0xA0 0x2550 # BOX DRAWINGS DOUBLE HORIZONTAL
+0xA1 0x2551 # BOX DRAWINGS DOUBLE VERTICAL
+0xA2 0x2552 # BOX DRAWINGS DOWN SINGLE AND RIGHT DOUBLE
+0xA3 0x0451 # CYRILLIC SMALL LETTER IO
+0xA4 0x0454 # CYRILLIC SMALL LETTER UKRAINIAN IE
+0xA5 0x2554 # BOX DRAWINGS DOUBLE DOWN AND RIGHT
+0xA6 0x0456 # CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+0xA7 0x0457 # CYRILLIC SMALL LETTER YI
+0xA8 0x2557 # BOX DRAWINGS DOUBLE DOWN AND LEFT
+0xA9 0x2558 # BOX DRAWINGS UP SINGLE AND RIGHT DOUBLE
+0xAA 0x2559 # BOX DRAWINGS UP DOUBLE AND RIGHT SINGLE
+0xAB 0x255A # BOX DRAWINGS DOUBLE UP AND RIGHT
+0xAC 0x255B # BOX DRAWINGS UP SINGLE AND LEFT DOUBLE
+0xAD 0x0491 # CYRILLIC SMALL LETTER GHE WITH UPTURN
+0xAE 0x255D # BOX DRAWINGS DOUBLE UP AND LEFT
+0xAF 0x255E # BOX DRAWINGS VERTICAL SINGLE AND RIGHT DOUBLE
+0xB0 0x255F # BOX DRAWINGS VERTICAL DOUBLE AND RIGHT SINGLE
+0xB1 0x2560 # BOX DRAWINGS DOUBLE VERTICAL AND RIGHT
+0xB2 0x2561 # BOX DRAWINGS VERTICAL SINGLE AND LEFT DOUBLE
+0xB3 0x0401 # CYRILLIC CAPITAL LETTER IO
+0xB4 0x0404 # CYRILLIC CAPITAL LETTER UKRAINIAN IE
+0xB5 0x2563 # BOX DRAWINGS DOUBLE VERTICAL AND LEFT
+0xB6 0x0406 # CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+0xB7 0x0407 # CYRILLIC CAPITAL LETTER YI
+0xB8 0x2566 # BOX DRAWINGS DOUBLE DOWN AND HORIZONTAL
+0xB9 0x2567 # BOX DRAWINGS UP SINGLE AND HORIZONTAL DOUBLE
+0xBA 0x2568 # BOX DRAWINGS UP DOUBLE AND HORIZONTAL SINGLE
+0xBB 0x2569 # BOX DRAWINGS DOUBLE UP AND HORIZONTAL
+0xBC 0x256A # BOX DRAWINGS VERTICAL SINGLE AND HORIZONTAL DOUBLE
+0xBD 0x0490 # CYRILLIC CAPITAL LETTER GHE WITH UPTURN
+0xBE 0x256C # BOX DRAWINGS DOUBLE VERTICAL AND HORIZONTAL
+0xBF 0x00A9 # COPYRIGHT SIGN
+0xC0 0x044E # CYRILLIC SMALL LETTER YU
+0xC1 0x0430 # CYRILLIC SMALL LETTER A
+0xC2 0x0431 # CYRILLIC SMALL LETTER BE
+0xC3 0x0446 # CYRILLIC SMALL LETTER TSE
+0xC4 0x0434 # CYRILLIC SMALL LETTER DE
+0xC5 0x0435 # CYRILLIC SMALL LETTER IE
+0xC6 0x0444 # CYRILLIC SMALL LETTER EF
+0xC7 0x0433 # CYRILLIC SMALL LETTER GHE
+0xC8 0x0445 # CYRILLIC SMALL LETTER HA
+0xC9 0x0438 # CYRILLIC SMALL LETTER I
+0xCA 0x0439 # CYRILLIC SMALL LETTER SHORT I
+0xCB 0x043A # CYRILLIC SMALL LETTER KA
+0xCC 0x043B # CYRILLIC SMALL LETTER EL
+0xCD 0x043C # CYRILLIC SMALL LETTER EM
+0xCE 0x043D # CYRILLIC SMALL LETTER EN
+0xCF 0x043E # CYRILLIC SMALL LETTER O
+0xD0 0x043F # CYRILLIC SMALL LETTER PE
+0xD1 0x044F # CYRILLIC SMALL LETTER YA
+0xD2 0x0440 # CYRILLIC SMALL LETTER ER
+0xD3 0x0441 # CYRILLIC SMALL LETTER ES
+0xD4 0x0442 # CYRILLIC SMALL LETTER TE
+0xD5 0x0443 # CYRILLIC SMALL LETTER U
+0xD6 0x0436 # CYRILLIC SMALL LETTER ZHE
+0xD7 0x0432 # CYRILLIC SMALL LETTER VE
+0xD8 0x044C # CYRILLIC SMALL LETTER SOFT SIGN
+0xD9 0x044B # CYRILLIC SMALL LETTER YERU
+0xDA 0x0437 # CYRILLIC SMALL LETTER ZE
+0xDB 0x0448 # CYRILLIC SMALL LETTER SHA
+0xDC 0x044D # CYRILLIC SMALL LETTER E
+0xDD 0x0449 # CYRILLIC SMALL LETTER SHCHA
+0xDE 0x0447 # CYRILLIC SMALL LETTER CHE
+0xDF 0x044A # CYRILLIC SMALL LETTER HARD SIGN
+0xE0 0x042E # CYRILLIC CAPITAL LETTER YU
+0xE1 0x0410 # CYRILLIC CAPITAL LETTER A
+0xE2 0x0411 # CYRILLIC CAPITAL LETTER BE
+0xE3 0x0426 # CYRILLIC CAPITAL LETTER TSE
+0xE4 0x0414 # CYRILLIC CAPITAL LETTER DE
+0xE5 0x0415 # CYRILLIC CAPITAL LETTER IE
+0xE6 0x0424 # CYRILLIC CAPITAL LETTER EF
+0xE7 0x0413 # CYRILLIC CAPITAL LETTER GHE
+0xE8 0x0425 # CYRILLIC CAPITAL LETTER HA
+0xE9 0x0418 # CYRILLIC CAPITAL LETTER I
+0xEA 0x0419 # CYRILLIC CAPITAL LETTER SHORT I
+0xEB 0x041A # CYRILLIC CAPITAL LETTER KA
+0xEC 0x041B # CYRILLIC CAPITAL LETTER EL
+0xED 0x041C # CYRILLIC CAPITAL LETTER EM
+0xEE 0x041D # CYRILLIC CAPITAL LETTER EN
+0xEF 0x041E # CYRILLIC CAPITAL LETTER O
+0xF0 0x041F # CYRILLIC CAPITAL LETTER PE
+0xF1 0x042F # CYRILLIC CAPITAL LETTER YA
+0xF2 0x0420 # CYRILLIC CAPITAL LETTER ER
+0xF3 0x0421 # CYRILLIC CAPITAL LETTER ES
+0xF4 0x0422 # CYRILLIC CAPITAL LETTER TE
+0xF5 0x0423 # CYRILLIC CAPITAL LETTER U
+0xF6 0x0416 # CYRILLIC CAPITAL LETTER ZHE
+0xF7 0x0412 # CYRILLIC CAPITAL LETTER VE
+0xF8 0x042C # CYRILLIC CAPITAL LETTER SOFT SIGN
+0xF9 0x042B # CYRILLIC CAPITAL LETTER YERU
+0xFA 0x0417 # CYRILLIC CAPITAL LETTER ZE
+0xFB 0x0428 # CYRILLIC CAPITAL LETTER SHA
+0xFC 0x042D # CYRILLIC CAPITAL LETTER E
+0xFD 0x0429 # CYRILLIC CAPITAL LETTER SHCHA
+0xFE 0x0427 # CYRILLIC CAPITAL LETTER CHE
+0xFF 0x042A # CYRILLIC CAPITAL LETTER HARD SIGN
--- /dev/null
+#=======================================================================
+# File name: ARABIC.TXT
+#
+# Contents: Map (external version) from Mac OS Arabic
+# character set to Unicode 2.1 and later.
+#
+# Copyright: (c) 1994-2002, 2005 by Apple Computer, Inc., all rights
+# reserved.
+#
+# Contact: charsets@apple.com
+#
+# Changes:
+#
+# c02 2005-Apr-04 Update header comments. Matches internal xml
+# <c1.2> and Text Encoding Converter 2.0.
+# b3,c1 2002-Dec-19 Add comments about character display and
+# direction overrides. Update URLs, notes.
+# Matches internal utom<b4>.
+# b02 1999-Sep-22 Update contact e-mail address. Matches
+# internal utom<b1>, ufrm<b1>, and Text
+# Encoding Converter version 1.5.
+# n10 1998-Feb-05 Show required Unicode character
+# directionality in a different way. Matches
+# internal utom<n4>, ufrm<n21>, and Text
+# Encoding Converter version 1.3. Update
+# header comments; include information on
+# loose mapping of digits.
+# n07 1997-Jul-17 Update to match internal utom<n2>, ufrm<n17>:
+# Change standard mapping for 0xC0 from U+066D
+# to U+274A. Add direction overrides to
+# mappings for 0x25, 0x2C, 0x3B, 0x3F. Add
+# information on variants.
+# n03 1995-Apr-18 First version (after fixing some typos).
+# Matches internal ufrm<n11>.
+#
+# Standard header:
+# ----------------
+#
+# Apple, the Apple logo, and Macintosh are trademarks of Apple
+# Computer, Inc., registered in the United States and other countries.
+# Unicode is a trademark of Unicode Inc. For the sake of brevity,
+# throughout this document, "Macintosh" can be used to refer to
+# Macintosh computers and "Unicode" can be used to refer to the
+# Unicode standard.
+#
+# Apple Computer, Inc. ("Apple") makes no warranty or representation,
+# either express or implied, with respect to this document and the
+# included data, its quality, accuracy, or fitness for a particular
+# purpose. In no event will Apple be liable for direct, indirect,
+# special, incidental, or consequential damages resulting from any
+# defect or inaccuracy in this document or the included data.
+#
+# These mapping tables and character lists are subject to change.
+# The latest tables should be available from the following:
+#
+# <http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
+#
+# For general information about Mac OS encodings and these mapping
+# tables, see the file "README.TXT".
+#
+# Format:
+# -------
+#
+# Three tab-separated columns;
+# '#' begins a comment which continues to the end of the line.
+# Column #1 is the Mac OS Arabic code (in hex as 0xNN).
+# Column #2 is the corresponding Unicode (in hex as 0xNNNN),
+# possibly preceded by a tag indicating required directionality
+# (i.e. <LR>+0xNNNN or <RL>+0xNNNN).
+# Column #3 is a comment containing the Unicode name.
+#
+# The entries are in Mac OS Arabic code order.
+#
+# Control character mappings are not shown in this table, following
+# the conventions of the standard UTC mapping tables. However, the
+# Mac OS Arabic character set uses the standard control characters at
+# 0x00-0x1F and 0x7F.
+#
+# Notes on Mac OS Arabic:
+# -----------------------
+#
+# This is a legacy Mac OS encoding; in the Mac OS X Carbon and Cocoa
+# environments, it is only supported via transcoding to and from
+# Unicode.
+#
+# 1. General
+#
+# The Mac OS Arabic character set is intended to cover Arabic as
+# used in North Africa, the Arabian peninsula, and the Levant. It
+# also contains several characters needed for Urdu and/or Farsi.
+#
+# The Mac OS Arabic character set is essentially a superset of ISO
+# 8859-6. The 8859-6 code points that are interpreted differently
+# in the Mac OS Arabic set are as follows:
+# 0xA0 is NO-BREAK SPACE in 8859-6 and right-left SPACE in Mac OS
+# Arabic; NO-BREAK is 0x81 in Mac OS Arabic.
+# 0xA4 is CURRENCY SIGN in 8859-6 and right-left DOLLAR SIGN in
+# Mac OS Arabic.
+# 0xAD is SOFT HYPHEN in 8859-6 and right-left HYPHEN-MINUS in
+# Mac OS Arabic.
+# ISO 8859-6 specifies that codes 0x30-0x39 can be rendered either
+# with European digit shapes or Arabic digit shapes. This is also
+# true in Mac OS Arabic, which determines from context which digit
+# shapes to use (see below).
+#
+# The Mac OS Arabic character set uses the C1 controls area and other
+# code points which are undefined in ISO 8859-6 for additional
+# graphic characters: additional Arabic letters for Farsi and Urdu,
+# some accented Roman letters for European languages (such as French),
+# and duplicates of some of the punctuation, symbols, and digits in
+# the ASCII block. The duplicate punctuation, symbol, and digit
+# characters have right-left directionality, while the ASCII versions
+# have left-right directionality. See the next section for more
+# information on this.
+#
+# Mac OS Arabic characters 0xEB-0xF2 are non-spacing/combining marks.
+#
+# 2. Directional characters and roundtrip fidelity
+#
+# The Mac OS Arabic character set was developed in 1986-1987. At that
+# time the bidirectional line layout algorithm used in the Mac OS
+# Arabic system was fairly simple; it used only a few direction
+# classes (instead of the 19 now used in the Unicode bidirectional
+# algorithm). In order to permit users to handle some tricky layout
+# problems, certain punctuation and symbol characters were encoded
+# twice, one with a left-right direction attribute and the other with
+# a right-left direction attribute.
+#
+# For example, plus sign is encoded at 0x2B with a left-right
+# attribute, and at 0xAB with a right-left attribute. However, there
+# is only one PLUS SIGN character in Unicode. This leads to some
+# interesting problems when mapping between Mac OS Arabic and Unicode;
+# see below.
+#
+# A related problem is that even when a particular character is
+# encoded only once in Mac OS Arabic, it may have a different
+# direction attribute than the corresponding Unicode character.
+#
+# For example, the Mac OS Arabic character at 0x93 is HORIZONTAL
+# ELLIPSIS with strong right-left direction. However, the Unicode
+# character HORIZONTAL ELLIPSIS has direction class neutral.
+#
+# 3. Behavior of ASCII-range numbers in WorldScript
+#
+# Mac OS Arabic also has two sets of digit codes.
+#
+# The digits at 0x30-0x39 may be displayed using either European
+# digit forms or Arabic digit forms, depending on context. If there
+# is a "strong European" character such as a Latin letter on either
+# side of a sequence consisting of digits 0x30-0x39 and possibly comma
+# 0x2C or period 0x2E, then the characters will be displayed using
+# European forms (This will happen even if there are neutral characters
+# between the digits and the strong European character). Otherwise, the
+# digits will be displayed using Arabic forms, the comma will be
+# displayed as Arabic thousands separator, and the period as Arabic
+# decimal separator. In any case, 0x2C, 0x2E, and 0x30-0x39 are always
+# left-right.
+#
+# The digits at 0xB0-0xB9 are always displayed using Arabic digit
+# shapes, and moreover, these digits always have strong right-left
+# directionality. These are mainly intended for special layout
+# purposes such as part numbers, etc.
+#
+# 4. Font variants
+#
+# The table in this file gives the Unicode mappings for the standard
+# Mac OS Arabic encoding. This encoding is supported by the Cairo font
+# (the system font for Arabic), and is the encoding supported by the
+# text processing utilities. However, the other Arabic fonts actually
+# implement slightly different encodings; this mainly affects the code
+# points 0xAA and 0xC0. For these code points the standard Mac OS
+# Arabic encoding has the following mappings:
+# 0xAA -> <RL>+0x002A ASTERISK, right-left
+# 0xC0 -> <RL>+0x274A EIGHT TEARDROP-SPOKED PROPELLER ASTERISK,
+# right-left
+# This mapping of 0xAA is consistent with the normal convention for
+# Mac OS Arabic and Hebrew that the right-left duplicates have codes
+# that are equal to the ASCII code of the left-right character plus
+# 0x80. However, in all of the other fonts, 0xAA is MULTIPLY SIGN, and
+# right-left ASTERISK may be at a different code point. The other
+# variants are described below.
+#
+# The TrueType variant is used for most of the Arabic TrueType fonts:
+# Baghdad, Geeza, Kufi, Nadeem. It differs from the standard variant
+# in the following way:
+# 0xAA -> <RL>+0x00D7 MULTIPLICATION SIGN, right-left
+# 0xC0 -> <RL>+0x002A ASTERISK, right-left
+#
+# The Thuluth variant is used for the Arabic Postscript-only fonts:
+# Thuluth and Thuluth bold. It differs from the standard variant in
+# the following way:
+# 0xAA -> <RL>+0x00D7 MULTIPLICATION SIGN, right-left
+# 0xC0 -> 0x066D ARABIC FIVE POINTED STAR
+#
+# The AlBayan variant is used for the Arabic TrueType font Al Bayan.
+# It differs from the standard variant in the following way:
+# 0x81 -> no mapping (glyph just has authorship information, etc.)
+# 0xA3 -> 0xFDFA ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM
+# 0xA4 -> 0xFDF2 ARABIC LIGATURE ALLAH ISOLATED FORM
+# 0xAA -> <RL>+0x00D7 MULTIPLICATION SIGN, right-left
+# 0xDC -> <RL>+0x25CF BLACK CIRCLE, right-left
+# 0xFC -> <RL>+0x25A0 BLACK SQUARE, right-left
+#
+# Unicode mapping issues and notes:
+# ---------------------------------
+#
+# 1. Matching the direction of Mac OS Arabic characters
+#
+# When Mac OS Arabic encodes a character twice but with different
+# direction attributes for the two code points - as in the case of
+# plus sign mentioned above - we need a way to map both Mac OS Arabic
+# code points to Unicode and back again without loss of information.
+# With the plus sign, for example, mapping one of the Mac OS Arabic
+# characters to a code in the Unicode corporate use zone is
+# undesirable, since both of the plus sign characters are likely to
+# be used in text that is interchanged.
+#
+# The problem is solved with the use of direction override characters
+# and direction-dependent mappings. When mapping from Mac OS Arabic
+# to Unicode, we use direction overrides as necessary to force the
+# direction of the resulting Unicode characters.
+#
+# The required direction is indicated by a direction tag in the
+# mappings. A tag of <LR> means the corresponding Unicode character
+# must have a strong left-right context, and a tag of <RL> indicates
+# a right-left context.
+#
+# For example, the mapping of 0x2B is given as <LR>+0x002B; the
+# mapping of 0xAB is given as <RL>+0x002B. If we map an isolated
+# instance of 0x2B to Unicode, it should be mapped as follows (LRO
+# indicates LEFT-RIGHT OVERRIDE, PDF indicates POP DIRECTION
+# FORMATTING):
+#
+# 0x2B -> 0x202D (LRO) + 0x002B (PLUS SIGN) + 0x202C (PDF)
+#
+# When mapping several characters in a row that require direction
+# forcing, the overrides need only be used at the beginning and end.
+# For example:
+#
+# 0x24 0x20 0x28 0x29 -> 0x202D 0x0024 0x0020 0x0028 0x0029 0x202C
+#
+# If neutral characters that require direction forcing are already
+# between strong-direction characters with matching directionality,
+# then direction overrides need not be used. Direction overrides are
+# always needed to map the right-left digits at 0xB0-0xB9.
+#
+# When mapping from Unicode to Mac OS Arabic, the Unicode
+# bidirectional algorithm should be used to determine resolved
+# direction of the Unicode characters. The mapping from Unicode to
+# Mac OS Arabic can then be disambiguated by the use of the resolved
+# direction:
+#
+# Unicode 0x002B -> Mac OS Arabic 0x2B (if L) or 0xAB (if R)
+#
+# However, this also means the direction override characters should
+# be discarded when mapping from Unicode to Mac OS Arabic (after
+# they have been used to determine resolved direction), since the
+# direction override information is carried by the code point itself.
+#
+# Even when direction overrides are not needed for roundtrip
+# fidelity, they are sometimes used when mapping Mac OS Arabic
+# characters to Unicode in order to achieve similar text layout with
+# the resulting Unicode text. For example, the single Mac OS Arabic
+# ellipsis character has direction class right-left,and there is no
+# left-right version. However, the Unicode HORIZONTAL ELLIPSIS
+# character has direction class neutral (which means it may end up
+# with a resolved direction of left-right if surrounded by left-right
+# characters). When mapping the Mac OS Arabic ellipsis to Unicode, it
+# is surrounded with a direction override to help preserve proper
+# text layout. The resolved direction is not needed or used when
+# mapping the Unicode HORIZONTAL ELLIPSIS back to Mac OS Arabic.
+#
+# 2. Mapping the Mac OS Arabic digits
+#
+# The main table below contains mappings that should be used when
+# strict round-trip fidelity is required. However, for numeric
+# values, the mappings in that table will produce Unicode characters
+# that may appear different than the Mac OS Arabic text displayed on
+# a Mac OS system using WorldScript. This is because WorldScript
+# uses context-dependent display for the 0x30-0x39 digits.
+#
+# If roundtrip fidelity is not required, then the following
+# alternate mappings should be used when a sequence of 0x30-0x39
+# digits - possibly including 0x2C and 0x2E - occurs in an Arabic
+# context (that is, when the first "strong" character on either side
+# of the digit sequence is Arabic, or there is no strong character):
+#
+# 0x2C 0x066C # ARABIC THOUSANDS SEPARATOR
+# 0x2E 0x066B # ARABIC DECIMAL SEPARATOR
+# 0x30 0x0660 # ARABIC-INDIC DIGIT ZERO
+# 0x31 0x0661 # ARABIC-INDIC DIGIT ONE
+# 0x32 0x0662 # ARABIC-INDIC DIGIT TWO
+# 0x33 0x0663 # ARABIC-INDIC DIGIT THREE
+# 0x34 0x0664 # ARABIC-INDIC DIGIT FOUR
+# 0x35 0x0665 # ARABIC-INDIC DIGIT FIVE
+# 0x36 0x0666 # ARABIC-INDIC DIGIT SIX
+# 0x37 0x0667 # ARABIC-INDIC DIGIT SEVEN
+# 0x38 0x0668 # ARABIC-INDIC DIGIT EIGHT
+# 0x39 0x0669 # ARABIC-INDIC DIGIT NINE
+#
+# Details of mapping changes in each version:
+# -------------------------------------------
+#
+# Changes from version n03 to version n07:
+#
+# - Change mapping for 0xC0 from U+066D to U+274A.
+#
+# - Add direction overrides (required directionality) to mappings
+# for 0x25, 0x2C, 0x3B, 0x3F.
+#
+##################
+
+0x20 <LR>+0x0020 # SPACE, left-right
+0x21 <LR>+0x0021 # EXCLAMATION MARK, left-right
+0x22 <LR>+0x0022 # QUOTATION MARK, left-right
+0x23 <LR>+0x0023 # NUMBER SIGN, left-right
+0x24 <LR>+0x0024 # DOLLAR SIGN, left-right
+0x25 <LR>+0x0025 # PERCENT SIGN, left-right
+0x26 <LR>+0x0026 # AMPERSAND, left-right
+0x27 <LR>+0x0027 # APOSTROPHE, left-right
+0x28 <LR>+0x0028 # LEFT PARENTHESIS, left-right
+0x29 <LR>+0x0029 # RIGHT PARENTHESIS, left-right
+0x2A <LR>+0x002A # ASTERISK, left-right
+0x2B <LR>+0x002B # PLUS SIGN, left-right
+0x2C <LR>+0x002C # COMMA, left-right; in Arabic-script context, displayed as 0x066C ARABIC THOUSANDS SEPARATOR
+0x2D <LR>+0x002D # HYPHEN-MINUS, left-right
+0x2E <LR>+0x002E # FULL STOP, left-right; in Arabic-script context, displayed as 0x066B ARABIC DECIMAL SEPARATOR
+0x2F <LR>+0x002F # SOLIDUS, left-right
+0x30 0x0030 # DIGIT ZERO; in Arabic-script context, displayed as 0x0660 ARABIC-INDIC DIGIT ZERO
+0x31 0x0031 # DIGIT ONE; in Arabic-script context, displayed as 0x0661 ARABIC-INDIC DIGIT ONE
+0x32 0x0032 # DIGIT TWO; in Arabic-script context, displayed as 0x0662 ARABIC-INDIC DIGIT TWO
+0x33 0x0033 # DIGIT THREE; in Arabic-script context, displayed as 0x0663 ARABIC-INDIC DIGIT THREE
+0x34 0x0034 # DIGIT FOUR; in Arabic-script context, displayed as 0x0664 ARABIC-INDIC DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE; in Arabic-script context, displayed as 0x0665 ARABIC-INDIC DIGIT FIVE
+0x36 0x0036 # DIGIT SIX; in Arabic-script context, displayed as 0x0666 ARABIC-INDIC DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN; in Arabic-script context, displayed as 0x0667 ARABIC-INDIC DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT; in Arabic-script context, displayed as 0x0668 ARABIC-INDIC DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE; in Arabic-script context, displayed as 0x0669 ARABIC-INDIC DIGIT NINE
+0x3A <LR>+0x003A # COLON, left-right
+0x3B <LR>+0x003B # SEMICOLON, left-right
+0x3C <LR>+0x003C # LESS-THAN SIGN, left-right
+0x3D <LR>+0x003D # EQUALS SIGN, left-right
+0x3E <LR>+0x003E # GREATER-THAN SIGN, left-right
+0x3F <LR>+0x003F # QUESTION MARK, left-right
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B <LR>+0x005B # LEFT SQUARE BRACKET, left-right
+0x5C <LR>+0x005C # REVERSE SOLIDUS, left-right
+0x5D <LR>+0x005D # RIGHT SQUARE BRACKET, left-right
+0x5E <LR>+0x005E # CIRCUMFLEX ACCENT, left-right
+0x5F <LR>+0x005F # LOW LINE, left-right
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B <LR>+0x007B # LEFT CURLY BRACKET, left-right
+0x7C <LR>+0x007C # VERTICAL LINE, left-right
+0x7D <LR>+0x007D # RIGHT CURLY BRACKET, left-right
+0x7E 0x007E # TILDE
+#
+0x80 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0x81 <RL>+0x00A0 # NO-BREAK SPACE, right-left
+0x82 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA
+0x83 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0x84 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE
+0x85 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0x86 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0x87 0x00E1 # LATIN SMALL LETTER A WITH ACUTE
+0x88 0x00E0 # LATIN SMALL LETTER A WITH GRAVE
+0x89 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x8A 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0x8B 0x06BA # ARABIC LETTER NOON GHUNNA
+0x8C <RL>+0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK, right-left
+0x8D 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA
+0x8E 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0x8F 0x00E8 # LATIN SMALL LETTER E WITH GRAVE
+0x90 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x91 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS
+0x92 0x00ED # LATIN SMALL LETTER I WITH ACUTE
+0x93 <RL>+0x2026 # HORIZONTAL ELLIPSIS, right-left
+0x94 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x95 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS
+0x96 0x00F1 # LATIN SMALL LETTER N WITH TILDE
+0x97 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
+0x98 <RL>+0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK, right-left
+0x99 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x9A 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0x9B <RL>+0x00F7 # DIVISION SIGN, right-left
+0x9C 0x00FA # LATIN SMALL LETTER U WITH ACUTE
+0x9D 0x00F9 # LATIN SMALL LETTER U WITH GRAVE
+0x9E 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x9F 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xA0 <RL>+0x0020 # SPACE, right-left
+0xA1 <RL>+0x0021 # EXCLAMATION MARK, right-left
+0xA2 <RL>+0x0022 # QUOTATION MARK, right-left
+0xA3 <RL>+0x0023 # NUMBER SIGN, right-left
+0xA4 <RL>+0x0024 # DOLLAR SIGN, right-left
+0xA5 0x066A # ARABIC PERCENT SIGN
+0xA6 <RL>+0x0026 # AMPERSAND, right-left
+0xA7 <RL>+0x0027 # APOSTROPHE, right-left
+0xA8 <RL>+0x0028 # LEFT PARENTHESIS, right-left
+0xA9 <RL>+0x0029 # RIGHT PARENTHESIS, right-left
+0xAA <RL>+0x002A # ASTERISK, right-left
+0xAB <RL>+0x002B # PLUS SIGN, right-left
+0xAC 0x060C # ARABIC COMMA
+0xAD <RL>+0x002D # HYPHEN-MINUS, right-left
+0xAE <RL>+0x002E # FULL STOP, right-left
+0xAF <RL>+0x002F # SOLIDUS, right-left
+0xB0 <RL>+0x0660 # ARABIC-INDIC DIGIT ZERO, right-left (need override)
+0xB1 <RL>+0x0661 # ARABIC-INDIC DIGIT ONE, right-left (need override)
+0xB2 <RL>+0x0662 # ARABIC-INDIC DIGIT TWO, right-left (need override)
+0xB3 <RL>+0x0663 # ARABIC-INDIC DIGIT THREE, right-left (need override)
+0xB4 <RL>+0x0664 # ARABIC-INDIC DIGIT FOUR, right-left (need override)
+0xB5 <RL>+0x0665 # ARABIC-INDIC DIGIT FIVE, right-left (need override)
+0xB6 <RL>+0x0666 # ARABIC-INDIC DIGIT SIX, right-left (need override)
+0xB7 <RL>+0x0667 # ARABIC-INDIC DIGIT SEVEN, right-left (need override)
+0xB8 <RL>+0x0668 # ARABIC-INDIC DIGIT EIGHT, right-left (need override)
+0xB9 <RL>+0x0669 # ARABIC-INDIC DIGIT NINE, right-left (need override)
+0xBA <RL>+0x003A # COLON, right-left
+0xBB 0x061B # ARABIC SEMICOLON
+0xBC <RL>+0x003C # LESS-THAN SIGN, right-left
+0xBD <RL>+0x003D # EQUALS SIGN, right-left
+0xBE <RL>+0x003E # GREATER-THAN SIGN, right-left
+0xBF 0x061F # ARABIC QUESTION MARK
+0xC0 <RL>+0x274A # EIGHT TEARDROP-SPOKED PROPELLER ASTERISK, right-left
+0xC1 0x0621 # ARABIC LETTER HAMZA
+0xC2 0x0622 # ARABIC LETTER ALEF WITH MADDA ABOVE
+0xC3 0x0623 # ARABIC LETTER ALEF WITH HAMZA ABOVE
+0xC4 0x0624 # ARABIC LETTER WAW WITH HAMZA ABOVE
+0xC5 0x0625 # ARABIC LETTER ALEF WITH HAMZA BELOW
+0xC6 0x0626 # ARABIC LETTER YEH WITH HAMZA ABOVE
+0xC7 0x0627 # ARABIC LETTER ALEF
+0xC8 0x0628 # ARABIC LETTER BEH
+0xC9 0x0629 # ARABIC LETTER TEH MARBUTA
+0xCA 0x062A # ARABIC LETTER TEH
+0xCB 0x062B # ARABIC LETTER THEH
+0xCC 0x062C # ARABIC LETTER JEEM
+0xCD 0x062D # ARABIC LETTER HAH
+0xCE 0x062E # ARABIC LETTER KHAH
+0xCF 0x062F # ARABIC LETTER DAL
+0xD0 0x0630 # ARABIC LETTER THAL
+0xD1 0x0631 # ARABIC LETTER REH
+0xD2 0x0632 # ARABIC LETTER ZAIN
+0xD3 0x0633 # ARABIC LETTER SEEN
+0xD4 0x0634 # ARABIC LETTER SHEEN
+0xD5 0x0635 # ARABIC LETTER SAD
+0xD6 0x0636 # ARABIC LETTER DAD
+0xD7 0x0637 # ARABIC LETTER TAH
+0xD8 0x0638 # ARABIC LETTER ZAH
+0xD9 0x0639 # ARABIC LETTER AIN
+0xDA 0x063A # ARABIC LETTER GHAIN
+0xDB <RL>+0x005B # LEFT SQUARE BRACKET, right-left
+0xDC <RL>+0x005C # REVERSE SOLIDUS, right-left
+0xDD <RL>+0x005D # RIGHT SQUARE BRACKET, right-left
+0xDE <RL>+0x005E # CIRCUMFLEX ACCENT, right-left
+0xDF <RL>+0x005F # LOW LINE, right-left
+0xE0 0x0640 # ARABIC TATWEEL
+0xE1 0x0641 # ARABIC LETTER FEH
+0xE2 0x0642 # ARABIC LETTER QAF
+0xE3 0x0643 # ARABIC LETTER KAF
+0xE4 0x0644 # ARABIC LETTER LAM
+0xE5 0x0645 # ARABIC LETTER MEEM
+0xE6 0x0646 # ARABIC LETTER NOON
+0xE7 0x0647 # ARABIC LETTER HEH
+0xE8 0x0648 # ARABIC LETTER WAW
+0xE9 0x0649 # ARABIC LETTER ALEF MAKSURA
+0xEA 0x064A # ARABIC LETTER YEH
+0xEB 0x064B # ARABIC FATHATAN
+0xEC 0x064C # ARABIC DAMMATAN
+0xED 0x064D # ARABIC KASRATAN
+0xEE 0x064E # ARABIC FATHA
+0xEF 0x064F # ARABIC DAMMA
+0xF0 0x0650 # ARABIC KASRA
+0xF1 0x0651 # ARABIC SHADDA
+0xF2 0x0652 # ARABIC SUKUN
+0xF3 0x067E # ARABIC LETTER PEH
+0xF4 0x0679 # ARABIC LETTER TTEH
+0xF5 0x0686 # ARABIC LETTER TCHEH
+0xF6 0x06D5 # ARABIC LETTER AE
+0xF7 0x06A4 # ARABIC LETTER VEH
+0xF8 0x06AF # ARABIC LETTER GAF
+0xF9 0x0688 # ARABIC LETTER DDAL
+0xFA 0x0691 # ARABIC LETTER RREH
+0xFB <RL>+0x007B # LEFT CURLY BRACKET, right-left
+0xFC <RL>+0x007C # VERTICAL LINE, right-left
+0xFD <RL>+0x007D # RIGHT CURLY BRACKET, right-left
+0xFE 0x0698 # ARABIC LETTER JEH
+0xFF 0x06D2 # ARABIC LETTER YEH BARREE
--- /dev/null
+#=======================================================================
+# File name: CENTEURO.TXT
+#
+# Contents: Map (external version) from Mac OS Central European
+# character set to Unicode 2.1 and later.
+#
+# Copyright: (c) 1995-2002, 2005 by Apple Computer, Inc., all rights
+# reserved.
+#
+# Contact: charsets@apple.com
+#
+# Changes:
+#
+# c02 2005-Apr-04 Update header comments. Matches internal xml
+# <c1.1> and Text Encoding Converter 2.0.
+# b3,c1 2002-Dec-19 Update URLs. Matches internal utom<b1>.
+# b02 1999-Sep-22 Update contact e-mail address. Matches
+# internal utom<b1>, ufrm<b1>, and Text
+# Encoding Converter version 1.5.
+# n05 1998-Feb-05 Update header comments to new format; no
+# mapping changes. Matches internal utom<n3>,
+# ufrm<n13>, and Text Encoding Converter
+# version 1.3.
+# n03 1995-Apr-15 First version (after fixing some typos).
+# Matches internal ufrm<n5>.
+#
+# Standard header:
+# ----------------
+#
+# Apple, the Apple logo, and Macintosh are trademarks of Apple
+# Computer, Inc., registered in the United States and other countries.
+# Unicode is a trademark of Unicode Inc. For the sake of brevity,
+# throughout this document, "Macintosh" can be used to refer to
+# Macintosh computers and "Unicode" can be used to refer to the
+# Unicode standard.
+#
+# Apple Computer, Inc. ("Apple") makes no warranty or representation,
+# either express or implied, with respect to this document and the
+# included data, its quality, accuracy, or fitness for a particular
+# purpose. In no event will Apple be liable for direct, indirect,
+# special, incidental, or consequential damages resulting from any
+# defect or inaccuracy in this document or the included data.
+#
+# These mapping tables and character lists are subject to change.
+# The latest tables should be available from the following:
+#
+# <http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
+#
+# For general information about Mac OS encodings and these mapping
+# tables, see the file "README.TXT".
+#
+# Format:
+# -------
+#
+# Three tab-separated columns;
+# '#' begins a comment which continues to the end of the line.
+# Column #1 is the Mac OS Central European code (in hex as 0xNN)
+# Column #2 is the corresponding Unicode (in hex as 0xNNNN)
+# Column #3 is a comment containing the Unicode name
+#
+# The entries are in Mac OS Central European code order.
+#
+# Control character mappings are not shown in this table, following
+# the conventions of the standard UTC mapping tables. However, the
+# Mac OS Central European character set uses the standard control
+# characters at 0x00-0x1F and 0x7F.
+#
+# Notes on Mac OS Central European:
+# ---------------------------------
+#
+# This is a legacy Mac OS encoding; in the Mac OS X Carbon and Cocoa
+# environments, it is only supported directly in programming
+# interfaces for QuickDraw Text, the Script Manager, and related
+# Text Utilities. For other purposes it is supported via transcoding
+# to and from Unicode.
+#
+# This character set is intended to cover the following languages:
+#
+# Polish, Czech, Slovak, Hungarian, Estonian, Latvian, Lithuanian
+#
+# These are written in Latin script, but using a different set of
+# of accented characters than Mac OS Roman. The Mac OS Central
+# European character set also includes a number of characters
+# needed for the Mac OS user interface and localization (e.g.
+# ellipsis, bullet, copyright sign), several typographic
+# punctuation symbols, math symbols, etc. However, it has a
+# smaller set of punctuation and symbols than Mac OS Roman. All of
+# the characters in Mac OS Central European that are also in the
+# Mac OS Roman character set are at the same code point in both
+# character sets; this improves application compatibility.
+#
+# Note: This does not have the same letter repertoire as ISO
+# 8859-2 (Latin-2); each has some accented letters that the other
+# does not have.
+#
+# Unicode mapping issues and notes:
+# ---------------------------------
+#
+# Details of mapping changes in each version:
+# -------------------------------------------
+#
+##################
+
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+#
+0x80 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0x81 0x0100 # LATIN CAPITAL LETTER A WITH MACRON
+0x82 0x0101 # LATIN SMALL LETTER A WITH MACRON
+0x83 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0x84 0x0104 # LATIN CAPITAL LETTER A WITH OGONEK
+0x85 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0x86 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0x87 0x00E1 # LATIN SMALL LETTER A WITH ACUTE
+0x88 0x0105 # LATIN SMALL LETTER A WITH OGONEK
+0x89 0x010C # LATIN CAPITAL LETTER C WITH CARON
+0x8A 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0x8B 0x010D # LATIN SMALL LETTER C WITH CARON
+0x8C 0x0106 # LATIN CAPITAL LETTER C WITH ACUTE
+0x8D 0x0107 # LATIN SMALL LETTER C WITH ACUTE
+0x8E 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0x8F 0x0179 # LATIN CAPITAL LETTER Z WITH ACUTE
+0x90 0x017A # LATIN SMALL LETTER Z WITH ACUTE
+0x91 0x010E # LATIN CAPITAL LETTER D WITH CARON
+0x92 0x00ED # LATIN SMALL LETTER I WITH ACUTE
+0x93 0x010F # LATIN SMALL LETTER D WITH CARON
+0x94 0x0112 # LATIN CAPITAL LETTER E WITH MACRON
+0x95 0x0113 # LATIN SMALL LETTER E WITH MACRON
+0x96 0x0116 # LATIN CAPITAL LETTER E WITH DOT ABOVE
+0x97 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
+0x98 0x0117 # LATIN SMALL LETTER E WITH DOT ABOVE
+0x99 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x9A 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0x9B 0x00F5 # LATIN SMALL LETTER O WITH TILDE
+0x9C 0x00FA # LATIN SMALL LETTER U WITH ACUTE
+0x9D 0x011A # LATIN CAPITAL LETTER E WITH CARON
+0x9E 0x011B # LATIN SMALL LETTER E WITH CARON
+0x9F 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xA0 0x2020 # DAGGER
+0xA1 0x00B0 # DEGREE SIGN
+0xA2 0x0118 # LATIN CAPITAL LETTER E WITH OGONEK
+0xA3 0x00A3 # POUND SIGN
+0xA4 0x00A7 # SECTION SIGN
+0xA5 0x2022 # BULLET
+0xA6 0x00B6 # PILCROW SIGN
+0xA7 0x00DF # LATIN SMALL LETTER SHARP S
+0xA8 0x00AE # REGISTERED SIGN
+0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x2122 # TRADE MARK SIGN
+0xAB 0x0119 # LATIN SMALL LETTER E WITH OGONEK
+0xAC 0x00A8 # DIAERESIS
+0xAD 0x2260 # NOT EQUAL TO
+0xAE 0x0123 # LATIN SMALL LETTER G WITH CEDILLA
+0xAF 0x012E # LATIN CAPITAL LETTER I WITH OGONEK
+0xB0 0x012F # LATIN SMALL LETTER I WITH OGONEK
+0xB1 0x012A # LATIN CAPITAL LETTER I WITH MACRON
+0xB2 0x2264 # LESS-THAN OR EQUAL TO
+0xB3 0x2265 # GREATER-THAN OR EQUAL TO
+0xB4 0x012B # LATIN SMALL LETTER I WITH MACRON
+0xB5 0x0136 # LATIN CAPITAL LETTER K WITH CEDILLA
+0xB6 0x2202 # PARTIAL DIFFERENTIAL
+0xB7 0x2211 # N-ARY SUMMATION
+0xB8 0x0142 # LATIN SMALL LETTER L WITH STROKE
+0xB9 0x013B # LATIN CAPITAL LETTER L WITH CEDILLA
+0xBA 0x013C # LATIN SMALL LETTER L WITH CEDILLA
+0xBB 0x013D # LATIN CAPITAL LETTER L WITH CARON
+0xBC 0x013E # LATIN SMALL LETTER L WITH CARON
+0xBD 0x0139 # LATIN CAPITAL LETTER L WITH ACUTE
+0xBE 0x013A # LATIN SMALL LETTER L WITH ACUTE
+0xBF 0x0145 # LATIN CAPITAL LETTER N WITH CEDILLA
+0xC0 0x0146 # LATIN SMALL LETTER N WITH CEDILLA
+0xC1 0x0143 # LATIN CAPITAL LETTER N WITH ACUTE
+0xC2 0x00AC # NOT SIGN
+0xC3 0x221A # SQUARE ROOT
+0xC4 0x0144 # LATIN SMALL LETTER N WITH ACUTE
+0xC5 0x0147 # LATIN CAPITAL LETTER N WITH CARON
+0xC6 0x2206 # INCREMENT
+0xC7 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xC8 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xC9 0x2026 # HORIZONTAL ELLIPSIS
+0xCA 0x00A0 # NO-BREAK SPACE
+0xCB 0x0148 # LATIN SMALL LETTER N WITH CARON
+0xCC 0x0150 # LATIN CAPITAL LETTER O WITH DOUBLE ACUTE
+0xCD 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE
+0xCE 0x0151 # LATIN SMALL LETTER O WITH DOUBLE ACUTE
+0xCF 0x014C # LATIN CAPITAL LETTER O WITH MACRON
+0xD0 0x2013 # EN DASH
+0xD1 0x2014 # EM DASH
+0xD2 0x201C # LEFT DOUBLE QUOTATION MARK
+0xD3 0x201D # RIGHT DOUBLE QUOTATION MARK
+0xD4 0x2018 # LEFT SINGLE QUOTATION MARK
+0xD5 0x2019 # RIGHT SINGLE QUOTATION MARK
+0xD6 0x00F7 # DIVISION SIGN
+0xD7 0x25CA # LOZENGE
+0xD8 0x014D # LATIN SMALL LETTER O WITH MACRON
+0xD9 0x0154 # LATIN CAPITAL LETTER R WITH ACUTE
+0xDA 0x0155 # LATIN SMALL LETTER R WITH ACUTE
+0xDB 0x0158 # LATIN CAPITAL LETTER R WITH CARON
+0xDC 0x2039 # SINGLE LEFT-POINTING ANGLE QUOTATION MARK
+0xDD 0x203A # SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
+0xDE 0x0159 # LATIN SMALL LETTER R WITH CARON
+0xDF 0x0156 # LATIN CAPITAL LETTER R WITH CEDILLA
+0xE0 0x0157 # LATIN SMALL LETTER R WITH CEDILLA
+0xE1 0x0160 # LATIN CAPITAL LETTER S WITH CARON
+0xE2 0x201A # SINGLE LOW-9 QUOTATION MARK
+0xE3 0x201E # DOUBLE LOW-9 QUOTATION MARK
+0xE4 0x0161 # LATIN SMALL LETTER S WITH CARON
+0xE5 0x015A # LATIN CAPITAL LETTER S WITH ACUTE
+0xE6 0x015B # LATIN SMALL LETTER S WITH ACUTE
+0xE7 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE
+0xE8 0x0164 # LATIN CAPITAL LETTER T WITH CARON
+0xE9 0x0165 # LATIN SMALL LETTER T WITH CARON
+0xEA 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE
+0xEB 0x017D # LATIN CAPITAL LETTER Z WITH CARON
+0xEC 0x017E # LATIN SMALL LETTER Z WITH CARON
+0xED 0x016A # LATIN CAPITAL LETTER U WITH MACRON
+0xEE 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE
+0xEF 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0xF0 0x016B # LATIN SMALL LETTER U WITH MACRON
+0xF1 0x016E # LATIN CAPITAL LETTER U WITH RING ABOVE
+0xF2 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE
+0xF3 0x016F # LATIN SMALL LETTER U WITH RING ABOVE
+0xF4 0x0170 # LATIN CAPITAL LETTER U WITH DOUBLE ACUTE
+0xF5 0x0171 # LATIN SMALL LETTER U WITH DOUBLE ACUTE
+0xF6 0x0172 # LATIN CAPITAL LETTER U WITH OGONEK
+0xF7 0x0173 # LATIN SMALL LETTER U WITH OGONEK
+0xF8 0x00DD # LATIN CAPITAL LETTER Y WITH ACUTE
+0xF9 0x00FD # LATIN SMALL LETTER Y WITH ACUTE
+0xFA 0x0137 # LATIN SMALL LETTER K WITH CEDILLA
+0xFB 0x017B # LATIN CAPITAL LETTER Z WITH DOT ABOVE
+0xFC 0x0141 # LATIN CAPITAL LETTER L WITH STROKE
+0xFD 0x017C # LATIN SMALL LETTER Z WITH DOT ABOVE
+0xFE 0x0122 # LATIN CAPITAL LETTER G WITH CEDILLA
+0xFF 0x02C7 # CARON
--- /dev/null
+#=======================================================================
+# FTP file name: CYRILLIC.TXT
+#
+# Contents: Map (external version) from Mac OS Cyrillic
+# character set to Unicode 2.0
+#
+# Copyright: (c) 1995-1998 by Apple Computer, Inc., all rights
+# reserved.
+#
+# Contacts: Peter Edberg <pedberg@apple.com>
+# Julio Gonzalez <juliog@apple.com>
+# John Jenkins <jenkins@apple.com>
+#
+# Changes:
+#
+# n05 1998-Feb-05 Update header comments to new format; no
+# mapping changes. Matches internal utom<3>,
+# ufrm<13>, and Text Encoding Converter
+# version 1.3.
+# n03 1995-Apr-15 First version (after fixing some typos).
+# Matches internal ufrm<5>.
+#
+# Standard header:
+# ----------------
+#
+# Apple, the Apple logo, and Macintosh are trademarks of Apple
+# Computer, Inc., registered in the United States and other countries.
+# Unicode is a trademark of Unicode Inc. For the sake of brevity,
+# throughout this document, "Macintosh" can be used to refer to
+# Macintosh computers and "Unicode" can be used to refer to the
+# Unicode standard.
+#
+# Apple makes no warranty or representation, either express or
+# implied, with respect to these tables, their quality, accuracy, or
+# fitness for a particular purpose. In no event will Apple be liable
+# for direct, indirect, special, incidental, or consequential damages
+# resulting from any defect or inaccuracy in this document or the
+# accompanying tables.
+#
+# These mapping tables and character lists are subject to change.
+# The latest tables should be available from the following:
+#
+# <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
+# <ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/>
+#
+# For general information about Mac OS encodings and these mapping
+# tables, see the file "README.TXT".
+#
+# Format:
+# -------
+#
+# Three tab-separated columns;
+# '#' begins a comment which continues to the end of the line.
+# Column #1 is the Mac OS Cyrillic code (in hex as 0xNN)
+# Column #2 is the corresponding Unicode (in hex as 0xNNNN)
+# Column #3 is a comment containing the Unicode name
+#
+# The entries are in Mac OS Cyrillic code order.
+#
+# Control character mappings are not shown in this table, following
+# the conventions of the standard UTC mapping tables. However, the
+# Mac OS Cyrillic character set uses the standard control characters
+# at 0x00-0x1F and 0x7F.
+#
+# Notes on Mac OS Cyrillic:
+# -------------------------
+#
+# The Mac Cyrillic encoding is used for most Cyrillic localized
+# system software except Ukrainian. A variant of Mac OS Cyrillic
+# is used for Ukrainian and for the Cyrillic Language Kit; it is
+# covered by a separate table.
+#
+# The Mac OS Cyrillic encoding has the same Cyrillic letter
+# repertoire as ISO 8859-5 (although not at the same code points).
+# This covers most of the Slavic languages written in Cyrillic
+# script.
+#
+# The Mac OS Cyrillic encoding also includes a number of characters
+# needed for the Mac OS user interface and localization (e.g.
+# ellipsis, bullet, copyright sign). All of the characters in Mac OS
+# Cyrillic that are also in the Mac OS Roman encoding are at the
+# same code point in both; this improves application compatibility.
+#
+# Unicode mapping issues and notes:
+# ---------------------------------
+#
+# Details of mapping changes in each version:
+# -------------------------------------------
+#
+##################
+
+0x00 0x0000 #NULL
+0x01 0x0001 #START OF HEADING
+0x02 0x0002 #START OF TEXT
+0x03 0x0003 #END OF TEXT
+0x04 0x0004 #END OF TRANSMISSION
+0x05 0x0005 #ENQUIRY
+0x06 0x0006 #ACKNOWLEDGE
+0x07 0x0007 #BELL
+0x08 0x0008 #BACKSPACE
+0x09 0x0009 #HORIZONTAL TABULATION
+0x0A 0x000A #LINE FEED
+0x0B 0x000B #VERTICAL TABULATION
+0x0C 0x000C #FORM FEED
+0x0D 0x000D #CARRIAGE RETURN
+0x0E 0x000E #SHIFT OUT
+0x0F 0x000F #SHIFT IN
+0x10 0x0010 #DATA LINK ESCAPE
+0x11 0x0011 #DEVICE CONTROL ONE
+0x12 0x0012 #DEVICE CONTROL TWO
+0x13 0x0013 #DEVICE CONTROL THREE
+0x14 0x0014 #DEVICE CONTROL FOUR
+0x15 0x0015 #NEGATIVE ACKNOWLEDGE
+0x16 0x0016 #SYNCHRONOUS IDLE
+0x17 0x0017 #END OF TRANSMISSION BLOCK
+0x18 0x0018 #CANCEL
+0x19 0x0019 #END OF MEDIUM
+0x1A 0x001A #SUBSTITUTE
+0x1B 0x001B #ESCAPE
+0x1C 0x001C #FILE SEPARATOR
+0x1D 0x001D #GROUP SEPARATOR
+0x1E 0x001E #RECORD SEPARATOR
+0x1F 0x001F #UNIT SEPARATOR
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+#
+0x80 0x0410 # CYRILLIC CAPITAL LETTER A
+0x81 0x0411 # CYRILLIC CAPITAL LETTER BE
+0x82 0x0412 # CYRILLIC CAPITAL LETTER VE
+0x83 0x0413 # CYRILLIC CAPITAL LETTER GHE
+0x84 0x0414 # CYRILLIC CAPITAL LETTER DE
+0x85 0x0415 # CYRILLIC CAPITAL LETTER IE
+0x86 0x0416 # CYRILLIC CAPITAL LETTER ZHE
+0x87 0x0417 # CYRILLIC CAPITAL LETTER ZE
+0x88 0x0418 # CYRILLIC CAPITAL LETTER I
+0x89 0x0419 # CYRILLIC CAPITAL LETTER SHORT I
+0x8A 0x041A # CYRILLIC CAPITAL LETTER KA
+0x8B 0x041B # CYRILLIC CAPITAL LETTER EL
+0x8C 0x041C # CYRILLIC CAPITAL LETTER EM
+0x8D 0x041D # CYRILLIC CAPITAL LETTER EN
+0x8E 0x041E # CYRILLIC CAPITAL LETTER O
+0x8F 0x041F # CYRILLIC CAPITAL LETTER PE
+0x90 0x0420 # CYRILLIC CAPITAL LETTER ER
+0x91 0x0421 # CYRILLIC CAPITAL LETTER ES
+0x92 0x0422 # CYRILLIC CAPITAL LETTER TE
+0x93 0x0423 # CYRILLIC CAPITAL LETTER U
+0x94 0x0424 # CYRILLIC CAPITAL LETTER EF
+0x95 0x0425 # CYRILLIC CAPITAL LETTER HA
+0x96 0x0426 # CYRILLIC CAPITAL LETTER TSE
+0x97 0x0427 # CYRILLIC CAPITAL LETTER CHE
+0x98 0x0428 # CYRILLIC CAPITAL LETTER SHA
+0x99 0x0429 # CYRILLIC CAPITAL LETTER SHCHA
+0x9A 0x042A # CYRILLIC CAPITAL LETTER HARD SIGN
+0x9B 0x042B # CYRILLIC CAPITAL LETTER YERU
+0x9C 0x042C # CYRILLIC CAPITAL LETTER SOFT SIGN
+0x9D 0x042D # CYRILLIC CAPITAL LETTER E
+0x9E 0x042E # CYRILLIC CAPITAL LETTER YU
+0x9F 0x042F # CYRILLIC CAPITAL LETTER YA
+0xA0 0x2020 # DAGGER
+0xA1 0x00B0 # DEGREE SIGN
+0xA2 0x00A2 # CENT SIGN
+0xA3 0x00A3 # POUND SIGN
+0xA4 0x00A7 # SECTION SIGN
+0xA5 0x2022 # BULLET
+0xA6 0x00B6 # PILCROW SIGN
+0xA7 0x0406 # CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I
+0xA8 0x00AE # REGISTERED SIGN
+0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x2122 # TRADE MARK SIGN
+0xAB 0x0402 # CYRILLIC CAPITAL LETTER DJE
+0xAC 0x0452 # CYRILLIC SMALL LETTER DJE
+0xAD 0x2260 # NOT EQUAL TO
+0xAE 0x0403 # CYRILLIC CAPITAL LETTER GJE
+0xAF 0x0453 # CYRILLIC SMALL LETTER GJE
+0xB0 0x221E # INFINITY
+0xB1 0x00B1 # PLUS-MINUS SIGN
+0xB2 0x2264 # LESS-THAN OR EQUAL TO
+0xB3 0x2265 # GREATER-THAN OR EQUAL TO
+0xB4 0x0456 # CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I
+0xB5 0x00B5 # MICRO SIGN
+0xB6 0x2202 # PARTIAL DIFFERENTIAL
+0xB7 0x0408 # CYRILLIC CAPITAL LETTER JE
+0xB8 0x0404 # CYRILLIC CAPITAL LETTER UKRAINIAN IE
+0xB9 0x0454 # CYRILLIC SMALL LETTER UKRAINIAN IE
+0xBA 0x0407 # CYRILLIC CAPITAL LETTER YI
+0xBB 0x0457 # CYRILLIC SMALL LETTER YI
+0xBC 0x0409 # CYRILLIC CAPITAL LETTER LJE
+0xBD 0x0459 # CYRILLIC SMALL LETTER LJE
+0xBE 0x040A # CYRILLIC CAPITAL LETTER NJE
+0xBF 0x045A # CYRILLIC SMALL LETTER NJE
+0xC0 0x0458 # CYRILLIC SMALL LETTER JE
+0xC1 0x0405 # CYRILLIC CAPITAL LETTER DZE
+0xC2 0x00AC # NOT SIGN
+0xC3 0x221A # SQUARE ROOT
+0xC4 0x0192 # LATIN SMALL LETTER F WITH HOOK
+0xC5 0x2248 # ALMOST EQUAL TO
+0xC6 0x2206 # INCREMENT
+0xC7 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xC8 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xC9 0x2026 # HORIZONTAL ELLIPSIS
+0xCA 0x00A0 # NO-BREAK SPACE
+0xCB 0x040B # CYRILLIC CAPITAL LETTER TSHE
+0xCC 0x045B # CYRILLIC SMALL LETTER TSHE
+0xCD 0x040C # CYRILLIC CAPITAL LETTER KJE
+0xCE 0x045C # CYRILLIC SMALL LETTER KJE
+0xCF 0x0455 # CYRILLIC SMALL LETTER DZE
+0xD0 0x2013 # EN DASH
+0xD1 0x2014 # EM DASH
+0xD2 0x201C # LEFT DOUBLE QUOTATION MARK
+0xD3 0x201D # RIGHT DOUBLE QUOTATION MARK
+0xD4 0x2018 # LEFT SINGLE QUOTATION MARK
+0xD5 0x2019 # RIGHT SINGLE QUOTATION MARK
+0xD6 0x00F7 # DIVISION SIGN
+0xD7 0x201E # DOUBLE LOW-9 QUOTATION MARK
+0xD8 0x040E # CYRILLIC CAPITAL LETTER SHORT U
+0xD9 0x045E # CYRILLIC SMALL LETTER SHORT U
+0xDA 0x040F # CYRILLIC CAPITAL LETTER DZHE
+0xDB 0x045F # CYRILLIC SMALL LETTER DZHE
+0xDC 0x2116 # NUMERO SIGN
+0xDD 0x0401 # CYRILLIC CAPITAL LETTER IO
+0xDE 0x0451 # CYRILLIC SMALL LETTER IO
+0xDF 0x044F # CYRILLIC SMALL LETTER YA
+0xE0 0x0430 # CYRILLIC SMALL LETTER A
+0xE1 0x0431 # CYRILLIC SMALL LETTER BE
+0xE2 0x0432 # CYRILLIC SMALL LETTER VE
+0xE3 0x0433 # CYRILLIC SMALL LETTER GHE
+0xE4 0x0434 # CYRILLIC SMALL LETTER DE
+0xE5 0x0435 # CYRILLIC SMALL LETTER IE
+0xE6 0x0436 # CYRILLIC SMALL LETTER ZHE
+0xE7 0x0437 # CYRILLIC SMALL LETTER ZE
+0xE8 0x0438 # CYRILLIC SMALL LETTER I
+0xE9 0x0439 # CYRILLIC SMALL LETTER SHORT I
+0xEA 0x043A # CYRILLIC SMALL LETTER KA
+0xEB 0x043B # CYRILLIC SMALL LETTER EL
+0xEC 0x043C # CYRILLIC SMALL LETTER EM
+0xED 0x043D # CYRILLIC SMALL LETTER EN
+0xEE 0x043E # CYRILLIC SMALL LETTER O
+0xEF 0x043F # CYRILLIC SMALL LETTER PE
+0xF0 0x0440 # CYRILLIC SMALL LETTER ER
+0xF1 0x0441 # CYRILLIC SMALL LETTER ES
+0xF2 0x0442 # CYRILLIC SMALL LETTER TE
+0xF3 0x0443 # CYRILLIC SMALL LETTER U
+0xF4 0x0444 # CYRILLIC SMALL LETTER EF
+0xF5 0x0445 # CYRILLIC SMALL LETTER HA
+0xF6 0x0446 # CYRILLIC SMALL LETTER TSE
+0xF7 0x0447 # CYRILLIC SMALL LETTER CHE
+0xF8 0x0448 # CYRILLIC SMALL LETTER SHA
+0xF9 0x0449 # CYRILLIC SMALL LETTER SHCHA
+0xFA 0x044A # CYRILLIC SMALL LETTER HARD SIGN
+0xFB 0x044B # CYRILLIC SMALL LETTER YERU
+0xFC 0x044C # CYRILLIC SMALL LETTER SOFT SIGN
+0xFD 0x044D # CYRILLIC SMALL LETTER E
+0xFE 0x044E # CYRILLIC SMALL LETTER YU
+0xFF 0x00A4 # CURRENCY SIGN
--- /dev/null
+#=======================================================================
+# File name: GREEK.TXT
+#
+# Contents: Map (external version) from Mac OS Greek
+# character set to Unicode 2.1 and later.
+#
+# Copyright: (c) 1995-2002, 2005 by Apple Computer, Inc., all rights
+# reserved.
+#
+# Contact: charsets@apple.com
+#
+# Changes:
+#
+# c02 2005-Apr-05 Update header comments. Matches internal xml
+# <c1.1> and Text Encoding Converter 2.0.
+# b3,c1 2002-Dec-19 Update to match changes in Mac OS Greek
+# encoding for Mac OS 9.2.2 and later.
+# Update URLs, notes. Matches internal
+# utom<b3>.
+# b02 1999-Sep-22 Update contact e-mail address. Matches
+# internal utom<b1>, ufrm<b1>, and Text
+# Encoding Converter version 1.5.
+# n06 1998-Feb-05 Update to match internal utom<n4>, ufrm<n17>,
+# and Text Encoding Converter versions 1.3:
+# Change mapping for 0xAF from U+0387 to its
+# canonical decomposition, U+00B7. Also
+# update header comments to new format.
+# n04 1995-Apr-15 First version (after fixing some typos).
+# Matches internal ufrm<n7>.
+#
+# Standard header:
+# ----------------
+#
+# Apple, the Apple logo, and Macintosh are trademarks of Apple
+# Computer, Inc., registered in the United States and other countries.
+# Unicode is a trademark of Unicode Inc. For the sake of brevity,
+# throughout this document, "Macintosh" can be used to refer to
+# Macintosh computers and "Unicode" can be used to refer to the
+# Unicode standard.
+#
+# Apple Computer, Inc. ("Apple") makes no warranty or representation,
+# either express or implied, with respect to this document and the
+# included data, its quality, accuracy, or fitness for a particular
+# purpose. In no event will Apple be liable for direct, indirect,
+# special, incidental, or consequential damages resulting from any
+# defect or inaccuracy in this document or the included data.
+#
+# These mapping tables and character lists are subject to change.
+# The latest tables should be available from the following:
+#
+# <http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
+#
+# For general information about Mac OS encodings and these mapping
+# tables, see the file "README.TXT".
+#
+# Format:
+# -------
+#
+# Three tab-separated columns;
+# '#' begins a comment which continues to the end of the line.
+# Column #1 is the Mac OS Greek code (in hex as 0xNN)
+# Column #2 is the corresponding Unicode (in hex as 0xNNNN)
+# Column #3 is a comment containing the Unicode name
+#
+# The entries are in Mac OS Greek code order.
+#
+# One of these mappings requires the use of a corporate character.
+# See the file "CORPCHAR.TXT" and notes below.
+#
+# Control character mappings are not shown in this table, following
+# the conventions of the standard UTC mapping tables. However, the
+# Mac OS Greek character set uses the standard control characters at
+# 0x00-0x1F and 0x7F.
+#
+# Notes on Mac OS Greek:
+# ----------------------
+#
+# This is a legacy Mac OS encoding; in the Mac OS X Carbon and Cocoa
+# environments, it is only supported via transcoding to and from
+# Unicode.
+#
+# Although a Mac OS script code is defined for Greek (smGreek = 6),
+# the Greek localized system does not currently use it (the font
+# family IDs are in the Mac OS Roman range). To determine if the
+# Greek encoding is being used when the script code is smRoman (0),
+# you must check if the system region code is 20, verGreece.
+#
+# The Mac OS Greek encoding is a superset of the repertoire of
+# ISO 8859-7 (although characters are not at the same code points),
+# except that LEFT & RIGHT SINGLE QUOTATION MARK replace the
+# MODIFIER LETTER REVERSED COMMA & APOSTROPHE (spacing versions of
+# Greek rough & smooth breathing marks) that are in ISO 8859-7.
+# The added characters in Mac OS Greek include more punctuation and
+# symbols and several accented Latin letters.
+#
+# Before Mac OS 9.2.2, code point 0x9C was SOFT HYPHEN (U+00AD), and
+# code point 0xFF was undefined. In Mac OS 9.2.2 and later versions,
+# SOFT HYPHEN was moved to 0xFF, and code point 0x9C was changed to be
+# EURO SIGN (U+20AC); the standard Apple fonts are updated for Mac OS
+# 9.2.2 to reflect this. There is a "no Euro sign" variant of the Mac
+# OS Greek encoding that uses the older mapping; this can be used for
+# older fonts.
+#
+# This "no Euro sign" variant of Mac OS Greek was the character set
+# used by Mac OS Greek systems before 9.2.2 except for system 6.0.7,
+# which used a variant character set but was quickly replaced with
+# Greek system 6.0.7.1 using the no Euro sign" character set
+# documented here. Greek system 4.1 used a variant Greek set that had
+# ISO 8859-7 in 0xA0-0xFF (with some holes filled in with DTP
+# characters), and Mac OS Roman accented Roman letters in 0x80-0x9F.
+#
+# Unicode mapping issues and notes:
+# ---------------------------------
+#
+# Details of mapping changes in each version:
+# -------------------------------------------
+#
+# Changes from version b02 to version b03/c01:
+#
+# - The Mac OS Greek encoding changed for Mac OS 9.2.2 and later
+# as follows:
+# 0x9C, changed from 0x00AD SOFT HYPHEN to 0x20AC EURO SIGN
+# 0xFF, changed from undefined to 0x00AD SOFT HYPHEN
+#
+# Changes from version n04 to version n06:
+#
+# - Change mapping of 0xAF from U+0387 to its canonical
+# decomposition, U+00B7.
+#
+##################
+
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+#
+0x80 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0x81 0x00B9 # SUPERSCRIPT ONE
+0x82 0x00B2 # SUPERSCRIPT TWO
+0x83 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0x84 0x00B3 # SUPERSCRIPT THREE
+0x85 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0x86 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0x87 0x0385 # GREEK DIALYTIKA TONOS
+0x88 0x00E0 # LATIN SMALL LETTER A WITH GRAVE
+0x89 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x8A 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0x8B 0x0384 # GREEK TONOS
+0x8C 0x00A8 # DIAERESIS
+0x8D 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA
+0x8E 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0x8F 0x00E8 # LATIN SMALL LETTER E WITH GRAVE
+0x90 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x91 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS
+0x92 0x00A3 # POUND SIGN
+0x93 0x2122 # TRADE MARK SIGN
+0x94 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x95 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS
+0x96 0x2022 # BULLET
+0x97 0x00BD # VULGAR FRACTION ONE HALF
+0x98 0x2030 # PER MILLE SIGN
+0x99 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x9A 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0x9B 0x00A6 # BROKEN BAR
+0x9C 0x20AC # EURO SIGN # before Mac OS 9.2.2, was SOFT HYPHEN
+0x9D 0x00F9 # LATIN SMALL LETTER U WITH GRAVE
+0x9E 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x9F 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xA0 0x2020 # DAGGER
+0xA1 0x0393 # GREEK CAPITAL LETTER GAMMA
+0xA2 0x0394 # GREEK CAPITAL LETTER DELTA
+0xA3 0x0398 # GREEK CAPITAL LETTER THETA
+0xA4 0x039B # GREEK CAPITAL LETTER LAMDA
+0xA5 0x039E # GREEK CAPITAL LETTER XI
+0xA6 0x03A0 # GREEK CAPITAL LETTER PI
+0xA7 0x00DF # LATIN SMALL LETTER SHARP S
+0xA8 0x00AE # REGISTERED SIGN
+0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x03A3 # GREEK CAPITAL LETTER SIGMA
+0xAB 0x03AA # GREEK CAPITAL LETTER IOTA WITH DIALYTIKA
+0xAC 0x00A7 # SECTION SIGN
+0xAD 0x2260 # NOT EQUAL TO
+0xAE 0x00B0 # DEGREE SIGN
+0xAF 0x00B7 # MIDDLE DOT
+0xB0 0x0391 # GREEK CAPITAL LETTER ALPHA
+0xB1 0x00B1 # PLUS-MINUS SIGN
+0xB2 0x2264 # LESS-THAN OR EQUAL TO
+0xB3 0x2265 # GREATER-THAN OR EQUAL TO
+0xB4 0x00A5 # YEN SIGN
+0xB5 0x0392 # GREEK CAPITAL LETTER BETA
+0xB6 0x0395 # GREEK CAPITAL LETTER EPSILON
+0xB7 0x0396 # GREEK CAPITAL LETTER ZETA
+0xB8 0x0397 # GREEK CAPITAL LETTER ETA
+0xB9 0x0399 # GREEK CAPITAL LETTER IOTA
+0xBA 0x039A # GREEK CAPITAL LETTER KAPPA
+0xBB 0x039C # GREEK CAPITAL LETTER MU
+0xBC 0x03A6 # GREEK CAPITAL LETTER PHI
+0xBD 0x03AB # GREEK CAPITAL LETTER UPSILON WITH DIALYTIKA
+0xBE 0x03A8 # GREEK CAPITAL LETTER PSI
+0xBF 0x03A9 # GREEK CAPITAL LETTER OMEGA
+0xC0 0x03AC # GREEK SMALL LETTER ALPHA WITH TONOS
+0xC1 0x039D # GREEK CAPITAL LETTER NU
+0xC2 0x00AC # NOT SIGN
+0xC3 0x039F # GREEK CAPITAL LETTER OMICRON
+0xC4 0x03A1 # GREEK CAPITAL LETTER RHO
+0xC5 0x2248 # ALMOST EQUAL TO
+0xC6 0x03A4 # GREEK CAPITAL LETTER TAU
+0xC7 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xC8 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xC9 0x2026 # HORIZONTAL ELLIPSIS
+0xCA 0x00A0 # NO-BREAK SPACE
+0xCB 0x03A5 # GREEK CAPITAL LETTER UPSILON
+0xCC 0x03A7 # GREEK CAPITAL LETTER CHI
+0xCD 0x0386 # GREEK CAPITAL LETTER ALPHA WITH TONOS
+0xCE 0x0388 # GREEK CAPITAL LETTER EPSILON WITH TONOS
+0xCF 0x0153 # LATIN SMALL LIGATURE OE
+0xD0 0x2013 # EN DASH
+0xD1 0x2015 # HORIZONTAL BAR
+0xD2 0x201C # LEFT DOUBLE QUOTATION MARK
+0xD3 0x201D # RIGHT DOUBLE QUOTATION MARK
+0xD4 0x2018 # LEFT SINGLE QUOTATION MARK
+0xD5 0x2019 # RIGHT SINGLE QUOTATION MARK
+0xD6 0x00F7 # DIVISION SIGN
+0xD7 0x0389 # GREEK CAPITAL LETTER ETA WITH TONOS
+0xD8 0x038A # GREEK CAPITAL LETTER IOTA WITH TONOS
+0xD9 0x038C # GREEK CAPITAL LETTER OMICRON WITH TONOS
+0xDA 0x038E # GREEK CAPITAL LETTER UPSILON WITH TONOS
+0xDB 0x03AD # GREEK SMALL LETTER EPSILON WITH TONOS
+0xDC 0x03AE # GREEK SMALL LETTER ETA WITH TONOS
+0xDD 0x03AF # GREEK SMALL LETTER IOTA WITH TONOS
+0xDE 0x03CC # GREEK SMALL LETTER OMICRON WITH TONOS
+0xDF 0x038F # GREEK CAPITAL LETTER OMEGA WITH TONOS
+0xE0 0x03CD # GREEK SMALL LETTER UPSILON WITH TONOS
+0xE1 0x03B1 # GREEK SMALL LETTER ALPHA
+0xE2 0x03B2 # GREEK SMALL LETTER BETA
+0xE3 0x03C8 # GREEK SMALL LETTER PSI
+0xE4 0x03B4 # GREEK SMALL LETTER DELTA
+0xE5 0x03B5 # GREEK SMALL LETTER EPSILON
+0xE6 0x03C6 # GREEK SMALL LETTER PHI
+0xE7 0x03B3 # GREEK SMALL LETTER GAMMA
+0xE8 0x03B7 # GREEK SMALL LETTER ETA
+0xE9 0x03B9 # GREEK SMALL LETTER IOTA
+0xEA 0x03BE # GREEK SMALL LETTER XI
+0xEB 0x03BA # GREEK SMALL LETTER KAPPA
+0xEC 0x03BB # GREEK SMALL LETTER LAMDA
+0xED 0x03BC # GREEK SMALL LETTER MU
+0xEE 0x03BD # GREEK SMALL LETTER NU
+0xEF 0x03BF # GREEK SMALL LETTER OMICRON
+0xF0 0x03C0 # GREEK SMALL LETTER PI
+0xF1 0x03CE # GREEK SMALL LETTER OMEGA WITH TONOS
+0xF2 0x03C1 # GREEK SMALL LETTER RHO
+0xF3 0x03C3 # GREEK SMALL LETTER SIGMA
+0xF4 0x03C4 # GREEK SMALL LETTER TAU
+0xF5 0x03B8 # GREEK SMALL LETTER THETA
+0xF6 0x03C9 # GREEK SMALL LETTER OMEGA
+0xF7 0x03C2 # GREEK SMALL LETTER FINAL SIGMA
+0xF8 0x03C7 # GREEK SMALL LETTER CHI
+0xF9 0x03C5 # GREEK SMALL LETTER UPSILON
+0xFA 0x03B6 # GREEK SMALL LETTER ZETA
+0xFB 0x03CA # GREEK SMALL LETTER IOTA WITH DIALYTIKA
+0xFC 0x03CB # GREEK SMALL LETTER UPSILON WITH DIALYTIKA
+0xFD 0x0390 # GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
+0xFE 0x03B0 # GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
+0xFF 0x00AD # SOFT HYPHEN # before Mac OS 9.2.2, was undefined
--- /dev/null
+#=======================================================================
+# File name: HEBREW.TXT
+#
+# Contents: Map (external version) from Mac OS Hebrew
+# character set to Unicode 2.1 and later.
+#
+# Copyright: (c) 1995-2002, 2005 by Apple Computer, Inc., all rights
+# reserved.
+#
+# Contact: charsets@apple.com
+#
+# Changes:
+#
+# c02 2005-Apr-05 Update header comments; add section on
+# roundtrip considerations. Matches internal
+# xml <c1.4> and Text Encoding Converter 2.0.
+# b3,c1 2002-Dec-19 Don't require left-right context for digits
+# 0x30-0x39. Change mapping of 0x81 to use
+# decomposition. Reverse the mappings of 0xA8,
+# 0xA9. Update URLs, notes. Matches internal
+# utom<b7>.
+# b02 1999-Sep-22 Update contact e-mail address. Matches
+# internal utom<b1>, ufrm<b1>, and Text
+# Encoding Converter version 1.5.
+# n03 1998-Feb-05 Show required Unicode character
+# directionality in a different way. Update
+# mappings for 0xC0 and 0xDE to use
+# transcoding hints; matches internal utom<n6>,
+# ufrm<n20>, and Text Encoding Converter
+# version 1.3. Rewrite header comments.
+# n01 1995-Nov-15 First version. Matches internal ufrm<n8>.
+#
+# Standard header:
+# ----------------
+#
+# Apple, the Apple logo, and Macintosh are trademarks of Apple
+# Computer, Inc., registered in the United States and other countries.
+# Unicode is a trademark of Unicode Inc. For the sake of brevity,
+# throughout this document, "Macintosh" can be used to refer to
+# Macintosh computers and "Unicode" can be used to refer to the
+# Unicode standard.
+#
+# Apple Computer, Inc. ("Apple") makes no warranty or representation,
+# either express or implied, with respect to this document and the
+# included data, its quality, accuracy, or fitness for a particular
+# purpose. In no event will Apple be liable for direct, indirect,
+# special, incidental, or consequential damages resulting from any
+# defect or inaccuracy in this document or the included data.
+#
+# These mapping tables and character lists are subject to change.
+# The latest tables should be available from the following:
+#
+# <http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
+#
+# For general information about Mac OS encodings and these mapping
+# tables, see the file "README.TXT".
+#
+# Format:
+# -------
+#
+# Three tab-separated columns;
+# '#' begins a comment which continues to the end of the line.
+# Column #1 is the Mac OS Hebrew code (in hex as 0xNN).
+# Column #2 is the corresponding Unicode or Unicode sequence (in
+# hex as 0xNNNN, 0xNNNN+0xNNNN, etc.). Sequences of up to 3
+# Unicode characters are used here. A single Unicode character
+# may be preceded by a tag indicating required directionality
+# (i.e. <LR>+0xNNNN or <RL>+0xNNNN).
+# Column #3 is a comment containing the Unicode name.
+#
+# The entries are in Mac OS Hebrew code order.
+#
+# Some of these mappings require the use of corporate characters.
+# See the file "CORPCHAR.TXT" and notes below.
+#
+# Control character mappings are not shown in this table, following
+# the conventions of the standard UTC mapping tables. However, the
+# Mac OS Hebrew character set uses the standard control characters at
+# 0x00-0x1F and 0x7F.
+#
+# Notes on Mac OS Hebrew:
+# -----------------------
+#
+# This is a legacy Mac OS encoding; in the Mac OS X Carbon and Cocoa
+# environments, it is only supported via transcoding to and from
+# Unicode.
+#
+# 1. General
+#
+# The Mac OS Hebrew character set supports the Hebrew and Yiddish
+# languages. It incorporates the Hebrew letter repertoire of
+# ISO 8859-8, and uses the same code points for them, 0xE0-0xFA.
+# It also incorporates the ASCII character set. In addition, the
+# Mac OS Hebrew character set includes the following:
+#
+# - Hebrew points (nikud marks) at 0xC6, 0xCB-0xCF and 0xD8-0xDF.
+# These are non-spacing combining marks. Note that the RAFE point
+# at 0xD8 is not displayed correctly in some fonts, and cannot be
+# typed using the keyboard layouts in the current Hebrew localized
+# systems. Also note: The character given in Unicode as QAMATS
+# (U+05B8) actually refers to two different sounds, depending on
+# context. For example, when ALEF is followed by QAMATS, the QAMATS
+# can actually refer to two different sounds depending on the
+# following letters. The Mac OS Hebrew character set separately
+# encodes these two sounds for the same graphic shape, as "qamats"
+# (0xCB) and "qamats qatan" (0xDE). The "qamats" character is more
+# common, so it is mapped to the Unicode QAMATS; "qamats qatan" can
+# only be used with a limited number of characters, and it is
+# mapped using a corporate-zone variant tag (see below).
+#
+# - Various Hebrew ligatures at 0x81, 0xC0, 0xC7, 0xC8, 0xD6, and
+# 0xD7. Also note that the Yiddish YOD YOD PATAH ligature at 0x81
+# is missing in some fonts.
+#
+# - The NEW SHEQEL SIGN at 0xA6.
+#
+# - Latin characters with diacritics at 0x80 and 0x82-0x9F. However,
+# most of these cannot be typed using the keyboard layouts in the
+# Hebrew localized systems.
+#
+# - Right-left versions of certain ASCII punctuation, symbols and
+# digits: 0xA0-0xA5, 0xA7-0xBF, 0xFB-0xFF. See below.
+#
+# - Miscellaneous additional punctuation at 0xC1, 0xC9, 0xCA, and
+# 0xD0-0xD5. There is a variant of the Hebrew encoding in which
+# the LEFT SINGLE QUOTATION MARK at 0xD4 is replaced by FIGURE
+# SPACE. The glyphs for some of the other punctuation characters
+# are missing in some fonts.
+#
+# - Four obsolete characters at 0xC2-0xC5 known as canorals (not to
+# be confused with cantillation marks!). These were used for
+# manual positioning of nikud marks before System 7.1 (at which
+# point nikud positioning became automatic with WorldScript.).
+#
+# 2. Directional characters and roundtrip fidelity
+#
+# The Mac OS Hebrew character set was developed around 1987. At that
+# time the bidirectional line line layout algorithm used in the Mac OS
+# Hebrew system was fairly simple; it used only a few direction
+# classes (instead of the 19 now used in the Unicode bidirectional
+# algorithm). In order to permit users to handle some tricky layou
+# problems, certain punctuation, symbol, and digit characters have
+# duplicate code points, one with a left-right direction attribute and
+# the other with a right-left direction attribute.
+#
+# For example, plus sign is encoded at 0x2B with a left-right
+# attribute, and at 0xAB with a right-left attribute. However, there
+# is only one PLUS SIGN character in Unicode. This leads to some
+# interesting problems when mapping between Mac OS Hebrew and Unicode;
+# see below.
+#
+# A related problem is that even when a particular character is
+# encoded only once in Mac OS Hebrew, it may have a different
+# direction attribute than the corresponding Unicode character.
+#
+# For example, the Mac OS Hebrew character at 0xC9 is HORIZONTAL
+# ELLIPSIS with strong right-left direction. However, the Unicode
+# character HORIZONTAL ELLIPSIS has direction class neutral.
+#
+# 3. Font variants
+#
+# The table in this file gives the Unicode mappings for the standard
+# Mac OS Hebrew encoding. This encoding is supported by many of the
+# Apple fonts (including all of the fonts in the Hebrew Language Kit),
+# and is the encoding supported by the text processing utilities.
+# However, some TrueType fonts provided with the localized Hebrew
+# system implement a slightly different encoding; the difference is
+# only in one code point, 0xD4. For the standard variant, this is:
+# 0xD4 -> <RL>+0x2018 LEFT SINGLE QUOTATION MARK, right-left
+#
+# The TrueType variant is used by the following TrueType fonts from
+# the localized system: Caesarea, Carmel Book, Gilboa, Ramat Sharon,
+# and Sinai Book. For these, 0xD4 is as follows:
+# 0xD4 -> <RL>+0x2007 FIGURE SPACE, right-left
+#
+# Unicode mapping issues and notes:
+# ---------------------------------
+#
+# 1. Matching the direction of Mac OS Hebrew characters
+#
+# When Mac OS Hebrew encodes a character twice but with different
+# direction attributes for the two code points - as in the case of
+# plus sign mentioned above - we need a way to map both Mac OS Hebrew
+# code points to Unicode and back again without loss of information.
+# With the plus sign, for example, mapping one of the Mac OS Hebrew
+# characters to a code in the Unicode corporate use zone is
+# undesirable, since both of the plus sign characters are likely to
+# be used in text that is interchanged.
+#
+# The problem is solved with the use of direction override characters
+# and direction-dependent mappings. When mapping from Mac OS Hebrew
+# to Unicode, we use direction overrides as necessary to force the
+# direction of the resulting Unicode characters.
+#
+# The required direction is indicated by a direction tag in the
+# mappings. A tag of <LR> means the corresponding Unicode character
+# must have a strong left-right context, and a tag of <RL> indicates
+# a right-left context.
+#
+# For example, the mapping of 0x2B is given as <LR>+0x002B; the
+# mapping of 0xAB is given as <RL>+0x002B. If we map an isolated
+# instance of 0x2B to Unicode, it should be mapped as follows (LRO
+# indicates LEFT-RIGHT OVERRIDE, PDF indicates POP DIRECTION
+# FORMATTING):
+#
+# 0x2B -> 0x202D (LRO) + 0x002B (PLUS SIGN) + 0x202C (PDF)
+#
+# When mapping several characters in a row that require direction
+# forcing, the overrides need only be used at the beginning and end.
+# For example:
+#
+# 0x24 0x20 0x28 0x29 -> 0x202D 0x0024 0x0020 0x0028 0x0029 0x202C
+#
+# If neutral characters that require direction forcing are already
+# between strong-direction characters with matching directionality,
+# then direction overrides need not be used. Direction overrides are
+# always needed to map the right-left digits at 0xB0-0xB9.
+#
+# When mapping from Unicode to Mac OS Hebrew, the Unicode
+# bidirectional algorithm should be used to determine resolved
+# direction of the Unicode characters. The mapping from Unicode to
+# Mac OS Hebrew can then be disambiguated by the use of the resolved
+# direction:
+#
+# Unicode 0x002B -> Mac OS Hebrew 0x2B (if L) or 0xAB (if R)
+#
+# However, this also means the direction override characters should
+# be discarded when mapping from Unicode to Mac OS Hebrew (after
+# they have been used to determine resolved direction), since the
+# direction override information is carried by the code point itself.
+#
+# Even when direction overrides are not needed for roundtrip
+# fidelity, they are sometimes used when mapping Mac OS Hebrew
+# characters to Unicode in order to achieve similar text layout with
+# the resulting Unicode text. For example, the single Mac OS Hebrew
+# ellipsis character has direction class right-left,and there is no
+# left-right version. However, the Unicode HORIZONTAL ELLIPSIS
+# character has direction class neutral (which means it may end up
+# with a resolved direction of left-right if surrounded by left-right
+# characters). When mapping the Mac OS Hebrew ellipsis to Unicode, it
+# is surrounded with a direction override to help preserve proper
+# text layout. The resolved direction is not needed or used when
+# mapping the Unicode HORIZONTAL ELLIPSIS back to Mac OS Hebrew.
+#
+# 2. Use of corporate-zone Unicodes
+#
+# The goals in the mappings provided here are:
+# - Ensure roundtrip mapping from every character in the Mac OS
+# Hebrew character set to Unicode and back
+# - Use standard Unicode characters as much as possible, to
+# maximize interchangeability of the resulting Unicode text.
+# Whenever possible, avoid having content carried by private-use
+# characters.
+#
+# Some of the characters in the Mac OS Hebrew character set do not
+# correspond to distinct, single Unicode characters. To map these
+# and satisfy both goals above, we employ various strategies.
+#
+# a) If possible, use private use characters in combination with
+# standard Unicode characters to mark variants of the standard
+# Unicode character.
+#
+# Apple has defined a block of 32 corporate characters as "transcoding
+# hints." These are used in combination with standard Unicode characters
+# to force them to be treated in a special way for mapping to other
+# encodings; they have no other effect. Sixteen of these transcoding
+# hints are "grouping hints" - they indicate that the next 2-4 Unicode
+# characters should be treated as a single entity for transcoding. The
+# other sixteen transcoding hints are "variant tags" - they are like
+# combining characters, and can follow a standard Unicode (or a sequence
+# consisting of a base character and other combining characters) to
+# cause it to be treated in a special way for transcoding. These always
+# terminate a combining-character sequence.
+#
+# Two transcoding hints are used in this mapping table: a grouping hint
+# and a variant tag:
+# hint:
+# 0xF86A group next 2 characters, right-left directionality
+# 0xF87F variant tag
+#
+# In Mac OS Hebrew, 0xC0 is a ligature for lamed holam. This can also
+# be represented in Mac OS Hebrew as 0xEC+0xDD, using separate
+# characters for lamed and holam. The latter sequence is mapped to
+# Unicode as 0x05DC+0x05B9, i.e. as the sequence HEBREW LETTER LAMED +
+# HEBREW POINT HOLAM. We want to map the ligature 0xC0 using the same
+# standard Unicode characters, but for round-trip fidelity we need to
+# distinguish it from the mapping of the sequence 0xEC+0xDD. Thus for
+# 0xC0 we use a grouping hint, and map as follows:
+#
+# 0xC0 -> 0xF86A+0x05DC+0x05B9
+#
+# The variant tag is used for "qamats qatan" to mark it as an alternate
+# for HEBREW POINT QAMATS, as follows:
+#
+# 0xDE -> 0x05B8+0xF87F
+#
+# b) Otherwise, use private use characters by themselves to map Mac OS
+# Hebrew characters which have no relationship to any standard Unicode
+# character.
+#
+# The following additional corporate zone Unicode characters are used
+# for this purpose here (to map the obsolete "canorals", see above):
+#
+# 0xF89B Hebrew canoral 1
+# 0xF89C Hebrew canoral 2
+# 0xF89D Hebrew canoral 3
+# 0xF89E Hebrew canoral 4
+#
+# 3. Roundtrip considerations when mapping to decomposed Unicode
+#
+# Both Mac OS Hebrew and Unicode provide multiple ways of representing
+# certain letter-and-point combinations. For example, HEBREW LETTER
+# VAV WITH HOLAM can be represented in Unicode as the single character
+# 0xFB4B or as the sequence 0x05D5 0x05B9; similarly, it can be
+# represented in Mac OS Hebrew as 0xC7 or as the sequence 0xE5 0xDD.
+# This leads to some roundtrip problems. First note that we have the
+# following mappings without such problems:
+#
+# Mac standard decomp. of reverse map
+# OS Unicode mapping std. mapping of decomp.
+# ---- ---------------------------------- ------------- -----------
+# 0xC6 0x05BC ... POINT DAGESH OR MAPIQ 0x05BC (same) 0xC6
+# 0xE5 0x05D5 ... LETTER VAV 0x05D5 (same) 0xE5
+# 0xDD 0x05B9 ... POINT HOLAM 0x05B9 (same) 0xDD
+#
+# However, those mappings above cause roundtrip problems for the
+# the following mappings if they are decomposed:
+#
+# Mac standard decomp. of reverse map
+# OS Unicode mapping std. mapping of decomp.
+# ---- ---------------------------------- ------------- -----------
+# 0xC7 0xFB4B ... LETTER VAV WITH HOLAM 0x05D5 0x05B9 0xE5 0xDD
+# 0xC8 0xFB35 ... LETTER VAV WITH DAGESH 0x05D5 0x05BC 0xE5 0xC6
+#
+# One solution is to use a grouping transcoding hint with the two
+# decompositions above to mark the decomposed sequence for special
+# treatment in transcoding. This yields the following mappings to
+# decomposed Unicode:
+#
+# Mac decomposed
+# OS Unicode mapping
+# ---- --------------------
+# 0xC7 0xF86A 0x05D5 0x05B9
+# 0xC8 0xF86A 0x05D5 0x05BC
+#
+# Details of mapping changes in each version:
+# -------------------------------------------
+#
+# Changes from version b02 to version b03/c01:
+#
+# - Stop specifying left-right context for digits 0x30-0x39, since the
+# corresponding Unicodes 0x0030-0x0039 already have left-right
+# directionality.
+#
+# - Change mapping of 0x81 from 0xFB1F HEBREW LIGATURE YIDDISH YOD YOD
+# PATAH to its canonical decomposition 0x05F2+0x05B7 to improve
+# cross-platform compatibility (Windows doesn't handle 0xFB1F)
+#
+# - Interchange the mappings of 0xA8 and 0xA9 to obtain the correct
+# open/close behavior; they work differently than in Mac Arabic.
+# The old mapping was
+# 0xA8 <RL>+0x0028 # LEFT PARENTHESIS, right-left
+# 0xA9 <RL>+0x0029 # RIGHT PARENTHESIS, right-left
+# and the new mapping is
+# 0xA8 <RL>+0x0029 # RIGHT PARENTHESIS, right-left
+# 0xA9 <RL>+0x0028 # LEFT PARENTHESIS, right-left
+#
+# Changes from version n01 to version n03:
+#
+# - Change mapping for 0xC0 from single corporate character to
+# grouping hint plus standard Unicodes
+#
+# - Change mapping for 0xDE from single corporate character to
+# standard Unicode plus variant tag
+#
+##################
+
+0x20 <LR>+0x0020 # SPACE, left-right
+0x21 <LR>+0x0021 # EXCLAMATION MARK, left-right
+0x22 <LR>+0x0022 # QUOTATION MARK, left-right
+0x23 <LR>+0x0023 # NUMBER SIGN, left-right
+0x24 <LR>+0x0024 # DOLLAR SIGN, left-right
+0x25 <LR>+0x0025 # PERCENT SIGN, left-right
+0x26 0x0026 # AMPERSAND
+0x27 <LR>+0x0027 # APOSTROPHE, left-right
+0x28 <LR>+0x0028 # LEFT PARENTHESIS, left-right
+0x29 <LR>+0x0029 # RIGHT PARENTHESIS, left-right
+0x2A <LR>+0x002A # ASTERISK, left-right
+0x2B <LR>+0x002B # PLUS SIGN, left-right
+0x2C <LR>+0x002C # COMMA, left-right
+0x2D <LR>+0x002D # HYPHEN-MINUS, left-right
+0x2E <LR>+0x002E # FULL STOP, left-right
+0x2F <LR>+0x002F # SOLIDUS, left-right
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A <LR>+0x003A # COLON, left-right
+0x3B <LR>+0x003B # SEMICOLON, left-right
+0x3C <LR>+0x003C # LESS-THAN SIGN, left-right
+0x3D <LR>+0x003D # EQUALS SIGN, left-right
+0x3E <LR>+0x003E # GREATER-THAN SIGN, left-right
+0x3F <LR>+0x003F # QUESTION MARK, left-right
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B <LR>+0x005B # LEFT SQUARE BRACKET, left-right
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D <LR>+0x005D # RIGHT SQUARE BRACKET, left-right
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B <LR>+0x007B # LEFT CURLY BRACKET, left-right
+0x7C <LR>+0x007C # VERTICAL LINE, left-right
+0x7D <LR>+0x007D # RIGHT CURLY BRACKET, left-right
+0x7E 0x007E # TILDE
+#
+0x80 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0x81 0x05F2+0x05B7 # HEBREW LIGATURE YIDDISH YOD YOD PATAH
+0x82 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA
+0x83 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0x84 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE
+0x85 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0x86 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0x87 0x00E1 # LATIN SMALL LETTER A WITH ACUTE
+0x88 0x00E0 # LATIN SMALL LETTER A WITH GRAVE
+0x89 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x8A 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0x8B 0x00E3 # LATIN SMALL LETTER A WITH TILDE
+0x8C 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE
+0x8D 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA
+0x8E 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0x8F 0x00E8 # LATIN SMALL LETTER E WITH GRAVE
+0x90 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x91 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS
+0x92 0x00ED # LATIN SMALL LETTER I WITH ACUTE
+0x93 0x00EC # LATIN SMALL LETTER I WITH GRAVE
+0x94 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x95 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS
+0x96 0x00F1 # LATIN SMALL LETTER N WITH TILDE
+0x97 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
+0x98 0x00F2 # LATIN SMALL LETTER O WITH GRAVE
+0x99 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x9A 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0x9B 0x00F5 # LATIN SMALL LETTER O WITH TILDE
+0x9C 0x00FA # LATIN SMALL LETTER U WITH ACUTE
+0x9D 0x00F9 # LATIN SMALL LETTER U WITH GRAVE
+0x9E 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x9F 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xA0 <RL>+0x0020 # SPACE, right-left
+0xA1 <RL>+0x0021 # EXCLAMATION MARK, right-left
+0xA2 <RL>+0x0022 # QUOTATION MARK, right-left
+0xA3 <RL>+0x0023 # NUMBER SIGN, right-left
+0xA4 <RL>+0x0024 # DOLLAR SIGN, right-left
+0xA5 <RL>+0x0025 # PERCENT SIGN, right-left
+0xA6 0x20AA # NEW SHEQEL SIGN
+0xA7 <RL>+0x0027 # APOSTROPHE, right-left
+0xA8 <RL>+0x0029 # RIGHT PARENTHESIS, right-left # close parenthesis
+0xA9 <RL>+0x0028 # LEFT PARENTHESIS, right-left # open parenthesis
+0xAA <RL>+0x002A # ASTERISK, right-left
+0xAB <RL>+0x002B # PLUS SIGN, right-left
+0xAC <RL>+0x002C # COMMA, right-left
+0xAD <RL>+0x002D # HYPHEN-MINUS, right-left
+0xAE <RL>+0x002E # FULL STOP, right-left
+0xAF <RL>+0x002F # SOLIDUS, right-left
+0xB0 <RL>+0x0030 # DIGIT ZERO, right-left (need override)
+0xB1 <RL>+0x0031 # DIGIT ONE, right-left (need override)
+0xB2 <RL>+0x0032 # DIGIT TWO, right-left (need override)
+0xB3 <RL>+0x0033 # DIGIT THREE, right-left (need override)
+0xB4 <RL>+0x0034 # DIGIT FOUR, right-left (need override)
+0xB5 <RL>+0x0035 # DIGIT FIVE, right-left (need override)
+0xB6 <RL>+0x0036 # DIGIT SIX, right-left (need override)
+0xB7 <RL>+0x0037 # DIGIT SEVEN, right-left (need override)
+0xB8 <RL>+0x0038 # DIGIT EIGHT, right-left (need override)
+0xB9 <RL>+0x0039 # DIGIT NINE, right-left (need override)
+0xBA <RL>+0x003A # COLON, right-left
+0xBB <RL>+0x003B # SEMICOLON, right-left
+0xBC <RL>+0x003C # LESS-THAN SIGN, right-left
+0xBD <RL>+0x003D # EQUALS SIGN, right-left
+0xBE <RL>+0x003E # GREATER-THAN SIGN, right-left
+0xBF <RL>+0x003F # QUESTION MARK, right-left
+0xC0 0xF86A+0x05DC+0x05B9 # Hebrew ligature lamed holam
+0xC1 <RL>+0x201E # DOUBLE LOW-9 QUOTATION MARK, right-left
+0xC2 0xF89B # Hebrew canoral 1
+0xC3 0xF89C # Hebrew canoral 2
+0xC4 0xF89D # Hebrew canoral 3
+0xC5 0xF89E # Hebrew canoral 4
+0xC6 0x05BC # HEBREW POINT DAGESH OR MAPIQ
+0xC7 0xFB4B # HEBREW LETTER VAV WITH HOLAM
+0xC8 0xFB35 # HEBREW LETTER VAV WITH DAGESH
+0xC9 <RL>+0x2026 # HORIZONTAL ELLIPSIS, right-left
+0xCA <RL>+0x00A0 # NO-BREAK SPACE, right-left
+0xCB 0x05B8 # HEBREW POINT QAMATS
+0xCC 0x05B7 # HEBREW POINT PATAH
+0xCD 0x05B5 # HEBREW POINT TSERE
+0xCE 0x05B6 # HEBREW POINT SEGOL
+0xCF 0x05B4 # HEBREW POINT HIRIQ
+0xD0 <RL>+0x2013 # EN DASH, right-left
+0xD1 <RL>+0x2014 # EM DASH, right-left
+0xD2 <RL>+0x201C # LEFT DOUBLE QUOTATION MARK, right-left
+0xD3 <RL>+0x201D # RIGHT DOUBLE QUOTATION MARK, right-left
+0xD4 <RL>+0x2018 # LEFT SINGLE QUOTATION MARK, right-left
+0xD5 <RL>+0x2019 # RIGHT SINGLE QUOTATION MARK, right-left
+0xD6 0xFB2A # HEBREW LETTER SHIN WITH SHIN DOT
+0xD7 0xFB2B # HEBREW LETTER SHIN WITH SIN DOT
+0xD8 0x05BF # HEBREW POINT RAFE
+0xD9 0x05B0 # HEBREW POINT SHEVA
+0xDA 0x05B2 # HEBREW POINT HATAF PATAH
+0xDB 0x05B1 # HEBREW POINT HATAF SEGOL
+0xDC 0x05BB # HEBREW POINT QUBUTS
+0xDD 0x05B9 # HEBREW POINT HOLAM
+0xDE 0x05B8+0xF87F # HEBREW POINT QAMATS, alternate form "qamats qatan"
+0xDF 0x05B3 # HEBREW POINT HATAF QAMATS
+0xE0 0x05D0 # HEBREW LETTER ALEF
+0xE1 0x05D1 # HEBREW LETTER BET
+0xE2 0x05D2 # HEBREW LETTER GIMEL
+0xE3 0x05D3 # HEBREW LETTER DALET
+0xE4 0x05D4 # HEBREW LETTER HE
+0xE5 0x05D5 # HEBREW LETTER VAV
+0xE6 0x05D6 # HEBREW LETTER ZAYIN
+0xE7 0x05D7 # HEBREW LETTER HET
+0xE8 0x05D8 # HEBREW LETTER TET
+0xE9 0x05D9 # HEBREW LETTER YOD
+0xEA 0x05DA # HEBREW LETTER FINAL KAF
+0xEB 0x05DB # HEBREW LETTER KAF
+0xEC 0x05DC # HEBREW LETTER LAMED
+0xED 0x05DD # HEBREW LETTER FINAL MEM
+0xEE 0x05DE # HEBREW LETTER MEM
+0xEF 0x05DF # HEBREW LETTER FINAL NUN
+0xF0 0x05E0 # HEBREW LETTER NUN
+0xF1 0x05E1 # HEBREW LETTER SAMEKH
+0xF2 0x05E2 # HEBREW LETTER AYIN
+0xF3 0x05E3 # HEBREW LETTER FINAL PE
+0xF4 0x05E4 # HEBREW LETTER PE
+0xF5 0x05E5 # HEBREW LETTER FINAL TSADI
+0xF6 0x05E6 # HEBREW LETTER TSADI
+0xF7 0x05E7 # HEBREW LETTER QOF
+0xF8 0x05E8 # HEBREW LETTER RESH
+0xF9 0x05E9 # HEBREW LETTER SHIN
+0xFA 0x05EA # HEBREW LETTER TAV
+0xFB <RL>+0x007D # RIGHT CURLY BRACKET, right-left
+0xFC <RL>+0x005D # RIGHT SQUARE BRACKET, right-left
+0xFD <RL>+0x007B # LEFT CURLY BRACKET, right-left
+0xFE <RL>+0x005B # LEFT SQUARE BRACKET, right-left
+0xFF <RL>+0x007C # VERTICAL LINE, right-left
--- /dev/null
+#=======================================================================
+# File name: ROMAN.TXT
+#
+# Contents: Map (external version) from Mac OS Roman
+# character set to Unicode 2.1 and later.
+#
+# Copyright: (c) 1994-2002, 2005 by Apple Computer, Inc., all rights
+# reserved.
+#
+# Contact: charsets@apple.com
+#
+# Changes:
+#
+# c02 2005-Apr-05 Update header comments. Matches internal xml
+# <c1.1> and Text Encoding Converter 2.0.
+# b4,c1 2002-Dec-19 Update URLs, notes. Matches internal
+# utom<b5>.
+# b03 1999-Sep-22 Update contact e-mail address. Matches
+# internal utom<b4>, ufrm<b3>, and Text
+# Encoding Converter version 1.5.
+# b02 1998-Aug-18 Encoding changed for Mac OS 8.5; change
+# mapping of 0xDB from CURRENCY SIGN to
+# EURO SIGN. Matches internal utom<b3>,
+# ufrm<b3>.
+# n08 1998-Feb-05 Minor update to header comments
+# n06 1997-Dec-14 Add warning about future changes to 0xDB
+# from CURRENCY SIGN to EURO SIGN. Clarify
+# some header information
+# n04 1997-Dec-01 Update to match internal utom<n3>, ufrm<n22>:
+# Change standard mapping for 0xBD from U+2126
+# to its canonical decomposition, U+03A9.
+# n03 1995-Apr-15 First version (after fixing some typos).
+# Matches internal ufrm<n9>.
+#
+# Standard header:
+# ----------------
+#
+# Apple, the Apple logo, and Macintosh are trademarks of Apple
+# Computer, Inc., registered in the United States and other countries.
+# Unicode is a trademark of Unicode Inc. For the sake of brevity,
+# throughout this document, "Macintosh" can be used to refer to
+# Macintosh computers and "Unicode" can be used to refer to the
+# Unicode standard.
+#
+# Apple Computer, Inc. ("Apple") makes no warranty or representation,
+# either express or implied, with respect to this document and the
+# included data, its quality, accuracy, or fitness for a particular
+# purpose. In no event will Apple be liable for direct, indirect,
+# special, incidental, or consequential damages resulting from any
+# defect or inaccuracy in this document or the included data.
+#
+# These mapping tables and character lists are subject to change.
+# The latest tables should be available from the following:
+#
+# <http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
+#
+# For general information about Mac OS encodings and these mapping
+# tables, see the file "README.TXT".
+#
+# Format:
+# -------
+#
+# Three tab-separated columns;
+# '#' begins a comment which continues to the end of the line.
+# Column #1 is the Mac OS Roman code (in hex as 0xNN)
+# Column #2 is the corresponding Unicode (in hex as 0xNNNN)
+# Column #3 is a comment containing the Unicode name
+#
+# The entries are in Mac OS Roman code order.
+#
+# One of these mappings requires the use of a corporate character.
+# See the file "CORPCHAR.TXT" and notes below.
+#
+# Control character mappings are not shown in this table, following
+# the conventions of the standard UTC mapping tables. However, the
+# Mac OS Roman character set uses the standard control characters at
+# 0x00-0x1F and 0x7F.
+#
+# Notes on Mac OS Roman:
+# ----------------------
+#
+# This is a legacy Mac OS encoding; in the Mac OS X Carbon and Cocoa
+# environments, it is only supported directly in programming
+# interfaces for QuickDraw Text, the Script Manager, and related
+# Text Utilities. For other purposes it is supported via transcoding
+# to and from Unicode.
+#
+# This character set is used for at least the following Mac OS
+# localizations: U.S., British, Canadian French, French, Swiss
+# French, German, Swiss German, Italian, Swiss Italian, Dutch,
+# Swedish, Norwegian, Danish, Finnish, Spanish, Catalan,
+# Portuguese, Brazilian, and the default International system.
+#
+# Variants of Mac OS Roman are used for Croatian, Icelandic,
+# Turkish, Romanian, and other encodings. Separate mapping tables
+# are available for these encodings.
+#
+# Before Mac OS 8.5, code point 0xDB was CURRENCY SIGN, and was
+# mapped to U+00A4. In Mac OS 8.5 and later versions, code point
+# 0xDB is changed to EURO SIGN and maps to U+20AC; the standard
+# Apple fonts are updated for Mac OS 8.5 to reflect this. There is
+# a "currency sign" variant of the Mac OS Roman encoding that still
+# maps 0xDB to U+00A4; this can be used for older fonts.
+#
+# Before Mac OS 8.5, the ROM bitmap versions of the fonts Chicago,
+# New York, Geneva, and Monaco did not implement the full Mac OS
+# Roman character set; they only supported character codes up to
+# 0xD8. The TrueType versions of these fonts have always implemented
+# the full character set, as with the bitmap and TrueType versions
+# of the other standard Roman fonts.
+#
+# In all Mac OS encodings, fonts such as Chicago which are used
+# as "system" fonts (for menus, dialogs, etc.) have four glyphs
+# at code points 0x11-0x14 for transient use by the Menu Manager.
+# These glyphs are not intended as characters for use in normal
+# text, and the associated code points are not generally
+# interpreted as associated with these glyphs; they are usually
+# interpreted (if at all) as the control codes DC1-DC4.
+#
+# Unicode mapping issues and notes:
+# ---------------------------------
+#
+# The following corporate zone Unicode character is used in this
+# mapping:
+#
+# 0xF8FF Apple logo
+#
+# NOTE: The graphic image associated with the Apple logo character
+# is not authorized for use without permission of Apple, and
+# unauthorized use might constitute trademark infringement.
+#
+# Details of mapping changes in each version:
+# -------------------------------------------
+#
+# Changes from version n08 to version b02:
+#
+# - Encoding changed for Mac OS 8.5; change mapping of 0xDB from
+# CURRENCY SIGN (U+00A4) to EURO SIGN (U+20AC).
+#
+# Changes from version n03 to version n04:
+#
+# - Change mapping of 0xBD from U+2126 to its canonical
+# decomposition, U+03A9.
+#
+##################
+
+0x20 0x0020 # SPACE
+0x21 0x0021 # EXCLAMATION MARK
+0x22 0x0022 # QUOTATION MARK
+0x23 0x0023 # NUMBER SIGN
+0x24 0x0024 # DOLLAR SIGN
+0x25 0x0025 # PERCENT SIGN
+0x26 0x0026 # AMPERSAND
+0x27 0x0027 # APOSTROPHE
+0x28 0x0028 # LEFT PARENTHESIS
+0x29 0x0029 # RIGHT PARENTHESIS
+0x2A 0x002A # ASTERISK
+0x2B 0x002B # PLUS SIGN
+0x2C 0x002C # COMMA
+0x2D 0x002D # HYPHEN-MINUS
+0x2E 0x002E # FULL STOP
+0x2F 0x002F # SOLIDUS
+0x30 0x0030 # DIGIT ZERO
+0x31 0x0031 # DIGIT ONE
+0x32 0x0032 # DIGIT TWO
+0x33 0x0033 # DIGIT THREE
+0x34 0x0034 # DIGIT FOUR
+0x35 0x0035 # DIGIT FIVE
+0x36 0x0036 # DIGIT SIX
+0x37 0x0037 # DIGIT SEVEN
+0x38 0x0038 # DIGIT EIGHT
+0x39 0x0039 # DIGIT NINE
+0x3A 0x003A # COLON
+0x3B 0x003B # SEMICOLON
+0x3C 0x003C # LESS-THAN SIGN
+0x3D 0x003D # EQUALS SIGN
+0x3E 0x003E # GREATER-THAN SIGN
+0x3F 0x003F # QUESTION MARK
+0x40 0x0040 # COMMERCIAL AT
+0x41 0x0041 # LATIN CAPITAL LETTER A
+0x42 0x0042 # LATIN CAPITAL LETTER B
+0x43 0x0043 # LATIN CAPITAL LETTER C
+0x44 0x0044 # LATIN CAPITAL LETTER D
+0x45 0x0045 # LATIN CAPITAL LETTER E
+0x46 0x0046 # LATIN CAPITAL LETTER F
+0x47 0x0047 # LATIN CAPITAL LETTER G
+0x48 0x0048 # LATIN CAPITAL LETTER H
+0x49 0x0049 # LATIN CAPITAL LETTER I
+0x4A 0x004A # LATIN CAPITAL LETTER J
+0x4B 0x004B # LATIN CAPITAL LETTER K
+0x4C 0x004C # LATIN CAPITAL LETTER L
+0x4D 0x004D # LATIN CAPITAL LETTER M
+0x4E 0x004E # LATIN CAPITAL LETTER N
+0x4F 0x004F # LATIN CAPITAL LETTER O
+0x50 0x0050 # LATIN CAPITAL LETTER P
+0x51 0x0051 # LATIN CAPITAL LETTER Q
+0x52 0x0052 # LATIN CAPITAL LETTER R
+0x53 0x0053 # LATIN CAPITAL LETTER S
+0x54 0x0054 # LATIN CAPITAL LETTER T
+0x55 0x0055 # LATIN CAPITAL LETTER U
+0x56 0x0056 # LATIN CAPITAL LETTER V
+0x57 0x0057 # LATIN CAPITAL LETTER W
+0x58 0x0058 # LATIN CAPITAL LETTER X
+0x59 0x0059 # LATIN CAPITAL LETTER Y
+0x5A 0x005A # LATIN CAPITAL LETTER Z
+0x5B 0x005B # LEFT SQUARE BRACKET
+0x5C 0x005C # REVERSE SOLIDUS
+0x5D 0x005D # RIGHT SQUARE BRACKET
+0x5E 0x005E # CIRCUMFLEX ACCENT
+0x5F 0x005F # LOW LINE
+0x60 0x0060 # GRAVE ACCENT
+0x61 0x0061 # LATIN SMALL LETTER A
+0x62 0x0062 # LATIN SMALL LETTER B
+0x63 0x0063 # LATIN SMALL LETTER C
+0x64 0x0064 # LATIN SMALL LETTER D
+0x65 0x0065 # LATIN SMALL LETTER E
+0x66 0x0066 # LATIN SMALL LETTER F
+0x67 0x0067 # LATIN SMALL LETTER G
+0x68 0x0068 # LATIN SMALL LETTER H
+0x69 0x0069 # LATIN SMALL LETTER I
+0x6A 0x006A # LATIN SMALL LETTER J
+0x6B 0x006B # LATIN SMALL LETTER K
+0x6C 0x006C # LATIN SMALL LETTER L
+0x6D 0x006D # LATIN SMALL LETTER M
+0x6E 0x006E # LATIN SMALL LETTER N
+0x6F 0x006F # LATIN SMALL LETTER O
+0x70 0x0070 # LATIN SMALL LETTER P
+0x71 0x0071 # LATIN SMALL LETTER Q
+0x72 0x0072 # LATIN SMALL LETTER R
+0x73 0x0073 # LATIN SMALL LETTER S
+0x74 0x0074 # LATIN SMALL LETTER T
+0x75 0x0075 # LATIN SMALL LETTER U
+0x76 0x0076 # LATIN SMALL LETTER V
+0x77 0x0077 # LATIN SMALL LETTER W
+0x78 0x0078 # LATIN SMALL LETTER X
+0x79 0x0079 # LATIN SMALL LETTER Y
+0x7A 0x007A # LATIN SMALL LETTER Z
+0x7B 0x007B # LEFT CURLY BRACKET
+0x7C 0x007C # VERTICAL LINE
+0x7D 0x007D # RIGHT CURLY BRACKET
+0x7E 0x007E # TILDE
+#
+0x80 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
+0x81 0x00C5 # LATIN CAPITAL LETTER A WITH RING ABOVE
+0x82 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA
+0x83 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
+0x84 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE
+0x85 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
+0x86 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
+0x87 0x00E1 # LATIN SMALL LETTER A WITH ACUTE
+0x88 0x00E0 # LATIN SMALL LETTER A WITH GRAVE
+0x89 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX
+0x8A 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
+0x8B 0x00E3 # LATIN SMALL LETTER A WITH TILDE
+0x8C 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE
+0x8D 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA
+0x8E 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
+0x8F 0x00E8 # LATIN SMALL LETTER E WITH GRAVE
+0x90 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX
+0x91 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS
+0x92 0x00ED # LATIN SMALL LETTER I WITH ACUTE
+0x93 0x00EC # LATIN SMALL LETTER I WITH GRAVE
+0x94 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX
+0x95 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS
+0x96 0x00F1 # LATIN SMALL LETTER N WITH TILDE
+0x97 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
+0x98 0x00F2 # LATIN SMALL LETTER O WITH GRAVE
+0x99 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
+0x9A 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
+0x9B 0x00F5 # LATIN SMALL LETTER O WITH TILDE
+0x9C 0x00FA # LATIN SMALL LETTER U WITH ACUTE
+0x9D 0x00F9 # LATIN SMALL LETTER U WITH GRAVE
+0x9E 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX
+0x9F 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
+0xA0 0x2020 # DAGGER
+0xA1 0x00B0 # DEGREE SIGN
+0xA2 0x00A2 # CENT SIGN
+0xA3 0x00A3 # POUND SIGN
+0xA4 0x00A7 # SECTION SIGN
+0xA5 0x2022 # BULLET
+0xA6 0x00B6 # PILCROW SIGN
+0xA7 0x00DF # LATIN SMALL LETTER SHARP S
+0xA8 0x00AE # REGISTERED SIGN
+0xA9 0x00A9 # COPYRIGHT SIGN
+0xAA 0x2122 # TRADE MARK SIGN
+0xAB 0x00B4 # ACUTE ACCENT
+0xAC 0x00A8 # DIAERESIS
+0xAD 0x2260 # NOT EQUAL TO
+0xAE 0x00C6 # LATIN CAPITAL LETTER AE
+0xAF 0x00D8 # LATIN CAPITAL LETTER O WITH STROKE
+0xB0 0x221E # INFINITY
+0xB1 0x00B1 # PLUS-MINUS SIGN
+0xB2 0x2264 # LESS-THAN OR EQUAL TO
+0xB3 0x2265 # GREATER-THAN OR EQUAL TO
+0xB4 0x00A5 # YEN SIGN
+0xB5 0x00B5 # MICRO SIGN
+0xB6 0x2202 # PARTIAL DIFFERENTIAL
+0xB7 0x2211 # N-ARY SUMMATION
+0xB8 0x220F # N-ARY PRODUCT
+0xB9 0x03C0 # GREEK SMALL LETTER PI
+0xBA 0x222B # INTEGRAL
+0xBB 0x00AA # FEMININE ORDINAL INDICATOR
+0xBC 0x00BA # MASCULINE ORDINAL INDICATOR
+0xBD 0x03A9 # GREEK CAPITAL LETTER OMEGA
+0xBE 0x00E6 # LATIN SMALL LETTER AE
+0xBF 0x00F8 # LATIN SMALL LETTER O WITH STROKE
+0xC0 0x00BF # INVERTED QUESTION MARK
+0xC1 0x00A1 # INVERTED EXCLAMATION MARK
+0xC2 0x00AC # NOT SIGN
+0xC3 0x221A # SQUARE ROOT
+0xC4 0x0192 # LATIN SMALL LETTER F WITH HOOK
+0xC5 0x2248 # ALMOST EQUAL TO
+0xC6 0x2206 # INCREMENT
+0xC7 0x00AB # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xC8 0x00BB # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
+0xC9 0x2026 # HORIZONTAL ELLIPSIS
+0xCA 0x00A0 # NO-BREAK SPACE
+0xCB 0x00C0 # LATIN CAPITAL LETTER A WITH GRAVE
+0xCC 0x00C3 # LATIN CAPITAL LETTER A WITH TILDE
+0xCD 0x00D5 # LATIN CAPITAL LETTER O WITH TILDE
+0xCE 0x0152 # LATIN CAPITAL LIGATURE OE
+0xCF 0x0153 # LATIN SMALL LIGATURE OE
+0xD0 0x2013 # EN DASH
+0xD1 0x2014 # EM DASH
+0xD2 0x201C # LEFT DOUBLE QUOTATION MARK
+0xD3 0x201D # RIGHT DOUBLE QUOTATION MARK
+0xD4 0x2018 # LEFT SINGLE QUOTATION MARK
+0xD5 0x2019 # RIGHT SINGLE QUOTATION MARK
+0xD6 0x00F7 # DIVISION SIGN
+0xD7 0x25CA # LOZENGE
+0xD8 0x00FF # LATIN SMALL LETTER Y WITH DIAERESIS
+0xD9 0x0178 # LATIN CAPITAL LETTER Y WITH DIAERESIS
+0xDA 0x2044 # FRACTION SLASH
+0xDB 0x20AC # EURO SIGN
+0xDC 0x2039 # SINGLE LEFT-POINTING ANGLE QUOTATION MARK
+0xDD 0x203A # SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
+0xDE 0xFB01 # LATIN SMALL LIGATURE FI
+0xDF 0xFB02 # LATIN SMALL LIGATURE FL
+0xE0 0x2021 # DOUBLE DAGGER
+0xE1 0x00B7 # MIDDLE DOT
+0xE2 0x201A # SINGLE LOW-9 QUOTATION MARK
+0xE3 0x201E # DOUBLE LOW-9 QUOTATION MARK
+0xE4 0x2030 # PER MILLE SIGN
+0xE5 0x00C2 # LATIN CAPITAL LETTER A WITH CIRCUMFLEX
+0xE6 0x00CA # LATIN CAPITAL LETTER E WITH CIRCUMFLEX
+0xE7 0x00C1 # LATIN CAPITAL LETTER A WITH ACUTE
+0xE8 0x00CB # LATIN CAPITAL LETTER E WITH DIAERESIS
+0xE9 0x00C8 # LATIN CAPITAL LETTER E WITH GRAVE
+0xEA 0x00CD # LATIN CAPITAL LETTER I WITH ACUTE
+0xEB 0x00CE # LATIN CAPITAL LETTER I WITH CIRCUMFLEX
+0xEC 0x00CF # LATIN CAPITAL LETTER I WITH DIAERESIS
+0xED 0x00CC # LATIN CAPITAL LETTER I WITH GRAVE
+0xEE 0x00D3 # LATIN CAPITAL LETTER O WITH ACUTE
+0xEF 0x00D4 # LATIN CAPITAL LETTER O WITH CIRCUMFLEX
+0xF0 0xF8FF # Apple logo
+0xF1 0x00D2 # LATIN CAPITAL LETTER O WITH GRAVE
+0xF2 0x00DA # LATIN CAPITAL LETTER U WITH ACUTE
+0xF3 0x00DB # LATIN CAPITAL LETTER U WITH CIRCUMFLEX
+0xF4 0x00D9 # LATIN CAPITAL LETTER U WITH GRAVE
+0xF5 0x0131 # LATIN SMALL LETTER DOTLESS I
+0xF6 0x02C6 # MODIFIER LETTER CIRCUMFLEX ACCENT
+0xF7 0x02DC # SMALL TILDE
+0xF8 0x00AF # MACRON
+0xF9 0x02D8 # BREVE
+0xFA 0x02D9 # DOT ABOVE
+0xFB 0x02DA # RING ABOVE
+0xFC 0x00B8 # CEDILLA
+0xFD 0x02DD # DOUBLE ACUTE ACCENT
+0xFE 0x02DB # OGONEK
+0xFF 0x02C7 # CARON
--- /dev/null
+A1 "!`"
+A8 '\\"{}'
+AB "<<"
+AD "\\-"
+AE "\\circled{R}"
+B0 "${}^\\circle$"
+B1 "$\\pm$"
+B2 "${}^2$"
+B3 "${}^3$"
+B4 "\\'{}"
+B5 "$\\mu$"
+B7 "$\\cdot$"
+B9 "${}^1$"
+BA "${}^{0\\over{}}$"
+BB ">>"
+BC "$1\\over4$"
+BD "$1\\over2$"
+BE "$3\\over4$"
+BF "?`"
+C0 "\\`A"
+C1 "\\'A"
+C2 "\\^A"
+C3 "\\~A"
+C4 '\\"A'
+C5 "\v{A}"
+C6 "\\AE{}"
+C7 "\\c{C}"
+C8 "\\`E"
+C9 "\\'E"
+CA "\\^E"
+CB '\\"E'
+CC "\\`I"
+CD "\\'I"
+CE "\\^I"
+CF '\\"I'
+D1 "\\~N"
+D2 "\\`O"
+D3 "\\'O"
+D4 "\\^O"
+D5 "\\~O"
+D6 '\\"O'
+D7 "$\\times$"
+D8 "\O{}"
+D9 "\\`U"
+DA "\\'U"
+DB "\\~U"
+DC '\\"U'
+DD "\'Y"
+DF "\ss{}"
+E0 "\\`a"
+E1 "\\'a"
+E2 "\\^a"
+E3 "\\~a"
+E4 '\\"a'
+E5 "\\r{a}"
+E6 "\\ae{}"
+E7 "\\c{c}"
+E8 "\\`e"
+E9 "\\'e"
+EA "\\^e"
+EB '\\"e'
+EC "\\`{\\i}"
+ED "\\'{\\i}"
+EE "\\^{\\i}"
+EF '\\"{\\i}'
+F1 "\\~n"
+F2 "\\`o"
+F3 "\\'o"
+F4 "\\^o"
+F5 "\\~o"
+F6 '\\"o'
+F8 "\\o{}"
+F9 "\\`u"
+FA "\\'u"
+FB "\\^u"
+FC '\\"u'
+FD "\\'y"
+FF '\\"y'
+201A "`"
+201E "''"
+2026 "\\dots"
+2020 "\\dagger"
+2030 "${}^0/{}_{00}$"
+2039 "<"
+2018 "`"
+2019 "\'"
+201C "``"
+201D "''"
+2022 "\\bullet"
+2013 --
+2014 ---
+2122 "\trademark"
+203A ">"
+2116 "\\No"
--- /dev/null
+1C "&"
+1E "\\\\\n"
+23 "\\#"
+26 "\\&"
+25 "\\%"
+24 "\\$"
+5F "\\_"
+7B "$\{$"
+7D "$\}$"
+5B "$\[$"
+5D "$\]$"
+5E "\\asciicircum{}"
+5C "$\\backslash$"
+7E "\\asciitilde{}"
+A9 "\\copyright{}"
+AD "\\-"
+A0 "~"
+FEFF "{}"
--- /dev/null
+#\r
+# Name: US-ASCII\ to Unicode\r
+# Unicode version: 1.1\r
+0x20 0x0020 # SPACE\r
+0x21 0x0021 # EXCLAMATION MARK\r
+0x22 0x0022 # QUOTATION MARK\r
+0x23 0x0023 # NUMBER SIGN\r
+0x24 0x0024 # DOLLAR SIGN\r
+0x25 0x0025 # PERCENT SIGN\r
+0x26 0x0026 # AMPERSAND\r
+0x27 0x0027 # APOSTROPHE\r
+0x28 0x0028 # LEFT PARENTHESIS\r
+0x29 0x0029 # RIGHT PARENTHESIS\r
+0x2A 0x002A # ASTERISK\r
+0x2B 0x002B # PLUS SIGN\r
+0x2C 0x002C # COMMA\r
+0x2D 0x002D # HYPHEN-MINUS\r
+0x2E 0x002E # FULL STOP\r
+0x2F 0x002F # SOLIDUS\r
+0x30 0x0030 # DIGIT ZERO\r
+0x31 0x0031 # DIGIT ONE\r
+0x32 0x0032 # DIGIT TWO\r
+0x33 0x0033 # DIGIT THREE\r
+0x34 0x0034 # DIGIT FOUR\r
+0x35 0x0035 # DIGIT FIVE\r
+0x36 0x0036 # DIGIT SIX\r
+0x37 0x0037 # DIGIT SEVEN\r
+0x38 0x0038 # DIGIT EIGHT\r
+0x39 0x0039 # DIGIT NINE\r
+0x3A 0x003A # COLON\r
+0x3B 0x003B # SEMICOLON\r
+0x3C 0x003C # LESS-THAN SIGN\r
+0x3D 0x003D # EQUALS SIGN\r
+0x3E 0x003E # GREATER-THAN SIGN\r
+0x3F 0x003F # QUESTION MARK\r
+0x40 0x0040 # COMMERCIAL AT\r
+0x41 0x0041 # LATIN CAPITAL LETTER A\r
+0x42 0x0042 # LATIN CAPITAL LETTER B\r
+0x43 0x0043 # LATIN CAPITAL LETTER C\r
+0x44 0x0044 # LATIN CAPITAL LETTER D\r
+0x45 0x0045 # LATIN CAPITAL LETTER E\r
+0x46 0x0046 # LATIN CAPITAL LETTER F\r
+0x47 0x0047 # LATIN CAPITAL LETTER G\r
+0x48 0x0048 # LATIN CAPITAL LETTER H\r
+0x49 0x0049 # LATIN CAPITAL LETTER I\r
+0x4A 0x004A # LATIN CAPITAL LETTER J\r
+0x4B 0x004B # LATIN CAPITAL LETTER K\r
+0x4C 0x004C # LATIN CAPITAL LETTER L\r
+0x4D 0x004D # LATIN CAPITAL LETTER M\r
+0x4E 0x004E # LATIN CAPITAL LETTER N\r
+0x4F 0x004F # LATIN CAPITAL LETTER O\r
+0x50 0x0050 # LATIN CAPITAL LETTER P\r
+0x51 0x0051 # LATIN CAPITAL LETTER Q\r
+0x52 0x0052 # LATIN CAPITAL LETTER R\r
+0x53 0x0053 # LATIN CAPITAL LETTER S\r
+0x54 0x0054 # LATIN CAPITAL LETTER T\r
+0x55 0x0055 # LATIN CAPITAL LETTER U\r
+0x56 0x0056 # LATIN CAPITAL LETTER V\r
+0x57 0x0057 # LATIN CAPITAL LETTER W\r
+0x58 0x0058 # LATIN CAPITAL LETTER X\r
+0x59 0x0059 # LATIN CAPITAL LETTER Y\r
+0x5A 0x005A # LATIN CAPITAL LETTER Z\r
+0x5B 0x005B # LEFT SQUARE BRACKET\r
+0x5C 0x005C # REVERSE SOLIDUS\r
+0x5D 0x005D # RIGHT SQUARE BRACKET\r
+0x5E 0x005E # CIRCUMFLEX ACCENT\r
+0x5F 0x005F # LOW LINE\r
+0x60 0x0060 # GRAVE ACCENT\r
+0x61 0x0061 # LATIN SMALL LETTER A\r
+0x62 0x0062 # LATIN SMALL LETTER B\r
+0x63 0x0063 # LATIN SMALL LETTER C\r
+0x64 0x0064 # LATIN SMALL LETTER D\r
+0x65 0x0065 # LATIN SMALL LETTER E\r
+0x66 0x0066 # LATIN SMALL LETTER F\r
+0x67 0x0067 # LATIN SMALL LETTER G\r
+0x68 0x0068 # LATIN SMALL LETTER H\r
+0x69 0x0069 # LATIN SMALL LETTER I\r
+0x6A 0x006A # LATIN SMALL LETTER J\r
+0x6B 0x006B # LATIN SMALL LETTER K\r
+0x6C 0x006C # LATIN SMALL LETTER L\r
+0x6D 0x006D # LATIN SMALL LETTER M\r
+0x6E 0x006E # LATIN SMALL LETTER N\r
+0x6F 0x006F # LATIN SMALL LETTER O\r
+0x70 0x0070 # LATIN SMALL LETTER P\r
+0x71 0x0071 # LATIN SMALL LETTER Q\r
+0x72 0x0072 # LATIN SMALL LETTER R\r
+0x73 0x0073 # LATIN SMALL LETTER S\r
+0x74 0x0074 # LATIN SMALL LETTER T\r
+0x75 0x0075 # LATIN SMALL LETTER U\r
+0x76 0x0076 # LATIN SMALL LETTER V\r
+0x77 0x0077 # LATIN SMALL LETTER W\r
+0x78 0x0078 # LATIN SMALL LETTER X\r
+0x79 0x0079 # LATIN SMALL LETTER Y\r
+0x7A 0x007A # LATIN SMALL LETTER Z\r
+0x7B 0x007B # LEFT CURLY BRACKET\r
+0x7C 0x007C # VERTICAL LINE\r
+0x7D 0x007D # RIGHT CURLY BRACKET\r
+0x7E 0x007E # TILDE\r
--- /dev/null
+Makefile
+semantic.cache
--- /dev/null
+#include <stdlib.h>
+#include <stdio.h>
+#include <dos.h>
+#include <langinfo.h>
+
+static char *badParam="";
+char* __get_dos_codepage(void) {
+ static char codePageName[10];
+ union REGS regs;
+ regs.x.ax=0x6601;
+ intdos(®s,®s);
+ sprintf(codePageName,"cp%d",(regs.x.bx & 0xFFFF));
+ if (regs.x.cflag) {
+ return badParam;
+ }
+ return codePageName;
+}
+char *nl_langinfo(nl_item item) {
+ if (item == CODESET) {
+ return __get_dos_codepage();
+ } else {
+ return badParam;
+ }
+}
+
--- /dev/null
+#ifndef LANGINFO_H
+#define LANGINFO_H
+typedef int nl_item;
+#define CODESET 1
+
+char *nl_langinfo(nl_item item);
+#endif
--- /dev/null
+#include <stdio.h>
+#include <string.h>
+#include <time.h>
+#include "strftime.h"
+
+static const char *monthAbbr[]={
+ "Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"};
+size_t strftime(char *s,size_t max, const char *format,
+ const struct tm *tm) {
+ size_t i=0,j=0;
+ while(j<max && format[i]) {
+ if (format[i]!='%') {
+ s[j]=format[i];
+ i++;j++;
+ } else {
+ i++;
+ switch (format[i]) {
+ case '%': s[j++]='%';break;
+ case 'm': if ((j+2) >=max) return 0;
+ j+=sprintf(s+j,"%02d",tm->tm_mon+1);
+ break;
+ case 'd': if ((j+2) >=max) return 0;
+ j+=sprintf(s+j,"%02d",tm->tm_mday);
+ break;
+ case 'y': if ((j+2) >=max) return 0;
+ j+=sprintf(s+j,"%02d",tm->tm_year%100);
+ break;
+ case 'Y': if ((j+4) >=max) return 0;
+ j+=sprintf(s+j,"%d",tm->tm_year+1900);
+ break;
+ case 'b': if ((j+3)>=max) return 0;
+ strcpy(s+j,monthAbbr[tm->tm_mon]);
+ j+=3;
+ break;
+ case 'l': if ((j+2) >= max) return 0;
+ { int hour = tm->tm_hour;
+ if (hour>12) hour -=12;
+ if (hour == 0) hour = 12;
+ j+=sprintf(s+j,"%2d",hour);
+ break;
+ }
+ case 'p': if ((j+2) >= max) return 0;
+ if (tm->tm_hour >11) {
+ strcpy(s+j,"PM");
+ } else {
+ strcpy(s+j,"AM");
+ }
+ j+=2;
+ break;
+ case 'H': if ((j+2) >= max) return 0;
+ j+=sprintf(s+j,"%02d",tm->tm_hour);
+ break;
+ case 'M':if ((j+2) >= max) return 0;
+ j+=sprintf(s+j,"%02d",tm->tm_min);
+ break;
+ case 'S':if ((j+2) >= max) return 0;
+ j+=sprintf(s+j,"%02d",tm->tm_sec);
+ break;
+ default:
+ ;
+ }
+ i++;
+ }
+ }
+ if (j>=max) return 0;
+ else {
+ s[j]=0;
+ return j;
+ }
+}
--- /dev/null
+#ifndef STRFTIME_H
+#define STRFTIME_H
+#include <stdlib.h>
+#include <time.h>
+
+size_t strftime(char *s,size_t max, const char *format,
+ const struct tm *tm);
+
+#endif
--- /dev/null
+#ifndef UNISTD_H
+#define UNISTD_H
+#include <dos.h>
+#include <io.h>
+extern int optind;
+extern char *optarg;
+extern int opterr;
+int getopt(int argc, char *argv[], char *optionS);
+#endif
--- /dev/null
+#! /bin/sh
+
+# Guess values for system-dependent variables and create Makefiles.
+# Generated automatically using autoconf version 2.13
+# Copyright (C) 1992, 93, 94, 95, 96 Free Software Foundation, Inc.
+#
+# This configure script is free software; the Free Software Foundation
+# gives unlimited permission to copy, distribute and modify it.
+
+# Defaults:
+ac_help=
+ac_default_prefix=/usr/local
+# Any additions from configure.in:
+ac_help="$ac_help
+ --with-wish=fullpath allows to specify full path for Tk interpreter to
+ avoid checking, which requires X"
+ac_help="$ac_help
+ --with-install-root=path
+ allows to install catdoc into other directory
+ than compilied-in path points to"
+ac_help="$ac_help
+ --with-input=charset set default charset to expect in 8-bit word files"
+ac_help="$ac_help
+ --with-output=charset set default charset to output"
+ac_help="$ac_help
+ --disable-charset-check allow make in charsets directory to succeed
+ even if files for default charsets are not found"
+ac_help="$ac_help
+ --disable-wordview Do not install tcl-tk
+ wrapper"
+ac_help="$ac_help
+ --disable-langinfo Do not use system
+locale for output encoding"
+
+# Initialize some variables set by options.
+# The variables have the same names as the options, with
+# dashes changed to underlines.
+build=NONE
+cache_file=./config.cache
+exec_prefix=NONE
+host=NONE
+no_create=
+nonopt=NONE
+no_recursion=
+prefix=NONE
+program_prefix=NONE
+program_suffix=NONE
+program_transform_name=s,x,x,
+silent=
+site=
+srcdir=
+target=NONE
+verbose=
+x_includes=NONE
+x_libraries=NONE
+bindir='${exec_prefix}/bin'
+sbindir='${exec_prefix}/sbin'
+libexecdir='${exec_prefix}/libexec'
+datadir='${prefix}/share'
+sysconfdir='${prefix}/etc'
+sharedstatedir='${prefix}/com'
+localstatedir='${prefix}/var'
+libdir='${exec_prefix}/lib'
+includedir='${prefix}/include'
+oldincludedir='/usr/include'
+infodir='${prefix}/info'
+mandir='${prefix}/man'
+
+# Initialize some other variables.
+subdirs=
+MFLAGS= MAKEFLAGS=
+SHELL=${CONFIG_SHELL-/bin/sh}
+# Maximum number of lines to put in a shell here document.
+ac_max_here_lines=12
+
+ac_prev=
+for ac_option
+do
+
+ # If the previous option needs an argument, assign it.
+ if test -n "$ac_prev"; then
+ eval "$ac_prev=\$ac_option"
+ ac_prev=
+ continue
+ fi
+
+ case "$ac_option" in
+ -*=*) ac_optarg=`echo "$ac_option" | sed 's/[-_a-zA-Z0-9]*=//'` ;;
+ *) ac_optarg= ;;
+ esac
+
+ # Accept the important Cygnus configure options, so we can diagnose typos.
+
+ case "$ac_option" in
+
+ -bindir | --bindir | --bindi | --bind | --bin | --bi)
+ ac_prev=bindir ;;
+ -bindir=* | --bindir=* | --bindi=* | --bind=* | --bin=* | --bi=*)
+ bindir="$ac_optarg" ;;
+
+ -build | --build | --buil | --bui | --bu)
+ ac_prev=build ;;
+ -build=* | --build=* | --buil=* | --bui=* | --bu=*)
+ build="$ac_optarg" ;;
+
+ -cache-file | --cache-file | --cache-fil | --cache-fi \
+ | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c)
+ ac_prev=cache_file ;;
+ -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \
+ | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*)
+ cache_file="$ac_optarg" ;;
+
+ -datadir | --datadir | --datadi | --datad | --data | --dat | --da)
+ ac_prev=datadir ;;
+ -datadir=* | --datadir=* | --datadi=* | --datad=* | --data=* | --dat=* \
+ | --da=*)
+ datadir="$ac_optarg" ;;
+
+ -disable-* | --disable-*)
+ ac_feature=`echo $ac_option|sed -e 's/-*disable-//'`
+ # Reject names that are not valid shell variable names.
+ if test -n "`echo $ac_feature| sed 's/[-a-zA-Z0-9_]//g'`"; then
+ { echo "configure: error: $ac_feature: invalid feature name" 1>&2; exit 1; }
+ fi
+ ac_feature=`echo $ac_feature| sed 's/-/_/g'`
+ eval "enable_${ac_feature}=no" ;;
+
+ -enable-* | --enable-*)
+ ac_feature=`echo $ac_option|sed -e 's/-*enable-//' -e 's/=.*//'`
+ # Reject names that are not valid shell variable names.
+ if test -n "`echo $ac_feature| sed 's/[-_a-zA-Z0-9]//g'`"; then
+ { echo "configure: error: $ac_feature: invalid feature name" 1>&2; exit 1; }
+ fi
+ ac_feature=`echo $ac_feature| sed 's/-/_/g'`
+ case "$ac_option" in
+ *=*) ;;
+ *) ac_optarg=yes ;;
+ esac
+ eval "enable_${ac_feature}='$ac_optarg'" ;;
+
+ -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \
+ | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \
+ | --exec | --exe | --ex)
+ ac_prev=exec_prefix ;;
+ -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \
+ | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \
+ | --exec=* | --exe=* | --ex=*)
+ exec_prefix="$ac_optarg" ;;
+
+ -gas | --gas | --ga | --g)
+ # Obsolete; use --with-gas.
+ with_gas=yes ;;
+
+ -help | --help | --hel | --he)
+ # Omit some internal or obsolete options to make the list less imposing.
+ # This message is too long to be a string in the A/UX 3.1 sh.
+ cat << EOF
+Usage: configure [options] [host]
+Options: [defaults in brackets after descriptions]
+Configuration:
+ --cache-file=FILE cache test results in FILE
+ --help print this message
+ --no-create do not create output files
+ --quiet, --silent do not print \`checking...' messages
+ --version print the version of autoconf that created configure
+Directory and file names:
+ --prefix=PREFIX install architecture-independent files in PREFIX
+ [$ac_default_prefix]
+ --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX
+ [same as prefix]
+ --bindir=DIR user executables in DIR [EPREFIX/bin]
+ --sbindir=DIR system admin executables in DIR [EPREFIX/sbin]
+ --libexecdir=DIR program executables in DIR [EPREFIX/libexec]
+ --datadir=DIR read-only architecture-independent data in DIR
+ [PREFIX/share]
+ --sysconfdir=DIR read-only single-machine data in DIR [PREFIX/etc]
+ --sharedstatedir=DIR modifiable architecture-independent data in DIR
+ [PREFIX/com]
+ --localstatedir=DIR modifiable single-machine data in DIR [PREFIX/var]
+ --libdir=DIR object code libraries in DIR [EPREFIX/lib]
+ --includedir=DIR C header files in DIR [PREFIX/include]
+ --oldincludedir=DIR C header files for non-gcc in DIR [/usr/include]
+ --infodir=DIR info documentation in DIR [PREFIX/info]
+ --mandir=DIR man documentation in DIR [PREFIX/man]
+ --srcdir=DIR find the sources in DIR [configure dir or ..]
+ --program-prefix=PREFIX prepend PREFIX to installed program names
+ --program-suffix=SUFFIX append SUFFIX to installed program names
+ --program-transform-name=PROGRAM
+ run sed PROGRAM on installed program names
+EOF
+ cat << EOF
+Host type:
+ --build=BUILD configure for building on BUILD [BUILD=HOST]
+ --host=HOST configure for HOST [guessed]
+ --target=TARGET configure for TARGET [TARGET=HOST]
+Features and packages:
+ --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no)
+ --enable-FEATURE[=ARG] include FEATURE [ARG=yes]
+ --with-PACKAGE[=ARG] use PACKAGE [ARG=yes]
+ --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no)
+ --x-includes=DIR X include files are in DIR
+ --x-libraries=DIR X library files are in DIR
+EOF
+ if test -n "$ac_help"; then
+ echo "--enable and --with options recognized:$ac_help"
+ fi
+ exit 0 ;;
+
+ -host | --host | --hos | --ho)
+ ac_prev=host ;;
+ -host=* | --host=* | --hos=* | --ho=*)
+ host="$ac_optarg" ;;
+
+ -includedir | --includedir | --includedi | --included | --include \
+ | --includ | --inclu | --incl | --inc)
+ ac_prev=includedir ;;
+ -includedir=* | --includedir=* | --includedi=* | --included=* | --include=* \
+ | --includ=* | --inclu=* | --incl=* | --inc=*)
+ includedir="$ac_optarg" ;;
+
+ -infodir | --infodir | --infodi | --infod | --info | --inf)
+ ac_prev=infodir ;;
+ -infodir=* | --infodir=* | --infodi=* | --infod=* | --info=* | --inf=*)
+ infodir="$ac_optarg" ;;
+
+ -libdir | --libdir | --libdi | --libd)
+ ac_prev=libdir ;;
+ -libdir=* | --libdir=* | --libdi=* | --libd=*)
+ libdir="$ac_optarg" ;;
+
+ -libexecdir | --libexecdir | --libexecdi | --libexecd | --libexec \
+ | --libexe | --libex | --libe)
+ ac_prev=libexecdir ;;
+ -libexecdir=* | --libexecdir=* | --libexecdi=* | --libexecd=* | --libexec=* \
+ | --libexe=* | --libex=* | --libe=*)
+ libexecdir="$ac_optarg" ;;
+
+ -localstatedir | --localstatedir | --localstatedi | --localstated \
+ | --localstate | --localstat | --localsta | --localst \
+ | --locals | --local | --loca | --loc | --lo)
+ ac_prev=localstatedir ;;
+ -localstatedir=* | --localstatedir=* | --localstatedi=* | --localstated=* \
+ | --localstate=* | --localstat=* | --localsta=* | --localst=* \
+ | --locals=* | --local=* | --loca=* | --loc=* | --lo=*)
+ localstatedir="$ac_optarg" ;;
+
+ -mandir | --mandir | --mandi | --mand | --man | --ma | --m)
+ ac_prev=mandir ;;
+ -mandir=* | --mandir=* | --mandi=* | --mand=* | --man=* | --ma=* | --m=*)
+ mandir="$ac_optarg" ;;
+
+ -nfp | --nfp | --nf)
+ # Obsolete; use --without-fp.
+ with_fp=no ;;
+
+ -no-create | --no-create | --no-creat | --no-crea | --no-cre \
+ | --no-cr | --no-c)
+ no_create=yes ;;
+
+ -no-recursion | --no-recursion | --no-recursio | --no-recursi \
+ | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r)
+ no_recursion=yes ;;
+
+ -oldincludedir | --oldincludedir | --oldincludedi | --oldincluded \
+ | --oldinclude | --oldinclud | --oldinclu | --oldincl | --oldinc \
+ | --oldin | --oldi | --old | --ol | --o)
+ ac_prev=oldincludedir ;;
+ -oldincludedir=* | --oldincludedir=* | --oldincludedi=* | --oldincluded=* \
+ | --oldinclude=* | --oldinclud=* | --oldinclu=* | --oldincl=* | --oldinc=* \
+ | --oldin=* | --oldi=* | --old=* | --ol=* | --o=*)
+ oldincludedir="$ac_optarg" ;;
+
+ -prefix | --prefix | --prefi | --pref | --pre | --pr | --p)
+ ac_prev=prefix ;;
+ -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*)
+ prefix="$ac_optarg" ;;
+
+ -program-prefix | --program-prefix | --program-prefi | --program-pref \
+ | --program-pre | --program-pr | --program-p)
+ ac_prev=program_prefix ;;
+ -program-prefix=* | --program-prefix=* | --program-prefi=* \
+ | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*)
+ program_prefix="$ac_optarg" ;;
+
+ -program-suffix | --program-suffix | --program-suffi | --program-suff \
+ | --program-suf | --program-su | --program-s)
+ ac_prev=program_suffix ;;
+ -program-suffix=* | --program-suffix=* | --program-suffi=* \
+ | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*)
+ program_suffix="$ac_optarg" ;;
+
+ -program-transform-name | --program-transform-name \
+ | --program-transform-nam | --program-transform-na \
+ | --program-transform-n | --program-transform- \
+ | --program-transform | --program-transfor \
+ | --program-transfo | --program-transf \
+ | --program-trans | --program-tran \
+ | --progr-tra | --program-tr | --program-t)
+ ac_prev=program_transform_name ;;
+ -program-transform-name=* | --program-transform-name=* \
+ | --program-transform-nam=* | --program-transform-na=* \
+ | --program-transform-n=* | --program-transform-=* \
+ | --program-transform=* | --program-transfor=* \
+ | --program-transfo=* | --program-transf=* \
+ | --program-trans=* | --program-tran=* \
+ | --progr-tra=* | --program-tr=* | --program-t=*)
+ program_transform_name="$ac_optarg" ;;
+
+ -q | -quiet | --quiet | --quie | --qui | --qu | --q \
+ | -silent | --silent | --silen | --sile | --sil)
+ silent=yes ;;
+
+ -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
+ ac_prev=sbindir ;;
+ -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
+ | --sbi=* | --sb=*)
+ sbindir="$ac_optarg" ;;
+
+ -sharedstatedir | --sharedstatedir | --sharedstatedi \
+ | --sharedstated | --sharedstate | --sharedstat | --sharedsta \
+ | --sharedst | --shareds | --shared | --share | --shar \
+ | --sha | --sh)
+ ac_prev=sharedstatedir ;;
+ -sharedstatedir=* | --sharedstatedir=* | --sharedstatedi=* \
+ | --sharedstated=* | --sharedstate=* | --sharedstat=* | --sharedsta=* \
+ | --sharedst=* | --shareds=* | --shared=* | --share=* | --shar=* \
+ | --sha=* | --sh=*)
+ sharedstatedir="$ac_optarg" ;;
+
+ -site | --site | --sit)
+ ac_prev=site ;;
+ -site=* | --site=* | --sit=*)
+ site="$ac_optarg" ;;
+
+ -srcdir | --srcdir | --srcdi | --srcd | --src | --sr)
+ ac_prev=srcdir ;;
+ -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*)
+ srcdir="$ac_optarg" ;;
+
+ -sysconfdir | --sysconfdir | --sysconfdi | --sysconfd | --sysconf \
+ | --syscon | --sysco | --sysc | --sys | --sy)
+ ac_prev=sysconfdir ;;
+ -sysconfdir=* | --sysconfdir=* | --sysconfdi=* | --sysconfd=* | --sysconf=* \
+ | --syscon=* | --sysco=* | --sysc=* | --sys=* | --sy=*)
+ sysconfdir="$ac_optarg" ;;
+
+ -target | --target | --targe | --targ | --tar | --ta | --t)
+ ac_prev=target ;;
+ -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*)
+ target="$ac_optarg" ;;
+
+ -v | -verbose | --verbose | --verbos | --verbo | --verb)
+ verbose=yes ;;
+
+ -version | --version | --versio | --versi | --vers)
+ echo "configure generated by autoconf version 2.13"
+ exit 0 ;;
+
+ -with-* | --with-*)
+ ac_package=`echo $ac_option|sed -e 's/-*with-//' -e 's/=.*//'`
+ # Reject names that are not valid shell variable names.
+ if test -n "`echo $ac_package| sed 's/[-_a-zA-Z0-9]//g'`"; then
+ { echo "configure: error: $ac_package: invalid package name" 1>&2; exit 1; }
+ fi
+ ac_package=`echo $ac_package| sed 's/-/_/g'`
+ case "$ac_option" in
+ *=*) ;;
+ *) ac_optarg=yes ;;
+ esac
+ eval "with_${ac_package}='$ac_optarg'" ;;
+
+ -without-* | --without-*)
+ ac_package=`echo $ac_option|sed -e 's/-*without-//'`
+ # Reject names that are not valid shell variable names.
+ if test -n "`echo $ac_package| sed 's/[-a-zA-Z0-9_]//g'`"; then
+ { echo "configure: error: $ac_package: invalid package name" 1>&2; exit 1; }
+ fi
+ ac_package=`echo $ac_package| sed 's/-/_/g'`
+ eval "with_${ac_package}=no" ;;
+
+ --x)
+ # Obsolete; use --with-x.
+ with_x=yes ;;
+
+ -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \
+ | --x-incl | --x-inc | --x-in | --x-i)
+ ac_prev=x_includes ;;
+ -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \
+ | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*)
+ x_includes="$ac_optarg" ;;
+
+ -x-libraries | --x-libraries | --x-librarie | --x-librari \
+ | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l)
+ ac_prev=x_libraries ;;
+ -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \
+ | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*)
+ x_libraries="$ac_optarg" ;;
+
+ -*) { echo "configure: error: $ac_option: invalid option; use --help to show usage" 1>&2; exit 1; }
+ ;;
+
+ *)
+ if test -n "`echo $ac_option| sed 's/[-a-z0-9.]//g'`"; then
+ echo "configure: warning: $ac_option: invalid host type" 1>&2
+ fi
+ if test "x$nonopt" != xNONE; then
+ { echo "configure: error: can only configure for one host and one target at a time" 1>&2; exit 1; }
+ fi
+ nonopt="$ac_option"
+ ;;
+
+ esac
+done
+
+if test -n "$ac_prev"; then
+ { echo "configure: error: missing argument to --`echo $ac_prev | sed 's/_/-/g'`" 1>&2; exit 1; }
+fi
+
+trap 'rm -fr conftest* confdefs* core core.* *.core $ac_clean_files; exit 1' 1 2 15
+
+# File descriptor usage:
+# 0 standard input
+# 1 file creation
+# 2 errors and warnings
+# 3 some systems may open it to /dev/tty
+# 4 used on the Kubota Titan
+# 6 checking for... messages and results
+# 5 compiler messages saved in config.log
+if test "$silent" = yes; then
+ exec 6>/dev/null
+else
+ exec 6>&1
+fi
+exec 5>./config.log
+
+echo "\
+This file contains any messages produced by compilers while
+running configure, to aid debugging if configure makes a mistake.
+" 1>&5
+
+# Strip out --no-create and --no-recursion so they do not pile up.
+# Also quote any args containing shell metacharacters.
+ac_configure_args=
+for ac_arg
+do
+ case "$ac_arg" in
+ -no-create | --no-create | --no-creat | --no-crea | --no-cre \
+ | --no-cr | --no-c) ;;
+ -no-recursion | --no-recursion | --no-recursio | --no-recursi \
+ | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) ;;
+ *" "*|*" "*|*[\[\]\~\#\$\^\&\*\(\)\{\}\\\|\;\<\>\?]*)
+ ac_configure_args="$ac_configure_args '$ac_arg'" ;;
+ *) ac_configure_args="$ac_configure_args $ac_arg" ;;
+ esac
+done
+
+# NLS nuisances.
+# Only set these to C if already set. These must not be set unconditionally
+# because not all systems understand e.g. LANG=C (notably SCO).
+# Fixing LC_MESSAGES prevents Solaris sh from translating var values in `set'!
+# Non-C LC_CTYPE values break the ctype check.
+if test "${LANG+set}" = set; then LANG=C; export LANG; fi
+if test "${LC_ALL+set}" = set; then LC_ALL=C; export LC_ALL; fi
+if test "${LC_MESSAGES+set}" = set; then LC_MESSAGES=C; export LC_MESSAGES; fi
+if test "${LC_CTYPE+set}" = set; then LC_CTYPE=C; export LC_CTYPE; fi
+
+# confdefs.h avoids OS command line length limits that DEFS can exceed.
+rm -rf conftest* confdefs.h
+# AIX cpp loses on an empty file, so make sure it contains at least a newline.
+echo > confdefs.h
+
+# A filename unique to this package, relative to the directory that
+# configure is in, which we can look for to find out if srcdir is correct.
+ac_unique_file=acconfig.h
+
+# Find the source files, if location was not specified.
+if test -z "$srcdir"; then
+ ac_srcdir_defaulted=yes
+ # Try the directory containing this script, then its parent.
+ ac_prog=$0
+ ac_confdir=`echo $ac_prog|sed 's%/[^/][^/]*$%%'`
+ test "x$ac_confdir" = "x$ac_prog" && ac_confdir=.
+ srcdir=$ac_confdir
+ if test ! -r $srcdir/$ac_unique_file; then
+ srcdir=..
+ fi
+else
+ ac_srcdir_defaulted=no
+fi
+if test ! -r $srcdir/$ac_unique_file; then
+ if test "$ac_srcdir_defaulted" = yes; then
+ { echo "configure: error: can not find sources in $ac_confdir or .." 1>&2; exit 1; }
+ else
+ { echo "configure: error: can not find sources in $srcdir" 1>&2; exit 1; }
+ fi
+fi
+srcdir=`echo "${srcdir}" | sed 's%\([^/]\)/*$%\1%'`
+
+# Prefer explicitly selected file to automatically selected ones.
+if test -z "$CONFIG_SITE"; then
+ if test "x$prefix" != xNONE; then
+ CONFIG_SITE="$prefix/share/config.site $prefix/etc/config.site"
+ else
+ CONFIG_SITE="$ac_default_prefix/share/config.site $ac_default_prefix/etc/config.site"
+ fi
+fi
+for ac_site_file in $CONFIG_SITE; do
+ if test -r "$ac_site_file"; then
+ echo "loading site script $ac_site_file"
+ . "$ac_site_file"
+ fi
+done
+
+if test -r "$cache_file"; then
+ echo "loading cache $cache_file"
+ . $cache_file
+else
+ echo "creating cache $cache_file"
+ > $cache_file
+fi
+
+ac_ext=c
+# CFLAGS is not in ac_cpp because -g, -O, etc. are not valid cpp options.
+ac_cpp='$CPP $CPPFLAGS'
+ac_compile='${CC-cc} -c $CFLAGS $CPPFLAGS conftest.$ac_ext 1>&5'
+ac_link='${CC-cc} -o conftest${ac_exeext} $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS 1>&5'
+cross_compiling=$ac_cv_prog_cc_cross
+
+ac_exeext=
+ac_objext=o
+if (echo "testing\c"; echo 1,2,3) | grep c >/dev/null; then
+ # Stardent Vistra SVR4 grep lacks -e, says ghazi@caip.rutgers.edu.
+ if (echo -n testing; echo 1,2,3) | sed s/-n/xn/ | grep xn >/dev/null; then
+ ac_n= ac_c='
+' ac_t=' '
+ else
+ ac_n=-n ac_c= ac_t=
+ fi
+else
+ ac_n= ac_c='\c' ac_t=
+fi
+
+
+catdoc_version=0.94
+# Extract the first word of "gcc", so it can be a program name with args.
+set dummy gcc; ac_word=$2
+echo $ac_n "checking for $ac_word""... $ac_c" 1>&6
+echo "configure:549: checking for $ac_word" >&5
+if eval "test \"`echo '$''{'ac_cv_prog_CC'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ if test -n "$CC"; then
+ ac_cv_prog_CC="$CC" # Let the user override the test.
+else
+ IFS="${IFS= }"; ac_save_ifs="$IFS"; IFS=":"
+ ac_dummy="$PATH"
+ for ac_dir in $ac_dummy; do
+ test -z "$ac_dir" && ac_dir=.
+ if test -f $ac_dir/$ac_word; then
+ ac_cv_prog_CC="gcc"
+ break
+ fi
+ done
+ IFS="$ac_save_ifs"
+fi
+fi
+CC="$ac_cv_prog_CC"
+if test -n "$CC"; then
+ echo "$ac_t""$CC" 1>&6
+else
+ echo "$ac_t""no" 1>&6
+fi
+
+if test -z "$CC"; then
+ # Extract the first word of "cc", so it can be a program name with args.
+set dummy cc; ac_word=$2
+echo $ac_n "checking for $ac_word""... $ac_c" 1>&6
+echo "configure:579: checking for $ac_word" >&5
+if eval "test \"`echo '$''{'ac_cv_prog_CC'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ if test -n "$CC"; then
+ ac_cv_prog_CC="$CC" # Let the user override the test.
+else
+ IFS="${IFS= }"; ac_save_ifs="$IFS"; IFS=":"
+ ac_prog_rejected=no
+ ac_dummy="$PATH"
+ for ac_dir in $ac_dummy; do
+ test -z "$ac_dir" && ac_dir=.
+ if test -f $ac_dir/$ac_word; then
+ if test "$ac_dir/$ac_word" = "/usr/ucb/cc"; then
+ ac_prog_rejected=yes
+ continue
+ fi
+ ac_cv_prog_CC="cc"
+ break
+ fi
+ done
+ IFS="$ac_save_ifs"
+if test $ac_prog_rejected = yes; then
+ # We found a bogon in the path, so make sure we never use it.
+ set dummy $ac_cv_prog_CC
+ shift
+ if test $# -gt 0; then
+ # We chose a different compiler from the bogus one.
+ # However, it has the same basename, so the bogon will be chosen
+ # first if we set CC to just the basename; use the full file name.
+ shift
+ set dummy "$ac_dir/$ac_word" "$@"
+ shift
+ ac_cv_prog_CC="$@"
+ fi
+fi
+fi
+fi
+CC="$ac_cv_prog_CC"
+if test -n "$CC"; then
+ echo "$ac_t""$CC" 1>&6
+else
+ echo "$ac_t""no" 1>&6
+fi
+
+ if test -z "$CC"; then
+ case "`uname -s`" in
+ *win32* | *WIN32*)
+ # Extract the first word of "cl", so it can be a program name with args.
+set dummy cl; ac_word=$2
+echo $ac_n "checking for $ac_word""... $ac_c" 1>&6
+echo "configure:630: checking for $ac_word" >&5
+if eval "test \"`echo '$''{'ac_cv_prog_CC'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ if test -n "$CC"; then
+ ac_cv_prog_CC="$CC" # Let the user override the test.
+else
+ IFS="${IFS= }"; ac_save_ifs="$IFS"; IFS=":"
+ ac_dummy="$PATH"
+ for ac_dir in $ac_dummy; do
+ test -z "$ac_dir" && ac_dir=.
+ if test -f $ac_dir/$ac_word; then
+ ac_cv_prog_CC="cl"
+ break
+ fi
+ done
+ IFS="$ac_save_ifs"
+fi
+fi
+CC="$ac_cv_prog_CC"
+if test -n "$CC"; then
+ echo "$ac_t""$CC" 1>&6
+else
+ echo "$ac_t""no" 1>&6
+fi
+ ;;
+ esac
+ fi
+ test -z "$CC" && { echo "configure: error: no acceptable cc found in \$PATH" 1>&2; exit 1; }
+fi
+
+echo $ac_n "checking whether the C compiler ($CC $CFLAGS $LDFLAGS) works""... $ac_c" 1>&6
+echo "configure:662: checking whether the C compiler ($CC $CFLAGS $LDFLAGS) works" >&5
+
+ac_ext=c
+# CFLAGS is not in ac_cpp because -g, -O, etc. are not valid cpp options.
+ac_cpp='$CPP $CPPFLAGS'
+ac_compile='${CC-cc} -c $CFLAGS $CPPFLAGS conftest.$ac_ext 1>&5'
+ac_link='${CC-cc} -o conftest${ac_exeext} $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS 1>&5'
+cross_compiling=$ac_cv_prog_cc_cross
+
+cat > conftest.$ac_ext << EOF
+
+#line 673 "configure"
+#include "confdefs.h"
+
+main(){return(0);}
+EOF
+if { (eval echo configure:678: \"$ac_link\") 1>&5; (eval $ac_link) 2>&5; } && test -s conftest${ac_exeext}; then
+ ac_cv_prog_cc_works=yes
+ # If we can't run a trivial program, we are probably using a cross compiler.
+ if (./conftest; exit) 2>/dev/null; then
+ ac_cv_prog_cc_cross=no
+ else
+ ac_cv_prog_cc_cross=yes
+ fi
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ ac_cv_prog_cc_works=no
+fi
+rm -fr conftest*
+ac_ext=c
+# CFLAGS is not in ac_cpp because -g, -O, etc. are not valid cpp options.
+ac_cpp='$CPP $CPPFLAGS'
+ac_compile='${CC-cc} -c $CFLAGS $CPPFLAGS conftest.$ac_ext 1>&5'
+ac_link='${CC-cc} -o conftest${ac_exeext} $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS 1>&5'
+cross_compiling=$ac_cv_prog_cc_cross
+
+echo "$ac_t""$ac_cv_prog_cc_works" 1>&6
+if test $ac_cv_prog_cc_works = no; then
+ { echo "configure: error: installation or configuration problem: C compiler cannot create executables." 1>&2; exit 1; }
+fi
+echo $ac_n "checking whether the C compiler ($CC $CFLAGS $LDFLAGS) is a cross-compiler""... $ac_c" 1>&6
+echo "configure:704: checking whether the C compiler ($CC $CFLAGS $LDFLAGS) is a cross-compiler" >&5
+echo "$ac_t""$ac_cv_prog_cc_cross" 1>&6
+cross_compiling=$ac_cv_prog_cc_cross
+
+echo $ac_n "checking whether we are using GNU C""... $ac_c" 1>&6
+echo "configure:709: checking whether we are using GNU C" >&5
+if eval "test \"`echo '$''{'ac_cv_prog_gcc'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ cat > conftest.c <<EOF
+#ifdef __GNUC__
+ yes;
+#endif
+EOF
+if { ac_try='${CC-cc} -E conftest.c'; { (eval echo configure:718: \"$ac_try\") 1>&5; (eval $ac_try) 2>&5; }; } | egrep yes >/dev/null 2>&1; then
+ ac_cv_prog_gcc=yes
+else
+ ac_cv_prog_gcc=no
+fi
+fi
+
+echo "$ac_t""$ac_cv_prog_gcc" 1>&6
+
+if test $ac_cv_prog_gcc = yes; then
+ GCC=yes
+else
+ GCC=
+fi
+
+ac_test_CFLAGS="${CFLAGS+set}"
+ac_save_CFLAGS="$CFLAGS"
+CFLAGS=
+echo $ac_n "checking whether ${CC-cc} accepts -g""... $ac_c" 1>&6
+echo "configure:737: checking whether ${CC-cc} accepts -g" >&5
+if eval "test \"`echo '$''{'ac_cv_prog_cc_g'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ echo 'void f(){}' > conftest.c
+if test -z "`${CC-cc} -g -c conftest.c 2>&1`"; then
+ ac_cv_prog_cc_g=yes
+else
+ ac_cv_prog_cc_g=no
+fi
+rm -f conftest*
+
+fi
+
+echo "$ac_t""$ac_cv_prog_cc_g" 1>&6
+if test "$ac_test_CFLAGS" = set; then
+ CFLAGS="$ac_save_CFLAGS"
+elif test $ac_cv_prog_cc_g = yes; then
+ if test "$GCC" = yes; then
+ CFLAGS="-g -O2"
+ else
+ CFLAGS="-g"
+ fi
+else
+ if test "$GCC" = yes; then
+ CFLAGS="-O2"
+ else
+ CFLAGS=
+ fi
+fi
+
+case ${CC} in
+*djgpp*) ac_cv_c_bigendian=no
+ ac_cv_func_setvbuf_reversed=no
+;;
+*)
+;;
+esac
+echo $ac_n "checking whether byte ordering is bigendian""... $ac_c" 1>&6
+echo "configure:776: checking whether byte ordering is bigendian" >&5
+if eval "test \"`echo '$''{'ac_cv_c_bigendian'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ ac_cv_c_bigendian=unknown
+# See if sys/param.h defines the BYTE_ORDER macro.
+cat > conftest.$ac_ext <<EOF
+#line 783 "configure"
+#include "confdefs.h"
+#include <sys/types.h>
+#include <sys/param.h>
+int main() {
+
+#if !BYTE_ORDER || !BIG_ENDIAN || !LITTLE_ENDIAN
+ bogus endian macros
+#endif
+; return 0; }
+EOF
+if { (eval echo configure:794: \"$ac_compile\") 1>&5; (eval $ac_compile) 2>&5; }; then
+ rm -rf conftest*
+ # It does; now see whether it defined to BIG_ENDIAN or not.
+cat > conftest.$ac_ext <<EOF
+#line 798 "configure"
+#include "confdefs.h"
+#include <sys/types.h>
+#include <sys/param.h>
+int main() {
+
+#if BYTE_ORDER != BIG_ENDIAN
+ not big endian
+#endif
+; return 0; }
+EOF
+if { (eval echo configure:809: \"$ac_compile\") 1>&5; (eval $ac_compile) 2>&5; }; then
+ rm -rf conftest*
+ ac_cv_c_bigendian=yes
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ ac_cv_c_bigendian=no
+fi
+rm -f conftest*
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+fi
+rm -f conftest*
+if test $ac_cv_c_bigendian = unknown; then
+if test "$cross_compiling" = yes; then
+ { echo "configure: error: can not run test program while cross compiling" 1>&2; exit 1; }
+else
+ cat > conftest.$ac_ext <<EOF
+#line 829 "configure"
+#include "confdefs.h"
+main () {
+ /* Are we little or big endian? From Harbison&Steele. */
+ union
+ {
+ long l;
+ char c[sizeof (long)];
+ } u;
+ u.l = 1;
+ exit (u.c[sizeof (long) - 1] == 1);
+}
+EOF
+if { (eval echo configure:842: \"$ac_link\") 1>&5; (eval $ac_link) 2>&5; } && test -s conftest${ac_exeext} && (./conftest; exit) 2>/dev/null
+then
+ ac_cv_c_bigendian=no
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -fr conftest*
+ ac_cv_c_bigendian=yes
+fi
+rm -fr conftest*
+fi
+
+fi
+fi
+
+echo "$ac_t""$ac_cv_c_bigendian" 1>&6
+if test $ac_cv_c_bigendian = yes; then
+ cat >> confdefs.h <<\EOF
+#define WORDS_BIGENDIAN 1
+EOF
+
+fi
+
+ac_aux_dir=
+for ac_dir in $srcdir $srcdir/.. $srcdir/../..; do
+ if test -f $ac_dir/install-sh; then
+ ac_aux_dir=$ac_dir
+ ac_install_sh="$ac_aux_dir/install-sh -c"
+ break
+ elif test -f $ac_dir/install.sh; then
+ ac_aux_dir=$ac_dir
+ ac_install_sh="$ac_aux_dir/install.sh -c"
+ break
+ fi
+done
+if test -z "$ac_aux_dir"; then
+ { echo "configure: error: can not find install-sh or install.sh in $srcdir $srcdir/.. $srcdir/../.." 1>&2; exit 1; }
+fi
+ac_config_guess=$ac_aux_dir/config.guess
+ac_config_sub=$ac_aux_dir/config.sub
+ac_configure=$ac_aux_dir/configure # This should be Cygnus configure.
+
+# Find a good install program. We prefer a C program (faster),
+# so one script is as good as another. But avoid the broken or
+# incompatible versions:
+# SysV /etc/install, /usr/sbin/install
+# SunOS /usr/etc/install
+# IRIX /sbin/install
+# AIX /bin/install
+# AIX 4 /usr/bin/installbsd, which doesn't work without a -g flag
+# AFS /usr/afsws/bin/install, which mishandles nonexistent args
+# SVR4 /usr/ucb/install, which tries to use the nonexistent group "staff"
+# ./install, which can be erroneously created by make from ./install.sh.
+echo $ac_n "checking for a BSD compatible install""... $ac_c" 1>&6
+echo "configure:896: checking for a BSD compatible install" >&5
+if test -z "$INSTALL"; then
+if eval "test \"`echo '$''{'ac_cv_path_install'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ IFS="${IFS= }"; ac_save_IFS="$IFS"; IFS=":"
+ for ac_dir in $PATH; do
+ # Account for people who put trailing slashes in PATH elements.
+ case "$ac_dir/" in
+ /|./|.//|/etc/*|/usr/sbin/*|/usr/etc/*|/sbin/*|/usr/afsws/bin/*|/usr/ucb/*) ;;
+ *)
+ # OSF1 and SCO ODT 3.0 have their own names for install.
+ # Don't use installbsd from OSF since it installs stuff as root
+ # by default.
+ for ac_prog in ginstall scoinst install; do
+ if test -f $ac_dir/$ac_prog; then
+ if test $ac_prog = install &&
+ grep dspmsg $ac_dir/$ac_prog >/dev/null 2>&1; then
+ # AIX install. It has an incompatible calling convention.
+ :
+ else
+ ac_cv_path_install="$ac_dir/$ac_prog -c"
+ break 2
+ fi
+ fi
+ done
+ ;;
+ esac
+ done
+ IFS="$ac_save_IFS"
+
+fi
+ if test "${ac_cv_path_install+set}" = set; then
+ INSTALL="$ac_cv_path_install"
+ else
+ # As a last resort, use the slow shell script. We don't cache a
+ # path for INSTALL within a source directory, because that will
+ # break other packages using the cache if that directory is
+ # removed, or if the path is relative.
+ INSTALL="$ac_install_sh"
+ fi
+fi
+echo "$ac_t""$INSTALL" 1>&6
+
+# Use test -z because SunOS4 sh mishandles braces in ${var-val}.
+# It thinks the first close brace ends the variable substitution.
+test -z "$INSTALL_PROGRAM" && INSTALL_PROGRAM='${INSTALL}'
+
+test -z "$INSTALL_SCRIPT" && INSTALL_SCRIPT='${INSTALL_PROGRAM}'
+
+test -z "$INSTALL_DATA" && INSTALL_DATA='${INSTALL} -m 644'
+
+if test "$enable_wordview" != "no" ;then
+# Check whether --with-wish or --without-wish was given.
+if test "${with_wish+set}" = set; then
+ withval="$with_wish"
+ WISH=$withval
+else
+
+for ac_prog in wish wish8.1 wish8.2 wish8.3 wish8.4
+do
+# Extract the first word of "$ac_prog", so it can be a program name with args.
+set dummy $ac_prog; ac_word=$2
+echo $ac_n "checking for $ac_word""... $ac_c" 1>&6
+echo "configure:960: checking for $ac_word" >&5
+if eval "test \"`echo '$''{'ac_cv_path_WISH'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ case "$WISH" in
+ /*)
+ ac_cv_path_WISH="$WISH" # Let the user override the test with a path.
+ ;;
+ ?:/*)
+ ac_cv_path_WISH="$WISH" # Let the user override the test with a dos path.
+ ;;
+ *)
+ IFS="${IFS= }"; ac_save_ifs="$IFS"; IFS=":"
+ ac_dummy="$PATH"
+ for ac_dir in $ac_dummy; do
+ test -z "$ac_dir" && ac_dir=.
+ if test -f $ac_dir/$ac_word; then
+ ac_cv_path_WISH="$ac_dir/$ac_word"
+ break
+ fi
+ done
+ IFS="$ac_save_ifs"
+ ;;
+esac
+fi
+WISH="$ac_cv_path_WISH"
+if test -n "$WISH"; then
+ echo "$ac_t""$WISH" 1>&6
+else
+ echo "$ac_t""no" 1>&6
+fi
+
+test -n "$WISH" && break
+done
+
+if test -n "$WISH"; then
+echo $ac_n "checking checking if wish version is 8.1 or above ""... $ac_c" 1>&6
+echo "configure:997: checking checking if wish version is 8.1 or above " >&5
+if eval "test \"`echo '$''{'GOOD_WISH'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ echo 'if {[info tclversion]>=7.6} {
+ puts -nonewline "yes"
+} else {
+ puts -nonewline "no"
+}
+exit'>tmp$$.tcl
+GOOD_WISH=`$WISH tmp$$.tcl`
+rm tmp$$.tcl
+
+fi
+
+echo "$ac_t""$GOOD_WISH" 1>&6
+if test "$GOOD_WISH" != yes; then
+enable_wordview=no
+fi
+fi
+
+fi
+
+fi
+# Check whether --with-install-root or --without-install-root was given.
+if test "${with_install_root+set}" = set; then
+ withval="$with_install_root"
+ installroot=$withval
+fi
+
+replsuffix=.replchars
+specsuffix=.specchars
+targetcharset=koi8-r
+sourcecharset=cp1251
+# Check whether --with-input or --without-input was given.
+if test "${with_input+set}" = set; then
+ withval="$with_input"
+ sourcecharset=$withval
+else
+ sourcecharset=cp1251
+fi
+
+# Check whether --with-output or --without-output was given.
+if test "${with_output+set}" = set; then
+ withval="$with_output"
+ targetcharset=$withval
+else
+ targetcharset=koi8-r
+fi
+
+if test "$targetcharset" = "utf-8"; then
+charsetcheck="$sourcecharset.txt"
+else
+charsetcheck="$sourcecharset.txt $targetcharset.txt"
+fi
+# Check whether --enable-charset-check or --disable-charset-check was given.
+if test "${enable_charset_check+set}" = set; then
+ enableval="$enable_charset_check"
+ if test "$enable_charset_check" = no;
+then
+ charsetcheck=""
+fi
+fi
+
+test -z "$manext" && manext=.1
+test -z "$man1dir" && man1dir=\${prefix}/man/man1
+if test -n "$WISH"; then
+# Check whether --enable-wordview or --disable-wordview was given.
+if test "${enable_wordview+set}" = set; then
+ enableval="$enable_wordview"
+ :
+fi
+
+fi
+if test "$enable_wordview" = no; then
+ installtargets=install-catdoc
+ buildtargets="catdoc xls2csv catppt"
+else
+ installtargets="install-catdoc install-wordview"
+ buildtargets="catdoc xls2csv catppt wordview"
+fi
+
+echo $ac_n "checking how to run the C preprocessor""... $ac_c" 1>&6
+echo "configure:1080: checking how to run the C preprocessor" >&5
+# On Suns, sometimes $CPP names a directory.
+if test -n "$CPP" && test -d "$CPP"; then
+ CPP=
+fi
+if test -z "$CPP"; then
+if eval "test \"`echo '$''{'ac_cv_prog_CPP'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ # This must be in double quotes, not single quotes, because CPP may get
+ # substituted into the Makefile and "${CC-cc}" will confuse make.
+ CPP="${CC-cc} -E"
+ # On the NeXT, cc -E runs the code through the compiler's parser,
+ # not just through cpp.
+ cat > conftest.$ac_ext <<EOF
+#line 1095 "configure"
+#include "confdefs.h"
+#include <assert.h>
+Syntax Error
+EOF
+ac_try="$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out"
+{ (eval echo configure:1101: \"$ac_try\") 1>&5; (eval $ac_try) 2>&5; }
+ac_err=`grep -v '^ *+' conftest.out | grep -v "^conftest.${ac_ext}\$"`
+if test -z "$ac_err"; then
+ :
+else
+ echo "$ac_err" >&5
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ CPP="${CC-cc} -E -traditional-cpp"
+ cat > conftest.$ac_ext <<EOF
+#line 1112 "configure"
+#include "confdefs.h"
+#include <assert.h>
+Syntax Error
+EOF
+ac_try="$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out"
+{ (eval echo configure:1118: \"$ac_try\") 1>&5; (eval $ac_try) 2>&5; }
+ac_err=`grep -v '^ *+' conftest.out | grep -v "^conftest.${ac_ext}\$"`
+if test -z "$ac_err"; then
+ :
+else
+ echo "$ac_err" >&5
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ CPP="${CC-cc} -nologo -E"
+ cat > conftest.$ac_ext <<EOF
+#line 1129 "configure"
+#include "confdefs.h"
+#include <assert.h>
+Syntax Error
+EOF
+ac_try="$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out"
+{ (eval echo configure:1135: \"$ac_try\") 1>&5; (eval $ac_try) 2>&5; }
+ac_err=`grep -v '^ *+' conftest.out | grep -v "^conftest.${ac_ext}\$"`
+if test -z "$ac_err"; then
+ :
+else
+ echo "$ac_err" >&5
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ CPP=/lib/cpp
+fi
+rm -f conftest*
+fi
+rm -f conftest*
+fi
+rm -f conftest*
+ ac_cv_prog_CPP="$CPP"
+fi
+ CPP="$ac_cv_prog_CPP"
+else
+ ac_cv_prog_CPP="$CPP"
+fi
+echo "$ac_t""$CPP" 1>&6
+
+echo $ac_n "checking for ANSI C header files""... $ac_c" 1>&6
+echo "configure:1160: checking for ANSI C header files" >&5
+if eval "test \"`echo '$''{'ac_cv_header_stdc'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ cat > conftest.$ac_ext <<EOF
+#line 1165 "configure"
+#include "confdefs.h"
+#include <stdlib.h>
+#include <stdarg.h>
+#include <string.h>
+#include <float.h>
+EOF
+ac_try="$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out"
+{ (eval echo configure:1173: \"$ac_try\") 1>&5; (eval $ac_try) 2>&5; }
+ac_err=`grep -v '^ *+' conftest.out | grep -v "^conftest.${ac_ext}\$"`
+if test -z "$ac_err"; then
+ rm -rf conftest*
+ ac_cv_header_stdc=yes
+else
+ echo "$ac_err" >&5
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ ac_cv_header_stdc=no
+fi
+rm -f conftest*
+
+if test $ac_cv_header_stdc = yes; then
+ # SunOS 4.x string.h does not declare mem*, contrary to ANSI.
+cat > conftest.$ac_ext <<EOF
+#line 1190 "configure"
+#include "confdefs.h"
+#include <string.h>
+EOF
+if (eval "$ac_cpp conftest.$ac_ext") 2>&5 |
+ egrep "memchr" >/dev/null 2>&1; then
+ :
+else
+ rm -rf conftest*
+ ac_cv_header_stdc=no
+fi
+rm -f conftest*
+
+fi
+
+if test $ac_cv_header_stdc = yes; then
+ # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI.
+cat > conftest.$ac_ext <<EOF
+#line 1208 "configure"
+#include "confdefs.h"
+#include <stdlib.h>
+EOF
+if (eval "$ac_cpp conftest.$ac_ext") 2>&5 |
+ egrep "free" >/dev/null 2>&1; then
+ :
+else
+ rm -rf conftest*
+ ac_cv_header_stdc=no
+fi
+rm -f conftest*
+
+fi
+
+if test $ac_cv_header_stdc = yes; then
+ # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi.
+if test "$cross_compiling" = yes; then
+ :
+else
+ cat > conftest.$ac_ext <<EOF
+#line 1229 "configure"
+#include "confdefs.h"
+#include <ctype.h>
+#define ISLOWER(c) ('a' <= (c) && (c) <= 'z')
+#define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c))
+#define XOR(e, f) (((e) && !(f)) || (!(e) && (f)))
+int main () { int i; for (i = 0; i < 256; i++)
+if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) exit(2);
+exit (0); }
+
+EOF
+if { (eval echo configure:1240: \"$ac_link\") 1>&5; (eval $ac_link) 2>&5; } && test -s conftest${ac_exeext} && (./conftest; exit) 2>/dev/null
+then
+ :
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -fr conftest*
+ ac_cv_header_stdc=no
+fi
+rm -fr conftest*
+fi
+
+fi
+fi
+
+echo "$ac_t""$ac_cv_header_stdc" 1>&6
+if test $ac_cv_header_stdc = yes; then
+ cat >> confdefs.h <<\EOF
+#define STDC_HEADERS 1
+EOF
+
+fi
+
+for ac_hdr in unistd.h
+do
+ac_safe=`echo "$ac_hdr" | sed 'y%./+-%__p_%'`
+echo $ac_n "checking for $ac_hdr""... $ac_c" 1>&6
+echo "configure:1267: checking for $ac_hdr" >&5
+if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ cat > conftest.$ac_ext <<EOF
+#line 1272 "configure"
+#include "confdefs.h"
+#include <$ac_hdr>
+EOF
+ac_try="$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out"
+{ (eval echo configure:1277: \"$ac_try\") 1>&5; (eval $ac_try) 2>&5; }
+ac_err=`grep -v '^ *+' conftest.out | grep -v "^conftest.${ac_ext}\$"`
+if test -z "$ac_err"; then
+ rm -rf conftest*
+ eval "ac_cv_header_$ac_safe=yes"
+else
+ echo "$ac_err" >&5
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ eval "ac_cv_header_$ac_safe=no"
+fi
+rm -f conftest*
+fi
+if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then
+ echo "$ac_t""yes" 1>&6
+ ac_tr_hdr=HAVE_`echo $ac_hdr | sed 'y%abcdefghijklmnopqrstuvwxyz./-%ABCDEFGHIJKLMNOPQRSTUVWXYZ___%'`
+ cat >> confdefs.h <<EOF
+#define $ac_tr_hdr 1
+EOF
+
+else
+ echo "$ac_t""no" 1>&6
+fi
+done
+
+
+if test -z "$enable_langinfo"; then
+enable_langinfo=yes
+fi
+# Check whether --enable-langinfo or --disable-langinfo was given.
+if test "${enable_langinfo+set}" = set; then
+ enableval="$enable_langinfo"
+
+ if test "$enable_langinfo" = yes; then
+ ac_safe=`echo "langinfo.h" | sed 'y%./+-%__p_%'`
+echo $ac_n "checking for langinfo.h""... $ac_c" 1>&6
+echo "configure:1314: checking for langinfo.h" >&5
+if eval "test \"`echo '$''{'ac_cv_header_$ac_safe'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ cat > conftest.$ac_ext <<EOF
+#line 1319 "configure"
+#include "confdefs.h"
+#include <langinfo.h>
+EOF
+ac_try="$ac_cpp conftest.$ac_ext >/dev/null 2>conftest.out"
+{ (eval echo configure:1324: \"$ac_try\") 1>&5; (eval $ac_try) 2>&5; }
+ac_err=`grep -v '^ *+' conftest.out | grep -v "^conftest.${ac_ext}\$"`
+if test -z "$ac_err"; then
+ rm -rf conftest*
+ eval "ac_cv_header_$ac_safe=yes"
+else
+ echo "$ac_err" >&5
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ eval "ac_cv_header_$ac_safe=no"
+fi
+rm -f conftest*
+fi
+if eval "test \"`echo '$ac_cv_header_'$ac_safe`\" = yes"; then
+ echo "$ac_t""yes" 1>&6
+
+ echo $ac_n "checking For nl_langinfo(CODESET)""... $ac_c" 1>&6
+echo "configure:1342: checking For nl_langinfo(CODESET)" >&5
+cat > conftest.$ac_ext <<EOF
+#line 1344 "configure"
+#include "confdefs.h"
+#include <langinfo.h>
+int main() {
+nl_langinfo(CODESET);
+; return 0; }
+EOF
+if { (eval echo configure:1351: \"$ac_compile\") 1>&5; (eval $ac_compile) 2>&5; }; then
+ rm -rf conftest*
+
+ cat >> confdefs.h <<\EOF
+#define HAVE_LANGINFO 1
+EOF
+
+ enable_langinfo=yes
+
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ enable_langinfo=no
+fi
+rm -f conftest*
+echo "$ac_t""$enable_langinfo" 1>&6
+
+else
+ echo "$ac_t""no" 1>&6
+enable_langinfo=no
+fi
+
+
+ fi
+fi
+
+
+
+echo $ac_n "checking for working const""... $ac_c" 1>&6
+echo "configure:1381: checking for working const" >&5
+if eval "test \"`echo '$''{'ac_cv_c_const'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ cat > conftest.$ac_ext <<EOF
+#line 1386 "configure"
+#include "confdefs.h"
+
+int main() {
+
+/* Ultrix mips cc rejects this. */
+typedef int charset[2]; const charset x;
+/* SunOS 4.1.1 cc rejects this. */
+char const *const *ccp;
+char **p;
+/* NEC SVR4.0.2 mips cc rejects this. */
+struct point {int x, y;};
+static struct point const zero = {0,0};
+/* AIX XL C 1.02.0.0 rejects this.
+ It does not let you subtract one const X* pointer from another in an arm
+ of an if-expression whose if-part is not a constant expression */
+const char *g = "string";
+ccp = &g + (g ? g-g : 0);
+/* HPUX 7.0 cc rejects these. */
+++ccp;
+p = (char**) ccp;
+ccp = (char const *const *) p;
+{ /* SCO 3.2v4 cc rejects this. */
+ char *t;
+ char const *s = 0 ? (char *) 0 : (char const *) 0;
+
+ *t++ = 0;
+}
+{ /* Someone thinks the Sun supposedly-ANSI compiler will reject this. */
+ int x[] = {25, 17};
+ const int *foo = &x[0];
+ ++foo;
+}
+{ /* Sun SC1.0 ANSI compiler rejects this -- but not the above. */
+ typedef const int *iptr;
+ iptr p = 0;
+ ++p;
+}
+{ /* AIX XL C 1.02.0.0 rejects this saying
+ "k.c", line 2.27: 1506-025 (S) Operand must be a modifiable lvalue. */
+ struct s { int j; const int *ap[3]; };
+ struct s *b; b->j = 5;
+}
+{ /* ULTRIX-32 V3.1 (Rev 9) vcc rejects this */
+ const int foo = 10;
+}
+
+; return 0; }
+EOF
+if { (eval echo configure:1435: \"$ac_compile\") 1>&5; (eval $ac_compile) 2>&5; }; then
+ rm -rf conftest*
+ ac_cv_c_const=yes
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ ac_cv_c_const=no
+fi
+rm -f conftest*
+fi
+
+echo "$ac_t""$ac_cv_c_const" 1>&6
+if test $ac_cv_c_const = no; then
+ cat >> confdefs.h <<\EOF
+#define const
+EOF
+
+fi
+
+
+echo $ac_n "checking whether setvbuf arguments are reversed""... $ac_c" 1>&6
+echo "configure:1457: checking whether setvbuf arguments are reversed" >&5
+if eval "test \"`echo '$''{'ac_cv_func_setvbuf_reversed'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ if test "$cross_compiling" = yes; then
+ { echo "configure: error: can not run test program while cross compiling" 1>&2; exit 1; }
+else
+ cat > conftest.$ac_ext <<EOF
+#line 1465 "configure"
+#include "confdefs.h"
+#include <stdio.h>
+/* If setvbuf has the reversed format, exit 0. */
+main () {
+ /* This call has the arguments reversed.
+ A reversed system may check and see that the address of main
+ is not _IOLBF, _IONBF, or _IOFBF, and return nonzero. */
+ if (setvbuf(stdout, _IOLBF, (char *) main, BUFSIZ) != 0)
+ exit(1);
+ putc('\r', stdout);
+ exit(0); /* Non-reversed systems segv here. */
+}
+EOF
+if { (eval echo configure:1479: \"$ac_link\") 1>&5; (eval $ac_link) 2>&5; } && test -s conftest${ac_exeext} && (./conftest; exit) 2>/dev/null
+then
+ ac_cv_func_setvbuf_reversed=yes
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -fr conftest*
+ ac_cv_func_setvbuf_reversed=no
+fi
+rm -fr conftest*
+fi
+
+rm -f core core.* *.core
+fi
+
+echo "$ac_t""$ac_cv_func_setvbuf_reversed" 1>&6
+if test $ac_cv_func_setvbuf_reversed = yes; then
+ cat >> confdefs.h <<\EOF
+#define SETVBUF_REVERSED 1
+EOF
+
+fi
+
+
+if test "$GCC" = "yes"; then
+ CFLAGS="-g -O2 -Wall"
+fi
+
+cat >> confdefs.h <<EOF
+#define SOURCE_CHARSET "$sourcecharset"
+EOF
+
+cat >> confdefs.h <<EOF
+#define TARGET_CHARSET "$targetcharset"
+EOF
+
+cat >> confdefs.h <<EOF
+#define SPEC_EXT "$specsuffix"
+EOF
+
+cat >> confdefs.h <<EOF
+#define REPL_EXT "$replsuffix"
+EOF
+
+cat >> confdefs.h <<EOF
+#define UNKNOWN_CHAR "?"
+EOF
+
+for ac_func in strdup strtol
+do
+echo $ac_n "checking for $ac_func""... $ac_c" 1>&6
+echo "configure:1530: checking for $ac_func" >&5
+if eval "test \"`echo '$''{'ac_cv_func_$ac_func'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ cat > conftest.$ac_ext <<EOF
+#line 1535 "configure"
+#include "confdefs.h"
+/* System header to define __stub macros and hopefully few prototypes,
+ which can conflict with char $ac_func(); below. */
+#include <assert.h>
+/* Override any gcc2 internal prototype to avoid an error. */
+/* We use char because int might match the return type of a gcc2
+ builtin and then its argument prototype would still apply. */
+char $ac_func();
+
+int main() {
+
+/* The GNU C library defines this for functions which it implements
+ to always fail with ENOSYS. Some functions are actually named
+ something starting with __ and the normal name is an alias. */
+#if defined (__stub_$ac_func) || defined (__stub___$ac_func)
+choke me
+#else
+$ac_func();
+#endif
+
+; return 0; }
+EOF
+if { (eval echo configure:1558: \"$ac_link\") 1>&5; (eval $ac_link) 2>&5; } && test -s conftest${ac_exeext}; then
+ rm -rf conftest*
+ eval "ac_cv_func_$ac_func=yes"
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ eval "ac_cv_func_$ac_func=no"
+fi
+rm -f conftest*
+fi
+
+if eval "test \"`echo '$ac_cv_func_'$ac_func`\" = yes"; then
+ echo "$ac_t""yes" 1>&6
+ ac_tr_func=HAVE_`echo $ac_func | tr 'abcdefghijklmnopqrstuvwxyz' 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'`
+ cat >> confdefs.h <<EOF
+#define $ac_tr_func 1
+EOF
+
+else
+ echo "$ac_t""no" 1>&6
+fi
+done
+
+for ac_func in strftime
+do
+echo $ac_n "checking for $ac_func""... $ac_c" 1>&6
+echo "configure:1585: checking for $ac_func" >&5
+if eval "test \"`echo '$''{'ac_cv_func_$ac_func'+set}'`\" = set"; then
+ echo $ac_n "(cached) $ac_c" 1>&6
+else
+ cat > conftest.$ac_ext <<EOF
+#line 1590 "configure"
+#include "confdefs.h"
+/* System header to define __stub macros and hopefully few prototypes,
+ which can conflict with char $ac_func(); below. */
+#include <assert.h>
+/* Override any gcc2 internal prototype to avoid an error. */
+/* We use char because int might match the return type of a gcc2
+ builtin and then its argument prototype would still apply. */
+char $ac_func();
+
+int main() {
+
+/* The GNU C library defines this for functions which it implements
+ to always fail with ENOSYS. Some functions are actually named
+ something starting with __ and the normal name is an alias. */
+#if defined (__stub_$ac_func) || defined (__stub___$ac_func)
+choke me
+#else
+$ac_func();
+#endif
+
+; return 0; }
+EOF
+if { (eval echo configure:1613: \"$ac_link\") 1>&5; (eval $ac_link) 2>&5; } && test -s conftest${ac_exeext}; then
+ rm -rf conftest*
+ eval "ac_cv_func_$ac_func=yes"
+else
+ echo "configure: failed program was:" >&5
+ cat conftest.$ac_ext >&5
+ rm -rf conftest*
+ eval "ac_cv_func_$ac_func=no"
+fi
+rm -f conftest*
+fi
+
+if eval "test \"`echo '$ac_cv_func_'$ac_func`\" = yes"; then
+ echo "$ac_t""yes" 1>&6
+ ac_tr_func=HAVE_`echo $ac_func | tr 'abcdefghijklmnopqrstuvwxyz' 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'`
+ cat >> confdefs.h <<EOF
+#define $ac_tr_func 1
+EOF
+
+else
+ echo "$ac_t""no" 1>&6
+LIBOBJS="$LIBOBJS ${ac_func}.${ac_objext}"
+fi
+done
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+trap '' 1 2 15
+cat > confcache <<\EOF
+# This file is a shell script that caches the results of configure
+# tests run on this system so they can be shared between configure
+# scripts and configure runs. It is not useful on other systems.
+# If it contains results you don't want to keep, you may remove or edit it.
+#
+# By default, configure uses ./config.cache as the cache file,
+# creating it if it does not exist already. You can give configure
+# the --cache-file=FILE option to use a different cache file; that is
+# what configure does when it calls configure scripts in
+# subdirectories, so they share the cache.
+# Giving --cache-file=/dev/null disables caching, for debugging configure.
+# config.status only pays attention to the cache file if you give it the
+# --recheck option to rerun configure.
+#
+EOF
+# The following way of writing the cache mishandles newlines in values,
+# but we know of no workaround that is simple, portable, and efficient.
+# So, don't put newlines in cache variables' values.
+# Ultrix sh set writes to stderr and can't be redirected directly,
+# and sets the high bit in the cache file unless we assign to the vars.
+(set) 2>&1 |
+ case `(ac_space=' '; set | grep ac_space) 2>&1` in
+ *ac_space=\ *)
+ # `set' does not quote correctly, so add quotes (double-quote substitution
+ # turns \\\\ into \\, and sed turns \\ into \).
+ sed -n \
+ -e "s/'/'\\\\''/g" \
+ -e "s/^\\([a-zA-Z0-9_]*_cv_[a-zA-Z0-9_]*\\)=\\(.*\\)/\\1=\${\\1='\\2'}/p"
+ ;;
+ *)
+ # `set' quotes correctly as required by POSIX, so do not add quotes.
+ sed -n -e 's/^\([a-zA-Z0-9_]*_cv_[a-zA-Z0-9_]*\)=\(.*\)/\1=${\1=\2}/p'
+ ;;
+ esac >> confcache
+if cmp -s $cache_file confcache; then
+ :
+else
+ if test -w $cache_file; then
+ echo "updating cache $cache_file"
+ cat confcache > $cache_file
+ else
+ echo "not updating unwritable cache $cache_file"
+ fi
+fi
+rm -f confcache
+
+trap 'rm -fr conftest* confdefs* core core.* *.core $ac_clean_files; exit 1' 1 2 15
+
+test "x$prefix" = xNONE && prefix=$ac_default_prefix
+# Let make expand exec_prefix.
+test "x$exec_prefix" = xNONE && exec_prefix='${prefix}'
+
+# Any assignment to VPATH causes Sun make to only execute
+# the first set of double-colon rules, so remove it if not needed.
+# If there is a colon in the path, we need to keep it.
+if test "x$srcdir" = x.; then
+ ac_vpsub='/^[ ]*VPATH[ ]*=[^:]*$/d'
+fi
+
+trap 'rm -f $CONFIG_STATUS conftest*; exit 1' 1 2 15
+
+DEFS=-DHAVE_CONFIG_H
+
+# Without the "./", some shells look in PATH for config.status.
+: ${CONFIG_STATUS=./config.status}
+
+echo creating $CONFIG_STATUS
+rm -f $CONFIG_STATUS
+cat > $CONFIG_STATUS <<EOF
+#! /bin/sh
+# Generated automatically by configure.
+# Run this file to recreate the current configuration.
+# This directory was configured as follows,
+# on host `(hostname || uname -n) 2>/dev/null | sed 1q`:
+#
+# $0 $ac_configure_args
+#
+# Compiler output produced by configure, useful for debugging
+# configure, is in ./config.log if it exists.
+
+ac_cs_usage="Usage: $CONFIG_STATUS [--recheck] [--version] [--help]"
+for ac_option
+do
+ case "\$ac_option" in
+ -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r)
+ echo "running \${CONFIG_SHELL-/bin/sh} $0 $ac_configure_args --no-create --no-recursion"
+ exec \${CONFIG_SHELL-/bin/sh} $0 $ac_configure_args --no-create --no-recursion ;;
+ -version | --version | --versio | --versi | --vers | --ver | --ve | --v)
+ echo "$CONFIG_STATUS generated by autoconf version 2.13"
+ exit 0 ;;
+ -help | --help | --hel | --he | --h)
+ echo "\$ac_cs_usage"; exit 0 ;;
+ *) echo "\$ac_cs_usage"; exit 1 ;;
+ esac
+done
+
+ac_given_srcdir=$srcdir
+ac_given_INSTALL="$INSTALL"
+
+trap 'rm -fr `echo "doc/Makefile charsets/Makefile src/Makefile Makefile doc/catdoc.1 doc/xls2csv.1 doc/wordview.1 doc/catppt.1 src/config.h" | sed "s/:[^ ]*//g"` conftest*; exit 1' 1 2 15
+EOF
+cat >> $CONFIG_STATUS <<EOF
+
+# Protect against being on the right side of a sed subst in config.status.
+sed 's/%@/@@/; s/@%/@@/; s/%g\$/@g/; /@g\$/s/[\\\\&%]/\\\\&/g;
+ s/@@/%@/; s/@@/@%/; s/@g\$/%g/' > conftest.subs <<\\CEOF
+$ac_vpsub
+$extrasub
+s%@SHELL@%$SHELL%g
+s%@CFLAGS@%$CFLAGS%g
+s%@CPPFLAGS@%$CPPFLAGS%g
+s%@CXXFLAGS@%$CXXFLAGS%g
+s%@FFLAGS@%$FFLAGS%g
+s%@DEFS@%$DEFS%g
+s%@LDFLAGS@%$LDFLAGS%g
+s%@LIBS@%$LIBS%g
+s%@exec_prefix@%$exec_prefix%g
+s%@prefix@%$prefix%g
+s%@program_transform_name@%$program_transform_name%g
+s%@bindir@%$bindir%g
+s%@sbindir@%$sbindir%g
+s%@libexecdir@%$libexecdir%g
+s%@datadir@%$datadir%g
+s%@sysconfdir@%$sysconfdir%g
+s%@sharedstatedir@%$sharedstatedir%g
+s%@localstatedir@%$localstatedir%g
+s%@libdir@%$libdir%g
+s%@includedir@%$includedir%g
+s%@oldincludedir@%$oldincludedir%g
+s%@infodir@%$infodir%g
+s%@mandir@%$mandir%g
+s%@CC@%$CC%g
+s%@INSTALL_PROGRAM@%$INSTALL_PROGRAM%g
+s%@INSTALL_SCRIPT@%$INSTALL_SCRIPT%g
+s%@INSTALL_DATA@%$INSTALL_DATA%g
+s%@WISH@%$WISH%g
+s%@CPP@%$CPP%g
+s%@LIBOBJS@%$LIBOBJS%g
+s%@specsuffix@%$specsuffix%g
+s%@replsuffix@%$replsuffix%g
+s%@buildtargets@%$buildtargets%g
+s%@installtargets@%$installtargets%g
+s%@targetcharset@%$targetcharset%g
+s%@sourcecharset@%$sourcecharset%g
+s%@man1dir@%$man1dir%g
+s%@manext@%$manext%g
+s%@charsetcheck@%$charsetcheck%g
+s%@installroot@%$installroot%g
+s%@catdoc_version@%$catdoc_version%g
+s%@WORDS_BIGENDIAN DEFS@%$WORDS_BIGENDIAN DEFS%g
+
+CEOF
+EOF
+
+cat >> $CONFIG_STATUS <<\EOF
+
+# Split the substitutions into bite-sized pieces for seds with
+# small command number limits, like on Digital OSF/1 and HP-UX.
+ac_max_sed_cmds=90 # Maximum number of lines to put in a sed script.
+ac_file=1 # Number of current file.
+ac_beg=1 # First line for current file.
+ac_end=$ac_max_sed_cmds # Line after last line for current file.
+ac_more_lines=:
+ac_sed_cmds=""
+while $ac_more_lines; do
+ if test $ac_beg -gt 1; then
+ sed "1,${ac_beg}d; ${ac_end}q" conftest.subs > conftest.s$ac_file
+ else
+ sed "${ac_end}q" conftest.subs > conftest.s$ac_file
+ fi
+ if test ! -s conftest.s$ac_file; then
+ ac_more_lines=false
+ rm -f conftest.s$ac_file
+ else
+ if test -z "$ac_sed_cmds"; then
+ ac_sed_cmds="sed -f conftest.s$ac_file"
+ else
+ ac_sed_cmds="$ac_sed_cmds | sed -f conftest.s$ac_file"
+ fi
+ ac_file=`expr $ac_file + 1`
+ ac_beg=$ac_end
+ ac_end=`expr $ac_end + $ac_max_sed_cmds`
+ fi
+done
+if test -z "$ac_sed_cmds"; then
+ ac_sed_cmds=cat
+fi
+EOF
+
+cat >> $CONFIG_STATUS <<EOF
+
+CONFIG_FILES=\${CONFIG_FILES-"doc/Makefile charsets/Makefile src/Makefile Makefile doc/catdoc.1 doc/xls2csv.1 doc/wordview.1 doc/catppt.1"}
+EOF
+cat >> $CONFIG_STATUS <<\EOF
+for ac_file in .. $CONFIG_FILES; do if test "x$ac_file" != x..; then
+ # Support "outfile[:infile[:infile...]]", defaulting infile="outfile.in".
+ case "$ac_file" in
+ *:*) ac_file_in=`echo "$ac_file"|sed 's%[^:]*:%%'`
+ ac_file=`echo "$ac_file"|sed 's%:.*%%'` ;;
+ *) ac_file_in="${ac_file}.in" ;;
+ esac
+
+ # Adjust a relative srcdir, top_srcdir, and INSTALL for subdirectories.
+
+ # Remove last slash and all that follows it. Not all systems have dirname.
+ ac_dir=`echo $ac_file|sed 's%/[^/][^/]*$%%'`
+ if test "$ac_dir" != "$ac_file" && test "$ac_dir" != .; then
+ # The file is in a subdirectory.
+ test ! -d "$ac_dir" && mkdir "$ac_dir"
+ ac_dir_suffix="/`echo $ac_dir|sed 's%^\./%%'`"
+ # A "../" for each directory in $ac_dir_suffix.
+ ac_dots=`echo $ac_dir_suffix|sed 's%/[^/]*%../%g'`
+ else
+ ac_dir_suffix= ac_dots=
+ fi
+
+ case "$ac_given_srcdir" in
+ .) srcdir=.
+ if test -z "$ac_dots"; then top_srcdir=.
+ else top_srcdir=`echo $ac_dots|sed 's%/$%%'`; fi ;;
+ /*) srcdir="$ac_given_srcdir$ac_dir_suffix"; top_srcdir="$ac_given_srcdir" ;;
+ *) # Relative path.
+ srcdir="$ac_dots$ac_given_srcdir$ac_dir_suffix"
+ top_srcdir="$ac_dots$ac_given_srcdir" ;;
+ esac
+
+ case "$ac_given_INSTALL" in
+ [/$]*) INSTALL="$ac_given_INSTALL" ;;
+ *) INSTALL="$ac_dots$ac_given_INSTALL" ;;
+ esac
+
+ echo creating "$ac_file"
+ rm -f "$ac_file"
+ configure_input="Generated automatically from `echo $ac_file_in|sed 's%.*/%%'` by configure."
+ case "$ac_file" in
+ *Makefile*) ac_comsub="1i\\
+# $configure_input" ;;
+ *) ac_comsub= ;;
+ esac
+
+ ac_file_inputs=`echo $ac_file_in|sed -e "s%^%$ac_given_srcdir/%" -e "s%:% $ac_given_srcdir/%g"`
+ sed -e "$ac_comsub
+s%@configure_input@%$configure_input%g
+s%@srcdir@%$srcdir%g
+s%@top_srcdir@%$top_srcdir%g
+s%@INSTALL@%$INSTALL%g
+" $ac_file_inputs | (eval "$ac_sed_cmds") > $ac_file
+fi; done
+rm -f conftest.s*
+
+# These sed commands are passed to sed as "A NAME B NAME C VALUE D", where
+# NAME is the cpp macro being defined and VALUE is the value it is being given.
+#
+# ac_d sets the value in "#define NAME VALUE" lines.
+ac_dA='s%^\([ ]*\)#\([ ]*define[ ][ ]*\)'
+ac_dB='\([ ][ ]*\)[^ ]*%\1#\2'
+ac_dC='\3'
+ac_dD='%g'
+# ac_u turns "#undef NAME" with trailing blanks into "#define NAME VALUE".
+ac_uA='s%^\([ ]*\)#\([ ]*\)undef\([ ][ ]*\)'
+ac_uB='\([ ]\)%\1#\2define\3'
+ac_uC=' '
+ac_uD='\4%g'
+# ac_e turns "#undef NAME" without trailing blanks into "#define NAME VALUE".
+ac_eA='s%^\([ ]*\)#\([ ]*\)undef\([ ][ ]*\)'
+ac_eB='$%\1#\2define\3'
+ac_eC=' '
+ac_eD='%g'
+
+if test "${CONFIG_HEADERS+set}" != set; then
+EOF
+cat >> $CONFIG_STATUS <<EOF
+ CONFIG_HEADERS="src/config.h"
+EOF
+cat >> $CONFIG_STATUS <<\EOF
+fi
+for ac_file in .. $CONFIG_HEADERS; do if test "x$ac_file" != x..; then
+ # Support "outfile[:infile[:infile...]]", defaulting infile="outfile.in".
+ case "$ac_file" in
+ *:*) ac_file_in=`echo "$ac_file"|sed 's%[^:]*:%%'`
+ ac_file=`echo "$ac_file"|sed 's%:.*%%'` ;;
+ *) ac_file_in="${ac_file}.in" ;;
+ esac
+
+ echo creating $ac_file
+
+ rm -f conftest.frag conftest.in conftest.out
+ ac_file_inputs=`echo $ac_file_in|sed -e "s%^%$ac_given_srcdir/%" -e "s%:% $ac_given_srcdir/%g"`
+ cat $ac_file_inputs > conftest.in
+
+EOF
+
+# Transform confdefs.h into a sed script conftest.vals that substitutes
+# the proper values into config.h.in to produce config.h. And first:
+# Protect against being on the right side of a sed subst in config.status.
+# Protect against being in an unquoted here document in config.status.
+rm -f conftest.vals
+cat > conftest.hdr <<\EOF
+s/[\\&%]/\\&/g
+s%[\\$`]%\\&%g
+s%#define \([A-Za-z_][A-Za-z0-9_]*\) *\(.*\)%${ac_dA}\1${ac_dB}\1${ac_dC}\2${ac_dD}%gp
+s%ac_d%ac_u%gp
+s%ac_u%ac_e%gp
+EOF
+sed -n -f conftest.hdr confdefs.h > conftest.vals
+rm -f conftest.hdr
+
+# This sed command replaces #undef with comments. This is necessary, for
+# example, in the case of _POSIX_SOURCE, which is predefined and required
+# on some systems where configure will not decide to define it.
+cat >> conftest.vals <<\EOF
+s%^[ ]*#[ ]*undef[ ][ ]*[a-zA-Z_][a-zA-Z_0-9]*%/* & */%
+EOF
+
+# Break up conftest.vals because some shells have a limit on
+# the size of here documents, and old seds have small limits too.
+
+rm -f conftest.tail
+while :
+do
+ ac_lines=`grep -c . conftest.vals`
+ # grep -c gives empty output for an empty file on some AIX systems.
+ if test -z "$ac_lines" || test "$ac_lines" -eq 0; then break; fi
+ # Write a limited-size here document to conftest.frag.
+ echo ' cat > conftest.frag <<CEOF' >> $CONFIG_STATUS
+ sed ${ac_max_here_lines}q conftest.vals >> $CONFIG_STATUS
+ echo 'CEOF
+ sed -f conftest.frag conftest.in > conftest.out
+ rm -f conftest.in
+ mv conftest.out conftest.in
+' >> $CONFIG_STATUS
+ sed 1,${ac_max_here_lines}d conftest.vals > conftest.tail
+ rm -f conftest.vals
+ mv conftest.tail conftest.vals
+done
+rm -f conftest.vals
+
+cat >> $CONFIG_STATUS <<\EOF
+ rm -f conftest.frag conftest.h
+ echo "/* $ac_file. Generated automatically by configure. */" > conftest.h
+ cat conftest.in >> conftest.h
+ rm -f conftest.in
+ if cmp -s $ac_file conftest.h 2>/dev/null; then
+ echo "$ac_file is unchanged"
+ rm -f conftest.h
+ else
+ # Remove last slash and all that follows it. Not all systems have dirname.
+ ac_dir=`echo $ac_file|sed 's%/[^/][^/]*$%%'`
+ if test "$ac_dir" != "$ac_file" && test "$ac_dir" != .; then
+ # The file is in a subdirectory.
+ test ! -d "$ac_dir" && mkdir "$ac_dir"
+ fi
+ rm -f $ac_file
+ mv conftest.h $ac_file
+ fi
+fi; done
+
+EOF
+cat >> $CONFIG_STATUS <<EOF
+
+EOF
+cat >> $CONFIG_STATUS <<\EOF
+
+exit 0
+EOF
+chmod +x $CONFIG_STATUS
+rm -fr confdefs* $ac_clean_files
+test "$no_create" = yes || ${CONFIG_SHELL-/bin/sh} $CONFIG_STATUS || exit 1
+
--- /dev/null
+dnl Process this file with autoconf to produce a configure script.
+AC_INIT(acconfig.h)
+catdoc_version=0.94.1
+dnl Checks for programs.
+AC_PROG_CC
+case ${CC} in
+*djgpp*) ac_cv_c_bigendian=no
+ ac_cv_func_setvbuf_reversed=no
+;;
+*)
+;;
+esac
+AC_C_BIGENDIAN
+AC_PROG_INSTALL
+if test "$enable_wordview" != "no" ;then
+AC_ARG_WITH(wish,[ --with-wish=fullpath allows to specify full path for Tk interpreter to
+ avoid checking, which requires X],
+WISH=$withval,[
+AC_PATH_PROGS(WISH,wish wish8.1 wish8.2 wish8.3 wish8.4)
+if test -n "$WISH"; then
+AC_CACHE_CHECK(checking if wish version is 8.1 or above ,GOOD_WISH,
+changequote(<<,>>)dnl
+<<changequote(<<,>>)dnl>>
+<<echo 'if {[info tclversion]>=7.6} {
+ puts -nonewline "yes"
+} else {
+ puts -nonewline "no"
+}
+exit'>tmp$$.tcl
+GOOD_WISH=`$WISH tmp$$.tcl`
+rm tmp$$.tcl >>
+<<changequote([, ])dnl>>
+changequote([, ])dnl
+)
+if test "$GOOD_WISH" != yes; then
+enable_wordview=no
+fi
+fi
+])
+fi
+AC_ARG_WITH(install-root,[ --with-install-root=path
+ allows to install catdoc into other directory
+ than compilied-in path points to],installroot=$withval,)
+replsuffix=.replchars
+specsuffix=.specchars
+targetcharset=koi8-r
+sourcecharset=cp1251
+AC_ARG_WITH(input,[ --with-input=charset set default charset to expect in 8-bit word files],sourcecharset=$withval,sourcecharset=cp1251)
+AC_ARG_WITH(output,[ --with-output=charset set default charset to output],targetcharset=$withval,targetcharset=koi8-r)
+if test "$targetcharset" = "utf-8"; then
+charsetcheck="$sourcecharset.txt"
+else
+charsetcheck="$sourcecharset.txt $targetcharset.txt"
+fi
+AC_ARG_ENABLE(charset-check,[ --disable-charset-check allow make in charsets directory to succeed
+ even if files for default charsets are not found],[if test "$enable_charset_check" = no;
+then
+ charsetcheck=""
+fi])
+test -z "$manext" && manext=.1
+test -z "$man1dir" && man1dir=\${prefix}/man/man1
+if test -n "$WISH"; then
+AC_ARG_ENABLE(wordview,[ --disable-wordview Do not install tcl-tk
+ wrapper],[:],)
+fi
+if test "$enable_wordview" = no; then
+ installtargets=install-catdoc
+ buildtargets="catdoc xls2csv catppt"
+else
+ installtargets="install-catdoc install-wordview"
+ buildtargets="catdoc xls2csv catppt wordview"
+fi
+dnl Checks for libraries.
+
+dnl Checks for header files.
+AC_HEADER_STDC
+AC_CHECK_HEADERS(unistd.h)
+
+if test -z "$enable_langinfo"; then
+enable_langinfo=yes
+fi
+AC_ARG_ENABLE(langinfo,[ --disable-langinfo Do not use system
+locale for output encoding],[
+ if test "$enable_langinfo" = yes; then
+ AC_CHECK_HEADER(langinfo.h,[
+ AC_MSG_CHECKING([For nl_langinfo(CODESET)])
+AC_TRY_COMPILE([#include <langinfo.h>],[nl_langinfo(CODESET);],[
+ AC_DEFINE(HAVE_LANGINFO,1,[Define this if you have XPG4 comliant nl_langinfo, which accepts CODESET argument])
+ enable_langinfo=yes
+ ],[enable_langinfo=no])
+AC_MSG_RESULT([$enable_langinfo])
+ ],[enable_langinfo=no])
+
+ fi],[])
+
+
+dnl Checks for typedefs, structures, and compiler characteristics.
+AC_C_CONST
+
+dnl Checks for library functions.
+AC_FUNC_SETVBUF_REVERSED
+
+if test "$GCC" = "yes"; then
+ CFLAGS="-g -O2 -Wall"
+fi
+
+AC_DEFINE_UNQUOTED([SOURCE_CHARSET],"$sourcecharset",[Character encoding used by default for 8-bit source files])
+AC_DEFINE_UNQUOTED([TARGET_CHARSET],"$targetcharset",[Output character encoding used by default, if impossible to determine encoding from locale])
+AC_DEFINE_UNQUOTED([SPEC_EXT],"$specsuffix",[Suffix for files with special symbols map (ones to be replaced regardless of availability in target encoding)])
+AC_DEFINE_UNQUOTED([REPL_EXT],"$replsuffix",[Suffix for symbols replacement map (what to do with symbols, which are not available in the target encoding)])
+AC_DEFINE_UNQUOTED([UNKNOWN_CHAR],"?",[Symbol to represent character which is not available either in target encoding or in replacement map])
+AC_CHECK_FUNCS(strdup strtol)
+AC_REPLACE_FUNCS(strftime)
+AC_SUBST(specsuffix)
+AC_SUBST(replsuffix)
+AC_SUBST(buildtargets)
+AC_SUBST(installtargets)
+AC_SUBST(targetcharset)
+AC_SUBST(sourcecharset)
+AC_SUBST(man1dir)
+AC_SUBST(manext)
+AC_SUBST(charsetcheck)
+AC_SUBST(installroot)
+AC_SUBST(catdoc_version)
+AC_SUBST(CFLAGS)
+AC_SUBST(WORDS_BIGENDIAN DEFS)
+AC_CONFIG_HEADER(src/config.h)
+AC_OUTPUT(doc/Makefile charsets/Makefile src/Makefile Makefile doc/catdoc.1 doc/xls2csv.1 doc/wordview.1 doc/catppt.1)
--- /dev/null
+Makefile
+catdoc.1
+catdoc.txt
+catdoc.ps
+wordview.1
+xls2csv.1
+xls2csv.ps
+xls2csv.txt
--- /dev/null
+
+# Your C compilier and flags
+SHELL = /bin/sh
+installroot=@installroot@
+srcdir = @srcdir@
+top_srcdir = @top_srcdir@
+INSTALL = @INSTALL@
+VPATH = @srcdir@
+prefix = @prefix@
+exec_prefix = @exec_prefix@
+bindir = @bindir@
+libdir = @libdir@
+mandir = @man1dir@
+manext = @manext@
+%.txt: %.1
+ nroff -mandoc $< |col -bx >$@
+%.ps: %.1
+ groff -Tps -mandoc $< >$@
+
+all:
+ touch build
+
+dosdoc: catdoc.txt catdoc.ps xls2csv.txt xls2csv.ps catppt.txt catppt.txt
+clean:
+ rm build
+install: @installtargets@
+install-catdoc: catdoc.1 xls2csv.1 catppt.1
+ ../mkinstalldirs $(installroot)$(mandir)
+ $(INSTALL) -m 644 catdoc.1 $(installroot)$(mandir)/catdoc$(manext)
+ $(INSTALL) -m 644 xls2csv.1 $(installroot)$(mandir)/xls2csv$(manext)
+ $(INSTALL) -m 644 catppt.1 $(installroot)$(mandir)/catppt$(manext)
+install-wordview:
+ ../mkinstalldirs $(installroot)$(mandir)
+ $(INSTALL) -m 644 wordview.1 $(installroot)$(mandir)/wordview$(manext)
+# Following rules are primarily for developers use. I doubt that you
+# need to rebuild these versions of documentation
+distclean: catppt.txt catppt.ps catdoc.1 catdoc.txt catdoc.ps xls2csv.1 xls2csv.txt xls2csv.ps
+ rm Makefile
--- /dev/null
+.TH catdoc 1 "Version @catdoc_version@" "MS-Word reader"
+.SH NAME
+catdoc \- reads MS-Word file and puts its content as plain text on standard output
+.SH SYNOPSIS
+
+.BR catdoc " [" -vlu8btawxV "] [" -m "
+.IR number ]
+[
+.B -s
+.IR charset ]
+[
+.B -d
+.IR charset ]
+[
+.B -f
+.IR output-format ]
+.I file
+
+.SH DESCRIPTION
+
+.B catdoc
+behaves much like
+.BR cat (1)
+but it reads MS-Word file and produces human-readable text on standard output.
+Optionally it can use
+.BR latex (1)
+escape sequences for characters which have special meaning for LaTeX.
+It also makes some effort to recognize MS-Word tables, although it never
+tries to write correct headers for LaTeX tabular environment. Additional
+output formats, such is HTML can be easily defined.
+.PP
+.B catdoc
+doesn't attempt to extract formatting information other than tables from
+MS-Word document, so different output modes means mainly that different
+characters should be escaped and different ways used to represent characters,
+missing from output charset. See CHARACTER SUBSTITUTION below
+
+.PP
+.B catdoc
+uses internal
+.BR unicode (4)
+representation of text, so it is able to convert texts when charset in
+source document doesn't match charset on target system.
+See CHARACTER SETS below.
+.PP
+If no file names supplied,
+.B catdoc
+processes its standard input unless it is terminal. It is unlikely that
+somebody could type Word document from keyboard, so if
+.B catdoc
+invoked without arguments and stdin is not redirected, it prints brief
+usage message and exits.
+Processing of standard input (even among other files) can be forced using
+dash '-' as file name.
+.PP
+By default,
+.B catdoc
+wraps lines which are more than 72 chars long and separates paragraphs by
+blank lines. This behavior can be turned of by
+.B -w
+switch. In
+.I wide
+mode
+.B catdoc prints each paragraph as one long line, suitable for import into
+word processors which perform word wrapping theirselves.
+
+
+.SH OPTIONS
+.TP 8
+.B -a
+- shortcut for -f ascii. Produces ASCII text as output.
+Separates table columns with TAB
+.TP 8
+.B -b
+- process broken MS-Word file. Normally,
+.B catdoc checks if first 8 bytes
+of file is Microsoft OLE signature. If so, it processes file, otherwise
+it just copies it to stdin. It is intended to use
+.B catdoc
+as filter for viewing all files with
+.I .doc
+extension.
+.TP 8
+.BI -d charset
+- specifies destination charset name. Charset file has format described in
+CHARACTER SETS below and should have
+.B .txt
+extension and reside in
+.B catdoc library directory ( @libdir@/catdoc). By default, current
+locale charset is used if langinfo support compiled in.
+.TP 8
+.BI -f format
+- specifies output format as described in CHARACTER SUBSTITUTION below.
+.B catdoc
+comes with two output formats - ascii and tex. You can add your own if you
+wish.
+.TP 8
+.B -l
+Causes
+.B catdoc
+to list names of available charsets to the stdout and exit successfully.
+.TP 8
+.BI -m number
+Specifies right margin for text (default 72).
+.B -m 0
+is equivalent to
+.B -w
+.TP 8
+.BI -s charset
+Specifies source charset. (one used in Word document), if Word document
+doesn't contain UTF-16 text. When reading rtf documents, it is
+typically not necessary, because rtf documents contain ansicpg
+specification. But it can be set wrong by Word (I've seen RTF documents
+on Russian, where cp1252 was specified). In this case this option would
+take precedence over charset, specified in the document. But
+source_charset statement in the configuration file have less priority
+than charset in the document.
+.TP 8
+.B -t
+- shortcut for
+.B -f tex
+ converts all printable chars, which have special meaning for
+.BR LaTeX (1)
+into appropriate control sequences. Separates table columns by
+.BR &.
+.TP 8
+.B -u
+- declares that Word document contain UNICODE (UTF-16) representation
+of text (as some Word-97 documents). If catdoc fails to correct Word document
+with default charset, try this option.
+.TP 8
+.B -8
+- declares is Word document is 8 bit. Just in case that catdoc
+ recognizes file format incorrectly.
+.TP 8
+.B -w
+disables word wrapping. By default
+.B catdoc
+output is splitted into lines not longer than 72 (or number, specified by
+-m option) characters and paragraphs
+are separated by blank line. With this option each paragraph is one
+long line.
+.TP 8
+.B -x
+causes catdoc to output unknown UNICODE character as \\xNNNN, instead
+of question marks.
+.TP 8
+.B -v
+causes catdoc to print some useless information about word document
+structure to stdout before actual start of text.
+.TP 8
+.B -V
+outputs catdoc version
+
+.SH CHARACTER SETS
+When processing MS-Word file
+.B catdoc
+uses information about two character sets, typically different
+ - input and output. They are stored in plain text files in
+.B catdoc
+library directory. Character set files should contain two whitespace-separated
+hexadecimal numbers - 8-bit code in character set and 16-bit Unicode code.
+Anything from hash mark to end of line is ignored, as well as blank lines.
+
+.B catdoc
+distribution includes some of these character sets. Additional character set
+definitions, directly usable by
+.B catdoc
+can be obtained from ftp.unicode.org. Charset files have
+.B .txt
+suffix, which shouldn't be specified in command-line or configuration
+files.
+.PP
+Note that
+.B catdoc
+is distributed with Cyrillic charsets as default. If you are not
+Russian, you probably don't want it, an should reconfigure catdoc at
+compile time or in runtime configuration file.
+.PP
+When dealing with documents with charsets other than default, remember
+that Microsoft never uses ISO charsets. While letters in, say cp1252 are
+at the same position as in ISO-8859-1, some punctuation signs would be
+lost, if you specify ISO-8859-1 as input charset. If you use cp1252,
+catdoc would deal with those signs as described in CHARACTER
+SUBSTITUTION below.
+
+.SH CHARACTER SUBSTITUTION
+.B catdoc
+converts MS-Word file into following internal Unicode representation:
+.TP 4
+1. Paragraphs are separated by ASCII Line Feed symbol (0x000A)
+.TP 4
+2. Table cells within row are separated by ASCII Field Separator symbol
+(0x001C)
+.TP 4
+3. Table rows are separated by ASCII Record Separator (0x001E)
+.TP 4
+4. All printable characters, including whitespace are represented with their
+respective UNICODE codes.
+.PP
+This UNICODE representation is subsequently converted into 8-bit text in
+target character set using following four-step algorithm:
+.TP 4
+1. List of special characters is searched for given Unicode character.
+If found, then appropriate multi-character sequence is output instead of
+character.
+.TP 4
+2. If there is an equivalent in target character set, it is output.
+.TP 4
+3. Otherwise, replacement list is searched and, if there is multi-character
+substitution for this UNICODE char, it is output.
+.TP 4
+4. If all above fails, "Unknown char" symbol (question mark) is output.
+.PP
+Lists of special characters and list of substitution are character
+set-independent, because special chars should be escaped regardless of their
+existence in target character set (usually, they are parts of US-ASCII, and
+therefore exist in any character set) and replacement list is searched only
+for those characters, which are not found in target character set.
+.PP
+These lists are stored in
+.B catdoc
+library directory in files with prefix of format name. These files have
+following format:
+.PP
+Each line can be either comment (starting with hash mark) or contain
+hexadecimal UNICODE value, separated by whitespace from string, which
+would be substituted instead of it. If string contain no whitespace it
+can be used as is, otherwise it should be enclosed in single or double
+quotes. Usual backslash sequences like
+.IR '\en' , '\et'
+can be used in these string.
+
+
+.SH RUNTIME CONFIGURATION
+Upon startup catdoc reads its system-wide configuration file (
+.B catdocrc in
+.B catdoc
+library directory) and then
+user-specific configuration file
+.BR ${HOME}/.catdocrc.
+.PP
+These files can contain following directives:
+.TP 8
+.BI "source_charset = " charset-name
+Sets default source charset, which would be used if no
+.B -s
+option specified. Consult configuration of nearby windows
+workstation to find one you need.
+.TP 8
+.BI "target_charset = " charset-name
+ Sets default output charset. You probably know, which one you use.
+.TP 8
+.BI "charset_path = " directory-list
+colon-separated list of directories, which are searched for charset files.
+This allows you to install additional charsets in your home directory.
+If first directory component of path is ~ it is replaced by contents of
+.B HOME
+environment variable.
+On MS-DOS platform, if directory name starts with %s, it is replaced
+with directory of executable file. Empty element in list (i.e. two
+consequitve colons) is considered current directory.
+.TP 8
+.BI "map_path = " directory-list
+colon-separated list of directories, which are searched for special character
+map and replacement map.
+Same substitution rules as in
+.B charset_path
+are applied.
+.TP 8
+.BI "format = " "format name"
+Output format which would be used by default.
+.B catdoc
+comes with two formats -
+.BR ascii " and " tex
+but nothing prevents you from writing your own format (set two map files -
+special character map and replacement map).
+.TP 8
+.BI "unknown_char = " "character specification"
+sets character to output instead of unknown Unicode character (default '?')
+Character specification can have one of two form - character enclosed in
+single quotes or hexadecimal code.
+.TP 8
+.BI "use_locale =" "(yes|no)"
+Enables or disables automatic selection of output charset (default
+.BR yes ),
+ based on
+system locale settings (if enabled at compile time). If automatic
+detection is enabled, than output charset settings in the configuration
+files (but not in the command line) are ignored, and current system
+locale charset is used instead. There are no automatic choice of input
+charset, based of locale language, because most modern Word files (since
+Word 97) are Unicode anyway
+
+.SH BUGS
+
+Doesn't handle
+fast-saves properly. Prints footnotes as separate paragraphs at the end of
+file, instead of producing correct LaTeX commands. Cannot distinguish
+between empty table cell and end of table row.
+
+
+
+.SH "SEE ALSO"
+
+.BR xls2csv (1),
+.BR cat (1),
+.BR strings (1),
+.BR utf (4),
+.BR unicode (4)
+
+.SH AUTHOR
+
+V.B.Wagner <vitus@45.free.net>
--- /dev/null
+.TH ppt2text 1 "Version @catdoc_version@" "MS-PowerPoint reader"
+.SH NAME
+catppt \- reads MS-PowerPoint file and puts its content on standard output
+.SH SYNOPSIS
+
+.BR "catppt " [ -lV ]
+.RB [ -b
+.IR " string " ]
+.RB [ -s
+.IR " charset " ]
+.RB [ -d
+.IR " charset " ]
+.I files
+
+.SH DESCRIPTION
+
+.B catppt
+reads MS-PowerPoint presentations and dumps its content to stdout.
+.SH "OPTIONS"
+.TP 8
+.BR -l
+list known charsets and exit successfully
+.TP 8
+.BI -b string
+slides break string. This string (by default - formfeed) would be output
+at the end of each slide page.
+
+.TP 8
+.BI -d charset`
+- specifies destination charset name. Charset file has format described in
+CHARACTER SETS section of
+.BR catdoc (1)
+manual page. By default, current locale
+charset would be used if langinfo support was enabled at the compile time.
+
+.TP 8
+.BI -s charset
+- specifies source charset. Typically, PowerPoint files use UNICODE
+strings with known charsets, but for some reason you may wish to
+override it.
+
+.TP 8
+.B -V
+outputs version number
+
+.SH "SEE ALSO"
+
+.BR cat (1),
+.BR catdoc (1),
+.BR xls2csv (1),
+.BR strings (1),
+.BR utf (4),
+.BR unicode (4)
+
+.SH AUTHOR
+
+Alex Ott <alexott@gmail.com>
+
--- /dev/null
+.TH wordview 1x "Version @catdoc_version@" "MS-Word reader"
+.SH NAME
+wordview \- displays text contained in MS-Word file in X window
+
+.SH SYNOPSIS
+.BR wordview " ["
+.IR filename "]"
+
+.SH DESCRIPTION
+
+.B wordview
+is simple GUI wrapper around
+.BR catdoc (1)
+which allows to browse through word file interactively. It doesn't allow
+to edit file, but allows to save plain text representation (or version
+with some TeX commands) into the file.
+.PP
+If for some reason
+.B catdoc
+doesn't recognize file encoding properly,
+.B wordview
+allows to specify encoding interactively.
+
+.SH OPTIONS
+.B wordview
+supports standard X options, supported by
+.BR wish (1)
+
+.SH X RESOURCES
+Following X resources can be used to customize
+.BR wordview look:
+
+.TP 8
+.B Wordview.Text.Background
+background color for main window
+.TP 8
+.B Wordview.Text.Foreground
+Foreground color for main window
+.TP 8
+.B Wordview.Text.selectBackground
+Background color of selected text
+.TP 8
+.B Wordview.Text.selectForeground
+Foreground color of selected text
+.TP 8
+.B Wordview.Text.Font
+Font to display text. We recommend to use fixed-width font, such as
+Courier, becouse
+.BR catdoc (1)
+is intended to convert Word into text. Either XLFD font names or
+Tk-style font specifications like
+.B {Courier 12pt}
+can be used for
+specifying font. If you use XLFD font names, usage of unicode
+(iso10646-1) fonts is recommended.
+.TP 8
+.B Wordview.Text.findMode
+How to search text. This option can have value either
+.BR exact " or " regexp
+and specifis whether text is searched for exact match or for regular
+expression by default. This behavoir can be toggled interactively via
+checkbox in the search dialog.
+.TP 8
+.B Wordview.Text.findCase
+This boolean option controls whether search is case-sensitive.
+Default is no.
+.TP 8
+.B Wordview.Menu.highlightBackground
+Background color for highlighted menu item
+.TP 8
+.B Wordview.Menu.highlightThickness
+.TP 8
+.B Wordview.Menu.activeBackground
+Background color of active menu item.
+.TP 8
+.B Wordview.Menu.activeBorderWidth
+Width (in pixels) of border around highlighted menu item. Default
+is 0, which differs from Tk global default. See
+.BR options (n)
+for more details.
+.PP
+A lot of other resource options which affect behavoir of standard Tk
+widgets can affect wordview. See Tcl/Tk manual pages for more
+information.
+
+.SH SEE ALSO
+.BR catdoc (1), wish (1), options (n)
+
+.SH AUTHOR
+Victor Wagner <vitus@45.free.net>.
+
+
--- /dev/null
+.TH xls2csv 1 "Version @catdoc_version@" "MS-Word reader"
+.SH NAME
+xls2csv \- reads MS-Excel file and puts its content as comma-separated data on standard output
+.SH SYNOPSIS
+
+.BR "xls2csv " [ -xlV ]
+.RB [ -f
+.IR " format " ]
+.RB [ -b
+.IR " string " ]
+.RB [ -s
+.IR " charset " ]
+.RB [ -d
+.IR " charset " ]
+.RB [-q
+.IR " number " ]
+.RB [ -c
+.IR " char" ]
+.I files
+
+.SH DESCRIPTION
+
+.B xls2csv
+reads MS-Excel spreadsheet and dumps its content as comma-separated
+values to stdout. Numbers are printed without delimiters, strings are
+enclosed in the double quotes. Double-quotes inside string are doubled.
+.SH "OPTIONS"
+.TP 8
+.BR -x
+print unknown Unicode chars as \exNNNN, rather than as question marks
+.TP 8
+.BR -l
+list known charsets and exit successfully
+.TP 8
+.BI -c char
+cell separator char. By default - comma.
+.TP 8
+.BI -b string
+sheet break string. This string (by default - formfeed) would be output
+at the end of each workbook page. This string is printed after page
+starting at start of line, but no linefeed would be automatically added
+at the end of string. Include newline at the ent of sheet separator if
+you want it to appear on separate line by itself
+.TP 9
+.BI -g number
+number of decimal digits in the numbers. By default maximal double
+precision (system-dependent macro DBL_DIG) is used.
+.TP 8
+.BI -q number
+set quote mode. In quote mode 0 cell contents is never quoted.
+In quote mode 1 only strings which contain spaces, double quotes or
+commas are quoted.
+In quote mode 2 (default) all cells with type string are quoted.
+In quote mode 3 all cells are quoted.
+
+.TP 8
+.BI -d charset`
+- specifies destination charset name. Charset file has format described in
+CHARACTER SETS section of
+.BR catdoc (1)
+manual page. By default, current locale
+charset would be used if langinfo support was enabled at the compile time.
+
+.TP 8
+.BI -s charset
+- specifies source charset. Typically, Excel files have CODE PAGE
+record, which denotes input charset, but for some reason you may wish to
+override it.
+.TP 8
+.BI -f format
+- specifies date/time format to use for output of all Excel date and
+time values. If this option is not specified, format, specified in
+the spreadsheet is used. On POSIX system any format, allowed by
+.BR strftime (3)
+can be used as value of this option. Under MS-DOS
+.B xls2csv
+implements limited set of
+.B strftime
+formats, namely
+.BR m ", " d ", " y ", " Y ", " b ", " l ", " p ", " H ", " M ", " S .
+
+.TP 8
+.B -V
+outputs version number
+
+.SH FILES
+${HOME}/.catdocrc, catdoc charset files and substitution map files (see
+.BR catdoc (1)
+manual page for details,
+
+.SH "SEE ALSO"
+
+.BR cat (1),
+.BR catdoc (1),
+.BR strings (1),
+.BR utf (4),
+.BR unicode (4)
+
+.SH AUTHOR
+
+V.B.Wagner <vitus@45.free.net>, based on biffview by David Rysdam
--- /dev/null
+#!/bin/sh
+#
+# install - install a program, script, or datafile
+# This comes from X11R5 (mit/util/scripts/install.sh).
+#
+# Copyright 1991 by the Massachusetts Institute of Technology
+#
+# Permission to use, copy, modify, distribute, and sell this software and its
+# documentation for any purpose is hereby granted without fee, provided that
+# the above copyright notice appear in all copies and that both that
+# copyright notice and this permission notice appear in supporting
+# documentation, and that the name of M.I.T. not be used in advertising or
+# publicity pertaining to distribution of the software without specific,
+# written prior permission. M.I.T. makes no representations about the
+# suitability of this software for any purpose. It is provided "as is"
+# without express or implied warranty.
+#
+# Calling this script install-sh is preferred over install.sh, to prevent
+# `make' implicit rules from creating a file called install from it
+# when there is no Makefile.
+#
+# This script is compatible with the BSD install script, but was written
+# from scratch. It can only install one file at a time, a restriction
+# shared with many OS's install programs.
+
+
+# set DOITPROG to echo to test this script
+
+# Don't use :- since 4.3BSD and earlier shells don't like it.
+doit="${DOITPROG-}"
+
+
+# put in absolute paths if you don't have them in your path; or use env. vars.
+
+mvprog="${MVPROG-mv}"
+cpprog="${CPPROG-cp}"
+chmodprog="${CHMODPROG-chmod}"
+chownprog="${CHOWNPROG-chown}"
+chgrpprog="${CHGRPPROG-chgrp}"
+stripprog="${STRIPPROG-strip}"
+rmprog="${RMPROG-rm}"
+mkdirprog="${MKDIRPROG-mkdir}"
+
+transformbasename=""
+transform_arg=""
+instcmd="$mvprog"
+chmodcmd="$chmodprog 0755"
+chowncmd=""
+chgrpcmd=""
+stripcmd=""
+rmcmd="$rmprog -f"
+mvcmd="$mvprog"
+src=""
+dst=""
+dir_arg=""
+
+while [ x"$1" != x ]; do
+ case $1 in
+ -c) instcmd="$cpprog"
+ shift
+ continue;;
+
+ -d) dir_arg=true
+ shift
+ continue;;
+
+ -m) chmodcmd="$chmodprog $2"
+ shift
+ shift
+ continue;;
+
+ -o) chowncmd="$chownprog $2"
+ shift
+ shift
+ continue;;
+
+ -g) chgrpcmd="$chgrpprog $2"
+ shift
+ shift
+ continue;;
+
+ -s) stripcmd="$stripprog"
+ shift
+ continue;;
+
+ -t=*) transformarg=`echo $1 | sed 's/-t=//'`
+ shift
+ continue;;
+
+ -b=*) transformbasename=`echo $1 | sed 's/-b=//'`
+ shift
+ continue;;
+
+ *) if [ x"$src" = x ]
+ then
+ src=$1
+ else
+ # this colon is to work around a 386BSD /bin/sh bug
+ :
+ dst=$1
+ fi
+ shift
+ continue;;
+ esac
+done
+
+if [ x"$src" = x ]
+then
+ echo "install: no input file specified"
+ exit 1
+else
+ true
+fi
+
+if [ x"$dir_arg" != x ]; then
+ dst=$src
+ src=""
+
+ if [ -d $dst ]; then
+ instcmd=:
+ else
+ instcmd=mkdir
+ fi
+else
+
+# Waiting for this to be detected by the "$instcmd $src $dsttmp" command
+# might cause directories to be created, which would be especially bad
+# if $src (and thus $dsttmp) contains '*'.
+
+ if [ -f $src -o -d $src ]
+ then
+ true
+ else
+ echo "install: $src does not exist"
+ exit 1
+ fi
+
+ if [ x"$dst" = x ]
+ then
+ echo "install: no destination specified"
+ exit 1
+ else
+ true
+ fi
+
+# If destination is a directory, append the input filename; if your system
+# does not like double slashes in filenames, you may need to add some logic
+
+ if [ -d $dst ]
+ then
+ dst="$dst"/`basename $src`
+ else
+ true
+ fi
+fi
+
+## this sed command emulates the dirname command
+dstdir=`echo $dst | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'`
+
+# Make sure that the destination directory exists.
+# this part is taken from Noah Friedman's mkinstalldirs script
+
+# Skip lots of stat calls in the usual case.
+if [ ! -d "$dstdir" ]; then
+defaultIFS='
+'
+IFS="${IFS-${defaultIFS}}"
+
+oIFS="${IFS}"
+# Some sh's can't handle IFS=/ for some reason.
+IFS='%'
+set - `echo ${dstdir} | sed -e 's@/@%@g' -e 's@^%@/@'`
+IFS="${oIFS}"
+
+pathcomp=''
+
+while [ $# -ne 0 ] ; do
+ pathcomp="${pathcomp}${1}"
+ shift
+
+ if [ ! -d "${pathcomp}" ] ;
+ then
+ $mkdirprog "${pathcomp}"
+ else
+ true
+ fi
+
+ pathcomp="${pathcomp}/"
+done
+fi
+
+if [ x"$dir_arg" != x ]
+then
+ $doit $instcmd $dst &&
+
+ if [ x"$chowncmd" != x ]; then $doit $chowncmd $dst; else true ; fi &&
+ if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dst; else true ; fi &&
+ if [ x"$stripcmd" != x ]; then $doit $stripcmd $dst; else true ; fi &&
+ if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dst; else true ; fi
+else
+
+# If we're going to rename the final executable, determine the name now.
+
+ if [ x"$transformarg" = x ]
+ then
+ dstfile=`basename $dst`
+ else
+ dstfile=`basename $dst $transformbasename |
+ sed $transformarg`$transformbasename
+ fi
+
+# don't allow the sed command to completely eliminate the filename
+
+ if [ x"$dstfile" = x ]
+ then
+ dstfile=`basename $dst`
+ else
+ true
+ fi
+
+# Make a temp file name in the proper directory.
+
+ dsttmp=$dstdir/#inst.$$#
+
+# Move or copy the file name to the temp name
+
+ $doit $instcmd $src $dsttmp &&
+
+ trap "rm -f ${dsttmp}" 0 &&
+
+# and set any options; do chmod last to preserve setuid bits
+
+# If any of these fail, we abort the whole thing. If we want to
+# ignore errors from any of these, just make sure not to ignore
+# errors from the above "$doit $instcmd $src $dsttmp" command.
+
+ if [ x"$chowncmd" != x ]; then $doit $chowncmd $dsttmp; else true;fi &&
+ if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dsttmp; else true;fi &&
+ if [ x"$stripcmd" != x ]; then $doit $stripcmd $dsttmp; else true;fi &&
+ if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dsttmp; else true;fi &&
+
+# Now rename the file to the real destination.
+
+ $doit $rmcmd -f $dstdir/$dstfile &&
+ $doit $mvcmd $dsttmp $dstdir/$dstfile
+
+fi &&
+
+
+exit 0
--- /dev/null
+#! /bin/sh
+# Common stub for a few missing GNU programs while installing.
+# Copyright (C) 1996, 1997 Free Software Foundation, Inc.
+# Franc,ois Pinard <pinard@iro.umontreal.ca>, 1996.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2, or (at your option)
+# any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
+# 02111-1307, USA.
+
+if test $# -eq 0; then
+ echo 1>&2 "Try \`$0 --help' for more information"
+ exit 1
+fi
+
+case "$1" in
+
+ -h|--h|--he|--hel|--help)
+ echo "\
+$0 [OPTION]... PROGRAM [ARGUMENT]...
+
+Handle \`PROGRAM [ARGUMENT]...' for when PROGRAM is missing, or return an
+error status if there is no known handling for PROGRAM.
+
+Options:
+ -h, --help display this help and exit
+ -v, --version output version information and exit
+
+Supported PROGRAM values:
+ aclocal touch file \`aclocal.m4'
+ autoconf touch file \`configure'
+ autoheader touch file \`config.h.in'
+ automake touch all \`Makefile.in' files
+ bison create \`y.tab.[ch]', if possible, from existing .[ch]
+ flex create \`lex.yy.c', if possible, from existing .c
+ lex create \`lex.yy.c', if possible, from existing .c
+ makeinfo touch the output file
+ yacc create \`y.tab.[ch]', if possible, from existing .[ch]"
+ ;;
+
+ -v|--v|--ve|--ver|--vers|--versi|--versio|--version)
+ echo "missing - GNU libit 0.0"
+ ;;
+
+ -*)
+ echo 1>&2 "$0: Unknown \`$1' option"
+ echo 1>&2 "Try \`$0 --help' for more information"
+ exit 1
+ ;;
+
+ aclocal)
+ echo 1>&2 "\
+WARNING: \`$1' is missing on your system. You should only need it if
+ you modified \`acinclude.m4' or \`configure.in'. You might want
+ to install the \`Automake' and \`Perl' packages. Grab them from
+ any GNU archive site."
+ touch aclocal.m4
+ ;;
+
+ autoconf)
+ echo 1>&2 "\
+WARNING: \`$1' is missing on your system. You should only need it if
+ you modified \`configure.in'. You might want to install the
+ \`Autoconf' and \`GNU m4' packages. Grab them from any GNU
+ archive site."
+ touch configure
+ ;;
+
+ autoheader)
+ echo 1>&2 "\
+WARNING: \`$1' is missing on your system. You should only need it if
+ you modified \`acconfig.h' or \`configure.in'. You might want
+ to install the \`Autoconf' and \`GNU m4' packages. Grab them
+ from any GNU archive site."
+ files=`sed -n 's/^[ ]*A[CM]_CONFIG_HEADER([^):]*:\([^)]*\)).*/\1/p' configure.in`
+ if test -z "$files"; then
+ files=`sed -n 's/^[ ]*A[CM]_CONFIG_HEADER(\([^):]*\)).*/\1/p' configure.in`
+ test -z "$files" || files="$files.in"
+ else
+ files=`echo "$files" | sed -e 's/:/ /g'`
+ fi
+ test -z "$files" && files="config.h.in"
+ touch $files
+ ;;
+
+ automake)
+ echo 1>&2 "\
+WARNING: \`$1' is missing on your system. You should only need it if
+ you modified \`Makefile.am', \`acinclude.m4' or \`configure.in'.
+ You might want to install the \`Automake' and \`Perl' packages.
+ Grab them from any GNU archive site."
+ find . -type f -name Makefile.am -print \
+ | sed 's/^\(.*\).am$/touch \1.in/' \
+ | sh
+ ;;
+
+ bison|yacc)
+ echo 1>&2 "\
+WARNING: \`$1' is missing on your system. You should only need it if
+ you modified a \`.y' file. You may need the \`Bison' package
+ in order for those modifications to take effect. You can get
+ \`Bison' from any GNU archive site."
+ rm -f y.tab.c y.tab.h
+ if [ $# -ne 1 ]; then
+ eval LASTARG="\${$#}"
+ case "$LASTARG" in
+ *.y)
+ SRCFILE=`echo "$LASTARG" | sed 's/y$/c/'`
+ if [ -f "$SRCFILE" ]; then
+ cp "$SRCFILE" y.tab.c
+ fi
+ SRCFILE=`echo "$LASTARG" | sed 's/y$/h/'`
+ if [ -f "$SRCFILE" ]; then
+ cp "$SRCFILE" y.tab.h
+ fi
+ ;;
+ esac
+ fi
+ if [ ! -f y.tab.h ]; then
+ echo >y.tab.h
+ fi
+ if [ ! -f y.tab.c ]; then
+ echo 'main() { return 0; }' >y.tab.c
+ fi
+ ;;
+
+ lex|flex)
+ echo 1>&2 "\
+WARNING: \`$1' is missing on your system. You should only need it if
+ you modified a \`.l' file. You may need the \`Flex' package
+ in order for those modifications to take effect. You can get
+ \`Flex' from any GNU archive site."
+ rm -f lex.yy.c
+ if [ $# -ne 1 ]; then
+ eval LASTARG="\${$#}"
+ case "$LASTARG" in
+ *.l)
+ SRCFILE=`echo "$LASTARG" | sed 's/l$/c/'`
+ if [ -f "$SRCFILE" ]; then
+ cp "$SRCFILE" lex.yy.c
+ fi
+ ;;
+ esac
+ fi
+ if [ ! -f lex.yy.c ]; then
+ echo 'main() { return 0; }' >lex.yy.c
+ fi
+ ;;
+
+ makeinfo)
+ echo 1>&2 "\
+WARNING: \`$1' is missing on your system. You should only need it if
+ you modified a \`.texi' or \`.texinfo' file, or any other file
+ indirectly affecting the aspect of the manual. The spurious
+ call might also be the consequence of using a buggy \`make' (AIX,
+ DU, IRIX). You might want to install the \`Texinfo' package or
+ the \`GNU make' package. Grab either from any GNU archive site."
+ file=`echo "$*" | sed -n 's/.*-o \([^ ]*\).*/\1/p'`
+ if test -z "$file"; then
+ file=`echo "$*" | sed 's/.* \([^ ]*\) *$/\1/'`
+ file=`sed -n '/^@setfilename/ { s/.* \([^ ]*\) *$/\1/; p; q; }' $file`
+ fi
+ touch $file
+ ;;
+
+ *)
+ echo 1>&2 "\
+WARNING: \`$1' is needed, and you do not seem to have it handy on your
+ system. You might have modified some files without having the
+ proper tools for further handling them. Check the \`README' file,
+ it often tells you about the needed prerequirements for installing
+ this package. You may also peek at any GNU archive site, in case
+ some other package would contain this missing \`$1' program."
+ exit 1
+ ;;
+esac
+
+exit 0
--- /dev/null
+#! /bin/sh
+# mkinstalldirs --- make directory hierarchy
+# Author: Noah Friedman <friedman@prep.ai.mit.edu>
+# Created: 1993-05-16
+# Public domain
+
+# $Id: mkinstalldirs,v 1.1 2006-02-24 17:44:06 vitus Exp $
+
+errstatus=0
+
+for file
+do
+ set fnord `echo ":$file" | sed -ne 's/^:\//#/;s/^://;s/\// /g;s/^#/\//;p'`
+ shift
+
+ pathcomp=
+ for d
+ do
+ pathcomp="$pathcomp$d"
+ case "$pathcomp" in
+ -* ) pathcomp=./$pathcomp ;;
+ esac
+
+ if test ! -d "$pathcomp"; then
+ echo "mkdir $pathcomp" 1>&2
+
+ mkdir "$pathcomp" || lasterr=$?
+
+ if test ! -d "$pathcomp"; then
+ errstatus=$lasterr
+ fi
+ fi
+
+ pathcomp="$pathcomp/"
+ done
+done
+
+exit $errstatus
+
+# mkinstalldirs ends here
--- /dev/null
+*-valgrind.log
+Data
+Makefile
+PRICE_LAN_ALL.XLS
+WordDocument
+aaa
+aaaa
+bbbb
+catdoc
+config.h
+core.*
+msole-excel.sheet.8-tmp1362961854
+semantic.cache
+test-ole
+test-ole.c
+test-oleparser.c
+wordview
+xls2csv
--- /dev/null
+# Your C compilier and flags
+SHELL = /bin/sh
+
+srcdir = @srcdir@
+top_srcdir = @top_srcdir@
+VPATH = @srcdir@
+prefix = @prefix@
+exec_prefix = @exec_prefix@
+installroot = @installroot@
+
+bindir = @bindir@
+libdir = @libdir@
+confdir = @sysconfdir@
+datadir = @datadir@
+mandir = @mandir@
+INSTALL = @INSTALL@
+WISH = @WISH@
+CC = @CC@
+COMPAT_OBJ=@LIBOBJS@
+build_targets = @buildtargets@
+install_targets = @installtargets@
+CHARSETPATH=${datadir}/catdoc
+# Flags to pass to your C compilier
+# remove -O2 on HP/UX 9. It is known to cause problems
+FLAGS=-I. @CFLAGS@ @DEFS@ -DCATDOC_VERSION=\"@catdoc_version@\" -DCHARSETPATH=\"${CHARSETPATH}\"
+
+#Compile-time configuration of catdoc itself. Edit according to taste
+
+#Directory, where all system-wide stuff resides
+DATA_DIR=$(datadir)/catdoc
+#directory, where catdoc binary would be placed
+#directory where to put man page
+
+#System-wide configuration file
+SYSTEMRC=$(confdir)/catdocrc
+
+#Per-user configration file (in user's home directory
+USERRC=.catdocrc
+
+# path where to search for charsets. may be colon-separated list of dirs
+
+# Extension for map of special character
+SPEC_EXT=@specsuffix@
+
+# Extension for map of substitutes (chars which are missing from target
+# charset)
+REPL_EXT=@replsuffix@
+
+# Target charset - one used on your system console or xterm
+TARGET_CHARSET=@targetcharset@
+
+# Source charset - one which your Word files are usially in unless they
+# are UNICODE
+SOURCE_CHARSET=@sourcecharset@
+
+# Character which would be printed instead of ones, not found in charset
+# or replacement map
+UNKNOWN_CHAR=?
+
+# END OF COMPILE-TIME CONFIGURATION
+
+DEFINES=-DSYSTEMRC=\"$(SYSTEMRC)\" -DUSERRC=\"$(USERRC)\"
+
+CFLAGS=$(FLAGS) $(DEFINES) @DEFS@
+
+COMMONOBJ=charsets.o substmap.o fileutil.o confutil.o numutils.o ole.o
+
+OBJ=catdoc.o reader.o writer.o analyze.o rtfread.o $(COMMONOBJ)
+
+OBJXLS=xls2csv.o sheet.o xlsparse.o $(COMMONOBJ)
+
+OBJPPT=catppt.o pptparse.o $(COMMONOBJ)
+#.c.o:
+# $(CC) -c $(CFLAGS) $*
+all: @buildtargets@
+catdoc: $(OBJ)
+ $(CC) -o catdoc $(OBJ)
+xls2csv: $(OBJXLS)
+ $(CC) -o xls2csv $(OBJXLS) -lm
+
+catppt: $(OBJPPT)
+ $(CC) -o catppt $(OBJPPT) -lm
+
+install: @installtargets@
+install-catdoc:catdoc xls2csv catppt
+ ../mkinstalldirs $(installroot)$(bindir)
+ $(INSTALL) -m 755 catdoc $(installroot)$(bindir)/catdoc
+ $(INSTALL) -m 755 xls2csv $(installroot)$(bindir)/xls2csv
+ $(INSTALL) -m 755 catppt $(installroot)$(bindir)/catppt
+install-wordview: wordview
+ ../mkinstalldirs $(installroot)$(bindir)
+ $(INSTALL) -m 755 wordview $(installroot)$(bindir)/wordview
+wordview: wordview.tcl
+ echo "#! $(WISH)" >wordview
+ echo set charset_lib "\"$(CHARSETPATH)\"">>wordview
+ cat wordview.tcl >>wordview
+ chmod 0755 wordview
+strftime.o: ../compat/strftime.c
+ $(CC) -c $(FLAGS) -o $@ $*
+clean:
+ rm -f *.o catdoc wordview xls2csv catppt
+distclean: clean
+ rm Makefile
--- /dev/null
+/*
+ Copyright 1998-2003 Victor Wagner
+ Copyright 2003 Alex Ott
+ This file is released under the GPL. Details can be
+ found in the file COPYING accompanying this distribution.
+*/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <stdio.h>
+#include <limits.h>
+#include <stdlib.h>
+#include <string.h>
+#include "catdoc.h"
+char ole_sign[]={0xD0,0xCF,0x11,0xE0,0xA1,0xB1,0x1A,0xE1,0};
+char rtf_sign[]="{\\rtf";
+char old_word_sign[]={0xdb,0xa5,0};
+char write_sign[]={0x31,0xBE,0};
+int verbose=0;
+/*********************************************************************
+ * Determines format of input file and calls parse_word_header or
+ * process_file if
+ * it is word processor file or copy_out if it is plain text file
+ * return not 0 when error
+ ********************************************************************/
+int analyze_format(FILE *f) {
+ unsigned char buffer[129];
+ long offset=0;
+ FILE *new_file, *ole_file;
+ int ret_code=69;
+
+ if (!signature_check) {
+ /* forced parsing */
+ /* no autodetect possible. Assume 8-bit if not overriden on
+ * command line */
+ if (!get_unicode_char)
+ get_unicode_char=get_8bit_char;
+ return process_file(f,LONG_MAX);
+ }
+ catdoc_read(buffer,4,1,f);
+ buffer[4]=0;
+ if (strncmp(buffer,write_sign,2)==0) {
+ printf("[Windows Write file. Some garbage expected]\n");
+ get_unicode_char=get_8bit_char;
+ return process_file(f,LONG_MAX);
+ } else if (strncmp(buffer,rtf_sign,4)==0) {
+ return parse_rtf(f);
+ } else if (strncmp(buffer,old_word_sign,2)==0) {
+ fread(buffer+4,1,124,f);
+ return parse_word_header(buffer,f,128,0);
+ }
+ fread(buffer+4,1,4,f);
+ if (strncmp(buffer,ole_sign,8)==0) {
+ if ((new_file=ole_init(f, buffer, 8)) != NULL) {
+ set_ole_func();
+ while((ole_file=ole_readdir(new_file)) != NULL) {
+ int res=ole_open(ole_file);
+ if (res >= 0) {
+ if (strcmp(((oleEntry*)ole_file)->name , "WordDocument") == 0) {
+ offset=catdoc_read(buffer, 1, 128, ole_file);
+ ret_code=parse_word_header(buffer,ole_file,-offset,offset);
+ }
+ }
+ ole_close(ole_file);
+ }
+ set_std_func();
+ ole_finish();
+ } else {
+ fprintf(stderr,"Broken OLE file. Try using -b switch");
+ exit(1);
+ }
+ } else {
+ set_std_func();
+ copy_out(f,buffer);
+ return 0;
+ }
+
+ return ret_code;
+}
+#define fDot 0x0001
+#define fGlsy 0x0002
+#define fComplex 0x0004
+#define fPictures 0x0008
+#define fEncrypted 0x100
+#define fReadOnly 0x400
+#define fReserved 0x800
+#define fExtChar 0x1000
+
+/*******************************************************************/
+/* parses word file info block passed in buffer.
+ * Determines actual size of text stream and calls process_file
+ ********************************************************************/
+int parse_word_header(unsigned char * buffer,FILE *f,int offset,long curpos) {
+ int flags,charset, ret_code=0;
+ long textstart,textlen,i;
+ char buf[2];
+
+ if (verbose) {
+ printf("File Info block version %d\n",getshort(buffer,2));
+ printf("Found at file offset %ld (hex %lx)\n",curpos,curpos);
+ printf("Written by product version %d\n",getshort(buffer,4));
+ printf("Language %d\n",getshort(buffer,6));
+ }
+ flags = getshort(buffer,10);
+ if (verbose) {
+ if ((flags & fDot)) {
+ printf("This is template (DOT) file\n");
+ } else {
+ printf("This is document (DOC) file\n");
+ }
+ if (flags & fGlsy) {
+ printf("This is glossary file\n");
+ }
+ }
+ if (flags & fComplex) {
+ fprintf(stderr,"[This was fast-saved %2d times. Some information is lost]\n",
+ (flags & 0xF0)>>4);
+/* ret_code=69;*/
+ }
+ if (verbose) {
+ if (flags & fReadOnly) {
+ printf("File is meant to be read-only\n");
+ }
+ if (flags & fReserved) {
+ printf("File is write-reserved\n");
+ }
+ }
+ if (flags & fExtChar) {
+ if (verbose) {
+ printf ("File uses extended character set\n");
+ }
+ if (!get_unicode_char)
+ get_unicode_char=get_word8_char;
+
+ } else if (!get_unicode_char)
+ get_unicode_char=get_8bit_char;
+
+ if (verbose) {
+ if (buffer[18]) {
+ printf("File created on Macintosh\n");
+ } else {
+ printf("File created on Windows\n");
+ }
+ }
+ if (flags & fEncrypted) {
+ fprintf(stderr,"[File is encrypted. Encryption key = %08lx]\n",
+ getlong(buffer,14));
+ return 69;
+ }
+ if (verbose) {
+ charset=getshort(buffer,20);
+ if (charset&&charset !=256) {
+ printf("Using character set %d\n",charset);
+ } else {
+ printf("Using default character set\n");
+ }
+ }
+ /* skipping to textstart and computing textend */
+ textstart=getlong(buffer,24);
+ textlen=getlong(buffer,28)-textstart;
+ textstart+=offset;
+ if (verbose) {
+ printf ("Textstart = %ld (hex %lx)\n",textstart+curpos,textstart+curpos);
+ printf ("Textlen = %ld (hex %lx)\n",textlen,textlen);
+ }
+ for (i=0;i<textstart;i++) {
+ catdoc_read(buf, 1, 1, f);
+ if (catdoc_eof(f)) {
+ fprintf(stderr,"File ended before textstart. Probably it is broken. Try -b switch\n");
+ exit(1);
+ }
+ }
+ return process_file(f,textlen) || ret_code;
+}
+
--- /dev/null
+/*
+ Copyright 1996-2003 Victor Wagner
+ Copyright 2003 Alex Ott
+ This file is released under the GPL. Details can be
+ found in the file COPYING accompanying this distribution.
+*/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <ctype.h>
+#include "catdoc.h"
+
+void help(void);
+
+
+int signature_check = 1;
+int forced_charset = 0; /* Flag which disallow rtf parser override charset*/
+int wrap_margin = WRAP_MARGIN;
+int (*get_unicode_char)(FILE *f,long *offset,long fileend) =NULL;
+
+char *input_buffer, *output_buffer;
+#ifdef __WATCOMC__
+/* watcom doesn't provide way to access program args via global variable */
+/* so we would hack it ourselves in Borland-compatible way*/
+char **_argv;
+int _argc;
+#endif
+/**************************************************************/
+/* Main program */
+/* Processes options, reads charsets files and substitution */
+/* maps and passes all remaining args to processfile */
+/**************************************************************/
+int main(int argc, char **argv) {
+ FILE *f;
+ int c,i;
+ char *tempname;
+ short int *tmp_charset;
+ int stdin_processed=0;
+#ifdef __WATCOMC__
+ _argv=argv;
+ _argc=argc;
+#endif
+ read_config_file(SYSTEMRC);
+#ifdef USERRC
+ tempname=find_file(strdup(USERRC),getenv("HOME"));
+ if (tempname) {
+ read_config_file(tempname);
+ free(tempname);
+ }
+#endif
+#ifdef HAVE_LANGINFO
+ get_locale_charset();
+#endif
+ while ((c=getopt(argc,argv,"Vls:d:f:taubxv8wm:"))!=-1) {
+ switch (c) {
+ case 's':
+ check_charset(&source_csname,optarg);
+ forced_charset = 1;
+ break;
+ case 'd':
+ check_charset(&dest_csname,optarg);
+ break;
+ case 'f':
+ format_name=strdup(optarg);
+ break;
+ case 't':
+ format_name=strdup("tex");
+ break;
+ case 'a':
+ format_name=strdup("ascii");
+ break;
+ case 'u':
+ get_unicode_char = get_word8_char;
+ break;
+ case '8':
+ get_unicode_char = get_8bit_char;
+ break;
+ case 'v':
+ verbose=1;
+ break;
+ case 'w':
+ wrap_margin=0; /* No wrap */
+ break;
+ case 'm': {
+ char *endptr;
+ wrap_margin = strtol(optarg,&endptr,0);
+ if (*endptr) {
+ fprintf(stderr,"Invalid wrap margin value `%s'\n",optarg);
+ exit(1);
+ }
+ break;
+ }
+ case 'l': list_charsets(); exit(0);
+ case 'b': signature_check =0; break;
+ case 'x': unknown_as_hex = 1; break;
+ case 'V': printf("Catdoc Version %s\n",CATDOC_VERSION);
+ exit(0);
+ default:
+ help();
+ exit(1);
+ }
+ }
+ input_buffer=malloc(FILE_BUFFER);
+ if (!input_buffer) {
+ fprintf(stderr,"Input buffer not allocated\n");
+ }
+ source_charset = read_charset(source_csname);
+ if (!source_charset) exit(1);
+ if (strncmp(dest_csname,"utf-8",6)) {
+ tmp_charset = read_charset(dest_csname);
+ if (!tmp_charset) exit(1);
+ target_charset= make_reverse_map(tmp_charset);
+ free(tmp_charset);
+ } else {
+ target_charset = NULL;
+ }
+ spec_chars=read_substmap(stradd(format_name,SPEC_EXT));
+ if (!spec_chars) {
+ fprintf(stderr,"Cannot read substitution map %s%s\n",format_name,
+ SPEC_EXT);
+ exit(1);
+ }
+ replacements=read_substmap(stradd(format_name,REPL_EXT));
+ if (!replacements) {
+ fprintf(stderr,"Cannot read replacement map %s%s\n",format_name,
+ REPL_EXT);
+ exit(1);
+ }
+
+ if (LINE_BUF_SIZE-longest_sequence<=wrap_margin) {
+ fprintf(stderr,"wrap margin is too large. cannot proceed\n");
+ exit(1);
+ }
+ if (!isatty(fileno(stdout))) {
+ output_buffer=malloc(FILE_BUFFER);
+ if (output_buffer) {
+ if (setvbuf(stdout,output_buffer,_IOFBF,FILE_BUFFER)) {
+ perror("stdout");
+ }
+ } else {
+ fprintf(stderr,"output buffer not allocated\n");
+ }
+ }
+ set_std_func();
+ if (optind == argc) {
+ if (isatty(fileno(stdin))) {
+ help();
+ exit(0);
+ }
+ if (input_buffer) setvbuf(stdin,input_buffer,_IOFBF,FILE_BUFFER);
+ return analyze_format(stdin);
+ }
+ c=0;
+ for (i=optind;i<argc;i++) {
+ if (!strcmp(argv[i],"-")) {
+ if (stdin_processed) {
+ fprintf(stderr,"Cannot process stdin twice\n");
+ exit(1);
+ }
+ if (input_buffer) setvbuf(stdin,input_buffer,_IOFBF,FILE_BUFFER);
+ analyze_format(stdin);
+ stdin_processed=1;
+ } else {
+ f=fopen(argv[i],"rb");
+ if (!f) {
+ c=1;
+ perror("catdoc");
+ continue;
+ }
+ if (input_buffer) {
+ if (setvbuf(f,input_buffer,_IOFBF,FILE_BUFFER)) {
+ perror(argv[i]);
+ }
+ }
+ c=analyze_format(f);
+ fclose(f);
+ }
+ }
+ return c;
+}
+/************************************************************************/
+/* Displays help message */
+/************************************************************************/
+void help (void) {
+ printf("Usage:\n catdoc [-vu8btawxlV] [-m number] [-s charset] "
+ "[-d charset] [ -f format] files\n");
+}
--- /dev/null
+/*
+ Copyright 1998-2003 Victor Wagner
+ Copyright 2003 Alex Ott
+ This file is released under the GPL. Details can be
+ found in the file COPYING accompanying this distribution.
+*/
+#ifndef CATDOC_H
+#define CATDOC_H
+
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+/* There is some strange thing on aix */
+#if (defined(_AIX)||defined(___AIX)) && !defined(__unix)
+# define __unix 1
+#endif
+
+/* These include files are always available */
+#include <stdio.h>
+#include <ctype.h>
+
+/* This is our own file */
+#include "ole.h"
+
+/*
+ * User customization
+ *
+ */
+
+
+#if defined(__MSDOS__) || defined(_WIN32)
+/* MS-DOS doesn't like dot at first char and thinks that suffix
+ * should be separated by dot. So we'd call personal config catdoc.rc
+ */
+# define USERRC "catdoc.rc"
+/* In DOS, %s in path gets replaced with full path to executable including
+ trailing backslash.
+ */
+# ifndef SYSTEMRC
+# define SYSTEMRC "%s\\catdoc.rc"
+# endif
+# ifndef CHARSETPATH
+# define CHARSETPATH "%s\\charsets"
+# endif
+/* Function to add executable directory in place of %s in path.
+ Not usable in Unix, where executable can have more then one
+ link and configuration files are usially kept separately from executables
+ */
+char *add_exe_path(const char* name);
+/* Separator of directories in list, such as PATH env var. */
+# define LIST_SEP ';'
+/* Separator of levels inside path */
+# define DIR_SEP '\\'
+#else
+/* On POSIX systems personal configuration files should start with dot*/
+# ifndef USERRC
+# define USERRC ".catdocrc"
+# endif
+
+# ifndef SYSTEMRC
+# define SYSTEMRC "/usr/local/lib/catdoc/catdocrc"
+# endif
+
+# ifndef CHARSETPATH
+# define CHARSETPATH "/usr/local/lib/catdoc"
+# endif
+/* Macro to add executable directory in place of %s in path.
+ Not usable in Unix, where executable can have more then one
+ link and configuration files are usially kept separately from executables
+ */
+# define add_exe_path(name) name
+/* Separator of directories in list, such as PATH env var. */
+# define LIST_SEP ':'
+/* Separator of levels inside path */
+#define DIR_SEP '/'
+#endif
+
+/* Charset files distributed with catdoc always have .txt extension*/
+#ifndef CHARSET_EXT
+# define CHARSET_EXT ".txt"
+#endif
+
+/* Default charsets */
+#ifndef TARGET_CHARSET
+#if defined(__MSDOS__) || defined(_WIN32)
+#define TARGET_CHARSET "cp866"
+#else
+#define TARGET_CHARSET "koi8-r"
+#endif
+#endif
+
+#ifndef SOURCE_CHARSET
+#define SOURCE_CHARSET "cp1251"
+#endif
+
+#ifndef UNKNOWN_CHAR
+#define UNKNOWN_CHAR "?"
+#endif
+/* On MS-DOS and WIN32 files have to have 3-char extension */
+#if defined(__MSDOS__) || defined(_WIN32)
+# ifndef SPEC_EXT
+# define SPEC_EXT ".spc"
+# endif
+# ifndef REPL_EXT
+# define REPL_EXT ".rpl"
+# endif
+#else
+
+/* On other system we'll rename them to something more readable */
+# ifndef SPEC_EXT
+# define SPEC_EXT ".specchars"
+# endif
+# ifndef REPL_EXT
+# define REPL_EXT ".replchars"
+# endif
+#endif
+#if defined(__MSDOS__) && !defined(__DJGPP__)
+/* Buffer sizes for 16-bit DOS progran */
+#define PARAGRAPH_BUFFER 16384
+#define FILE_BUFFER 32256
+#define PATH_BUF_SIZE 80
+#else
+/* Buffers for 32-bit and more program */
+#define PARAGRAPH_BUFFER 262144
+#define FILE_BUFFER 262144
+#define PATH_BUF_SIZE 1024
+#endif
+
+/* Buffer for single line. Should be greater than wrap margin +
+ longest substitution sequence */
+#define LINE_BUF_SIZE 512
+/* Default value for wrap margin */
+#ifndef WRAP_MARGIN
+#define WRAP_MARGIN 72
+#endif
+/* variable (defined in catdoc.c) which holds actual value of wrap margin*/
+extern int wrap_margin;
+/*
+ * Public types variables and procedures which should be avalable
+ * to all files in the program
+ */
+
+#ifdef __TURBOC__
+/* Turbo C defines broken isspace, which works only for us-ascii */
+#undef isspace
+#define isspace(c) ((unsigned char)(c) <=32)
+#endif
+
+/* Structure to store UNICODE -> target charset mappings */
+/* array of 256 pointers (which may be null) to arrays of 256 short ints
+ which contain 8-bit character codes or -1 if no matching char */
+typedef short int ** CHARSET;
+
+/* structure to store multicharacter substitution mapping */
+/* Array of 256 pointers to arrays of 256 pointers to string */
+/* configuration variables defined in catdoc.c */
+typedef char *** SUBSTMAP;
+
+extern short int *source_charset;
+extern char bad_char[]; /* defines one-symbol string to replace unknown unicode chars */
+extern char *source_csname;
+extern char *dest_csname;
+extern char *format_name;
+extern CHARSET target_charset;
+extern SUBSTMAP spec_chars;
+ /* Defines unicode chars which should be
+ replaced by strings before UNICODE->target chatset
+ mappigs are applied i.e. TeX special chars like %
+ */
+extern SUBSTMAP replacements;
+ /* Defines unicode chars which could be
+ mapped to some character sequence if no
+ corresponding character exists in the target charset
+ i.e copyright sign */
+extern int verbose; /* if true, some additional information would be
+ printed. defined in analyze.c */
+extern int (*get_unicode_char)(FILE *f,long *offset,long fileend);
+/* pointer to function which gets
+ a char from stream */
+
+extern int get_utf16lsb (FILE *f,long *offset,long fileend);
+extern int get_utf16msb (FILE *f,long *offset,long fileend);
+extern int get_utf8 (FILE *f,long *offset,long fileend);
+extern int get_8bit_char (FILE *f,long *offset,long fileend);
+
+extern int get_word8_char (FILE *f,long *offset,long fileend);
+
+extern const char *charset_from_codepage(unsigned int codepage);
+extern short int *read_charset(const char *filename);
+extern CHARSET make_reverse_map (short int *charset);
+
+extern int to_unicode (short int *charset, int c) ;
+
+extern int from_unicode (CHARSET charset, int u) ;
+
+extern char* convert_char(int unicode_char);
+
+extern char* to_utf8(unsigned int uc);
+
+extern char* map_path, *charset_path;
+extern int signature_check;
+extern int unknown_as_hex;
+char *find_file(char *name, const char *path);
+char *stradd(const char *s1, const char *s2);
+void read_config_file(const char *filename);
+#ifdef HAVE_LANGINFO
+void get_locale_charset(void);
+#if defined(HAVE_STRFTIME) && !defined(__TURB0C__)
+void set_time_locale();
+#endif
+#endif
+SUBSTMAP read_substmap(char* filename);
+extern int longest_sequence;/* for checking which value of wrap_margin
+ can cause buffer overflow*/
+char *map_subst(SUBSTMAP map,int uc);
+
+int check_charset(char **filename,const char *charset);
+int process_file(FILE *f,long stop);
+void copy_out(FILE *f, char *header);
+void output_paragraph(unsigned short int *buffer) ;
+int parse_rtf(FILE *f);
+/* format recognition*/
+int analyze_format(FILE *f);
+void list_charsets(void);
+int parse_word_header(unsigned char *buffer,FILE *f,int offset,long curpos);
+/* large buffers for file IO*/
+extern char *input_buffer,*output_buffer;
+#ifndef HAVE_STRDUP
+ char *strdup(const char *s);
+#endif
+/* numeric conversions */
+long int getlong(unsigned char *buffer,int offset);
+unsigned long int getulong(unsigned char *buffer,int offset);
+unsigned int getshort(unsigned char *buffer,int offset);
+#endif
--- /dev/null
++-charsets.obj +-substmap.obj +-reader.obj +-writer.obj +-fileutil.obj &
++-langinfo.obj +-analyze.obj +-confutil.obj +-rtfread.obj +-numutils.obj &
++-sheet.obj +-xlsparse.obj +-ole.obj +-strftime.obj +-pptparse.obj
--- /dev/null
+/**
+ * @file ppt2text.c
+ * @author Alex Ott <alexott@gmail.com>
+ * @date 23 äÅË 2004
+ * Version: $Id: catppt.c,v 1.1 2006-02-24 17:44:06 vitus Exp $
+ * Copyright: Alex Ott
+ *
+ * @brief main module for text extracting from .ppt
+ *
+ *
+ */
+
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <stdio.h>
+#include <time.h>
+#include <math.h>
+#include <ctype.h>
+#include <string.h>
+#include <unistd.h>
+#include "ppt.h"
+#include "catdoc.h"
+#include <stdlib.h>
+#include "catdoc.h"
+#include "float.h"
+
+#ifdef __TURBOC__
+#define strcasecmp(a,b) strcmpi(a,b)
+#endif
+
+/**
+ * Displays help message
+ *
+ */
+void help (void) {
+ printf("Usage:\n ppt2text [-lV] [-b string] [-s charset] [-d charset] files\n");
+}
+
+
+char *input_buffer, *output_buffer;
+
+/**
+ *
+ *
+ * @param argc
+ * @param argv
+ *
+ * @return
+ */
+int main(int argc, char *argv[]) {
+ FILE *input;
+ FILE *new_file, *ole_file;
+ char *filename =NULL;
+ short int *tmp_charset;
+ int c;
+ int i;
+ char *tempname;
+ read_config_file(SYSTEMRC);
+#ifdef USERRC
+ tempname=find_file(strdup(USERRC),getenv("HOME"));
+ if (tempname) {
+ read_config_file(tempname);
+ free(tempname);
+ }
+#endif
+#ifdef HAVE_LANGINFO
+ get_locale_charset();
+#endif
+
+ check_charset(&dest_csname,dest_csname);
+
+ while ((c=getopt(argc,argv,"Vls:d:p:"))!=-1) {
+ switch(c) {
+ case 'l':
+ list_charsets(); exit(0);
+ case 's':
+ check_charset(&source_csname,optarg);
+ source_charset=read_charset(source_csname);
+ break;
+ case 'd':
+ check_charset(&dest_csname,optarg);
+ break;
+ case 'V': printf("Catdoc Version %s\n",CATDOC_VERSION);
+ exit(0);
+ default:
+ help();
+ exit(1);
+ }
+ }
+ /* If we are using system strftime, we need to set LC_TIME locale
+ * category unless choosen charset is not same as system locale
+ */
+#if defined(HAVE_LANGINFO) && defined(HAVE_STRFTIME) && !defined(__TURB0C__)
+ set_time_locale();
+#endif
+ /* charset conversion init*/
+ input_buffer=malloc(FILE_BUFFER);
+ if (strcmp(dest_csname,"utf-8")) {
+ tmp_charset=read_charset(dest_csname);
+ if (!tmp_charset) {
+ fprintf(stderr,"Cannot load target charset %s\n",dest_csname);
+ exit(1);
+ }
+ target_charset=make_reverse_map(tmp_charset);
+ free(tmp_charset);
+ } else {
+ target_charset=NULL;
+ }
+ spec_chars=read_substmap(stradd("ascii",SPEC_EXT));
+ if (!spec_chars) {
+ fprintf(stderr,"Cannod read substitution map ascii%s\n",
+ SPEC_EXT);
+ exit(1);
+ }
+ replacements=read_substmap(stradd("ascii",REPL_EXT));
+ if (!replacements) {
+ fprintf(stderr,"Cannod read substitution map ascii%s\n",
+ REPL_EXT);
+ exit(1);
+ }
+ if (optind>=argc) {
+ if (isatty(fileno(stdin))) {
+ help();
+ exit(0);
+ }
+ do_ppt(stdin,"STDIN");
+ exit (0);
+ }
+ for (i=optind;i<argc;i++) {
+ filename = argv[i];
+ input=fopen(filename,"rb");
+ if (!input) {
+ perror(filename);
+ exit(1);
+ }
+ if ((new_file=ole_init(input, NULL, 0)) != NULL) {
+ set_ole_func();
+ while((ole_file=ole_readdir(new_file)) != NULL) {
+ int res=ole_open(ole_file);
+/* fprintf(stderr, "name = %s\n", ((oleEntry*)ole_file)->name); */
+ if (res >= 0) {
+ if (strcasecmp(((oleEntry*)ole_file)->name , "PowerPoint Document") == 0) {
+ do_ppt(ole_file,filename);
+ }
+ }
+ ole_close(ole_file);
+ }
+ set_std_func();
+ ole_finish();
+ fclose(new_file);
+ } else {
+ fprintf(stderr, "%s is not OLE file or Error\n", filename);
+ }
+ }
+ return 0;
+}
--- /dev/null
+/*
+ Copyright 1998-2003 Victor Wagner
+ Copyright 2003 Alex Ott
+ This file is released under the GPL. Details can be
+ found in the file COPYING accompanying this distribution.
+*/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <stdio.h>
+#include <string.h>
+#include <stdlib.h>
+#include "catdoc.h"
+
+char *charset_path=CHARSETPATH;
+char *source_csname=SOURCE_CHARSET, *dest_csname=TARGET_CHARSET;
+short int * source_charset;
+int unknown_as_hex=0;
+char bad_char[]=UNKNOWN_CHAR;
+CHARSET target_charset;
+/************************************************************************/
+/* Converts char in input charset into unicode representation */
+/* Should be converted to macro */
+/************************************************************************/
+int to_unicode (short int *charset, int c) {
+ return charset[c];
+}
+/************************************************************************/
+/* Search inverse charset record for given unicode char and returns */
+/* 0-255 char value if found, -1 otherwise */
+/************************************************************************/
+int from_unicode (CHARSET charset, int u) {
+ short int *p;
+ /* This is really assignment, not comparation */
+ if ((p=charset[(unsigned)u>>8])) {
+ return p[u & 0xff];
+ } else {
+ return -1;
+ }
+}
+/************************************************************************/
+/* Converts direct (charset -> unicode) to reverse map */
+/************************************************************************/
+CHARSET make_reverse_map(short int *charset) {
+ CHARSET newmap=calloc(sizeof(short int *), 256);
+ int i,j,k,l;
+ short int *p;
+ if (! charset) {
+ return NULL;
+ }
+ for (i=0;i<256;i++) {
+ k= charset[i];
+ j= (unsigned)k>>8;
+ if (!newmap[j]) {
+ newmap[j] = malloc(sizeof(short int *)*256);
+ if (!newmap[j]) {
+ fprintf(stderr,"Insufficient memory for charset\n");
+ exit(1);
+ }
+ for (l=0,p=newmap[j];l<256;l++,p++) *p=-1;
+ }
+ p=newmap[j];
+ p[k & 0xff]=i;
+ }
+ return newmap;
+}
+
+/************************************************************************/
+/* Reads charset file (as got from ftp.unicode.org) and returns array of*/
+/* 256 short ints (malloced) mapping from charset t unicode */
+/************************************************************************/
+short int * read_charset(const char *filename) {
+ char *path;
+ FILE *f;
+ short int *new=calloc(sizeof(short int),256);
+ int c;
+ long int uc;
+ path= find_file(stradd(filename,CHARSET_EXT),charset_path);
+ if (!path) {
+ fprintf(stderr,"Cannot load charset %s - file not found\n",filename);
+ return NULL;
+ }
+ f=fopen(path,"rb");
+
+ if (!f) {
+ perror(path);
+ return NULL;
+ }
+ if (input_buffer)
+ setvbuf(f,input_buffer,_IOFBF,FILE_BUFFER);
+ /* defaults */
+ for (c=0;c<32;c++) {
+ new[c]=c;
+ }
+ while (!feof(f)) {
+ if (fscanf(f,"%i %li",&c,&uc)==2) {
+ if (c<0||c>255||uc<0||(uc>0xFEFE&& uc!=0xFFFE)) {
+ fprintf(stderr,"Invalid charset file %s\n",path);
+ fclose(f);
+ return NULL;
+ }
+ new[c]=uc;
+ }
+ while((fgetc(f)!='\n')&&!feof(f)) ;
+ }
+ fclose (f);
+ free(path);
+ return new;
+}
+
+
+/************************************************************************/
+/* Reads 8-bit char and convers it from source charset */
+/************************************************************************/
+
+int get_8bit_char (FILE *f,long *offset,long fileend)
+{
+ unsigned char buf;
+ if (catdoc_read(&buf, 1, 1, f)==0) return EOF;
+ (*offset)++;
+ return to_unicode(source_charset,buf);
+}
+
+
+/************************************************************************/
+/* Reads 16-bit unicode value. MS-Word runs on LSB-first machine only, */
+/* so read lsb first always and don't care about proper bit order */
+/************************************************************************/
+
+int get_utf16lsb (FILE *f,long *offset,long fileend) {
+ unsigned char buf[2];
+ int result;
+ result=catdoc_read(buf, 1, 2, f);
+ if (result<0) {
+ perror("read:");
+ exit(1);
+ }
+ if (result !=2) {
+ return EOF;
+ }
+ (*offset)+=2;
+ return ((int)buf[1])|(((int)buf[0])<<8);
+}
+
+/************************************************************************/
+/* Reads 16-bit unicode value written in MSB order. For processing
+ * non-word files . */
+/************************************************************************/
+int get_utf16msb (FILE *f,long *offset,long fileend) {
+ unsigned char buf[2];
+ int result;
+ result=catdoc_read(buf, 1, 2, f);
+ if (result<0) {
+ perror("read:");
+ exit(1);
+ }
+ if (result !=2) {
+ return EOF;
+ }
+ (*offset)+=2;
+ return ((int)buf[0])|(((int)buf[1])<<8);
+}
+
+int get_utf8 (FILE *f,long *offset,long fileend) {
+ unsigned char buf[3];
+ int d,c;
+ int result;
+ result=catdoc_read(buf, 1, 1, f);
+ if (result<0) {
+ perror("read");
+ exit(1);
+ }
+ if (result==0) return EOF;
+ c=buf[0];
+ d=0;
+ if (c<0x80)
+ return c;
+ if (c <0xC0)
+ return 0xfeff; /*skip corrupted sequebces*/
+ if (c <0xE0) {
+ if (catdoc_read(buf+1, 1, 1, f)<=0) return EOF;
+ return ((c & 0x1F)<<6 | ((char)buf[1] & 0x3F));
+ }
+ if (c <0xF0) {
+ if (catdoc_read(buf+1, 1, 2, f)<=2) return (int)EOF;
+ return ((c & 0x0F)<<12)|
+ ((buf[1] & 0x3f)<<6)|
+ (buf[2] & 0x3f);
+ }
+ return 0xFEFF;
+}
+
+/**************************************************************************/
+/* Converts unicode char to output charset sequence. Coversion have */
+/* three steps: 1. Replacement map is searched for the character in case */
+/* it is not allowed for output format (% in TeX, < in HTML */
+/* 2. target charset is searched for this unicode char, if it wasn't */
+/* replaced. If not found, then 3. Substitution map is searched */
+/**************************************************************************/
+char *convert_char(int uc) {
+ static char plain_char[]="a"; /*placeholder for one-char sequences */
+ static char hexbuf[8];
+ char *mapped;
+ int c;
+ if ((mapped=map_subst(spec_chars,uc))) return mapped;
+ if (target_charset) {
+ c =from_unicode(target_charset,uc);
+ if (c>=0) {
+ *plain_char=c;
+ return plain_char;
+ }
+ if ((mapped = map_subst(replacements,uc))) return mapped;
+ if (unknown_as_hex) {
+ sprintf(hexbuf,"\\x%04X",(unsigned)uc);
+ /* This sprintf is safe, becouse uc is unicode character code,
+ which cannot be greater than 0xFFFE. It is ensured by routines
+ in reader.c
+ */
+ return hexbuf;
+ }
+ return bad_char;
+ } else {
+ /* NULL target charset means UTF-8 output */
+ return to_utf8(uc);
+ }
+}
+/******************************************************************/
+/* Converts given unicode character to the utf-8 sequence */
+/* in the static string buffer. Buffer wouldbe overwritten upon */
+/* next call */
+/******************************************************************/
+char *to_utf8(unsigned int uc) {
+ static char utfbuffer[4]; /* it shouldn't overflow becouse we never deal
+ with chars greater than 65535*/
+ int count=0;
+ if (uc< 0x80) {
+ utfbuffer[0]=uc;
+ count=1;
+ } else {
+ if (uc < 0x800) {
+ utfbuffer[count++]=0xC0 | (uc >> 6);
+ } else {
+ utfbuffer[count++]=0xE0 | (uc >>12);
+ utfbuffer[count++]=0x80 | ((uc >>6) &0x3F);
+ }
+ utfbuffer[count++]=0x80 | (uc & 0x3F);
+ }
+ utfbuffer[count]=0;
+ return utfbuffer;
+}
+
+struct cp_map {
+ int codepage;
+ char *charset_name;
+};
+
+struct cp_map cp_to_charset [] = {
+ {10000,"mac-roman"},
+ {10001,"mac-japanese"},
+ {10002,"mac-tchinese"},
+ {10003,"mac-korean"},
+ {10004,"mac-arabic"},
+ {10005,"mac-hebrew"},
+ {10006,"mac-greek1"},
+ {10007,"mac-cyrillic"},
+ {10008,"mac-schinese"},
+ {10010,"mac-romania"},
+ {10017,"mac-ukraine"},
+ {10021,"mac-thai"},
+ {10029,"mac-centeuro"},
+ {10079,"mac-iselandic"},
+ {10081,"mac-turkish"},
+ {10082,"mac-croatia"},
+ {20866,"koi8-r"},
+ {28591,"8859-1"},
+ {28592,"8859-2"},
+ {28593,"8859-3"},
+ {28594,"8859-4"},
+ {28595,"8859-5"},
+ {28596,"8859-6"},
+ {28597,"8859-7"},
+ {28598,"8859-8"},
+ {28599,"8859-9"},
+ {28605,"8859-15"},
+ {65001,"utf-8"},
+ {0,NULL}};
+const char *charset_from_codepage(unsigned int codepage) {
+
+ static char buffer[7];
+ struct cp_map *cp;
+ if (codepage==1200||codepage==1201) {
+ /* For UCS2 */
+ return "";
+ } else
+ if (codepage<10000) {
+ sprintf(buffer,"cp%d",codepage);
+ return buffer;
+ } else {
+ for (cp = cp_to_charset;cp->codepage!=0&& cp->codepage!=codepage;cp++);
+ return cp->charset_name;
+ }
+}
--- /dev/null
+/* src/config.h.in. Generated automatically from configure.in by autoheader 2.13. */
+
+/* Define to empty if the keyword does not work. */
+#undef const
+
+/* Define if the setvbuf function takes the buffering type as its second
+ argument and the buffer pointer as the third, as on System V
+ before release 3. */
+#undef SETVBUF_REVERSED
+
+/* Define if you have the ANSI C header files. */
+#undef STDC_HEADERS
+
+/* Define if your processor stores words with the most significant
+ byte first (like Motorola and SPARC, unlike Intel and VAX). */
+#undef WORDS_BIGENDIAN
+
+/* Define if you have the strdup function. */
+#undef HAVE_STRDUP
+
+/* Define if you have the strftime function. */
+#undef HAVE_STRFTIME
+
+/* Define if you have the strtol function. */
+#undef HAVE_STRTOL
+
+/* Define if you have the <unistd.h> header file. */
+#undef HAVE_UNISTD_H
+
+/* Define this if you have XPG4 comliant nl_langinfo, which accepts CODESET argument */
+#undef HAVE_LANGINFO
+
+/* Character encoding used by default for 8-bit source files */
+#undef SOURCE_CHARSET
+
+/* Output character encoding used by default, if impossible to determine encoding from locale */
+#undef TARGET_CHARSET
+
+/* Suffix for files with special symbols map (ones to be replaced regardless of availability in target encoding) */
+#undef SPEC_EXT
+
+/* Suffix for symbols replacement map (what to do with symbols, which are not available in the target encoding) */
+#undef REPL_EXT
+
+/* Symbol to represent character which is not available either in target encoding or in replacement map */
+#undef UNKNOWN_CHAR
+
--- /dev/null
+/*
+ Copyright 1998-2003 Victor Wagner
+ Copyright 2003 Alex Ott
+ This file is released under the GPL. Details can be
+ found in the file COPYING accompanying this distribution.
+*/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <unistd.h>
+#include <ctype.h>
+#include <string.h>
+#include <stdlib.h>
+#ifdef HAVE_LANGINFO
+#include <langinfo.h>
+#ifndef __TURBOC__
+#include <locale.h>
+#endif
+#endif
+#include "catdoc.h"
+
+char *format_name="ascii";
+static int runtime_locale_check=1;
+/********************************************************************/
+/* Reads configuration file */
+/* */
+/********************************************************************/
+void read_config_file(const char* filename) {
+ FILE *f=fopen(add_exe_path(filename),"rb");
+ char *name,*value,line[1024],*c;
+ int lineno=0;
+ if (!f) return;
+ while (!feof(f)) {
+ fgets(line,1024,f);
+ if (feof(f)) break;
+ lineno++;
+ if ((c=strchr(line,'#'))) *c='\0';
+ name=line;
+ while (*name&&isspace(*name)) name++;
+ if (!*name) continue;
+ for (value=name;*value&&(isalnum(*value)||*value=='_'); value++);
+ if (*value=='=') {
+ *value=0;value++;
+ } else {
+ *value=0;value++;
+ while(*value&&isspace(*value)) value++;
+ if (*value++ != '=' ) {
+ fprintf(stderr,"Error %s(%d): name = value syntax expected\n",
+ filename,lineno);
+ exit(1);
+ }
+ while(*value&&isspace(*value)) value++;
+ }
+ for (c=value;*c&&!isspace(*c);c++);
+ if (value==c) {
+ fprintf(stderr,"Error %s(%d): name = value syntax expected\n",
+ filename,lineno);
+ exit(1);
+ }
+ *c=0;
+ if (!strcmp(name,"source_charset")) {
+ source_csname=strdup(value);
+ } else if (!strcmp(name,"target_charset")) {
+ dest_csname=strdup(value);
+ } else if (!strcmp(name,"format")) {
+ format_name=strdup(value);
+ } else if (!strcmp(name,"charset_path")) {
+ charset_path=strdup(value);
+ } else if (!strcmp(name,"map_path")) {
+ map_path = strdup(value);
+ } else if (!strcmp(name,"unknown_char")) {
+ if (*value=='"' && value[1] && value[2]=='"') value++;
+ if (*value=='\'' && value[1] && value[2]=='\'') value++;
+ bad_char[0] = *value;
+ } else if (!strcmp(name,"use_locale")) {
+ if (tolower(value[0])=='n') {
+ runtime_locale_check=0;
+ } else if (tolower(value[0])=='y') {
+ runtime_locale_check=1;
+ } else {
+ fprintf(stderr,"Error %s(%d): use_locale requires 'yes' or 'no'\n",
+ filename,lineno);
+ exit(1);
+ }
+
+ } else {
+ fprintf(stderr,"Invalid configuration directive in %s(%d):,%s = %s\n",
+ filename,lineno,name,value);
+ exit(1);
+ }
+ }
+ fclose(f);
+}
+#ifdef HAVE_LANGINFO
+static char *locale_charset = NULL;
+/*********************************************************************/
+/* Determines output character set from current locale and puts it *
+ * into global variable dest_csname *
+ *********************************************************************/
+void get_locale_charset() {
+ char *codeset;
+ if (!runtime_locale_check) return;
+#ifndef __TURBOC__
+ if (!setlocale(LC_CTYPE,"")) return;
+#endif
+ codeset = nl_langinfo(CODESET);
+ if (!strncmp(codeset,"ISO",3)||!strncmp(codeset,"iso",3)) {
+ codeset+=3;
+ if (*codeset=='-') codeset++;
+ if (!strncmp(codeset,"646",3)) {
+ /* ISO 646 is another name for us=ascii */
+ check_charset(&dest_csname,"us-ascii") ;
+ } else {
+ if (check_charset(&dest_csname,codeset)) {
+ locale_charset = dest_csname;
+ }
+ }
+ } else if (!strcmp(codeset,"ANSI_X3.4-1968")) {
+ check_charset(&dest_csname,"us-ascii");
+ } else if (!strncmp(codeset,"ANSI",4)||!strncmp(codeset,"ansi",4)) {
+ char *newstr;
+ if (*codeset=='-') {
+ codeset++;
+ }
+ newstr = malloc(strlen(codeset)-4+2+1);
+ strcpy(newstr,"cp");
+ strcpy(newstr+2,codeset+4);
+ if (check_charset(&dest_csname,newstr)) {
+ locale_charset = dest_csname;
+ }
+ free(newstr);
+ } else if (!strncmp(codeset,"IBM",3)) {
+ char *newstr;
+ codeset+=3;
+ if (*codeset == '-') codeset++;
+ newstr=malloc(strlen(codeset)+2+1);
+ strcpy(newstr,"cp");
+ strcpy(newstr+2,codeset);
+ if (check_charset(&dest_csname, newstr)) {
+ locale_charset=dest_csname;
+ }
+ free(newstr);
+ } else {
+ char *i,*newstr = strdup(codeset);
+ for (i=newstr;*i;i++) {
+ *i=tolower(*i);
+ }
+ if (check_charset(&dest_csname,newstr)) {
+ locale_charset = dest_csname;
+ }
+ }
+
+}
+#ifndef __TURBOC__
+void set_time_locale() {
+ if (!runtime_locale_check) return;
+ if (!locale_charset) return;
+ if (strcmp(locale_charset,dest_csname)!=0) return;
+ setlocale(LC_TIME,"");
+}
+#endif
+#endif
+#ifndef HAVE_STRDUP
+/* Implementation of strdup for systems which don't have it */
+char *strdup(const char *s) {
+ int size=strlen(s);
+ char *newstr;
+ newstr=malloc(size+1);
+ return strcpy(newstr,s);
+}
+#endif
--- /dev/null
+/*
+ Copyright 1998-2003 Victor Wagner
+ Copyright 2003 Alex Ott
+ This file is released under the GPL. Details can be
+ found in the file COPYING accompanying this distribution.
+*/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <unistd.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/stat.h>
+#include <stdlib.h>
+#include "catdoc.h"
+#if defined(MSDOS) && !defined(__MSDOS__)
+#define __MSDOS__
+#endif
+#if defined(__MSDOS__) || defined(_WIN32)
+#include <dir.h>
+#include <dos.h>
+#else
+#include <glob.h>
+#endif
+
+
+/************************************************************************/
+/* Copies component of string starting with p and ending one char */
+/* before q into path_buf, expanding ~ if neccessary */
+/************************************************************************/
+int prepare_path_buf(char *path_buf, const char *start, const char *end) {
+ if (*start == '~' && start[1] == DIR_SEP) {
+ char *home=getenv("HOME");
+ start++;
+ if (!home) {
+ if (end-start>PATH_BUF_SIZE) return 0;
+ strncpy(path_buf,start,end-start);
+ path_buf[end-start]=0;
+ } else {
+ int l = strlen(home);
+ if (l+(end-start)>PATH_BUF_SIZE) return 0;
+ strcpy(path_buf,home);
+ strncpy(path_buf+l,start,end-start);
+ path_buf[end-start+l]=0;
+ }
+ } else {
+ if (end-start>PATH_BUF_SIZE) return 0;
+ strncpy(path_buf,start,end-start);
+ path_buf[end-start]=0;
+ }
+ /* Empty list element means current directory */
+ if (!*path_buf) {
+ path_buf[0]='.';
+ path_buf[1]=0;
+#ifdef __MSDOS__
+ } else {
+ strcpy(path_buf,add_exe_path(path_buf)); /* safe, becouse
+ add_exe_path knows about PATH_BUF_SIZE */
+#endif
+ }
+ return 1;
+
+}
+/************************************************************************/
+/* Searches for file name in specified list of directories. Sets */
+/* Returns dynamically allocated full path or NULL. if nothing */
+/* appropriate Expects name to be dynamically allocated and frees it */
+/************************************************************************/
+char *find_file(char *name, const char *path)
+{ const char *p;
+ char *q;
+ char path_buf[PATH_BUF_SIZE];
+ char dir_sep[2]={DIR_SEP,0};
+ for (p=path;p;p=q+1) {
+ q=strchr(p,LIST_SEP);
+
+ if (q) {
+ if (!prepare_path_buf(path_buf,p,q)) continue;
+ } else {
+ q--;
+ if (!prepare_path_buf(path_buf,p,p+strlen(p))) continue;
+ }
+ strcat(path_buf,dir_sep); /* always one char */
+ if (strlen(path_buf)+strlen(name)>=PATH_BUF_SIZE)
+ continue; /* Ignore too deeply nested directories */
+ strcat(path_buf,name);
+ if (access(path_buf,0)==0) {
+ free(name);
+ return strdup(path_buf);
+ }
+ }
+ /* if we are here, nothing found */
+ free(name);
+ return NULL;
+}
+
+/************************************************************************/
+/* Searches for charset with given name and put pointer to malloced copy*/
+/* of its name into first arg if found. Otherwise leaves first arg */
+/* unchanged. Returns non-zero on success */
+/************************************************************************/
+int check_charset(char **filename,const char *charset) {
+ char *tmppath;
+ if (!strncmp(charset,"utf-8",6)) {
+ *filename=strdup("utf-8");
+ return 1;
+ }
+ tmppath=find_file(stradd(charset,CHARSET_EXT),charset_path);
+ if (tmppath&& *tmppath) {
+ *filename=strdup(charset);
+ free(tmppath);
+ return 1;
+ }
+ return 0;
+}
+
+/**********************************************************************/
+/* Returns malloced string containing concatenation of two */
+/* arguments */
+/**********************************************************************/
+char *stradd(const char *s1,const char *s2)
+{ char *res;
+ res=malloc(strlen(s1)+strlen(s2)+1);
+ if (!res) {
+ fprintf (stderr,"Out of memory!");
+ exit(1);
+ }
+ strcpy(res,s1);
+ strcat(res,s2);
+ return res;
+}
+
+
+/*
+ * In DOS, argv[0] contain full path to the program, and it is a custom
+ * to keep configuration files in same directory as program itself
+ */
+#ifdef __MSDOS__
+char *exe_dir(void) {
+ static char pathbuf[PATH_BUF_SIZE];
+ char *q;
+ strcpy(pathbuf,_argv[0]); /* DOS ensures, that our exe path is no
+ longer than PATH_BUF_SIZE*/
+ q=strrchr(pathbuf,DIR_SEP);
+ if (q) {
+ *q=0;
+ } else {
+ pathbuf[0]=0;
+ }
+ return pathbuf;
+}
+char *add_exe_path(const char *name) {
+ static char path[PATH_BUF_SIZE];
+ char *mypath=exe_dir();
+ /* No snprintf in Turbo C 2.0 library, so just check by hand
+ and exit if something goes wrong */
+ if (strchr(name,'%')) {
+ /* there is substitution */
+ if (strlen(name)-1+strlen(mypath)>=PATH_BUF_SIZE) {
+ fprintf(stderr,"Invalid config file. file name \"%s\" too long "
+ "after substitution\n",name);
+ exit(1);
+ }
+ sprintf(path,name,exe_dir());
+ return path;
+ } else {
+ return name;
+ }
+}
+#endif
+/*********************************************************************/
+/* Prints out list of available charsets, i.e. names without extension *
+ * of all .txt files in the charset path + internally-supported utf-8 *
+ ************************************************************************/
+
+void list_charsets(void) {
+ const char *p;
+ char *q;
+ char path_buf[PATH_BUF_SIZE];
+ char dir_sep[2]={DIR_SEP,0};
+#ifdef __MSDOS__
+ struct ffblk ffblock;
+ int res,col;
+#else
+ glob_t glob_buf;
+ int count,glob_flags=GLOB_ERR;
+#endif
+ char **ptr;
+ for (p=charset_path;p;p=q+1) {
+ q=strchr(p,LIST_SEP);
+
+ if (q) {
+ if (q-p>=PATH_BUF_SIZE) {
+ /* Oops, dir name too long, perhabs broken config file */
+ continue;
+ }
+ strncpy(path_buf,p,q-p);
+ path_buf[q-p]=0;
+ } else {
+ q--;
+ if (strlen(p)>=PATH_BUF_SIZE) continue;
+ strcpy(path_buf,p);
+ }
+ /* Empty list element means current directory */
+ if (!*path_buf) {
+ path_buf[0]='.';
+ path_buf[1]=0;
+#ifdef __MSDOS__
+ } else {
+ strcpy(path_buf,add_exe_path(path_buf)); /* safe, becouse
+ add_exe_path knows about PATH_BUF_SIZE */
+#endif
+ }
+ strcat(path_buf,dir_sep); /* always one char */
+ if (strlen(path_buf)+6>=PATH_BUF_SIZE)
+ continue; /* Ignore too deeply nested directories */
+ strcat(path_buf,"*.txt");
+#ifdef __MSDOS__
+ res=findfirst(path_buf,&ffblock,FA_RDONLY | FA_HIDDEN | FA_ARCH);
+ col=1;
+ printf("Available charsets:\n");
+ while (!res) {
+ char name[12],*src,*dest;
+ dest=name;
+ src=ffblock.ff_name;
+ for (dest=name,src=ffblock.ff_name;*src && *src !='.';dest++,src++)
+ *dest=tolower(*src);
+ *dest++=(col<5)?'\t':'\n';
+ if (++col>5) col=1;
+ *dest=0;
+ printf("%10s",name);
+ res=findnext(&ffblock);
+ }
+#else
+ switch (glob(path_buf,glob_flags,NULL,&glob_buf)) {
+ case 0:
+#ifdef GLOB_NOMATCH
+ case GLOB_NOMATCH:
+#endif
+ break;
+ default:
+ perror("catdoc");
+ exit(1);
+ }
+ glob_flags|=GLOB_APPEND;
+#endif
+ }
+#ifdef __MSDOS__
+ fputs("utf-8\n",stdout);
+#else
+ count=0;printf("Available charsets:");
+ for (ptr=glob_buf.gl_pathv;*ptr;ptr++) {
+ printf("%c",(count++)%5?'\t':'\n');
+ p=strrchr(*ptr,dir_sep[0]);
+ if (!p) continue;
+ p++;
+ if ((q=strchr(p,'.'))) *q=0;
+ fputs(p,stdout);
+ }
+ printf("%c",(count++)%5?'\t':'\n');
+ fputs("utf-8",stdout);
+ printf("\n");
+ globfree(&glob_buf);
+#endif
+}
--- /dev/null
+CC=tcc
+TCDIR=c:\tc
+LIB=$(TCDIR)\lib
+CFLAGS=-v -w -mc -DHAVE_LANGINFO -DHAVE_STRDUP -DCATDOC_VERSION="0.94.1"
+OBJ=charsets.obj substmap.obj reader.obj writer.obj fileutil.obj langinfo.obj analyze.obj confutil.obj rtfread.obj numutils.obj sheet.obj xlsparse.obj ole.obj strftime.obj pptparse.obj
+.c.obj:
+ $(CC) -v -c -mc -I../compat $(CFLAGS) $*
+all: catdoc.exe xls2csv.exe catppt.exe
+catdoc.exe: catdoc.obj catdoc.lib
+ $(CC) -v -mc catdoc.obj catdoc.lib
+xls2csv.exe: xls2csv.obj catdoc.lib
+ $(CC) -v -mc xls2csv.obj catdoc.lib
+catppt.exe: catppt.obj catdoc.lib
+ $(CC) -v -mc catppt.obj catdoc.lib
+langinfo.obj: ..\compat\langinfo.c
+ $(CC) -c -I../compat $(CFLAGS) ../compat/langinfo.c
+strftime.obj: ..\compat\strftime.c
+ $(CC) -c -I../compat $(CFLAGS) ../compat/strftime.c
+
+catdoc.lib: $(OBJ) catdoc.rsp
+ tlib /E catdoc @catdoc.rsp
+clean:
+ del *.obj
+ del *.exe
+ del *.lib
--- /dev/null
+/*****************************************************************/
+/* Utilities to convert various numeric types from the Windows */
+/* (Little endian) format to native types */
+/* */
+/* This file is part of catdoc project */
+/* (c) Victor Wagner 1996-2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+
+
+/********************************************************************/
+/* Reads 2-byte LSB int from buffer at given offset platfom-indepent
+ * way
+ *********************************************************************/
+unsigned int getshort(unsigned char *buffer,int offset) {
+ return (unsigned short int)buffer[offset]|((unsigned short int)buffer[offset+1]<<8);
+}
+/********************************************************************/
+/* Reads 4-byte LSB int from buffer at given offset almost platfom-indepent
+ * way
+ *********************************************************************/
+long int getlong(unsigned char *buffer,int offset) {
+ return (long)buffer[offset]|((long)buffer[offset+1]<<8L)
+ |((long)buffer[offset+2]<<16L)|((long)buffer[offset+3]<<24L);
+}
+
+unsigned long int getulong(unsigned char *buffer,int offset) {
+ return (unsigned long)buffer[offset]|((unsigned long)buffer[offset+1]<<8L)
+ |((unsigned long)buffer[offset+2]<<16L)|((unsigned long)buffer[offset+3]<<24L);
+}
--- /dev/null
+/**
+ * @file ole.c
+ * @author Alex Ott, Victor B Wagner
+ * @date Wed Jun 11 12:33:01 2003
+ * Version: $Id: ole.c,v 1.1 2006-02-24 17:44:06 vitus Exp $
+ * Copyright: Victor B Wagner, 1996-2003 Alex Ott, 2003
+ *
+ * @brief Parsing structure of MS Office compound document
+ *
+ * This file is part of catdoc project
+ * and distributed under GNU Public License
+ *
+ */
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+
+#include "catdoc.h"
+
+#define min(a,b) ((a) < (b) ? (a) : (b))
+
+long int sectorSize, shortSectorSize;
+/* BBD Info */
+long int bbdNumBlocks;
+unsigned char *BBD=NULL;
+/* SBD Info */
+long int sbdNumber, sbdStart, sbdLen;
+unsigned char *SBD=NULL;
+oleEntry *rootEntry=NULL;
+/* Properties Info */
+long propCurNumber, propLen, propNumber, propStart;
+unsigned char *properties=NULL;
+long int fileLength=0;
+
+static unsigned char ole_sign[]={0xD0,0xCF,0x11,0xE0,0xA1,0xB1,0x1A,0xE1,0};
+
+
+/**
+ * Initializes ole structure
+ *
+ * @param f (FILE *) compound document file, positioned at bufSize
+ * byte. Might be pipe or socket
+ * @param buffer (void *) bytes already read from f
+ * @param bufSize number of bytes already read from f should be less
+ * than 512
+ *
+ * @return
+ */
+FILE* ole_init(FILE *f, void *buffer, size_t bufSize) {
+ unsigned char oleBuf[BBD_BLOCK_SIZE];
+ unsigned char *tmpBuf;
+ FILE *newfile;
+ int ret=0, i;
+ long int sbdMaxLen, sbdCurrent, propMaxLen, propCurrent, mblock, msat_size;
+ oleEntry *tEntry;
+
+ /* deleting old data (if it was allocated) */
+ ole_finish();
+
+ if (fseek(f,0,SEEK_SET) == -1) {
+ if ( errno == ESPIPE ) {
+ /* We got non-seekable file, create temp file */
+ if((newfile=tmpfile()) == NULL) {
+ perror("Can't create tmp file");
+ return NULL;
+ }
+ if (bufSize > 0) {
+ ret=fwrite(buffer, 1, bufSize, newfile);
+ if(ret != bufSize) {
+ perror("Can't write to tmp file");
+ return NULL;
+ }
+ }
+
+ while(!feof(f)){
+ ret=fread(oleBuf,1,BBD_BLOCK_SIZE,f);
+ fwrite(oleBuf, 1, ret, newfile);
+ }
+ fseek(newfile,0,SEEK_SET);
+ } else {
+ perror("Can't seek in file");
+ return NULL;
+ }
+ } else {
+ newfile=f;
+ }
+ fseek(newfile,0,SEEK_END);
+ fileLength=ftell(newfile);
+/* fprintf(stderr, "fileLength=%ld\n", fileLength); */
+ fseek(newfile,0,SEEK_SET);
+ ret=fread(oleBuf,1,BBD_BLOCK_SIZE,newfile);
+ if ( ret != BBD_BLOCK_SIZE ) {
+ return NULL;
+ }
+ if (strncmp(oleBuf,ole_sign,8) != 0) {
+ return NULL;
+ }
+ sectorSize = 1<<getshort(oleBuf,0x1e);
+ shortSectorSize=1<<getshort(oleBuf,0x20);
+
+/* Read BBD into memory */
+ bbdNumBlocks = getulong(oleBuf,0x2c);
+ if((BBD=malloc(bbdNumBlocks*sectorSize)) == NULL ) {
+ return NULL;
+ }
+
+ if((tmpBuf=malloc(MSAT_ORIG_SIZE)) == NULL ) {
+ return NULL;
+ }
+ memcpy(tmpBuf,oleBuf+0x4c,MSAT_ORIG_SIZE);
+ mblock=getlong(oleBuf,0x44);
+ msat_size=getlong(oleBuf,0x48);
+
+/* fprintf(stderr, "msat_size=%ld\n", msat_size); */
+
+ i=0;
+ while((mblock >= 0) && (i < msat_size)) {
+ unsigned char *newbuf;
+/* fprintf(stderr, "i=%d mblock=%ld\n", i, mblock); */
+ if ((newbuf=realloc(tmpBuf, sectorSize*(i+1)+MSAT_ORIG_SIZE)) != NULL) {
+ tmpBuf=newbuf;
+ } else {
+ perror("MSAT realloc error");
+ free(tmpBuf);
+ ole_finish();
+ return NULL;
+ }
+
+ fseek(newfile, 512+mblock*sectorSize, SEEK_SET);
+ if(fread(tmpBuf+MSAT_ORIG_SIZE+(sectorSize-4)*i,
+ 1, sectorSize, newfile) != sectorSize) {
+ fprintf(stderr, "Error read MSAT!\n");
+ ole_finish();
+ return NULL;
+ }
+
+ i++;
+ mblock=getlong(tmpBuf, MSAT_ORIG_SIZE+(sectorSize-4)*i);
+ }
+
+/* fprintf(stderr, "bbdNumBlocks=%ld\n", bbdNumBlocks); */
+ for(i=0; i< bbdNumBlocks; i++) {
+ long int bbdSector=getlong(tmpBuf,4*i);
+
+ if (bbdSector >= fileLength/sectorSize || bbdSector < 0) {
+ fprintf(stderr, "Bad BBD entry!\n");
+ ole_finish();
+ return NULL;
+ }
+ fseek(newfile, 512+bbdSector*sectorSize, SEEK_SET);
+ if ( fread(BBD+i*sectorSize, 1, sectorSize, newfile) != sectorSize ) {
+ fprintf(stderr, "Can't read BBD!\n");
+ free(tmpBuf);
+ ole_finish();
+ return NULL;
+ }
+ }
+ free(tmpBuf);
+
+/* Read SBD into memory */
+ sbdLen=0;
+ sbdMaxLen=10;
+ sbdCurrent = sbdStart = getlong(oleBuf,0x3c);
+ if (sbdStart > 0) {
+ if((SBD=malloc(sectorSize*sbdMaxLen)) == NULL ) {
+ ole_finish();
+ return NULL;
+ }
+ while(1) {
+ fseek(newfile, 512+sbdCurrent*sectorSize, SEEK_SET);
+ fread(SBD+sbdLen*sectorSize, 1, sectorSize, newfile);
+ sbdLen++;
+ if (sbdLen >= sbdMaxLen) {
+ unsigned char *newSBD;
+
+ sbdMaxLen+=5;
+ if ((newSBD=realloc(SBD, sectorSize*sbdMaxLen)) != NULL) {
+ SBD=newSBD;
+ } else {
+ perror("SBD realloc error");
+ ole_finish();
+ return NULL;
+ }
+ }
+ sbdCurrent = getlong(BBD, sbdCurrent*4);
+ if(sbdCurrent < 0 ||
+ sbdCurrent >= fileLength/sectorSize)
+ break;
+ }
+ sbdNumber = (sbdLen*sectorSize)/shortSectorSize;
+/* fprintf(stderr, "sbdLen=%ld sbdNumber=%ld\n",sbdLen, sbdNumber); */
+ } else {
+ SBD=NULL;
+ }
+/* Read property catalog into memory */
+ propLen = 0;
+ propMaxLen = 5;
+ propCurrent = propStart = getlong(oleBuf,0x30);
+ if (propStart >= 0) {
+ if((properties=malloc(propMaxLen*sectorSize)) == NULL ) {
+ ole_finish();
+ return NULL;
+ }
+ while(1) {
+/* fprintf(stderr, "propCurrent=%ld\n",propCurrent); */
+ fseek(newfile, 512+propCurrent*sectorSize, SEEK_SET);
+ fread(properties+propLen*sectorSize,
+ 1, sectorSize, newfile);
+ propLen++;
+ if (propLen >= propMaxLen) {
+ unsigned char *newProp;
+
+ propMaxLen+=5;
+ if ((newProp=realloc(properties, propMaxLen*sectorSize)) != NULL)
+ properties=newProp;
+ else {
+ perror("Properties realloc error");
+ ole_finish();
+ return NULL;
+ }
+ }
+
+ propCurrent = getlong(BBD, propCurrent*4);
+ if(propCurrent < 0 ||
+ propCurrent >= fileLength/sectorSize ) {
+ break;
+ }
+ }
+/* fprintf(stderr, "propLen=%ld\n",propLen); */
+ propNumber = (propLen*sectorSize)/PROP_BLOCK_SIZE;
+ propCurNumber = 0;
+ } else {
+ ole_finish();
+ properties = NULL;
+ return NULL;
+ }
+
+
+/* Find Root Entry */
+ while((tEntry=(oleEntry*)ole_readdir(newfile)) != NULL) {
+ if (!tEntry->name[0]||strcmp(tEntry->name,"Root Entry") == 0) {
+ rootEntry=tEntry;
+ break;
+ }
+ ole_close((FILE*)tEntry);
+ }
+ propCurNumber = 0;
+ fseek(newfile, 0, SEEK_SET);
+ if (!rootEntry) {
+ fprintf(stderr,"Cannot find root entry in this file!\n");
+ ole_finish();
+ return NULL;
+ }
+ return newfile;
+}
+
+/**
+ *
+ *
+ * @param oleBuf
+ *
+ * @return
+ */
+int rightOleType(unsigned char *oleBuf) {
+ return (oleBuf[0x42] == 1 || oleBuf[0x42] == 2 ||
+ oleBuf[0x42] == 3 || oleBuf[0x42] == 5 );
+}
+
+/**
+ *
+ *
+ * @param oleBuf
+ *
+ * @return
+ */
+oleType getOleType(unsigned char *oleBuf) {
+ return (oleType)((unsigned char)oleBuf[0x42]);
+}
+
+/**
+ * Reads next directory entry from file
+ *
+ * @param name buffer for name converted to us-ascii should be at least 33 chars long
+ * @param size size of file
+ *
+ * @return 0 if everything is ok -1 on error
+ */
+FILE *ole_readdir(FILE *f) {
+ int i, nLen;
+ unsigned char *oleBuf;
+ oleEntry *e=NULL;
+ long int chainMaxLen, chainCurrent;
+
+ if ( properties == NULL || propCurNumber >= propNumber || f == NULL )
+ return NULL;
+ oleBuf=properties + propCurNumber*PROP_BLOCK_SIZE;
+ if( !rightOleType(oleBuf))
+ return NULL;
+ if ((e = (oleEntry*)malloc(sizeof(oleEntry))) == NULL) {
+ perror("Can\'t allocate memory");
+ return NULL;
+ }
+ e->dirPos=oleBuf;
+ e->type=getOleType(oleBuf);
+ e->file=f;
+ e->startBlock=getlong(oleBuf,0x74);
+ e->blocks=NULL;
+
+ nLen=getshort(oleBuf,0x40);
+ for (i=0 ; i < nLen /2; i++)
+ e->name[i]=(char)oleBuf[i*2];
+ e->name[i]='\0';
+ propCurNumber++;
+ e->length=getulong(oleBuf,0x78);
+/* Read sector chain for object */
+ chainMaxLen = 25;
+ e->numOfBlocks = 0;
+ chainCurrent = e->startBlock;
+ e->isBigBlock = (e->length >= 0x1000) || !strcmp(e->name, "Root Entry");
+/* fprintf(stderr, "e->name=%s e->length=%ld\n", e->name, e->length); */
+/* fprintf(stderr, "e->startBlock=%ld BBD=%p\n", e->startBlock, BBD); */
+ if (e->startBlock >= 0 &&
+ e->length >= 0 &&
+ (e->startBlock <=
+ fileLength/(e->isBigBlock ? sectorSize : shortSectorSize))) {
+ if((e->blocks=malloc(chainMaxLen*sizeof(long int))) == NULL ) {
+ return NULL;
+ }
+ while(1) {
+/* fprintf(stderr, "chainCurrent=%ld\n", chainCurrent); */
+ e->blocks[e->numOfBlocks++] = chainCurrent;
+ if (e->numOfBlocks >= chainMaxLen) {
+ long int *newChain;
+ chainMaxLen+=25;
+ if ((newChain=realloc(e->blocks,
+ chainMaxLen*sizeof(long int))) != NULL)
+ e->blocks=newChain;
+ else {
+ perror("Properties realloc error");
+ free(e->blocks);
+ e->blocks=NULL;
+ return NULL;
+ }
+ }
+ if ( e->isBigBlock ) {
+ chainCurrent = getlong(BBD, chainCurrent*4);
+ } else if ( SBD != NULL ) {
+ chainCurrent = getlong(SBD, chainCurrent*4);
+ } else {
+ chainCurrent=-1;
+ }
+ if(chainCurrent <= 0 ||
+ chainCurrent >= ( e->isBigBlock ?
+ ((bbdNumBlocks*sectorSize)/4)
+ : ((sbdNumber*shortSectorSize)/4) ) ||
+ (e->numOfBlocks >
+ e->length/(e->isBigBlock ? sectorSize : shortSectorSize))) {
+/* fprintf(stderr, "chain End=%ld\n", chainCurrent); */
+ break;
+ }
+ }
+ }
+
+ if(e->length > (e->isBigBlock ? sectorSize : shortSectorSize)*e->numOfBlocks)
+ e->length = (e->isBigBlock ? sectorSize : shortSectorSize)*e->numOfBlocks;
+/* fprintf(stderr, "READDIR: e->name=%s e->numOfBlocks=%ld length=%ld\n", */
+/* e->name, e->numOfBlocks, e->length); */
+
+ return (FILE*)e;
+}
+
+/**
+ * Open stream, which correspond to directory entry last read by
+ * ole_readdir
+ *
+ *
+ * @return opaque pointer to pass to ole_read, casted to (FILE *)
+ */
+int ole_open(FILE *stream) {
+ oleEntry *e=(oleEntry *)stream;
+ if ( e->type != oleStream)
+ return -2;
+
+ e->ole_offset=0;
+ e->file_offset= ftell(e->file);
+ return 0;
+}
+
+/**
+ *
+ *
+ * @param e
+ * @param blk
+ *
+ * @return
+ */
+long int calcFileBlockOffset(oleEntry *e, long int blk) {
+ long int res;
+ if ( e->isBigBlock ) {
+ res=512+e->blocks[blk]*sectorSize;
+ } else {
+ long int sbdPerSector=sectorSize/shortSectorSize;
+ long int sbdSecNum=e->blocks[blk]/sbdPerSector;
+ long int sbdSecMod=e->blocks[blk]%sbdPerSector;
+/* fprintf(stderr, "calcoffset: e->name=%s e->numOfBlocks=%ld length=%ld sbdSecNum=%ld rootEntry->blocks=%p\n",
+ e->name, e->numOfBlocks, e->length, sbdSecNum, rootEntry->blocks);*/
+ res=512 + rootEntry->blocks[sbdSecNum]*sectorSize + sbdSecMod*shortSectorSize;
+ }
+ return res;
+}
+
+
+/**
+ * Reads block from open ole stream interface-compatible with fread
+ *
+ * @param ptr pointer to buffer for read to
+ * @param size size of block
+ * @param nmemb size in blocks
+ * @param stream pointer to FILE* structure
+ *
+ * @return number of readed blocks
+ */
+size_t ole_read(void *ptr, size_t size, size_t nmemb, FILE *stream) {
+ oleEntry *e = (oleEntry*)stream;
+ long int llen = size*nmemb, rread=0, i;
+ long int blockNumber, modBlock, toReadBlocks, toReadBytes, bytesInBlock;
+ long int ssize; /**< Size of block */
+ long int newoffset;
+ unsigned char *cptr = ptr;
+ if( e->ole_offset+llen > e->length )
+ llen= e->length - e->ole_offset;
+
+ ssize = (e->isBigBlock ? sectorSize : shortSectorSize);
+ blockNumber=e->ole_offset/ssize;
+/* fprintf(stderr, "blockNumber=%ld e->numOfBlocks=%ld llen=%ld\n", */
+/* blockNumber, e->numOfBlocks, llen); */
+ if ( blockNumber >= e->numOfBlocks || llen <=0 )
+ return 0;
+
+ modBlock=e->ole_offset%ssize;
+ bytesInBlock = ssize - modBlock;
+ if(bytesInBlock < llen) {
+ toReadBlocks = (llen-bytesInBlock)/ssize;
+ toReadBytes = (llen-bytesInBlock)%ssize;
+ } else {
+ toReadBlocks = toReadBytes = 0;
+ }
+/* fprintf(stderr, "llen=%ld toReadBlocks=%ld toReadBytes=%ld bytesInBlock=%ld blockNumber=%ld modBlock=%ld\n", */
+/* llen, toReadBlocks, toReadBytes, bytesInBlock, blockNumber, modBlock); */
+ newoffset = calcFileBlockOffset(e,blockNumber)+modBlock;
+ if (e->file_offset != newoffset) {
+ fseek(e->file, e->file_offset=newoffset, SEEK_SET);
+ }
+ rread=fread(ptr, 1, min(llen,bytesInBlock), e->file);
+ e->file_offset += rread;
+ for(i=0; i<toReadBlocks; i++) {
+ int readbytes;
+ blockNumber++;
+ newoffset = calcFileBlockOffset(e,blockNumber);
+ if (newoffset != e->file_offset);
+ fseek(e->file, e->file_offset=newoffset , SEEK_SET);
+ readbytes=fread(cptr+rread, 1, min(llen-rread, ssize), e->file);
+ rread +=readbytes;
+ e->file_offset +=readbytes;
+ }
+ if(toReadBytes > 0) {
+ int readbytes;
+ blockNumber++;
+ newoffset = calcFileBlockOffset(e,blockNumber);
+ fseek(e->file, e->file_offset=newoffset, SEEK_SET);
+ readbytes=fread(cptr+rread, 1, toReadBytes,e ->file);
+ rread +=readbytes;
+ e->file_offset +=readbytes;
+ }
+/* fprintf(stderr, "ole_offset=%ld rread=%ld llen=%ld\n",
+ e->ole_offset, rread, llen);*/
+ e->ole_offset+=rread;
+ return rread;
+}
+
+/**
+ *
+ *
+ * @param stream
+ *
+ * @return
+ */
+int ole_eof(FILE *stream) {
+ oleEntry *e=(oleEntry*)stream;
+/* fprintf(stderr, "EOF: e->ole_offset=%ld e->length=%ld\n",
+ e->ole_offset, e->length);*/
+ return (e->ole_offset >= e->length);
+}
+
+/**
+ *
+ *
+ */
+void ole_finish(void) {
+ if ( BBD != NULL ) free(BBD);
+ if ( SBD != NULL ) free(SBD);
+ if ( properties != NULL ) free(properties);
+ if ( rootEntry != NULL ) ole_close((FILE*)rootEntry);
+ properties = SBD = BBD = NULL;
+ rootEntry = NULL;
+}
+
+/**
+ *
+ *
+ * @param stream
+ *
+ * @return
+ */
+int ole_close(FILE *stream) {
+ oleEntry *e=(oleEntry*)stream;
+ if(e == NULL)
+ return -1;
+ if (e->blocks != NULL)
+ free(e->blocks);
+ free(e);
+ return 0;
+}
+
+/**
+ *
+ *
+ * @param stream pointer to OLE stream structure
+ * @param offset
+ * @param whence
+ *
+ * @return
+ */
+int ole_seek(FILE *stream, long offset, int whence) {
+ oleEntry *e=(oleEntry*)stream;
+ long int new_ole_offset=0, new_file_offset;
+ int ssize, modBlock, blockNumber;
+
+ switch(whence) {
+ case SEEK_SET:
+ new_ole_offset=offset;
+ break;
+
+ case SEEK_CUR:
+ new_ole_offset=e->ole_offset+offset;
+ break;
+
+ case SEEK_END:
+ new_ole_offset=e->length+offset;
+ break;
+
+ default:
+ errno=EINVAL;
+ return -1;
+ }
+ if(new_ole_offset<0)
+ new_ole_offset=0;
+ if(new_ole_offset >= e->length)
+ new_ole_offset=e->length;
+
+ ssize = (e->isBigBlock ? sectorSize : shortSectorSize);
+ blockNumber=new_ole_offset/ssize;
+ if ( blockNumber >= e->numOfBlocks )
+ return -1;
+
+ modBlock=new_ole_offset%ssize;
+ new_file_offset = calcFileBlockOffset(e,blockNumber)+modBlock;
+ fseek(e->file, e->file_offset=new_file_offset, SEEK_SET);
+ e->ole_offset=new_ole_offset;
+
+ return 0;
+}
+
+/**
+ * Tell position inside OLE stream
+ *
+ * @param stream pointer to OLE stream
+ *
+ * @return current position inside OLE stream
+ */
+long ole_tell(FILE *stream) {
+ oleEntry *e=(oleEntry*)stream;
+ return e->ole_offset;
+}
+
+
+/**
+ *
+ *
+ */
+size_t (*catdoc_read)(void *ptr, size_t size, size_t nmemb, FILE *stream);
+int (*catdoc_eof)(FILE *stream);
+int (*catdoc_seek)(FILE *stream, long offset, int whence);
+long (*catdoc_tell)(FILE *stream);
+
+void set_ole_func(void) {
+ catdoc_read=ole_read;
+ catdoc_eof=ole_eof;
+ catdoc_seek=ole_seek;
+ catdoc_tell=ole_tell;
+}
+
+#ifdef feof
+/* feof is macro in Turbo C, so we need a real function to assign to
+ * pointer
+ */
+int my_feof(FILE *f) {
+ return feof(f);
+}
+#define FEOF my_feof
+#else
+#define FEOF feof
+#endif
+
+void set_std_func(void) {
+ catdoc_read=fread;
+ catdoc_eof=FEOF;
+ catdoc_seek=fseek;
+ catdoc_tell=ftell;
+}
+
--- /dev/null
+/**
+ * @file ole.h
+ * @author Alex Ott <ott@jet.msk.su>
+ * @date 03 éÀÎ 2003
+ * Version: $Id: ole.h,v 1.1 2006-02-24 17:44:06 vitus Exp $
+ * Copyright: Alex Ott, 2003,
+ *
+ * @brief
+ *
+ *
+ */
+
+#ifndef _OLE_H
+#define _OLE_H 1
+
+#define BBD_BLOCK_SIZE 512
+#define SBD_BLOCK_SIZE 64
+#define PROP_BLOCK_SIZE 128
+#define OLENAMELENGHT 32
+#define MSAT_ORIG_SIZE 436
+
+typedef enum {
+ oleDir=1,
+ oleStream=2,
+ oleRootDir=5,
+ oleUnknown=3
+} oleType;
+
+typedef struct {
+ FILE *file;
+ char name[OLENAMELENGHT+1];
+ long int startBlock;
+ long int curBlock;
+ unsigned long int length;
+ long int ole_offset;
+ long int file_offset;
+ unsigned char *dirPos;
+ oleType type;
+ long int numOfBlocks;
+ long int *blocks; /**< array of blocks numbers */
+ int isBigBlock;
+} oleEntry;
+
+/**
+ * Functions
+ *
+ */
+FILE* ole_init(FILE *f, void *buffer, size_t bufSize);
+FILE *ole_readdir(FILE *f);
+int ole_seek(FILE *stream, long offset, int whence);
+long ole_tell(FILE *stream);
+int ole_eof(FILE *stream);
+size_t ole_read(void *ptr, size_t size, size_t nmemb, FILE *stream);
+int ole_open(FILE *);
+int ole_close(FILE *);
+void ole_finish(void);
+
+extern size_t (*catdoc_read)(void *ptr, size_t size, size_t nmemb, FILE *stream);
+extern int (*catdoc_eof)(FILE *stream);
+extern int (*catdoc_seek)(FILE *stream, long offset, int whence);
+extern long (*catdoc_tell)(FILE *stream);
+
+void set_ole_func(void);
+void set_std_func(void);
+
+#endif /* _OLE_H */
+
--- /dev/null
+/**
+ * @file ppt.h
+ * @author Alex Ott <alexott@gmail.com>
+ * @date 23 äÅË 2004
+ * Version: $Id: ppt.h,v 1.1 2006-02-24 17:44:06 vitus Exp $
+ * Copyright: Alex Ott
+ *
+ * @brief definitions of .ppt processing functions
+ *
+ *
+ */
+
+#ifndef _PPT_H
+#define _PPT_H 1
+
+#include <stdio.h>
+#include <math.h>
+
+void do_ppt(FILE *input,char *filename);
+
+#endif /* _PPT_H */
+
--- /dev/null
+/**
+ * @file pptparse.c
+ * @author Alex Ott <alexott@gmail.com>
+ * @date 23 äÅË 2004
+ * Version: $Id: pptparse.c,v 1.1 2006-02-24 17:44:06 vitus Exp $
+ * Copyright: Alex Ott
+ *
+ * @brief .ppt parsing routines
+ *
+ *
+ */
+
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <stdlib.h>
+#include <string.h>
+#include <math.h>
+#include <time.h>
+#include "ppt.h"
+#include "catdoc.h"
+#include "ppttypes.h"
+
+static void process_item (int rectype, long reclen, FILE* input);
+
+#if !defined(min)
+#define min(x,y) ((x) < (y) ? (x) : (y))
+#endif
+
+
+/**
+ *
+ *
+ * @param input
+ * @param filename
+ */
+void do_ppt(FILE *input,char *filename) {
+ int itemsread=1;
+ int rectype;
+ long reclen;
+ unsigned char recbuf[8];
+
+ while(itemsread) {
+ itemsread = catdoc_read(recbuf, 1, 8, input);
+/* fprintf(stderr,"itemsread=%d: ",itemsread); */
+/* for(i=0; i<8; i++) */
+/* fprintf(stderr,"%02x ",recbuf[i]); */
+/* fprintf(stderr,"\n"); */
+
+ if (catdoc_eof(input)) {
+ process_item(DOCUMENT_END,0,input);
+ return;
+ }
+ if(itemsread < 8)
+ break;
+ rectype=getshort(recbuf,2);
+ reclen=getulong(recbuf,4);
+ if (reclen < 0) {
+ return;
+ }
+ process_item(rectype,reclen,input);
+ }
+}
+
+
+/**
+ *
+ *
+ * @param rectype
+ * @param reclen
+ * @param input
+ */
+static void process_item (int rectype, long reclen, FILE* input) {
+ int i=0, u;
+ static char buf[2];
+
+ switch(rectype) {
+ case DOCUMENT_END:
+/* fprintf(stderr,"End of document, ended at %ld\n",catdoc_tell(input)); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case DOCUMENT:
+/* fprintf(stderr,"Start of document, reclen=%ld, started at %ld\n", reclen, */
+/* catdoc_tell(input)); */
+ break;
+
+ case DOCUMENT_ATOM:
+/* fprintf(stderr,"DocumentAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case SLIDE:
+/* fprintf(stderr,"Slide, reclen=%ld\n", reclen); */
+/* fputs("---------------------------------------\n",stderr); */
+ break;
+
+ case SLIDE_ATOM:
+/* fprintf(stderr,"SlideAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case SLIDE_BASE:
+/* fprintf(stderr,"SlideBase, reclen=%ld\n", reclen); */
+ break;
+
+ case SLIDE_BASE_ATOM:
+/* fprintf(stderr,"SlideBaseAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case NOTES:
+/* fprintf(stderr,"Notes, reclen=%ld\n", reclen); */
+ break;
+
+ case NOTES_ATOM:
+/* fprintf(stderr,"NotesAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case HEADERS_FOOTERS:
+/* fprintf(stderr,"HeadersFooters, reclen=%ld\n", reclen); */
+ break;
+
+ case HEADERS_FOOTERS_ATOM:
+/* fprintf(stderr,"HeadersFootersAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case MAIN_MASTER:
+/* fprintf(stderr,"MainMaster, reclen=%ld\n", reclen); */
+ break;
+
+ case TEXT_BYTES_ATOM: {
+/* fprintf(stderr,"TextBytes, reclen=%ld\n", reclen); */
+ for(i=0; i < reclen; i++) {
+ catdoc_read(buf,1,1,input);
+ if((unsigned char)*buf!=0x0d)
+ fputs(convert_char((unsigned char)*buf),stdout);
+ else
+ fputc('\n',stdout);
+ }
+ fputc('\n',stdout);
+ }
+ break;
+
+ case TEXT_CHARS_ATOM:
+ case CSTRING: {
+ long text_len;
+
+/* fprintf(stderr,"CString, reclen=%ld\n", reclen); */
+ text_len=reclen/2;
+ for(i=0; i < text_len; i++) {
+ catdoc_read(buf,2,1,input);
+ u=(unsigned short)getshort(buf,0);
+ if(u!=0x0d)
+ fputs(convert_char(u),stdout);
+ else
+ fputc('\n',stdout);
+ }
+ fputc('\n',stdout);
+ }
+ break;
+
+ case USER_EDIT_ATOM:
+/* fprintf(stderr,"UserEditAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case COLOR_SCHEME_ATOM:
+/* fprintf(stderr,"ColorSchemeAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case PPDRAWING:
+/* fprintf(stderr,"PPDrawing, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case ENVIRONMENT:
+/* fprintf(stderr,"Environment, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case SSDOC_INFO_ATOM:
+/* fprintf(stderr,"SSDocInfoAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case SSSLIDE_INFO_ATOM:
+/* fprintf(stderr,"SSSlideInfoAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case PROG_TAGS:
+/* fprintf(stderr,"ProgTags, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case PROG_STRING_TAG:
+/* fprintf(stderr,"ProgStringTag, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case PROG_BINARY_TAG:
+/* fprintf(stderr,"ProgBinaryTag, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case LIST:
+/* fprintf(stderr,"List, reclen=%ld\n", reclen); */
+ break;
+
+ case SLIDE_LIST_WITH_TEXT:
+/* fprintf(stderr,"SlideListWithText, reclen=%ld\n", reclen); */
+/* fputs("---------------------------------------\n",stderr); */
+ break;
+
+ case PERSIST_PTR_INCREMENTAL_BLOCK:
+/* fprintf(stderr,"PersistPtrIncrementalBlock, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case EX_OLE_OBJ_STG:
+/* fprintf(stderr,"ExOleObjStg, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case PPDRAWING_GROUP:
+/* fprintf(stderr,"PpdrawingGroup, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case EX_OBJ_LIST:
+/* fprintf(stderr,"ExObjList, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case TX_MASTER_STYLE_ATOM:
+/* fprintf(stderr,"TxMasterStyleAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case HANDOUT:
+/* fprintf(stderr,"Handout, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case SLIDE_PERSIST_ATOM:
+/* fprintf(stderr,"SlidePersistAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case TEXT_HEADER_ATOM:
+/* fprintf(stderr,"TextHeaderAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case TEXT_SPEC_INFO:
+/* fprintf(stderr,"TextSpecInfo, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ case STYLE_TEXT_PROP_ATOM:
+/* fprintf(stderr,"StyleTextPropAtom, reclen=%ld\n", reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;
+
+ /* case :
+ fprintf(stderr,", reclen=%ld\n", reclen);
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;*/
+
+ /* case :
+ fprintf(stderr,", reclen=%ld\n", reclen);
+ catdoc_seek(input, reclen, SEEK_CUR);
+ break;*/
+
+ default:
+/* fprintf(stderr,"Default action for rectype=%d reclen=%ld\n", */
+/* rectype, reclen); */
+ catdoc_seek(input, reclen, SEEK_CUR);
+
+ }
+
+}
--- /dev/null
+/**
+ * @file ppttypes.h
+ * @author Alex Ott <alexott@gmail.com>
+ * @date 26 äÅË 2004
+ * Version: $Id: ppttypes.h,v 1.1 2006-02-24 17:44:06 vitus Exp $
+ * Copyright: Alex Ott
+ *
+ * @brief Enumerations for .ppt records
+ *
+ *
+ */
+
+#ifndef _PPTTYPES_H
+#define _PPTTYPES_H 1
+
+#define UNKNOWN 0
+#define DOCUMENT 1000
+#define DOCUMENT_ATOM 1001
+#define DOCUMENT_END 1002
+#define SLIDE_PERSIST 1003
+#define SLIDE_BASE 1004
+#define SLIDE_BASE_ATOM 1005
+#define SLIDE 1006
+#define SLIDE_ATOM 1007
+#define NOTES 1008
+#define NOTES_ATOM 1009
+#define ENVIRONMENT 1010
+#define SLIDE_PERSIST_ATOM 1011
+#define MAIN_MASTER 1016
+#define SSSLIDE_INFO_ATOM 1017
+#define SSDOC_INFO_ATOM 1025
+#define EX_OBJ_LIST 1033
+#define PPDRAWING_GROUP 1035
+#define PPDRAWING 1036
+#define LIST 2000
+#define COLOR_SCHEME_ATOM 2032
+#define TEXT_HEADER_ATOM 3999
+#define TEXT_CHARS_ATOM 4000
+#define STYLE_TEXT_PROP_ATOM 4001
+#define TX_MASTER_STYLE_ATOM 4003
+#define TEXT_BYTES_ATOM 4008
+#define TEXT_CISTYLE_ATOM 4008
+#define TEXT_SPEC_INFO 4010
+#define EX_OLE_OBJ_STG 4113
+#define CSTRING 4026
+#define HANDOUT 4041
+#define HEADERS_FOOTERS 4057
+#define HEADERS_FOOTERS_ATOM 4058
+#define SLIDE_LIST_WITH_TEXT 4080
+#define SLIDE_LIST 4084
+#define USER_EDIT_ATOM 4085
+#define PROG_TAGS 5000
+#define PROG_STRING_TAG 5001
+#define PROG_BINARY_TAG 5002
+#define PERSIST_PTR_INCREMENTAL_BLOCK 6002
+/* #define */
+/* #define */
+/* #define */
+/* #define */
+/* #define */
+
+#endif /* _PPTTYPES_H */
+
--- /dev/null
+/*****************************************************************/
+/* Reading routines for MS-Word, MS-Write and text files */
+/* */
+/* This file is part of catdoc project */
+/* (c) Victor Wagner 1996-2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <string.h>
+#include <stdio.h>
+#include "catdoc.h"
+unsigned short int buffer[PARAGRAPH_BUFFER];
+static unsigned char read_buf[256];
+static int buf_is_unicode;
+
+/**************************************************************************/
+/* Just prints out content of input file. Called when file is not OLE */
+/* stream */
+/* Parameters - f - file to copy out. header - first few bytes of file, */
+/* which have been already read by format recognition code, but should */
+/* be output anyway */
+/**************************************************************************/
+void copy_out (FILE *f,char *header) {
+ char *buf=(char *)buffer;
+ int count,i;
+ long offset;
+ if (get_unicode_char == get_word8_char) {
+ /* non-word file and -u specified. Trying to guess which kind of
+ * unicode is used
+ */
+ if ((unsigned char)header[0]==0xFE && (unsigned char)header[1]==0xFF) {
+ get_unicode_char = get_utf16msb;
+ fputs(convert_char(header[2]<<8|header[3]),stdout);
+ fputs(convert_char(header[4]<<8|header[5]),stdout);
+ fputs(convert_char(header[6]<<8|header[7]),stdout);
+ } else if ((unsigned char)header[0]!=0xFF ||
+ (unsigned char)header[1]!=0xFE) {
+ int c,j,d;
+ /* if it is not utf16, assume it is UTF8. We are told -u,
+ * aren't we */
+ get_unicode_char = get_utf8;
+ i=0;
+ while (i<8) {
+ c=(unsigned char)header[i++];
+ if (c >=0x80) {
+ if ( c<0xE0) {
+ c=(c & 0x1F);
+ count =1;
+ } else {
+ c=(c & 0xF);
+ count = 2;
+ }
+ for (j=0;j<count;j++) {
+ if (i<7) {
+ d=(unsigned char) header[i++];
+ } else {
+ d=fgetc(f);
+ }
+ c=c<<6 | (d & 0x3F);
+ }
+ }
+ fputs (convert_char(c),stdout);
+ }
+ } else {
+ get_unicode_char = get_utf16lsb;
+ fputs(convert_char(header[3]<<8|header[2]),stdout);
+ fputs(convert_char(header[5]<<8|header[4]),stdout);
+ fputs(convert_char(header[7]<<8|header[6]),stdout);
+ }
+ while (!catdoc_eof(f)) {
+ i=get_unicode_char(f,&offset,0x7FFFFFFF);
+ if (i!=EOF) fputs(convert_char(i),stdout);
+ }
+ } else {
+ for (i=0;i<8;i++) {
+ fputs(convert_char(to_unicode(source_charset,(unsigned char)header[i])),stdout);
+ }
+ /* Assuming 8-bit input text */
+ while ((count = catdoc_read(buf,1,PARAGRAPH_BUFFER,f))) {
+ for (i=0;i<count;i++) {
+ fputs(convert_char(to_unicode(source_charset,
+ (unsigned char)buf[i])),stdout);
+ }
+ }
+ }
+}
+/**************************************************************************/
+/* process_file - main process engine. Reads word file using function, */
+/* pointed by get_unicode_char, searches for things which looks like */
+/* paragraphs and print them out */
+/**************************************************************************/
+int process_file(FILE *f,long stop) {
+ int bufptr;
+ int tabmode=0;
+ long offset=0;
+ int hyperlink_mode = 0;
+ unsigned short c;
+ /* Now we are starting to read with get_unicode_char */
+ while (!catdoc_eof(f) && offset<stop) {
+ bufptr = -1;
+ do {
+ c=get_unicode_char(f,&offset,stop);
+ /* Following symbols below 32 are allowed inside paragraph:
+ 0x0002 - footnote mark
+ 0x0007 - table separator (converted to tabmode)
+ 0x0009 - Horizontal tab ( printed as is)
+ 0x000B - hard return
+ 0x000C - page break
+ 0x000D - return - marks an end of paragraph
+ 0x001E - IS2 for some reason means short defis in Word.
+ 0x001F - soft hyphen in Word
+ 0x0013 - start embedded hyperlink
+ 0x0014 - separate hyperlink URL from text
+ 0x0015 - end embedded hyperlink
+ */
+ if (tabmode) {
+ tabmode=0;
+ if (c==0x007) {
+ buffer[++bufptr]=0x1E;
+ continue;
+ } else {
+ buffer[++bufptr]=0x1C;
+ }
+ }
+ if (c<32) {
+ switch (c) {
+ case 0x007:
+ tabmode = 1;
+ break;
+ case 0x000D:
+ case 0x000B:
+ buffer[++bufptr]=0x000A;
+ break;
+ case 0x000C:
+ buffer[++bufptr]=c;
+ break;
+ case 0x001E:
+ buffer[++bufptr]='-';
+ break;
+ case 0x0002: break;
+
+ case 0x001F:
+ buffer[++bufptr]=0xAD;/* translate to Unicode
+ soft hyphen */
+ break;
+ case 0x0009:
+ buffer[++bufptr]=c;
+ break;
+ case 0x0013:
+ hyperlink_mode=1;
+ buffer[++bufptr]=' ';
+ break;
+ case 0x0014:
+ hyperlink_mode = 0;
+ /*fall through */
+ case 0x0015:
+ /* just treat hyperlink separators as
+ * space */
+ buffer[++bufptr]=' ';
+ break;
+ case 0x0001: if (hyperlink_mode)
+ break;
+ /* else fall through */
+ default:
+ bufptr=-1; /* Any other control char - discard para*/
+ }
+ } else if (c != 0xfeff) {
+ /* skip ZERO-WIDTH-UNBREAKABLE-SPACE. Output anything
+ * else*/
+ buffer[++bufptr]=c;
+ }
+ } while (bufptr<PARAGRAPH_BUFFER-2 &&
+ !catdoc_eof(f) &&
+ buffer[bufptr]!=0x000a);
+ if (bufptr>0) {
+ buffer[++bufptr]=0;
+ output_paragraph(buffer);
+ }
+ }
+ return 0;
+}
+/**********************************************************************/
+/* Reads file from MS-Word 97 and above file. Takes in account strange*
+ * situation that unicode and non-unicode 256-byte blocks could be *
+ * intermixed in word file *
+ * *
+ * Parameters: *
+ * *
+ * f - file to read *
+ * offset - position of the character inside file (to determine *
+ * possible block boundaries *
+ **********************************************************************/
+int get_word8_char(FILE *f,long *offset,long fileend) {
+ int count,i,u;
+ char c;
+ if ((i=(*offset)%256) ==0) {
+ count=catdoc_read(read_buf,1,256,f);
+ memset(read_buf+count,0,256-count);
+ buf_is_unicode=0;
+ if (*offset+(long)count>fileend) {
+ count=fileend-*offset;
+ }
+ while (i<count) {
+ c=read_buf[i++];
+ if ((c==0x20|| c==0x0D||ispunct(c))&&i<count&&read_buf[i]==0) {
+ buf_is_unicode=1;
+ break;
+ }
+ i++;
+ }
+ i=0;
+ }
+ if (buf_is_unicode) {
+ u=read_buf[i] | read_buf[i+1]<<8;
+ (*offset)+=2;
+ } else {
+ u=to_unicode(source_charset,read_buf[i]);
+ (*offset)++;
+ }
+ return u;
+}
+
+
--- /dev/null
+/*****************************************************************/
+/* Reading routines for rtf files */
+/* */
+/* This file is part of catdoc project */
+/* (c) Victor Wagner 2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <ctype.h>
+#include <string.h>
+#include "catdoc.h"
+
+/********************************************************
+ * Datatypes declaration
+ *
+ */
+typedef enum {
+ RTF_CODEPAGE,
+ RTF_FONT_CHARSET,
+ RTF_UC,
+ RTF_UNICODE_CHAR,
+ RTF_CHAR,
+ RTF_PARA,
+ RTF_TABLE_START,
+ RTF_TABLE_END,
+ RTF_ROW,
+ RTF_CELL,
+ RTF_UNKNOWN,
+ RTF_OVERLAY,
+ RTF_PICT,
+ RTF_F,
+ RTF_AUTHOR,
+ RTF_FONTTBL,
+ RTF_INFO,
+ RTF_STYLESHEET,
+ RTF_COLORTBL,
+ RTF_LISTOVERRIDETABLE,
+ RTF_LISTTABLE,
+ RTF_RSIDTBL,
+ RTF_GENERATOR,
+ RTF_DATAFIELD,
+ RTF_LANG,
+ RTF_PARD,
+ RTF_TAB,
+ RTF_SPEC_CHAR,
+ RTF_EMDASH,
+ RTF_ENDASH,
+ RTF_EMSPACE,
+ RTF_ENSPACE,
+ RTF_BULLET,
+ RTF_LQUOTE,
+ RTF_RQUOTE,
+ RTF_LDBLQUOTE,
+ RTF_RDBLQUOTE,
+ RTF_ZWNONJOINER,
+} RTFTypes;
+
+typedef struct {
+ char *name;
+ RTFTypes type;
+} RTFTypeMap;
+
+RTFTypeMap rtf_types[]={
+ {"uc",RTF_UC},
+ {"ansicpg",RTF_CODEPAGE},
+ {"pard",RTF_PARD},
+ {"par",RTF_PARA},
+ {"cell",RTF_CELL},
+ {"row",RTF_ROW},
+ {"overlay",RTF_OVERLAY},
+ {"pict",RTF_PICT},
+ {"author",RTF_AUTHOR},
+ {"f",RTF_F},
+ {"fonttbl",RTF_FONTTBL},
+ {"info",RTF_INFO},
+ {"stylesheet",RTF_STYLESHEET},
+ {"colortbl",RTF_COLORTBL},
+ {"listtable",RTF_LISTTABLE},
+ {"listoverridetable",RTF_LISTOVERRIDETABLE},
+ {"rsidtbl",RTF_RSIDTBL},
+ {"generator",RTF_GENERATOR},
+ {"datafield",RTF_DATAFIELD},
+ {"lang",RTF_LANG},
+ {"tab",RTF_TAB},
+ {"emdash",RTF_EMDASH},
+ {"endash",RTF_ENDASH},
+ {"emspace",RTF_EMDASH},
+ {"enspace",RTF_ENDASH},
+ {"bullet",RTF_BULLET},
+ {"lquote",RTF_LQUOTE},
+ {"rquote",RTF_RQUOTE},
+ {"ldblquote",RTF_LDBLQUOTE},
+ {"rdblquote",RTF_RDBLQUOTE},
+ {"zwnj",RTF_ZWNONJOINER},
+/* {"",}, */
+/* {"",}, */
+ {"u",RTF_UNICODE_CHAR}
+};
+
+#define RTFNAMEMAXLEN 32
+#define RTFARGSMAXLEN 64
+
+/**
+ * Structure describing rtf command
+ *
+ */
+typedef struct {
+ RTFTypes type;
+ char name[RTFNAMEMAXLEN+1];
+ signed int numarg;
+/* void *args; */
+} RTFcommand;
+
+
+#define MAXFONTNAME 64
+/**
+ *
+ *
+ */
+typedef struct {
+ int name;
+ char fontname[MAXFONTNAME+1];
+} RTFFont;
+
+/**
+ * Structure to describe style
+ *
+ */
+typedef struct {
+ int codepage;
+} RTFStyle;
+
+/**
+ * Structure to store values, local to rtf group
+ *
+ */
+typedef struct {
+ int uc; /**< How much symbols to skip */
+ RTFStyle* style; /**< curren style */
+} RTFGroupData;
+
+/********************************************************
+ * Functions declaration
+ *
+ */
+
+extern int forced_charset;
+signed long getNumber(FILE *f);
+
+int getRtfCommand(FILE *f, RTFcommand *command );
+unsigned short int rtf_to_unicode(int code);
+RTFTypes getCommandType(char *name);
+signed int getCharCode(FILE *f);
+void rtfSetCharset(short int **charset_ptr,unsigned int codepage);
+
+/********************************************************
+ * Global data
+ *
+ */
+short int *current_charset;
+int rtf_level=0;
+
+/********************************************************
+ * Functions implementation
+ *
+ */
+extern unsigned short int buffer[];
+void add_to_buffer(int *bufptr,unsigned short int c) {
+ buffer[++(*bufptr)]=c;
+ if (*bufptr > PARAGRAPH_BUFFER-2) {
+ buffer[++(*bufptr)]=0;
+ output_paragraph(buffer);
+ *bufptr=-1;
+ }
+}
+
+void end_paragraph(int *bufptr) {
+ add_to_buffer(bufptr,0x000a);
+ add_to_buffer(bufptr,0);
+ output_paragraph(buffer);
+ *bufptr=-1;
+}
+
+/**
+ * Parses RTF file from file stream
+ *
+ * @param f - file stream descriptor
+ */
+int parse_rtf(FILE *f) {
+ int para_mode=0, data_skip_mode=0,i;
+ RTFGroupData *groups=NULL;
+ int group_count=0, group_store=20;
+ int bufptr=-1;
+ current_charset=source_charset;
+ fseek(f,0,SEEK_SET);
+ if((groups=(RTFGroupData*)calloc(group_store,sizeof(RTFGroupData))) == NULL ) {
+ perror("Can\'t allocate memory: ");
+ return 1;
+ }
+ groups[0].uc = 2; /* DEfault uc = 2 */
+ while ( !feof(f) ) {
+ int c = fgetc(f);
+ if ( feof( f ) )
+ break;
+ switch (c) {
+ case '\\': {
+ int code;
+ RTFcommand com;
+ if ((code=getRtfCommand(f, &com)) != 0)
+ break;
+ switch (com.type) {
+ case RTF_SPEC_CHAR:
+/* fprintf(stderr, "Spec Char found=%s and arg=%c\n", */
+/* com.name, com.numarg); */
+ if (com.numarg == '*' && data_skip_mode == 0) {
+ data_skip_mode=group_count;
+ } else if (com.numarg == '\r') {
+ end_paragraph(&bufptr);
+ } else if (com.numarg == '~') {
+ add_to_buffer(&bufptr,0xA0);/* NO-BREAK SPACE */
+ } else if (com.numarg == '-') {
+ add_to_buffer(&bufptr,0xAD);/* Optional hyphen */
+ }
+
+ break;
+ case RTF_EMDASH:
+ add_to_buffer(&bufptr,0x2014);/* EM DASH*/
+ break;
+ case RTF_ENDASH:
+ add_to_buffer(&bufptr,0x2013);break;
+ case RTF_BULLET:
+ add_to_buffer(&bufptr,0x2022);break;
+ case RTF_LQUOTE: add_to_buffer(&bufptr,0x2018);break;
+ case RTF_RQUOTE: add_to_buffer(&bufptr,0x2019);break;
+ case RTF_LDBLQUOTE: add_to_buffer(&bufptr,0x201C);break;
+ case RTF_RDBLQUOTE: add_to_buffer(&bufptr,0x201D);break;
+ case RTF_ZWNONJOINER: add_to_buffer(&bufptr,0xfeff);break;
+ case RTF_EMSPACE:
+ case RTF_ENSPACE:
+ add_to_buffer(&bufptr,' ');break;
+ case RTF_CHAR:
+/* fprintf(stderr, "RTF char %d\n", com.numarg); */
+ if (data_skip_mode == 0) {
+ add_to_buffer(&bufptr,rtf_to_unicode(com.numarg));
+ }
+ break;
+ case RTF_UC:
+ groups[group_count].uc=com.numarg;
+ break;
+ case RTF_TAB:
+ add_to_buffer(&bufptr,0x0009);
+ break;
+ case RTF_UNICODE_CHAR:
+ if (com.numarg < 0)
+ break;
+/* fprintf(stderr, "Unicode char %d\n", com.numarg); */
+ if (data_skip_mode == 0)
+ add_to_buffer(&bufptr,com.numarg);
+ i=groups[group_count].uc;
+ while((--i)>0)
+ fgetc(f);
+ break;
+ case RTF_PARA:
+ /*if (para_mode > 0) {*/
+ end_paragraph(&bufptr);
+ /*}*/
+ para_mode=group_count;
+ break;
+ case RTF_PICT:
+ case RTF_FONTTBL:
+ case RTF_INFO:
+ case RTF_COLORTBL:
+ case RTF_STYLESHEET:
+ case RTF_LISTTABLE:
+ case RTF_LISTOVERRIDETABLE:
+ case RTF_RSIDTBL:
+ case RTF_GENERATOR:
+ case RTF_DATAFIELD:
+ if (data_skip_mode == 0){
+ data_skip_mode=group_count;
+ }
+ break;
+ case RTF_LANG:
+/* fprintf(stderr, "Selected lang = %d\n",com.numarg); */
+ break;
+ case RTF_CODEPAGE:
+ rtfSetCharset(¤t_charset,com.numarg);
+ default:
+/* fprintf(stderr, "Unknown command with name %s and arg=%d\n", */
+/* com.name, com.numarg); */
+ ;
+ }
+ break;
+ }
+ case '{':
+ group_count++;
+ if (group_count >= group_store ) {
+ group_store+=10;
+ if((groups=(RTFGroupData*)realloc(groups,
+ group_store*sizeof(RTFGroupData)))
+ == NULL ) {
+ perror("Can\'t allocate memory: ");
+ return 1;
+ }
+ }
+ if (para_mode)
+ add_to_buffer(&bufptr,0x20);
+ groups[group_count]=groups[group_count-1];
+ break;
+ case '}':
+ group_count--;
+ if(group_count < 0)
+ group_count=0;
+ if(para_mode > 0 && para_mode > group_count) {
+ /*add_to_buffer(&bufptr,0);
+ output_paragraph(buffer);
+ fprintf(stderr,"\nGROUP_END para_mode=%d group_count=%d bufptr=%d\n", para_mode,group_count,bufptr);
+ bufptr=-1;*/
+ para_mode=0;
+ }
+ if(data_skip_mode > group_count) {
+ data_skip_mode=0;
+ }
+ break;
+ default:
+ if (data_skip_mode == 0)
+ if (c != '\n' && c != '\r')
+ add_to_buffer(&bufptr,rtf_to_unicode(c));
+ }
+ }
+ if (bufptr>=0) {
+ add_to_buffer(&bufptr,'\n');
+ add_to_buffer(&bufptr,0);
+ output_paragraph(buffer);
+ }
+ free(groups);
+ return 0;
+}
+
+/**
+ * Convert text string to number
+ *
+ * @param f stream to read data from
+ *
+ * @return converted number
+ */
+signed long getNumber(FILE *f) {
+ int c,count=0;
+ char buf[RTFARGSMAXLEN];
+
+ while(isdigit(c=fgetc(f)) || c=='-') {
+ if(feof(f))
+ return -1;
+ buf[count++]=(char)c;
+ }
+ ungetc(c,f);
+ buf[count]='\0';
+ return strtol(buf, (char **)NULL, 10);
+}
+
+/**
+ * Parse command stream from rtf file and fill command structure
+ *
+ * @param f - rtf file stream
+ * @param command - pointer to RTFcommand structure to fill
+ *
+ * @return parse code not 0 - error, 0 - success
+ */
+int getRtfCommand(FILE *f, RTFcommand *command ) {
+ int c=fgetc(f);
+ if (isalpha(c)) {
+ int name_count=1;
+ command->name[0]=(char)c;
+ while(isalpha(c=fgetc(f)) && name_count < RTFNAMEMAXLEN) {
+ if(feof(f))
+ return 1;
+ command->name[name_count++]=(char)c;
+ }
+ command->name[name_count]='\0';
+ command->type=getCommandType(command->name);
+/* command->args=NULL; */
+ ungetc(c,f);
+ if (isdigit(c) || c == '-' )
+ command->numarg=getNumber(f);
+ else
+ command->numarg=0;
+ c=fgetc(f);
+ if(!(c==' ' || c=='\t'))
+ ungetc(c,f);
+ } else {
+ command->name[0]=(char)c;
+ command->name[1]='\0';
+/* command->args=NULL; */
+ if (c == '\'') {
+ command->type=RTF_CHAR;
+ command->numarg=getCharCode(f);
+ if(feof(f))
+ return -1;
+ } else {
+ command->type=RTF_SPEC_CHAR;
+ command->numarg=c;
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * Converts char to unicode.
+ *
+ * @param code - integer code of char
+ *
+ * @return converted char
+ */
+unsigned short int rtf_to_unicode(int code) {
+ int cc=code;
+ if (code < 0 || (cc=to_unicode(current_charset, code)) < 0 ) return 0xFEFF;
+ return cc;
+}
+
+/**
+ * Convert name of RTF command to RTFType
+ *
+ * @param name name to convert
+ *
+ * @return RTFType, if unknown command, then return RTF_UNKNOWN
+ */
+RTFTypes getCommandType(char *name) {
+ int i, olen=sizeof(rtf_types)/sizeof(RTFTypeMap);
+ for (i = 0; i < olen ; i++) {
+ if ( strcmp(name,rtf_types[i].name) == 0 ) {
+ return rtf_types[i].type;
+ }
+ }
+ return RTF_UNKNOWN;
+}
+
+/**
+ * Return number representing char code in Hex
+ *
+ * @param f stream to read data from
+ *
+ * @return converted number
+ */
+signed int getCharCode(FILE *f) {
+ int c,count=0,i;
+ char buf[RTFARGSMAXLEN];
+ for(i=0;i<2; i++) {
+ if (isdigit(c=fgetc(f))||(c>='a' && c<='f')) {
+ if(feof(f))
+ return -1;
+ buf[count++]=(char)c;
+ } else
+ ungetc(c,f);
+ }
+
+ buf[count]='\0';
+ return strtol(buf, (char **)NULL, 16);
+}
+
+void rtfSetCharset(short int **charset_ptr,unsigned int codepage)
+{
+ /* Do not override charset if it is specified in the command line */
+ const char *charset_name;
+ char *save_buf = input_buffer;
+ if (forced_charset) return;
+ charset_name = charset_from_codepage(codepage);
+ check_charset(&source_csname,charset_name);
+ input_buffer=NULL;
+ *charset_ptr = read_charset(source_csname);
+ input_buffer = save_buf;
+}
--- /dev/null
+/*****************************************************************/
+/* Representation and handling of Excell worksheets in memory */
+/* */
+/* This file is part of catdoc project */
+/* (c) Victor Wagner 1998-2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <stdlib.h>
+#include <ctype.h>
+#include <string.h>
+#include "xls.h"
+struct rowdescr *rowptr=NULL;
+int startrow=0,lastrow=0;
+char cell_separator = ',';
+int quote_mode = QUOTE_ALL_STRINGS;
+char *sheet_separator = "\f";
+/*
+ * Allocates cell for given row and col and returns pointer to poitrer
+ * of cell contents
+ */
+unsigned char **allocate (int row,int col) {
+ unsigned int newrow,newcol;
+ if (row>=lastrow) {
+ newrow=(row/16+1)*16;
+ rowptr=realloc(rowptr,newrow*sizeof(struct rowdescr));
+ if (rowptr == NULL) {
+ perror("allocating sheet ");
+ exit(1);
+ }
+ memset(rowptr+lastrow,0,(newrow-lastrow)*sizeof(struct rowdescr));
+ lastrow=newrow;
+ }
+ if (col>=rowptr[row].end) {
+ newcol=(col/16+1)*16;
+ rowptr[row].cells=realloc(rowptr[row].cells,newcol*sizeof(char *));
+ if (rowptr[row].cells == NULL) {
+ perror("allocating row");
+ exit(1);
+ }
+ memset(rowptr[row].cells+rowptr[row].end,0,(newcol-rowptr[row].end)
+ *sizeof(char *));
+ rowptr[row].end=newcol;
+ }
+ if (col>rowptr[row].last) rowptr[row].last=col;
+ return (rowptr[row].cells+col);
+}
+/*
+ * Frees up all memory used by sheet
+ */
+void free_sheet(void) {
+ int i,j;
+ struct rowdescr *row;
+ unsigned char **col;
+ for (row=rowptr,i=0;i<lastrow;i++,row++) {
+ if (!row->cells) continue;
+ for (col=row->cells,j=0;j<row->end;j++,col++) {
+ if (*col) {
+ free(*col);
+ }
+ }
+ free (row->cells);
+ }
+ free (rowptr);
+ rowptr=NULL;
+ lastrow=0;
+}
+
+/*
+ * prints out one value with quoting
+ * uses global variable quote_mode
+ */
+void print_value(unsigned char *value)
+{
+ int i,len;
+ int quotes=0;
+ if (value != NULL) {
+ len=strlen((char *)value);
+ } else {
+ len = 0;
+ }
+ switch (quote_mode) {
+ case QUOTE_NEVER:
+ break;
+ case QUOTE_SPACES_ONLY:
+ for (i=0;i<len;i++) {
+ if (isspace(value[i]) || value[i]==cell_separator ||
+ value[i]=='"') {
+ quotes=1;
+ break;
+ }
+ }
+ break;
+ case QUOTE_ALL_STRINGS:
+ { char *endptr;
+ strtod(value,&endptr);
+ quotes=(*endptr != '0');
+ break;
+ }
+ case QUOTE_EVERYTHING:
+ quotes = 1;
+ break;
+ }
+ if (quotes) {
+ fputc('\"',stdout);
+ for (i=0;i<len;i++) {
+ if (value[i]=='\"') {
+ fputc('\"',stdout);
+ fputc('\"',stdout);
+ } else {
+ fputc(value[i],stdout);
+ }
+ }
+ fputc('\"',stdout);
+ } else {
+ fputs((char *)value,stdout);
+ }
+}
+/*
+ * Prints sheet to stdout. Uses global variable cell_separator
+ */
+void print_sheet(void) {
+ int i,j,printed=0;
+ struct rowdescr *row;
+ unsigned char **col;
+ lastrow--;
+ while (lastrow>0&&!rowptr[lastrow].cells) lastrow--;
+ for(i=0,row=rowptr;i<=lastrow;i++,row++) {
+ if (row->cells) {
+ for (j=0,col=row->cells;j<=row->last;j++,col++) {
+ if (j){
+ fputc(cell_separator,stdout);
+ printed=1;
+ }
+ if (*col) {
+ print_value(*col);
+ printed=1;
+ }
+ }
+ if (printed) {
+ fputc('\n',stdout);
+ printed=0;
+ }
+ }
+ }
+ fputs(sheet_separator,stdout);
+}
--- /dev/null
+/*****************************************************************/
+/* Substitution maps to replace some unicode characters with */
+/* multicharacter sequences */
+/* */
+/* This file is part of catdoc project */
+/* (c) Victor Wagner 1998-2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <string.h>
+#include <ctype.h>
+#include <stdlib.h>
+#include "catdoc.h"
+SUBSTMAP spec_chars, replacements;
+char *map_path=CHARSETPATH;
+void map_insert(SUBSTMAP map, int uc, const char *s) ;
+int longest_sequence=6;/* six is longest character sequence which can be
+ generated by catdoc internally*/
+
+/******************************************************************/
+/* checks for terminator of character sequence. If stop is something
+ * like quote - check for same char. If it is \n, check for any space
+ ********************************************************************/
+int isstop(char c, char stop) {
+ if (stop=='\n') {
+ return isspace(c);
+ } else {
+ return c==stop;
+ }
+}
+/************************************************************************/
+/* Reads substitution map file. */
+/************************************************************************/
+SUBSTMAP read_substmap(char* filename) {
+ FILE *f;
+ SUBSTMAP map=calloc(sizeof(char **),256);
+ char *path, line[1024], *p, *q;
+ char s[256];
+ char stopchar;
+ int escaped, lineno=0, i;
+ unsigned int uc;
+ path=find_file(filename,add_exe_path(map_path));
+ if (!path) {
+ free(map);
+ return(NULL);
+ }
+ if (!map) {
+ fprintf(stderr,"Insufficient memory\n");
+ exit(1);
+ }
+ f=fopen(path,"rb");
+ if (!f) {
+ perror("catdoc");
+ return NULL;
+ }
+ if (input_buffer)
+ setvbuf(f,input_buffer,_IOFBF,FILE_BUFFER);
+
+ while (!feof(f)) {
+ if (!fgets(line,1024,f)) continue;
+ lineno++;
+ /* parse line */
+
+ /* skip leading space */
+ for(p=line;*p && isspace(*p);p++);
+ /* if #, it is comment */
+ if (!*p ||
+#ifdef __MSDOS__
+ *p==0x1A || /* DOS have strange habit of using ^Z as eof */
+#endif
+ *p=='#') continue;
+ /* read hexadecimal code */
+ uc = strtol(p,&p,16);
+ if (!isspace(*p)|| uc<0 || uc>0xfffd) {
+ fprintf(stderr,"Error parsing %s(%d)\n",path,lineno);
+ continue;
+ }
+ /* skip space between code and sequence */
+ for(;*p && isspace(*p);p++);
+ if (!p) continue;
+ switch (*p) {
+ case '\'':
+ case '"':
+ stopchar=*p;
+ break;
+ case '(':
+ stopchar=')';
+ break;
+ case '[':
+ stopchar=']';
+ break;
+ case '{':
+ stopchar='}';
+ break;
+ default:
+ p--;
+ stopchar='\n';
+ }
+ p++;
+ q=p;
+ escaped=0;
+ while (*q && (!isstop(*q,stopchar) || escaped)) {
+ if (escaped) {
+ escaped=0;
+ } else {
+ escaped= (*q=='\\');
+ }
+ q++;
+ }
+ if (*q!=stopchar && !(isspace(*q) && stopchar=='\n')) {
+ fprintf(stderr,"Error parsing %s(%d): unterminated sequence\n",
+ path,lineno);
+ continue;
+ }
+ /* HERE SHOULD BE BACKSLASH ESCAPE PROCESSING !!!*/
+ *q=0;
+ for (q=s,i=0;*p && i<256;q++,i++) {
+ if (*p!='\\') {
+ *q=*p++;
+ } else {
+ switch (*(++p)) {
+ case 'n': *q='\n'; break;
+ case 'r': *q='\r'; break;
+ case 't': *q='\t'; break;
+ case 'b': *q='\b'; break;
+ case '\"': *q='\"'; break;
+ case '\'': *q='\''; break;
+ case '0': *q=strtol(p,&p,8); p--; break;
+ case '\\':
+ default:
+ *q=*p;
+ }
+ p++;
+ }
+ }
+ *q=0;
+ if (i>longest_sequence)
+ longest_sequence=i;
+ map_insert(map,uc,s);
+ }
+ fclose(f);
+ free(path);
+ return map;
+}
+
+/*************************************************************************/
+/* searches for string in the substituton map. Returns NULL if not found */
+/*************************************************************************/
+char * map_subst ( SUBSTMAP map, int uc) {
+ char **p=map[(unsigned)uc >>8];
+ if (!p) return NULL;
+ return p[uc & 0xff];
+}
+
+/*************************************************************************/
+/* inserts string + unicode code into map */
+/*************************************************************************/
+void map_insert(SUBSTMAP map, int uc, const char *s) {
+ SUBSTMAP p=map+((unsigned)uc>>8);
+
+ if (!*p) {
+ *p= calloc(sizeof(char*),256);
+ if (!*p) {
+ fprintf(stderr,"Insufficient memory\n");
+ exit(1);
+ }
+ }
+ (*p)[uc & 0xff] = strdup(s);
+}
--- /dev/null
+# -* wish *-
+# fallback which allows me to run wordview.tcl without doing make
+package require Tcl 8.3
+
+if ![info exist charset_lib] {
+ set charset_lib /usr/local/lib/catdoc
+}
+option add *Text.Font {Courier 11} widgetDefault
+option add *Text.Background white widgetDefault
+option add *Text.Foreground black widgetDefault
+option add *Text.selectBackground black widgetDefault
+option add *Text.selectForeground white widgetDefault
+option add *Text.findMode exact widgetDefault
+option add *Text.findCase no widgetDefault
+option add *Menu.highlightBackground MidnightBlue widgetDefault
+option add *Menu.highlightThickness 0 widgetDefault
+option add *Menu.activeBackground MidnightBlue widgetDefault
+option add *Menu.activeForeground white widgetDefault
+option add *Menu.activeBorderWidth 0 widgetDefault
+menu .mainmenu
+. configure -menu .mainmenu
+.mainmenu add cascade -label File -menu [set m [menu .mainmenu.file]] -underline 0
+$m add command -label Open... -command load_file -accelerator Ctrl-O
+$m add command -label "Save As..." -command write_file -accelerator Ctrl-S -state disabled
+$m add separator
+$m add command -label Quit -command exit -accelerator Alt-F4
+set m [menu .mainmenu.edit -postcommand EditEnable]
+.mainmenu add cascade -label Edit -menu $m -underline 0 -state disabled
+$m add command -label Copy -command CopySel -accelerator Ctrl-C
+$m add separator
+$m add command -label "Select All" -accelerator Ctrl-A -command \
+ {.text tag add sel 0.0 end}
+.mainmenu add cascade -label Find -menu .mainmenu.search -underline 1 -state disabled
+set m [menu .mainmenu.search -postcommand EnableSearch]
+$m add command -label "Find..." -command FindDialog -accelerator Ctrl-F
+$m add command -label "Find Again" -accelerator F3 -command DoFind
+#
+# build charset menu
+#
+
+.mainmenu add cascade -state disabled -label Encoding -menu [set m [menu .mainmenu.encoding]]
+$m add radio -label Default -value Default -var in_charset
+$m add radio -label unicode -value unicode -var in_charset
+foreach l [glob [file join $charset_lib *.txt]] {
+ set n [file rootname [file tail $l]]
+ $m add radio -label $n -value $n -var in_charset
+}
+
+set in_charset Default
+
+trace var in_charset w reread
+set m [menu .mainmenu.help]
+.mainmenu add cascade -label Help -menu $m -underline 0
+$m add command -label "Manual page" -command [list show_help [file tail $argv0]]
+$m add command -label "Regular expressions" -command {show_help re_syntax}
+$m add separator
+$m add command -label "About..." -command AboutDialog
+
+
+
+text .text -width 80 -height 25 -xscrollcommand ".xs set" \
+ -yscrollcommand ".ys set" -wrap word \
+ -spacing3 2m
+.text tag configure sel -relief flat -borderwidth 0
+.text tag configure doc -lmargin1 0.2i -lmargin2 0
+scrollbar .ys -orient vert -command ".text yview"
+scrollbar .xs -orient horiz -command ".text xview"
+bind .text <F3> { if [info exists FindPattern] DoFind}
+bind .text <Control-O> load_file
+bind .text <Control-o> load_file
+bind .text <Control-S> {write_file}
+bind .text <Control-s> {write_file}
+bind .text <Control-F> FindDialog
+bind .text <Control-f> FindDialog
+grid .text .ys
+grid .xs x
+grid .text -sticky news
+grid .xs -sticky we
+grid .ys -sticky ns
+grid columnconfigure . 0 -weight 1
+grid columnconfigure . 1 -weight 0
+grid rowconfigure . 0 -weight 1
+grid rowconfigure . 1 -weight 0
+
+# Find options (All this can be tuned from dialog)
+set FindMode -[option get .text findMode FindMode] ;# no -regexp for novices
+set FindDir -forwards ;# Why not -backwards
+set FindCase -nocase ;# Leave it empty if you want to be case sensitive
+if {[option get .text findCase FindCase]} {
+ set FindCase ""
+}
+
+
+proc show_help {page} {
+ global argv0
+ if [winfo exists .man] {
+ wm deiconify .man
+ raise .man
+ .man.text delete 0.0 end
+ } else {
+ toplevel .man -class Man
+ wm title .man "[file tail $argv0] help: $page"
+ menu .man.menu
+ .man.menu add cascade -label File -menu [set m [menu .man.menu.file]]
+ .man configure -menu .man.menu
+ $m add command -label Close -command {destroy .man}
+ text .man.text -yscrollcommand {.man.y set}
+ scrollbar .man.y -command {.man.text yview} -orient vert
+ grid .man.text .man.y -sticky news
+ grid columnconfigure .man 0 -weight 1
+ grid columnconfigure .man 1 -weight 0
+ }
+ .man.text insert end [exec man $page 2>/dev/null | col -b ]
+}
+
+proc load_file {{name {}}} {
+global filename
+if ![string length $name] {set name [tk_getOpenFile -filetypes {
+{{Msword files} .doc}
+{{RTF files} .rtf}
+{{MS Write files} .wri}
+{{All files} *}} ]}
+if ![string length $name] return
+if ![file readable $name] {
+ return -code error "Cannot open file $name"
+}
+set filename $name
+.mainmenu entryconfigure Encoding -state normal
+.mainmenu.file entryconfigure "Save As..." -state normal
+.mainmenu entryconfigure "Edit" -state normal
+.mainmenu entryconfigure "Find" -state normal
+reread
+}
+
+proc make_opt {var flag} {
+ upvar #0 $var charset
+ switch $charset {
+ "Default" {return ""}
+ "unicode" {return "-u"}
+ default {return "$flag $charset"}
+ }
+}
+proc reread {args} {
+global filename in_charset out_charset
+
+set inopt [make_opt in_charset -s]
+set f [open "|catdoc -w $inopt -d utf-8 \"$filename\"" r]
+fconfigure $f -encoding utf-8
+.text configure -state normal
+.text delete 0.0 end
+.text insert 0.0 [read $f] doc
+.text mark set insert 1.0
+.text configure -state disabled
+.text see 1.0
+if [catch {close $f} msg] {
+ tk_messageBox -icon error -title error -message $msg -type ok
+ return
+}
+}
+proc write_file {{name {}}} {
+ global filename
+ if ![string length $name] {
+ set name [tk_getSaveFile -filetypes {
+ {{Text files} .txt}
+ {{LaTeX files} .tex}}]
+ }
+ if ![string length $name] return
+ if {[file extension $name]==".tex"} {
+ eval exec catdoc -t [make_opt in_charset -s] [make_opt out_charset -d]\
+ [list $filename] > [list $name]
+ } else {
+ eval exec catdoc [make_opt in_charset -s] [make_opt out_charset -d]\
+ [list $filename] > [list $name]
+ }
+}
+# -postcommand for Edit menu
+proc EditEnable {} {
+if [llength [.text tag ranges sel]] {
+ .mainmenu.edit entryconfigure Copy -state normal
+} else {
+ .mainmenu.edit entryconfigure Copy -state disabled
+}
+}
+proc CopySel {} {
+clipboard clear
+clipboard append -- [.text get sel.first sel.last]
+}
+proc FindDialog {} {
+make_transient .find "Find"
+frame .find.top
+label .find.top.l -text "Find"
+entry .find.top.e -width 30 -textvar FindPattern
+bind .find.top.e <Key-Return> ".find.b.find invoke"
+pack .find.top.l .find.top.e -side left
+FindOptionFrame
+frame .find.b
+button .find.b.find -text "Search" -command DoFind
+button .find.b.close -text "Close" -command "destroy .find"
+pack .find.b.find .find.b.close -side left -padx 20
+pack .find.top -pady 5 -anchor w -padx 10
+pack .find.opt -pady 10
+pack .find.b
+focus .find.top.e
+}
+proc EnableSearch {} {
+global FindPattern ReplaceString
+if ![info exists FindPattern] {
+ .mainmenu.search entryconfigure "Find Again" -state disabled
+} else {
+ .mainmenu.search entryconfigure "Find Again" -state normal
+}
+}
+proc make_transient {wpath title} {
+set x [expr [winfo rootx .]+[winfo width .]/3]
+set y [expr [winfo rooty .]+[winfo height .]/3]
+catch {destroy $wpath}
+toplevel $wpath
+wm transient $wpath .
+wm positionfrom $wpath program
+wm geometry $wpath +$x+$y
+wm title $wpath $title
+}
+proc FindOptionFrame {} {
+frame .find.opt
+checkbutton .find.opt.dir -variable FindDir -onvalue -backwards\
+ -offvalue -forwards -text Backward
+checkbutton .find.opt.regex -variable FindMode -onvalue\
+ -regex -offvalue -exact -text RegExp
+checkbutton .find.opt.case -variable FindCase -onvalue -nocase -offvalue {}\
+ -text "Ignore case"
+pack .find.opt.dir .find.opt.regex .find.opt.case -side left
+}
+proc DoFind {{quiet 0}} {
+global FindPattern FindMode FindDir FindCase
+if ![string length $FindPattern] {return 0}
+if {$FindMode=="-backwords"} {
+ set stopindex 0.0
+} else {
+ set stopindex end
+}
+set index [eval ".text search $FindCase $FindMode $FindDir -- \
+ [list $FindPattern] insert $stopindex"]
+if ![string length $index] {
+ if !$quiet {
+ tk_messageBox -type ok -title "Not found" -message "Pattern not found"
+ }
+ return 0
+} else {
+.text tag remove sel 0.0 end
+if {$FindMode=="-exact"} {
+.text tag add sel $index "$index + [string length $FindPattern] chars"
+} else {
+eval "regexp $FindCase --" [list $FindPattern [.text get "$index linestart"\
+ "$index lineend"] match]
+.text tag add sel $index "$index + [string length $match] chars"
+}
+.text mark set insert sel.last
+.text see $index
+.text see insert
+focus .text
+return 1
+}
+}
+proc AboutDialog {} {
+make_transient .about "About WordView"
+message .about.m -aspect 250 -text "MS-Word viewer for UNIX
+Copyright (c) by Victor B. Wagner 1997-98
+This program is distributed under
+GNU General Public License Version 2 or above
+Check http://www.gnu.org/copyleft/gpl.html for copying
+and warranty conditions" -justify center
+button .about.ok -text Ok -command {destroy .about}
+pack .about.m .about.ok
+}
+if [llength $argv] {
+ if {![file exist [lindex $argv 0]]} {
+ puts stderr "No such file: [lindex $argv 0]"
+ exit 1
+ }
+load_file [lindex $argv 0]
+}
+focus .text
--- /dev/null
+/*****************************************************************/
+/* Output of unicode buffers with conversion into target encoding*/
+/* And application of substitution maps */
+/* */
+/* This file is part of catdoc project */
+/* (c) Victor Wagner 1996-2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <stdio.h>
+#include <ctype.h>
+#include <string.h>
+#include "catdoc.h"
+/************************************************************************/
+/* performs paragraph formatting if wrap_margin is non-zero */
+/* gets character sequence and appends it to buffer. If buffer is long */
+/* enough, prints its beginning out */
+/* Input parameter is pointer to the string which represents one unicode*/
+/* character after character set translation. See convert_char ina */
+/* charset.c */
+/************************************************************************/
+static char outputbuffer[LINE_BUF_SIZE]="";
+void out_char(const char *chunk) {
+ static int bufpos=0;
+ int eol_flag=0;
+ const char *p; char *q;
+ if (!wrap_margin) {
+ fputs(chunk,stdout);
+ return;
+ }
+
+ for (q=outputbuffer+bufpos,p=chunk;
+ *p;
+ *(q++)=*(p++),bufpos++) {
+ if (*p=='\n') eol_flag=1;
+ }
+ *q=0;
+ /* This strcat is safe. wrap margin setting
+ code in main.c ensures that wrap_margin is
+ less than LINE_BUF_SIZE-strlen(largest chunk)
+ */
+ if (eol_flag) {
+ /* End of paragraph */
+ char *q = map_subst(spec_chars,'\n');
+ fputs(outputbuffer,stdout);
+ *outputbuffer=0;
+ bufpos=0;
+ if (q) {
+ fputs(q,stdout);
+ } else {
+ fputc('\n',stdout);
+ }
+ } else if (bufpos>wrap_margin) {
+ char *q=outputbuffer,*p=outputbuffer+wrap_margin;
+
+ while (p>outputbuffer&&!isspace(*p)) p--;
+ if (p==outputbuffer) {
+ /*worst case - nowhere to wrap. Will use brute force */
+ fwrite(outputbuffer,wrap_margin,1,stdout);
+ fputc('\n',stdout);
+ p=outputbuffer+wrap_margin;
+ } else {
+ *p=0;p++;
+ fputs(outputbuffer,stdout);
+ fputc('\n',stdout);
+ }
+ for(q=outputbuffer;*p;p++,q++) *q=*p;
+ bufpos=q-outputbuffer;
+ *q=0;
+ }
+}
+
+/************************************************************************/
+/* Main output function.
+ * Programs which read word-processor files should accumulate paragraphs
+ * in the buffer as array of unicode 16-bit codes and pass to this
+ * function
+ ************************************************************************/
+void output_paragraph(unsigned short int *buffer) {
+ unsigned short int *p;
+ int countout=0;
+ for (p=buffer;*p;p++) {
+ out_char(convert_char(*p));
+ countout++;
+ }
+}
--- /dev/null
+/*****************************************************************/
+/* Definition specific for Excel file handling */
+/* */
+/* This file is part of catdoc project */
+/* (c) Victor Wagner 2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+
+#ifndef XLS_H
+#define XLS_H
+
+#include <stdio.h>
+#include <math.h>
+/* types of quoting */
+#define QUOTE_NEVER 0
+#define QUOTE_SPACES_ONLY 1
+#define QUOTE_ALL_STRINGS 2
+#define QUOTE_EVERYTHING 3
+
+struct rowdescr {
+ int last, end;
+ unsigned char **cells;
+};
+/* structure to hold parsed table */
+extern struct rowdescr *rowptr;
+extern int startrow;
+/* xls2csv-specific configuration */
+extern char cell_separator;
+extern int quote_mode;
+void print_sheet(void);
+void free_sheet(void);
+void print_value(unsigned char *value);
+char *format_double(char *rec,int offset,int format_code);
+char *format_int(int value,int format_code);
+char *format_rk(char *rec,short int format_code);
+char *gettypename(long rectype);
+void parse_sst(char *sstbuf,int bufsize);
+void process_item (int rectype, int reclen, char *rec);
+unsigned char **allocate(int row,int col);
+char *copy_unicode_string(unsigned char **src);
+char convert8to8(char *src,int count);
+char *convert16to8(char *src,int count);
+void do_table(FILE *input,char *filename);
+char *mark_string(char *instr);
+
+/* ---- added by Igor ---- */
+void SetFormatIdxUsed(int format_code);
+/* -- end added by Igor -- */
+
+
+#ifdef __TURBOC__
+#define rint(x) floor((x+0.5))
+#endif
+#endif
--- /dev/null
+/*****************************************************************/
+/* Main program for parsing XLS files */
+/* */
+/* This file is part of catdoc project */
+/* (c) David Rysdam 1998 */
+/* (c) Victor Wagner 1998-2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <stdio.h>
+#include <time.h>
+#include <math.h>
+#include <ctype.h>
+#include <string.h>
+#include "xltypes.h"
+#include "catdoc.h"
+#include <stdlib.h>
+#include <unistd.h>
+#include "catdoc.h"
+#include "float.h"
+#include "xls.h"
+
+#ifdef __TURBOC__
+#define strcasecmp(a,b) strcmpi(a,b)
+#endif
+extern char *forced_date_format;
+extern char number_format[];
+extern char *sheet_separator;
+/************************************************************************/
+/* Displays help message */
+/************************************************************************/
+void help (void) {
+ printf("Usage:\n xls2csv [-xlV] [-g number] [-f date-format] [-b string] [-s charset] [-d charset] [-c char] [ -q number] files\n");
+}
+/* Defines unicode chars which should be
+ replaced by strings before UNICODE->target chatset
+ mappigs are applied i.e. TeX special chars like %
+ */
+char *input_buffer, *output_buffer;
+int main(int argc, char *argv[])
+{
+ FILE *input;
+ FILE *new_file, *ole_file;
+ char *filename =NULL;
+ short int *tmp_charset;
+ int c;
+ int i;
+ char *tempname;
+ read_config_file(SYSTEMRC);
+#ifdef USERRC
+ tempname=find_file(strdup(USERRC),getenv("HOME"));
+ if (tempname) {
+ read_config_file(tempname);
+ free(tempname);
+ }
+#endif
+#ifdef HAVE_LANGINFO
+ get_locale_charset();
+#endif
+
+ check_charset(&dest_csname,dest_csname);
+
+ while ((c=getopt(argc,argv,"Vlf:s:d:xq:c:b:g:p:"))!=-1) {
+ switch(c) {
+ case 'l':
+ list_charsets(); exit(0);
+ case 'x':
+ unknown_as_hex = 1; break;
+ case 's':
+ check_charset(&source_csname,optarg);
+ source_charset=read_charset(source_csname);
+ break;
+ case 'b':
+ sheet_separator= strdup(optarg);
+ break;
+ case 'd':
+ check_charset(&dest_csname,optarg);
+ break;
+ case 'q':
+ { char *errptr;
+ quote_mode = strtol(optarg,&errptr,0);
+ if ((errptr && *errptr)||quote_mode<0||quote_mode>3) {
+ fprintf(stderr,
+ "argument of -q should be number from 0 to 3\n");
+ exit(1);
+ }
+ }
+ break;
+ case 'c':
+ cell_separator = optarg[0];
+ break;
+ case 'f':
+ forced_date_format = strdup(optarg);
+ break;
+ case 'g':
+ { char *strend;
+ int digits = strtol(optarg,&strend,0);
+ if (*strend||digits<0||digits>DBL_DIG) {
+ fprintf(stderr,"value of -g option should be numbe between 0 and %d, not '%s'\n", DBL_DIG, optarg);
+ exit(1);
+ }
+ sprintf(number_format,"%%.%dg",digits);
+ }
+ break;
+ case 'V': printf("Catdoc Version %s\n",CATDOC_VERSION);
+ exit(0);
+ default:
+ help();
+ exit(1);
+ }
+ }
+/* If we are using system strftime, we need to set LC_TIME locale
+ * category unless choosen charset is not same as system locale
+ */
+#if defined(HAVE_LANGINFO) && defined(HAVE_STRFTIME) && !defined(__TURB0C__)
+ set_time_locale();
+#endif
+ /* charset conversion init*/
+ input_buffer=malloc(FILE_BUFFER);
+ if (strcmp(dest_csname,"utf-8")) {
+ tmp_charset=read_charset(dest_csname);
+ if (!tmp_charset) {
+ fprintf(stderr,"Cannot load target charset %s\n",dest_csname);
+ exit(1);
+ }
+ target_charset=make_reverse_map(tmp_charset);
+ free(tmp_charset);
+ } else {
+ target_charset=NULL;
+ }
+ spec_chars=read_substmap(stradd("ascii",SPEC_EXT));
+ if (!spec_chars) {
+ fprintf(stderr,"Cannod read substitution map ascii%s\n",
+ SPEC_EXT);
+ exit(1);
+ }
+ replacements=read_substmap(stradd("ascii",REPL_EXT));
+ if (!replacements) {
+ fprintf(stderr,"Cannod read substitution map ascii%s\n",
+ REPL_EXT);
+ exit(1);
+ }
+ if (optind>=argc) {
+ if (isatty(fileno(stdin))) {
+ help();
+ exit(0);
+ }
+ do_table(stdin,"STDIN");
+ exit (0);
+ }
+ for (i=optind;i<argc;i++) {
+ filename = argv[i];
+ input=fopen(filename,"rb");
+ if (!input) {
+ perror(filename);
+ exit(1);
+ }
+ if ((new_file=ole_init(input, NULL, 0)) != NULL) {
+ set_ole_func();
+ while((ole_file=ole_readdir(new_file)) != NULL) {
+ int res=ole_open(ole_file);
+/* fprintf(stderr, "name = %s\n", ((oleEntry*)ole_file)->name); */
+ if (res >= 0) {
+ if (strcasecmp(((oleEntry*)ole_file)->name , "Workbook") == 0
+ || strcasecmp(((oleEntry*)ole_file)->name,"Book") == 0) {
+ do_table(ole_file,filename);
+ }
+ }
+ ole_close(ole_file);
+ }
+ set_std_func();
+ ole_finish();
+ fclose(new_file);
+ } else {
+ fprintf(stderr, "%s is not OLE file or Error\n", filename);
+ }
+ }
+ return 0;
+}
--- /dev/null
+/*****************************************************************/
+/* BIFF-stream (excel file) parsing */
+/* */
+/* This file is part of catdoc project */
+/* (c) David Rysdam 1998 */
+/* (c) Victor Wagner 1998-2003, (c) Alex Ott 2003 */
+/*****************************************************************/
+#ifdef HAVE_CONFIG_H
+#include <config.h>
+#endif
+#include <stdlib.h>
+#include <string.h>
+#include "xls.h"
+#include "catdoc.h"
+#include "xltypes.h"
+#include "float.h"
+#include <math.h>
+#include <time.h>
+#ifndef HAVE_STRFTIME
+#include "../compat/strftime.h"
+#endif
+static unsigned char rec[MAX_MS_RECSIZE];
+int biff_version=0;
+short int *formatTable=NULL;
+char *forced_date_format = NULL;
+size_t formatTableIndex = 0;
+size_t formatTableSize = 0;
+double date_shift = 25569.0;
+#define FLT_FORMAT(a,b,c) a #b c
+#define MK_FORMAT(x) FLT_FORMAT("%.",x,"g")
+char number_format[8]=MK_FORMAT(DBL_DIG);
+
+void CleanUpFormatIdxUsed(void);
+
+void do_table(FILE *input,char *filename) {
+ long rectype;
+ long reclen,build_year=0,build_rel=0,offset=0;
+ int eof_flag=0;
+ int itemsread=1;
+ date_shift=25569.0; /* Windows 1900 date system */
+ CleanUpFormatIdxUsed();
+ while (itemsread) {
+ catdoc_read(rec,2,1,input);
+ biff_version=getshort(rec,0);
+ catdoc_read(rec,2,1,input);
+ reclen=getshort(rec,0);
+ if ( biff_version == 0x0809 || biff_version == 0x0409 ||
+ biff_version == 0x0209 || biff_version == 0x0009 ) {
+ if (reclen==8 || reclen==16) {
+ if (biff_version == 0x0809 ) {
+ itemsread=catdoc_read(rec,4,1,input);
+ build_year=getshort(rec+2,0);
+ build_rel=getshort(rec,0);
+ if(build_year > 5 ) {
+ itemsread=catdoc_read(rec,8,1,input);
+ biff_version=8;
+ offset=12;
+ }
+ else {
+ biff_version=7;
+ offset=4;
+ }
+ } else if (biff_version == 0x0209 ) {
+ biff_version=3;
+ offset=2;
+ } else if (biff_version == 0x0409 ) {
+ offset=2;
+ biff_version=4;
+ } else {
+ biff_version=2;
+ }
+ itemsread=catdoc_read(rec,reclen-offset,1,input);
+ break;
+ } else {
+ fprintf(stderr,"%s: Invalid BOF record\n",filename);
+ return;
+ }
+ } else {
+ itemsread=catdoc_read(rec,126,1,input);
+ }
+ }
+ if (catdoc_eof(input)) {
+ fprintf(stderr,"%s: No BOF record found\n",filename);
+ exit(1);
+ }
+ while(itemsread){
+ char buffer[2];
+ rectype = 0;
+ itemsread = catdoc_read(buffer, 2, 1, input);
+ if (catdoc_eof(input)) {
+ process_item(MSEOF,0,NULL);
+ return;
+ }
+
+ rectype=getshort(buffer,0);
+ if(itemsread == 0)
+ break;
+ reclen=0;
+
+ itemsread = catdoc_read(buffer, 2, 1, input);
+ reclen=getshort(buffer,0);
+ if (reclen && reclen <MAX_MS_RECSIZE &&reclen >0){
+ itemsread = catdoc_read(rec, 1, reclen, input);
+ rec[reclen] = '\0';
+ }
+ if(eof_flag) {
+ if (rectype != BOF) {
+ break;
+ }
+ }
+/* fprintf(stderr,"Rectype 0x%04X reclen=%d\n",rectype, reclen); */
+ process_item(rectype,reclen,rec);
+ if (rectype == MSEOF) {
+ eof_flag=1;
+ } else {
+ eof_flag=0;
+ }
+ }
+ return;
+}
+unsigned char **sst=NULL;/* Shared string table parsed into array of strings in
+ output encoding*/
+int sstsize = 0; /*Number of strings in SST*/
+unsigned char *sstBuffer=NULL; /*Unparsed sst to accumulate all its parts*/
+int sstBytes = 0; /*Size of SST Data, already accumulated in the buffer */
+int codepage=1251; /*default*/
+int prev_rectype=0;
+/* holds a pointer to formula value, becouse value itself would be in
+ * next biff record
+ */
+unsigned char **saved_reference = NULL;
+
+void process_item (int rectype, int reclen, char *rec) {
+ if (rectype != CONTINUE && prev_rectype == SST) {
+ /* we have accumulated unparsed SST, and now encountered
+ * another record, which indicates that SST is ended */
+ /* fprintf(stderr,"parse sst!\n");*/
+ parse_sst(sstBuffer,sstBytes);
+ }
+ switch (rectype) {
+ case FILEPASS: {
+ fprintf(stderr,"File is encrypted\n");
+ exit(69);
+ break;
+ }
+ case WRITEPROT: {
+ fprintf(stderr,"File is write protected\n");
+ break;
+ }
+
+ case 0x42: {
+ if (source_charset) break;
+ codepage=getshort(rec,0);
+ /*fprintf(stderr,"CODEPAGE %d\n",codepage); */
+ if (codepage!=1200) {
+ const char *cp = charset_from_codepage(codepage);
+ source_charset=read_charset(cp);
+ }
+ break;
+ }
+ case FORMAT: {
+ int format_code;
+ format_code=getshort(rec,0);
+ SetFormatIdxUsed(format_code);
+ /* this debug code prints format string */
+ /*
+ int i;
+ char *ptr;
+ fprintf(stderr,"Format %x \"",format_code);
+ if (rec[2] == reclen - 3 && rec[3] != 0) {
+ for (i=0,ptr=rec+3;i<rec[2];i++,ptr++) {
+ fputc(*ptr,stderr);
+ }
+ } else {
+ for (i=0,ptr=rec+5;i<rec[2];i++,ptr+=2) {
+ fputc(*ptr,stderr);
+ }
+ }
+ fprintf (stderr,"\"\n");
+ */
+ break;
+ }
+ case SST: {
+ /* Just copy SST into buffer, and wait until we get
+ * all CONTINUE records
+ */
+/* fprintf(stderr,"SST\n"); */
+ /* If exists first SST entry, then just drop it and start new*/
+ if (sstBuffer != NULL)
+ free(sstBuffer);
+ if (sst != NULL)
+ free(sst);
+
+ sstBuffer=(char*)malloc(reclen);
+ sstBytes = reclen;
+ if (sstBuffer == NULL ) {
+ perror("SSTptr alloc error! ");
+ exit(1);
+ }
+ memcpy(sstBuffer,rec,reclen);
+ break;
+ }
+ case CONTINUE: {
+ if (prev_rectype != SST) {
+ return; /* to avoid changing of prev_rectype;*/
+ }
+ sstBuffer=realloc(sstBuffer,sstBytes+reclen);
+ if (sstBuffer == NULL ) {
+ perror("SSTptr realloc error! ");
+ exit(1);
+ }
+ memcpy(sstBuffer+sstBytes,rec,reclen);
+ sstBytes+=reclen;
+ return;
+ }
+ case LABEL: {
+ int row,col;
+ unsigned char **pcell;
+ unsigned char *src=(unsigned char *)rec+6;
+
+ saved_reference=NULL;
+ row = getshort(rec,0);
+ col = getshort(rec,2);
+ /* fprintf(stderr,"LABEL!\n"); */
+ pcell=allocate(row,col);
+ *pcell=copy_unicode_string(&src);
+ break;
+ }
+ case BLANK: { int row,col;unsigned char **pcell;
+ row = getshort(rec,0);
+ col = getshort(rec,2);
+ pcell=allocate(row,col);
+ *pcell=NULL;
+ break;
+ }
+ case MULBLANK: {
+ int row, startcol,endcol;
+ unsigned char **pcell;
+ row = getshort(rec,0);
+ startcol = getshort(rec,2);
+ endcol=getshort(rec,reclen-2);
+ pcell=allocate(row,endcol);
+ *pcell=NULL;
+ break;
+ }
+ case CONSTANT_STRING: {
+ int row = getshort(rec,0);
+ int col = getshort(rec,2);
+ unsigned char **pcell;
+ int string_no=getshort(rec,6);
+ if (!sst) {
+ fprintf(stderr,"CONSTANT_STRING before SST parsed\n");
+ exit(1);
+ }
+ /* fprintf(stderr,"col=%d row=%d no=%d\n",col,row,string_no); */
+
+ saved_reference=NULL;
+ pcell=allocate(row,col);
+ if (string_no>=sstsize|| string_no < 0 ) {
+ fprintf(stderr,"string index out of boundary\n");
+ exit(1);
+ } else if (sst[string_no] !=NULL) {
+ int len;
+ char *outptr;
+ len=strlen(sst[string_no]);
+ outptr=*pcell=malloc(len+1);
+ strcpy(outptr,sst[string_no]);
+ } else {
+ *pcell=malloc(1);
+ strcpy(*pcell,"");
+ }
+ break;
+ }
+ case 0x03:
+ case 0x103:
+ case 0x303:
+ case NUMBER: {
+ int row,col;
+ unsigned char **pcell;
+
+ saved_reference=NULL;
+ row = getshort(rec,0)-startrow;
+ col = getshort(rec,2);
+ pcell=allocate(row,col);
+ *pcell=strdup(format_double(rec,6,getshort(rec,4)));
+ break;
+ }
+ case INTEGER_CELL: {
+ int row,col;
+ unsigned char **pcell;
+
+ row = getshort(rec,0)-startrow;
+ col = getshort(rec,2);
+ pcell=allocate(row,col);
+ *pcell=strdup(format_int(getshort(rec,7),getshort(rec,4)));
+ break;
+
+ }
+ case RK: {
+ int row,col,format_code;
+ unsigned char **pcell;
+
+ saved_reference=NULL;
+ row = getshort(rec,0)-startrow;
+ col = getshort(rec,2);
+ pcell=allocate(row,col);
+ format_code = getshort(rec,4);
+ *pcell=strdup(format_rk(rec+6,format_code));
+ break;
+ }
+ case MULRK: {
+ int row,col,startcol,endcol,offset,format_code;
+ unsigned char **pcell;
+ row = getshort(rec,0)-startrow;
+ startcol = getshort(rec,2);
+ endcol = getshort(rec,reclen-2);
+ saved_reference=NULL;
+
+ for (offset=4,col=startcol;col<=endcol;offset+=6,col++) {
+ pcell=allocate(row,col);
+ format_code=getshort(rec,offset);
+ *pcell=strdup(format_rk(rec+offset+2,format_code));
+
+ }
+ break;
+ }
+ case FORMULA: {
+ int row,col;
+ unsigned char **pcell;
+ saved_reference=NULL;
+ row = getshort(rec,0)-startrow;
+ col = getshort(rec,2);
+ pcell=allocate(row,col);
+ if (((unsigned char)rec[12]==0xFF)&&(unsigned char)rec[13]==0xFF) {
+ /* not a floating point value */
+ if (rec[6]==1) {
+ /*boolean*/
+ char buf[2]="0";
+ buf[0]+=rec[9];
+ *pcell=strdup(buf);
+ } else if (rec[6]==2) {
+ /*error*/
+ char buf[6]="ERROR";
+ *pcell=strdup(buf);
+ } else if (rec[6]==0) {
+ saved_reference=pcell;
+ }
+ } else {
+ int format_code=getshort(rec,4);
+ *pcell=strdup(format_double(rec,6,format_code));
+ }
+ break;
+ }
+ case STRING: {
+ unsigned char *src=(unsigned char *)rec;
+ if (!saved_reference) {
+ fprintf(stderr,"String record without preceeding string formula\n");
+ break;
+ }
+ *saved_reference=copy_unicode_string(&src);
+ break;
+ }
+ case BOF: {
+ if (rowptr) {
+ fprintf(stderr,"BOF when current sheet is not flushed\n");
+ free_sheet();
+ }
+ break;
+ }
+ case XF:
+ case 0x43: /*from perl module Spreadsheet::ParseExecel */
+ {
+ short int formatIndex = getshort(rec,2);
+ /* we are interested only in format index here */
+ if (formatTableIndex >= formatTableSize) {
+ formatTable=realloc(formatTable,
+ (formatTableSize+=16)*sizeof(short int));
+
+ if (!formatTable) {
+ fprintf(stderr,"Out of memory for format table");
+ exit (1);
+ }
+ }
+ formatTable[formatTableIndex++] = formatIndex;
+ break;
+ }
+ case MS1904: /* Macintosh 1904 date system */
+ date_shift=24107.0;
+ break;
+
+
+ case MSEOF: {
+ if (!rowptr) break;
+ print_sheet();
+ free_sheet();
+ break;
+ }
+ case ROW: {
+ /* fprintf(stderr,"Row! %d %d %d\n",getshort(rec,0), getshort(rec+2,0),getshort(rec+4,0)); */
+ break;
+ }
+ case INDEX: {
+ /* fprintf(stderr,"INDEX! %d %d\n", getlong(rec+4,0), getlong(rec+8,0)); */
+ break;
+ }
+ default: {
+#if 0
+ fprintf(stderr,"Unknown record 0x%x\n length %d\n",rectype,reclen);
+#endif
+ }
+ }
+ prev_rectype=rectype;
+}
+
+/*
+ * Extracts string from sst and returns mallocked copy of it
+ */
+char *copy_unicode_string (unsigned char **src) {
+ int count=0;
+ int flags = 0;
+ int start_offset=0;
+ int to_skip=0; /* ÉÓÐÏÌØÚÕÅÔÓÑ ÄÌÑ ÐÏÄÓÞÅÔÁ ÄÌÉÎÙ ÄÁÎÎÙÈ
+ * ÚÁ ËÏÎÃÏÍ ÓÔÒÏËÉ */
+ int offset = 1; /* ÄÌÑ ÕÞÅÔÁ ÐÅÒÅÍÅÎÎÏÊ ÄÌÉÎÙ ÐÅÒ×ÏÇÏ ÐÏÌÑ */
+ int charsize;
+ /* char *realstart=*src; */
+ char *dest; /* ËÕÄÁ ÂÕÄÅÍ ËÏÐÉÒÏ×ÁÔØ ÓÔÒÏËÕ */
+ char *s,*d,*c;
+
+ int i,u,l,len;
+
+ /* for(i=0;i<20;i++) */
+ /* fprintf(stderr,"%02x ",(*src)[i]); */
+ /* fprintf(stderr,"\n"); */
+
+ flags = *(*src+1+offset);
+ if (! ( flags == 0 || flags == 1 || flags == 8 || flags == 9 ||
+ flags == 4 || flags == 5 || flags == 0x0c || flags == 0x0d ) ) {
+ count=**src;
+ flags = *(*src+offset);
+ offset --;
+ flags = *(*src+1+offset);
+ if (! ( flags == 0 || flags == 1 || flags == 8 || flags == 9 ||
+ flags == 4 || flags == 5 || flags == 0x0c || flags == 0x0d ) ) {
+ /* fprintf(stderr,"Strange flags = %d, returning NULL\n", flags); */
+ return NULL;
+ }
+ }
+ else {
+ count=getshort(*src,0);
+ }
+ charsize=(flags &0x01) ? 2 : 1;
+
+ switch (flags & 12 ) {
+ case 0x0c: /* Far East with RichText formating */
+ to_skip=4*getshort(*src,2+offset)+getlong(*src, 4+offset);
+ start_offset=2+offset+2+4;
+ /* fprintf(stderr,"Far East with RichText formating\n"); */
+ break;
+
+ case 0x08: /* With RichText formating */
+ to_skip=4*getshort(*src,2+offset);
+ start_offset=2+offset+2;
+ /* fprintf(stderr,"With RichText formating %d\n",getshort(*src,2+offset)); */
+ break;
+
+ case 0x04: /* Far East */
+ to_skip=getlong(*src, 2+offset);
+ start_offset=2+offset+4;
+ /* fprintf(stderr,"Far East\n"); */
+ break;
+
+ default:
+ to_skip=0;
+ start_offset=2+offset;
+ /* fprintf(stderr,"Default string\n"); */
+ }
+
+ /* fprintf(stderr,"count=%d skip=%d start_offset=%d\n", */
+ /* count, to_skip, start_offset); */
+ /* Á ÚÄÅÓØ ÍÙ ËÏÐÉÒÕÅÍ ÓÔÒÏËÕ */
+ if ( (dest=malloc(count+1)) == NULL ) {
+ perror("Dest string alloc error");
+ *src+=(to_skip+start_offset+(count*charsize));
+ exit(0);
+ }
+ *src+=start_offset;
+ len = count;
+ *dest=0;l=0;
+ for (s=*src,d=dest,i=0;i<count;i++,s+=charsize) {
+ /* fprintf(stderr,"l=%d len=%d count=%d charsize=%d\n",l,len,count,charsize); */
+ if ( (charsize == 1 && (*s == 1 || *s == 0)) ||
+ (charsize == 2 && (*s == 1 || *s == 0) && *(s+1) != 4)) {
+ /* fprintf(stderr,"extchar (unicode)=%02x %02x\n",*s, *(s+1)); */
+ charsize=(*s &0x01) ? 2 : 1;
+ if (charsize == 2)
+ s-=1;
+ count++;
+ continue;
+ }
+ if ( charsize == 2 ){
+ u=(unsigned short)getshort(s,0);
+ c=convert_char(u);
+ /* fprintf(stderr,"char=%02x %02x\n", *s, *(s+1)); */
+ } else {
+ if (!source_charset) {
+ check_charset(&source_csname,source_csname);
+ /* fprintf(stderr,"charset=%s\n",source_csname);*/
+ source_charset=read_charset(source_csname);
+ }
+ u=(unsigned short)to_unicode(source_charset,(unsigned char)*s);
+ c=convert_char(u);
+ }
+ if (c != NULL) {
+ int dl = strlen(c);
+ while (l+dl>=len) {
+ len+=16;
+ dest=realloc(dest,len+1);
+ }
+ d=dest+l;
+ strcpy(d,c);
+ d+=dl;
+ l+=dl;
+ }
+ }
+ *src=s+to_skip;
+ return dest;
+}
+
+
+/*
+ * Format code is index into format table (which is list of XF records
+ * in the file
+ * Second word of XF record is format type idnex
+ * format index between 0x0E and 0x16 also between 0x2D and ox2F denotes
+ * date if it is not used for explicitly stored formats.
+ * BuiltInDateFormatIdx converts format index into index of explicit
+ * built-in date formats sutable for strftime.
+ */
+int BuiltInDateFormatIdx (int index) {
+ int offset;
+ offset=1; /* offset of date formats */
+ /* 0 is used as false -- format not found */
+ if ((index>= 0x0E) && (index<=0x16)) {
+ return offset+index-0x0E;
+ } else
+ if ((index>=0x2d) && (index<=0x2F)) {
+ return offset+index-0x2d+9;
+ } else if (index==0xa4) {
+ return 12+offset;
+ } else
+ return 0;
+}
+
+/*
+ * GetBuiltInDateFormat stores and returns
+ * built in xls2csv strftime formats.
+ */
+#define NUMOFDATEFORMATS 13
+char *GetBuiltInDateFormat(int dateindex) {
+ static char *formats[]={
+ /* reserved */ NULL, /* BuiltInDateFormatIdx use dateindex=0 as flag format not found */
+ /* 0x0E */ "%m-%d-%y", /* 01 */
+ /* 0x0F */ "%d-%b-%y", /* 02 */
+ /* 0x10 */ "%d-%b", /* 03 */
+ /* 0x11 */ "%b-%d", /* 04 */
+ /* 0x12 */ "%l:%M %p", /* 05 */
+ /* 0x13 */ "%l:%M:%S %p", /* 06 */
+ /* 0x14 */ "%H:%M", /* 07 */
+ /* 0x15 */ "%H:%M:%S", /* 08 */
+ /* 0x16 */ "%m-%d-%y %H:%M", /* 09 */
+ /* 0x2d */ "%M:%S", /* 10 */
+ /* 0x2e */ "%H:%M:%S", /* 11 */
+ /* 0x2f */ "%M:%S", /* 12 */
+ /* 0xa4 */ "%m.%d.%Y %l:%M:%S %p" /* 13 */
+ };
+ if (dateindex>0 && dateindex <= NUMOFDATEFORMATS) {
+ return formats[dateindex];
+ } else
+ return NULL;
+}
+
+static char FormatIdxUsed[NUMOFDATEFORMATS];
+
+void CleanUpFormatIdxUsed() {
+ int i;
+ for (i=0;i<NUMOFDATEFORMATS; i++);
+ FormatIdxUsed[i]=0;
+}
+
+/*
+ * format index between 0x0E and 0x16 also between 0x2D and ox2F denotes
+ * date in case when they are built-in Excel97 formats.
+ * Nevertheless, those indexes can be used for explicitly stored formats,
+ * which are not dates in general.
+ * SetFormatIdxUsed marks this formats as already used
+ * and excludes them from list of built-in formats
+ * preventing misformatting of corresponding data.
+ */
+void SetFormatIdxUsed(int format_code) {
+ int dateindex;
+ /*fprintf(stderr,"Format idx %x to be set to dirty\n",format_code);
+ */
+ dateindex=BuiltInDateFormatIdx(format_code);
+ if (dateindex) {
+ FormatIdxUsed[dateindex]=1;
+ /*fprintf(stderr,"Date idx %d is set to be dirty\n",dateindex); */
+ }
+}
+
+/*
+ * format index between 0x0E and 0x16 also between 0x2D and ox2F denotes
+ * date in case when they are built-in Excel97 formats.
+ * Nevertheless, those indexes can be used for explicitly stored formats,
+ * which are not dates in general.
+ * SetFormatIdxUsed marks this formats as already used
+ * and excludes them from list of built-in formats
+ * preventing misformatting of corresponding data.
+ * IsFormatIdxUsed tests this case.
+ */
+char IsFormatIdxUsed(int format_code) {
+ int dateindex;
+ dateindex=BuiltInDateFormatIdx(format_code);
+ if (dateindex) {
+ /* fprintf(stderr,"Date idx %d is dirty\n",dateindex); */
+ return FormatIdxUsed[dateindex]==1;
+ }
+ else return 0;
+}
+
+
+/* Checks if format denoted by given code is date
+ * Format code is index into format table (which is list of XF records
+ * in the file
+ * Second word of XF record is format type inex
+ * format index between 0x0E and 0x16 also between 0x2D and ox2F denotes
+ * date.
+ * If so, it returns strftime format for this date. Otherwise returns
+ * NULL
+ */
+
+
+char *isDateFormat(int format_code) {
+ int index;
+ int dateindex;
+ if (format_code>=formatTableIndex) {
+ fprintf(stderr,"Format code %d is used before definition\n",format_code);
+ return NULL;
+ }
+
+ index = formatTable[format_code];
+ if (IsFormatIdxUsed(index)) {
+ fprintf(stderr,"Format %x is redefined\n",index);
+ /* this format is something user-defined --- not a standard built-in date*/
+ return NULL;
+ }
+ dateindex=BuiltInDateFormatIdx(index);
+ if (dateindex) {
+ if (forced_date_format) return forced_date_format;
+ return GetBuiltInDateFormat(dateindex);
+ } else
+ return NULL;
+}
+
+
+
+time_t float2date(double d);
+/*
+ * Extracts floating point value and formats it
+ */
+
+char *number2string(double d,short int format_code) {
+ static char buffer [128];
+ char *datefmt;
+ if ((datefmt=isDateFormat(format_code))!=NULL) {
+ time_t t = float2date(d);
+ strftime(buffer, 127,datefmt, gmtime(&t));
+ } else {
+ sprintf(buffer,number_format,d);
+ }
+ return buffer;
+}
+
+char *format_double(char *rec,int offset,int format_code) {
+ union { char cc[8];
+ double d;} dconv;
+ char *d,*s;
+ int i;
+# ifdef WORDS_BIGENDIAN
+ for(s=rec+offset+8,d=dconv.cc,i=0;
+ i<8;i++) *(d++)=*(--s);
+# else
+ for(s=rec+offset,d=dconv.cc,i=0;
+ i<8;i++) *(d++)=*(s++);
+# endif
+ return number2string(dconv.d,format_code);
+}
+
+/*
+ * Formats integer value into static buffer
+ */
+char *format_int(int value,int format_code) {
+ static char buffer[12];
+ sprintf(buffer,"%i",value);
+ return buffer;
+}
+/*
+ * Formats RK record
+ */
+char* format_rk(char *rec,short int format_code) {
+ double value=0.0;
+ int i;
+
+ if ( *(rec) & 0x02 )
+ {
+ value=(double)(getlong(rec,0)>>2);
+ }
+ else {
+ union { char cc[8];
+ double d;} dconv;
+ char *d,*s;
+ for(i=0;i<8;i++)
+ dconv.cc[i]='\0';
+# ifdef WORDS_BIGENDIAN
+ for(s=rec+4,d=dconv.cc,i=0; i<4;i++)
+ *(d++)=*(--s);
+ dconv.cc[0]=dconv.cc[0] & 0xfc;
+# else
+ for(s=rec,d=dconv.cc+4,i=0;
+ i<4;i++) *(d++)=*(s++);
+ dconv.cc[3]=dconv.cc[3] & 0xfc;
+# endif
+ value=dconv.d;
+ }
+ if ( *(rec) & 0x01 )
+ value=value*0.01;
+ return number2string(value,format_code);
+}
+
+
+/*
+ * Converts excel date into time_t
+ */
+time_t float2date(double f) {
+ /* Hacked version. Excell stores date as floating point count of days
+ * since 1.1.1900. or 1.1.1904
+ * We are substracting value of 1.1.1970 and multiplying
+ * by 86400 thus getting seconds from the epoch
+ */
+ return rint((f-date_shift)*86400);
+}
+
+/*
+ * Parses SST into array of strings
+ */
+void parse_sst(char *sstbuf,int bufsize) {
+ int i; /* index into sst */
+ unsigned char *curString; /* pointer into unparsed buffer*/
+ unsigned char *barrier=(unsigned char *)sstbuf+bufsize; /*pointer to end of buffer*/
+ unsigned char **parsedString;/*pointer into parsed array*/
+
+ sstsize = getlong(sstbuf+4,0);
+ sst=malloc(sstsize*sizeof(char *));
+
+ if (sst == NULL) {
+ perror("SST allocation error");
+ exit(1);
+ }
+ memset(sst,0,sstsize*sizeof(char *));
+ for (i=0,parsedString=sst,curString=sstbuf+8;
+ i<sstsize && curString<barrier; i++,parsedString++) {
+ /* fprintf(stderr,"copying %d string\n",i); */
+ *parsedString = copy_unicode_string(&curString);
+ }
+ /* fprintf(stderr,"end sst i=%d sstsize=%d\n",i,sstsize); */
+
+}
--- /dev/null
+/*
+ Copyright 1998 David Rysdam
+ Copyright 1998-2003 Victor Wagner
+ Copyright 2003 Alex Ott
+ This file is released under the GPL. Details can be
+ found in the file COPYING accompanying this distribution.
+*/
+
+#define MAX_MS_RECSIZE 18000
+
+#define MS_UNIX_DATE_DIFF (70*365.2422+1)
+
+#define DATE_FORMAT 14
+
+#define MS1904 0x22
+#define ADDIN 0x87
+#define ADDMENU 0xC2
+#define ARRAY 0x221
+#define AUTOFILTER 0x9E
+#define AUTOFILTERINFO 0x9D
+#define BACKUP 0x40
+#define BLANK 0x201
+#define BOF 0x809
+#define BOOKBOOL 0xDA
+#define BOOLERR 0x205
+#define BOTTOMMARGIN 0x29
+#define BOUNDSHEET 0x85
+#define CALCCOUNT 0x0C
+#define CALCMODE 0x0D
+#define CODEPAGE 0x42
+#define COLINFO 0x7D
+#define CONTINUE 0x3C
+#define COORDLIST 0xA9
+#define COUNTRY 0x8C
+#define CRN 0x5A
+#define DBCELL 0xD7
+#define DCON 0x50
+#define DCONNAME 0x52
+#define DCONREF 0x51
+#define DEFAULTROWHEIGHT 0x225
+#define DEFCOLWIDTH 0x55
+#define DELMENU 0xC3
+#define DELTA 0x10
+#define DIMENSIONS 0x200
+#define DOCROUTE 0xB8
+#define EDG 0x88
+#define MSEOF 0x0A
+#define EXTERNCOUNT 0x16
+#define EXTERNNAME 0x223
+#define EXTERNSHEET 0x17
+#define FILEPASS 0x2F
+#define FILESHARING 0x5B
+#define FILESHARING2 0x1A5
+#define FILTERMODE 0x9B
+#define FNGROUPCOUNT 0x9C
+#define FNGROUPNAME 0x9A
+#define FONT 0x231
+#define FONT2 0x31
+#define FOOTER 0x15
+#define FORMAT 0x41E
+#define FORMULA_RELATED 0x4BC
+#define DOUBLE_STREAM_FILE 0x161
+/*#define FORMULA 0x406 Microsoft docs wrong?*/
+#define FORMULA 0x06
+#define GCW 0xAB
+#define GRIDSET 0x82
+#define PROT4REVPASS 0x1BC
+#define GUTS 0x80
+#define HCENTER 0x83
+#define HEADER 0x14
+#define HIDEOBJ 0x8D
+#define HORIZONTALPAGEBREAKS 0x1B
+#define IMDATA 0x7F
+#define INDEX 0x20B
+#define INTERFACEEND 0xE2
+#define INTERFACEHDR 0xE1
+#define ITERATION 0x11
+#define LABEL 0x204
+#define LEFTMARGIN 0x26
+#define LHNGRAPH 0x95
+#define LHRECORD 0x94
+#define LPR 0x98
+#define MMS 0xC1
+#define MULBLANK 0xBE
+#define MULRK 0xBD
+#define NAME 0x218
+#define NOTE 0x1C
+#define NUMBER 0x203
+#define OBJ 0x5D
+#define OBJPROTECT 0x63
+#define OBPROJ 0xD3
+#define OLESIZE 0xDE
+#define PALETTE 0x92
+#define PANE 0x41
+#define PASSWORD 0x13
+#define PLS 0x4D
+#define PRECISION 0x0E
+#define PRINTGRIDLINES 0x2B
+#define PRINTHEADERS 0x2A
+#define PROTECT 0x12
+#define PUB 0x89
+#define RECIPNAME 0xB9
+#define REFMODE 0x0F
+#define RIGHTMARGIN 0x27
+#define RK 0x27E
+#define ROW 0x208
+#define RSTRING 0xD6
+#define SAVERECALC 0x5F
+#define SCENARIO 0xAF
+#define SCENMAN 0xAE
+#define SCENPROTECT 0xDD
+#define SCL 0xA0
+#define SELECTION 0x1D
+#define SETUP 0xA1
+#define SHRFMLA 0xBC
+#define SORT 0x90
+#define SOUND 0x96
+#define STANDARDWIDTH 0x99
+#define STRING 0x207
+#define STYLE 0x293
+#define SUB 0x91
+#define SXDI 0xC5
+#define SXEXT 0xDC
+#define SXIDSTM 0xD5
+#define SXIVD 0xB4
+#define SXLI 0xB5
+#define SXPI 0xB6
+#define SXSTRING 0xCD
+#define SXTBL 0xD0
+#define SXTBPG 0xD2
+#define SXTBRGIITM 0xD1
+#define SXVD 0xB1
+#define SXVI 0xB2
+#define SXVIEW 0xB0
+#define SXVS 0xE3
+#define TABID 0x13D
+#define TABIDCONF 0xEA
+#define TABLE 0x236
+#define TEMPLATE 0x60
+#define TOPMARGIN 0x28
+#define UDDESC 0xDF
+#define UNCALCED 0x5E
+#define VCENTER 0x84
+#define VERTICALPAGEBREAKS 0x1A
+#define WINDOW1 0x3D
+#define WINDOW2 0x23E
+#define WINDOWPROTECT 0x19
+#define WRITEACCESS 0x5C
+#define WRITEPROT 0x86
+#define WSBOOL 0x81
+#define XCT 0x59
+#define XF 0xE0
+#define SST 0xFC
+#define CONSTANT_STRING 0xFD
+#define REFRESHALL 0x1B7
+#define USESELFS 0x160
+#define EXTSST 0xFF
+/* Vitus additions */
+#define INTEGER_CELL 0x202