Iconv - library

links: |- index -|- home -| end

in page: preamble downloads encodings end

Preamble

2011-08-04: Converting from one character set to another can be difficult to understand for some - what is a character set? - but 'unix' has addressed this problem strongly. There exist a GNU project - libiconv - which addresses this. Microsoft (MS) address this issue through UNICODE, which works well on most MS windows machines, but is not completely cross-platform.

Just visiting the 'unicode' consortium site in various browsers, in various platforms will often show you the difference. But in general, the web, one way or another, now supports most of the world languages, or more precisely their character sets, and displays them in a 'native' form.

But I wanted to compile this library in WIN32, since several other project I compile needed this library. I chose the last stable source at the time - http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.13.1.tar.gz - circa 30-Jun-2009. I note this source is also now available through an anonymous checkout:

git clone git://git.savannah.gnu.org/libiconv.git [target_directory]

Over the years, I have developed a set of perl scripts, 'amsrcs' - see here - which do SOME of what the unix auto-tools do. That is read the 'configure.ac' file, and then the Makefile.am (or .in) set, to build up the project for MSVC. This 'amsrcs' script set outputs a MSVC6 build file set - a <project>.dsw, pointing the a number of DSP projects files. This can be loaded in just about any version of MSVC, and converted to the format it uses.

Of course these UNIX type sources include (a) a file generated during the auto-tool processing, 'config.h', and (b) several other 'standard' UNIX include headers. So in 'windows' I have hand prepared many of these, so as to avoid, if possible,  having to 'amend' the source. Often these are just 'stubs', providing just 'enough' definitions to get the source compiles.

These will be included in the source zips provided. Feel free to replace, or amend these. These are just my 'estimation' of what is needed to successfully compile the source. And the MS runtime libraries chosen for this port is Multithreaded static - that is /MT and /MTd. Always take care to NOT mix runtimes.

As always, source and binaries supplied as is. No warranty for fitness of purpose is implied or intended. To the extent possible this WIN32 port is released under GPL version 2, or later, at your choice.


top

Downloads

WARNING: Take care with downloading, and using binaries from the web

Some downloads:
libiconf-src-01.zip - Full, modified SOURCE, excluding the 'msvc' folder
libiconf-win-01.zip - The 'msvc' folder, including the MSVC build files, and /MT static libraries in 'msvc/bin'
libiconv-1.13.1.tar.gz - Original source

Date Link Size MD5
2011/08/04 libiconf-src-01.zip 5,127,253 6bea26eea7c9639437c922d2f741b035
2011/08/04 libiconf-win-01.zip 1,503,179 0178ba54035de74e8733adc754626c60
2010/09/21 libiconv-1.13.1.tar.gz 4,716,070 7ab33ebd26687c744a37264a330bbe9a

Usage:
- Download, and unzip the full source, preserving directories, into a folder of your choice,
  but suggest say C:\Projects\libiconv-1.13.1, to show the version.
- In this folder, create a folder 'msvc', like libiconv-1.13.1> md msvc, and then cd msvc
- Download the 'win' zip, and unzip it, preserving directories, in this 'msvc' folder
- Load libiconv.sln (or libiconv.dsw) into your version of MSVC. Allow it to convert to its own format.
- Run MSVC to build all targets.


top

Encodings

From its web site, libconv suggests it supports ALL the following encodings, some through UNICODE support -

European languages
ASCII, ISO-8859-{1,2,3,4,5,7,9,10,13,14,15,16}, KOI8-R, KOI8-U, KOI8-RU, CP{ 1250, 1251, 1252, 1253, 1254, 1257 }, CP{ 850, 866, 1131 }, Mac{ Roman, CentralEurope, Iceland, Croatian, Romania}, Mac{ Cyrillic, Ukraine, Greek, Turkish}, Macintosh
Semitic languages
ISO-8859-{6,8}, CP{ 1255, 1256 }, CP862, Mac{ Hebrew, Arabic }
Japanese
EUC-JP, SHIFT_JIS, CP932, ISO-2022-JP, ISO-2022-JP-2, ISO-2022-JP-1
Chinese
EUC-CN, HZ, GBK, CP936, GB18030, EUC-TW, BIG5, CP950, BIG5-HKSCS, BIG5-HKSCS:2001, BIG5-HKSCS:1999, ISO-2022-CN, ISO-2022-CN-EXT
Korean
EUC-KR, CP949, ISO-2022-KR, JOHAB
Armenian
ARMSCII-8
Georgian
Georgian-Academy, Georgian-PS
Tajik
KOI8-T
Kazakh
PT154, RK1048
Thai
ISO-8859-11, TIS-620, CP874, MacThai
Laotian
MuleLao-1, CP1133
Vietnamese
VISCII, TCVN, CP1258
Platform specifics
HP-ROMAN8, NEXTSTEP
Full Unicode
UTF-8, UCS-2, UCS-2BE, UCS-2LE, UCS-4, UCS-4BE, UCS-4LE, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, UTF-32LE, UTF-7, C99, JAVA
Full Unicode, in terms of uint16_t or uint32_t (with machine dependent endianness and alignment)
UCS-2-INTERNAL, UCS-4-INTERNAL
Locale dependent, in terms of `char' or `wchar_t' (with machine dependent endianness and alignment, and with OS and locale dependent semantics)
char, wchar_t - The empty encoding name "" is equivalent to "char": it denotes the locale dependent character encoding.

When configured with the option --enable-extra-encodings, it also provides support for a few extra encodings:

European languages
CP { 437, 737, 775, 852, 853, 855, 857, 858, 860, 861, 863, 865, 869, 1125 }
Semitic languages
CP864
Japanese
EUC-JISX0213, Shift_JISX0213, ISO-2022-JP-3
Chinese
BIG5-2003 (experimental)
Turkmen
TDS565
Platform specifics
ATARIST, RISCOS-LATIN1

It can convert from any of these encodings to any other, through Unicode conversion.


top

checked by tidy  Valid HTML 4.01 Transitional