[ TechnoCage | Caskey | siconv ]
siconv will convert data from one character set to another. It was written to overcome the problem with the iconv that comes with glibc. That version appears to read in an entire file before converting it. When I needed to convert a 1GB file from iso-8859-1 to utf8 it just wouldn't do. iconv() provides the necessary operations but there isn't a command line utility that simply wrappered that method. Comments are welcome. If you find something interesting to do with this code I'd like to hear about it. My email address is below.
All you need to do is cat your data through it. It reads from stdin and writes to stdout. If you run it without any command line parameters it will convert from iso-8859-1 (US-Latin1) to utf8. Otherwise it will convert from the first command line parameter to the second. Anything other than zero or two command line parameters yields a usage message.
$ bzcat data.txt.bz | ./siconv > converted-data.txt $
http://www.technocage.com/~caskey/siconv You're soaking in it!
