[NZLUG] Character set translator passing illegal charaters

Robin Sheat robin at kallisti.net.nz
Wed Jun 26 21:09:59 NZST 2013


Op 26-06-13 17:12, Volker Kuhlmann schreef:
> Does anyone know of a character set conversion program that handles
> illegal characters by passing them through? I can't find one, and am a
> bit reluctant to re-invent the wheel.

What are you converting? I often use Encoding::FixLatin in Perl for
cleaning up Latin1 and making unicode out of it. This does a pretty good
best-effort job, along with some regexes to clean up smart quotes and
other oddball things that end up in text (I found VT100 control codes in
one data set.)

http://search.cpan.org/dist/Encoding-FixLatin/lib/Encoding/FixLatin.pm

Robin.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 263 bytes
Desc: OpenPGP digital signature
URL: <http://lists.nzoss.org.nz/pipermail/nzlug/attachments/20130626/e7f5165d/attachment.pgp>


More information about the NZLUG mailing list