From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Tue May 07 2002 - 22:50:56 EDT
--- F J Franklin <F.J.Franklin@sheffield.ac.uk>
wrote: > On Tue, 7 May 2002, phearbear wrote:
> >>>How about an easy way to convert to UTF-8 ?
> >> You should be able to use the UT_UTF8String
> class...
> > Oups, i faulty assumed it didn't support UCS4
> yet.
>
> Support is there but incomplete. Byte sequences
> longer 3 bytes will cause
> problems, and there isn't a UTF-8 -> UCS-4
> conversion yet.
Sorry to keep whining about this but it was all in
my lost huge Unicode patch over a year ago. UTF-8
sequences can be up to 6 bytes long. We should
probably leave it up to iconv anyway since we have to
handle things like overlong sequences, illegal
sequences etc. iconv should handle this.
I think my implementation used the ByteBuf class so
that it could handle UCS-2 and UCS-4 properly without
worrying about all those null bytes looking like
string terminators and stuff.
Andrew Dunbar.
> Regards, Frank
>
> Francis James Franklin
> F.J.Franklin@shef.ac.uk
>
> "No, she really likes me. She told me I look like
> Britney Spears, and why
> would you say that to somebody you don't like?"
>
> --- Elle Woods
>
>
=====
http://linguaphile.sourceforge.net http://www.abisource.com
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com
This archive was generated by hypermail 2.1.4 : Tue May 07 2002 - 22:53:39 EDT