Re: commit (HEAD): IMPORTANT - 32-bit UT_UCSChar

From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Tue May 07 2002 - 22:50:56 EDT

  • Next message: Martin Sevior: "commit: HEAD: Base classes for next generation Containers."

     --- F J Franklin <F.J.Franklin@sheffield.ac.uk>
    wrote: > On Tue, 7 May 2002, phearbear wrote:
    > >>>How about an easy way to convert to UTF-8 ?
    > >> You should be able to use the UT_UTF8String
    > class...
    > > Oups, i faulty assumed it didn't support UCS4
    > yet.
    >
    > Support is there but incomplete. Byte sequences
    > longer 3 bytes will cause
    > problems, and there isn't a UTF-8 -> UCS-4
    > conversion yet.

    Sorry to keep whining about this but it was all in
    my lost huge Unicode patch over a year ago. UTF-8
    sequences can be up to 6 bytes long. We should
    probably leave it up to iconv anyway since we have to
    handle things like overlong sequences, illegal
    sequences etc. iconv should handle this.
    I think my implementation used the ByteBuf class so
    that it could handle UCS-2 and UCS-4 properly without
    worrying about all those null bytes looking like
    string terminators and stuff.

    Andrew Dunbar.

    > Regards, Frank
    >
    > Francis James Franklin
    > F.J.Franklin@shef.ac.uk
    >
    > "No, she really likes me. She told me I look like
    > Britney Spears, and why
    > would you say that to somebody you don't like?"
    >
    > --- Elle Woods
    >
    >

    =====
    http://linguaphile.sourceforge.net http://www.abisource.com

    __________________________________________________
    Do You Yahoo!?
    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    http://uk.my.yahoo.com



    This archive was generated by hypermail 2.1.4 : Tue May 07 2002 - 22:53:39 EDT