Re: Strings, Was: profile results for new UT_* implementations?


Subject: Re: Strings, Was: profile results for new UT_* implementations?
From: Joaquín Cuenca Abela (cuenca@pacaterie.u-psud.fr)
Date: Wed Jun 20 2001 - 06:03:45 CDT


On 20 Jun 2001 07:51:01 +0200, Mike Nordell wrote:
>
> Dom Lachowicz wrote:
>
> > Abi historically has always used UCS-2 internally to represent strings,
> and as
> > you note, we're beginning to run into problems with that. Dealing with
> UTF-8 is
> > no more pleasant than dealing with UCS-2 in my experience, but perhaps it
> is
> > (much) more common in the programming communtiy as a whole.
>
> I'd say dealing with UTF-8 is _much´more of a hell:
> A discussion I and Joaquin had about this in the back of the cab on the way
> to the .dk party turned out that while having a document in any format
> on.disk, having it in UCS-2 in memory should be _much_ easier to deal with
> (only indexing on unsigned chars) than UTF-8 (indexing on... oh, we can't
> index). At the moment I believe we both felt it was the way to go. At least
> I still feel it's reasonable.

yes, I think that it's definitively the way to go. UTF-8 is too hard to
deal with it in memory, but a good idea for saving.

btw, not only MS uses UCS-2, but also Java.

Cheers,

--
Joaquín Cuenca Abela
cuenca@celium.net



This archive was generated by hypermail 2b25 : Wed Jun 20 2001 - 10:58:11 CDT