Re: Making word delimiters locale sentitive

From: Andrew Dunbar (hippietrail@yahoo.com)
Date: Thu May 02 2002 - 00:01:25 EDT

  • Next message: Andrew Dunbar: "Re: Abiword in Evolution. (fwd)"

     --- Jordi Mas <jmas@softcatala.org> wrote: > Hi guys,
    >
    > Inside the file abi\src\af\util\xp\ut_misc.cpp there
    > is a struct called
    > "s_word_delim".
    >
    > In order to make the Catalan spell checker to work
    > properly with AbiWord
    > we need to hack this structure to include the "·"
    > character (not recognized
    > right now). Also, we need to hack the
    > UT_isWordDelimiter() because
    > the '-' character can be part of word in Catalan
    > (ex. copiar-lo).

    The dash can be part of a word in English and French
    too. I wonder why it's not already there.

    > In my opinion, all this settings should be locale
    > sensitive and we should
    > move them from the code into an external file where
    > we define all this
    > setting and can be easily modified for every
    > language.
    >
    > What do you think guys? If we agree on how to do
    > this, I can make the
    > changes myself.

    Actually it's a harder problem than this. In a
    multilingual context you need a function to find word
    boundaries since some languages (Thai, Khmer,
    Japanese)
    make spaces between words either optional or illegal.
    So we need a to call a function and this function
    needs to be able to call a language-specific function.
    Most languages (or the default) can then use a single
    function which, in turn, can use the "word delimeter"
    method which, as you point out, needs to be
    extensible.

    A perfect little case for OO (:

    Andrew Dunbar.

    > Thanks,
    >
    > Jordi,
    >

    =====
    http://linguaphile.sourceforge.net http://www.abisource.com

    __________________________________________________
    Do You Yahoo!?
    Everything you'll ever need on one web page
    from News and Sport to Email and Music Charts
    http://uk.my.yahoo.com



    This archive was generated by hypermail 2.1.4 : Thu May 02 2002 - 00:03:32 EDT