Re: [Fribidi-discuss] fribidi and arabic joining

From: Tomas Frydrych (tomas@frydrych.uklinux.net)
Date: Tue Mar 26 2002 - 04:29:30 EST

Next message: Philippe DEFERT: "abi post 0.99.3 CVS March 25, IRIX 6.5"

Previous message: F J Franklin: "Re: A new start? [Re: Alan's excellent idea]"
Next in thread: Tomas Frydrych: "Re: [Fribidi-discuss] fribidi and arabic joining"
Reply: Tomas Frydrych: "Re: [Fribidi-discuss] fribidi and arabic joining"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi Behdad,

thanks, I found that very helpful.

> But things are not so easy, the Arabic Joining Alg.
> itself needs the "Left" and "Right" character of a character in text,
> which Left and Right are defined in the visual text, not logical, the
> left and right characters cannot be found easily from the next and
> previous character of the logical order, because of the override marks
> (LRO and RLO).

Its not just the LRO/RLO, but all embeding level boundaries, since
there the character on one (visual) side of the character being
shaped is unrelated to the next/prev. logical character.

> Example:
> <LRO> a b C D <RLE> f g H <PDF> x Y z <PDF>
> => <LRO> z Y x <RLE> f g H <PDF> D C b a <PDF>
> also <LRO> a b <RLO> f g <PDF> h <LRE> x y <PDF> Z <PDF>
> => <LRO> Z <LRE> x y <PDF> h <RLO> f g <PDF> b a <PDF>

If I have a logical sequence abcKLMxyz (caps = RTL, overall LTR),
then visual: abcMLKxyz, i.e., the visual left and right of K and M
cannot be worked out in the logical space because 'c' is no way
related to 'x' and 'L', and 'x' is in no way related to 'c' and 'L'. If the
joining algorithm is defined in visual space, then I do not see how
you can get around having to completely reorder first.

> The first idea needs some work to prove the independency (which
> may not be true).
Shaping is _not_ independent of line breaking, here is why:

(1) Unidirectional text: the BIDI transformation does not change the
visual sequence of glyphs, but just the orientation of the coordinate
system from which it is observed, i.e., the context of individual
characters remains identical. Line breaking makes no difference
here, since it does not impact the visual ordering (the individual
lines are mere snapshots of segments of the 'original' single long
(visual) line).

(2) true bidi text: (1) applies to all characters except those on
direction boundary. For the characters on the boundary, we have
two cases:
   (2a) character starts an embeding level: the context given by the
   next (logical) character, and the last (logical) character of the
   preceeding embeding level

   (2b) character ends embeding level: the context given by the
   previous (logical character), and the first (logical) character of the
   next embeding level.

Now, the line breaking here does not change the visual sequence
of the text if the line break falls withing a segment of visual
direction identical to the base direction, it has exactly the same
effect as in case (1).

If the line break falls into a segment with a different visual direction
than the base direction, the visual order of glyphs is impacted (we
no more have simple snapshots of segments of the original long
visual line). The question of whether this has any impact on the
shaping then boils down to one thing: is the contextual value of the
glyph immediatelly before the last embeding level change on the
previous line identical to the contextual value of the character on
which a line break is allowed. If it is (A), then the joining is
independent of line breaking. If it is not (B), then joining is affected
by line breaking:

abcKLM OPQ xyz => abcQPO MLK xyz
with line between M and O:
abcMLK
QPO xyz
i.e., the origianal line would have had medial Q and final M, the new
line has medial M and final Q.

In theory (B) is entirely legitimate, and so joining is not
independent of linebreaking. In real life, the assumption that (A) is
true will often be satisfied, because we can expect a character
such as space on the embeding level boundary.

It seems to me that the only way to completely resolve the circular
dependence here is by defining the joining algorithm in logical
space.

Tomas

But the second one which is a bit complex
> seems to produce the desired result. I will provide the test
> cases for different cases in another mail.
>
> [End of BiDi vs. Arabic Joining interaction material, the rest is
> fribidi related.]
>
> B. Our implementation of the Arabic Joining Algorithm is quite
> small and light, that will not harm the objectives of fribidi at all,
> but makes it much more useful, either the command line tool (that can
> be used to cat right to left files), and the library. Many
> applications that use fribidi do not support Arabic Joining as there
> is no light-weight implementation of it availble, or the author just
> wanted it to work for hebrew. But with Arabic Joining in fribidi the
> developer can just easily turn the arabic joining on to work well for
> arabic too.
>
> C. The Pango is not a real solution for the audience of fribidi:
> fribidi has been ported to some mobile devices. Also fribidi has been
> used on linux console and xterm, that is not a good idea to use pango
> for arabic joining there. fribidi is mostly used for hebrew and
> arabic scripts, which their rendering will be completed with arabic
> joining algorithm, then we should not worry about other shaping
> matters, when shaping of all the Unicode characters is needed, the
> fribidi feature can be turned off.
>
> D. Using the Unicode Arabic Presentation Forms is also essential with
> Linux console, as the kernel maps the Unicode codepoints to glyphs,
> for other scripts like syriac which does not have the presentation
> forms in unicode, their presentaion forms should be registered in the
> private area of unicode (H. Peter Anvin is responsible for registering
> them in linux), to be able to show them in linux console.
>
> E. About the overhead of it on fribidi, I believe that the
> hebrew community should not be so happy, but:
>
> 1. It can be fully turned out with a configure time
> option.
> 2. When compiled with Arabic Joining, by default its
> off, the developer should turn it on if needed.
> 3. I try to put it in a different binary to save the
> resources.
>
>
> I hope that with the above discussion there will be enough
> reasons for all of you to put it in fribidi.
>
> Yours,
> -- Behdad Esfahbod 6 Farvardin 1381, 2002 Mar 26
> <behdad at bamdad dot org> [Finger for Geek Code]
>
>
>
> _______________________________________________
> Fribidi-discuss mailing list
> Fribidi-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/fribidi-discuss
>

Next message: Philippe DEFERT: "abi post 0.99.3 CVS March 25, IRIX 6.5"
Previous message: F J Franklin: "Re: A new start? [Re: Alan's excellent idea]"
Next in thread: Tomas Frydrych: "Re: [Fribidi-discuss] fribidi and arabic joining"
Reply: Tomas Frydrych: "Re: [Fribidi-discuss] fribidi and arabic joining"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.4 : Tue Mar 26 2002 - 04:33:37 EST