[xep-support] Re: RTL script, Arabic-Indic figures and hyphen

Alexey Gagarinov agagarinov at renderx.com
Sun Aug 19 22:27:58 PDT 2012


Hi Benoit,

>     The text inside the fo:inline element is not displayed correctly in a PDF rendered by XEP 4.19.
>
>       * I expect, reading right to left: "Arabic-word space Arabic-0 hyphen Arabic-1"
>       * but I see in the PDF: "Arabic-word space Arabic-1 hyphen Arabic-0"
>
>     So, 0 and 1 are switched. Other Unicode-compliant softwares display the text on screen as I expect it.

It was a bug in XEP's BIDI algorithm -- FIXED.

>     If I replace the hyphen by an Arabic letter, then the figures are in the expected order, but a space or
>     comma also give an incorrect order.

I believe, you're wrong about the comma.
XEP (both 4.19 and 4.20 versions) displays the correct order.

The comma (,) belongs to CS (Common Number Separator) class in terms of BIDI algorithm.
According to BIDI algorithm (UAX#9, W4 rule):
"A single common separator between two numbers of the same type changes to that type."

In other words, you should treat a single common separator between 2 Arabic (or European) numbers as a part 
of that entire number.
I guess it's more obvious for European numbers -- 1,000,000 is a single number, but the same is applied for 
Arabic numbers.
Note: '.' is also a CS class char, so 0.1 and 1,000,000 are both single numbers. 0,1 is also is a single 
number according to Unicode BIDI algorithm.


Regards,
   Alexey Gagarinov
RenderX




More information about the Xep-support mailing list