[xep-support] Rendering Unicode combined diacritic characters

Nikolai Grigoriev grig at renderx.com
Thu May 20 03:22:38 PDT 2004


Jim,

> I'm finding that when displaying a PDF file generated
> from XEP, that Unicode combining diacritics don't
> render correctly.

You are right. XEP does not do any special processing 
for combining diacritics: it just displays diacritic signs
as a standalone characters. (They show above their 
preceding characters only by virtue of their glyph metrics).

I admit this is a drawback in XEP; but it is not an easy 
one to overcome. Proper diacritic placement requires
parsing of complex font structures; we don't have immediate 
plans in that direction.

> You'll notice that the barred i does have
> the acute accent over it (though a little too far
> left), but the dot on the i is retained (it shouldn't
> be retained when combined with an acute accent).
> This character is represented by x0268 x0301. 

I don't see how we could achieve this, unless there 
is a special glyph for barred dotless i. XEP certainly
cannot decompose glyph descriptions from fonts.

> Microsoft's Unicode rendering engine (Uniscribe), 
> only recently was able to render these character
> combinations correctly. There are other unicode
> rendering schemes, such as SIL's graphite...

We don't use any third-party Unicode processors: 
our past experience made us very suspicious to
external components, as it seriously undermines 
portability.

Best regards,
Nikolai Grigoriev
RenderX


----- Original Message ----- 
From: "Jim Skelton" <jimskelton at yahoo.com>
To: <xep-support at renderx.com>
Sent: Tuesday, May 18, 2004 3:36 AM
Subject: [xep-support] Rendering Unicode combined diacritic characters


I'm finding that when displaying a PDF file generated
from XEP, that Unicode combining diacritics don't
render correctly. An example of this is at
http://www3.telus.net/osis/CNT.2john.pdf -- in the
title, 3rd word from the left, you'll notice that the
acute accent over the e dieresis is too low. This
character is represented by the two glyphs x00EB
x0301. The latter glyph is a combining diacritic,
meaning that it overstrikes the preceding character.
In theory, the rendering engine should place the
combining diacritic at the correct height, depending
on the metrics of the base character. 

Another example of this is in the acute accent over
barred i, found in the first line of regular text, 2nd
word over. You'll notice that the barred i does have
the acute accent over it (though a little too far
left), but the dot on the i is retained (it shouldn't
be retained when combined with an acute accent). This
character is represented by x0268 x0301. 

The embedded font used in this PDF file is a Unicode
font which contains a lot of the latin character
subsets. 

Is there something that can be done to render Unicode
combining diacritics correctly? I noticed that
Microsoft's Unicode rendering engine (Uniscribe), only
recently was able to render these character
combinations correctly. There are other unicode
rendering schemes, such as SIL's graphite...

--Jim







__________________________________
Do you Yahoo!?
SBC Yahoo! - Internet access at a great low price.
http://promo.yahoo.com/sbc/
-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo at renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/tos.html

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo at renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/tos.html



More information about the Xep-support mailing list