[xep-support] Question about rendering character references

Eliot Kimber ekimber at innodata-isogen.com
Fri Jul 22 11:34:08 PDT 2005


Powell, Todd wrote:
> However, in all these cases, when the FO is pushed through XEP, the
> character doesn't show in the resulting PDF.
> 
> Any suggestions on what I'm doing wrong.

You need to use a font that has a glyph for that character. On Windows 
2K and XP you can use the character map to see if the font you're using 
has a glyph for that character. You may also need to change the font 
selection strategy to "character-by-character" if you are specifying a 
font list that includes a font with a glyph for the character. See a 
recent response to this list in regards to another font issue for 
details on font selection strategy.

The encoding you use doesn't matter--it's the same abstract character in 
the parsed XML.

The reason your UTF-8 representation looked wrong but the UTF-16 version 
looked correct is because whatever you used to look at the UTF-8 version 
interpreted it as ASCII, not UTF-8. This doesn't happen with UTF-16 
encodings because they must start with the magic "byte order mark" that 
unambiguously marks them as being UTF-16. UTF-8 doesn't require a byte 
order mark (but one can be used).

UTF-8 is designed to be compatible with ASCII, such that the first 255 
characters of UTF-8 are the same as in ASCII. Thus, if the tool opening 
file doesn't know it's UTF-8 and tries to guess, it might guess wrong if 
all it sees are ASCII characters. Most encoding guessers look at the 
first 1000 or so characters and if they find an UTF-8 non-ASCII 
character it must be UTF-8 and if they don't, they assume ASCII. But, 
like in your case, they can assume wrong, and then you see what you 
saw--apparent garbage characters.

Using a tool like SC Unipad or Textpad, you can force the encoding on 
open and then get an accurate picture of your file's contents.

Cheers,

Eliot
-- 
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(512) 372-8155

ekimber at innodata-isogen.com
www.innodata-isogen.com

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo at renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html



More information about the Xep-support mailing list