[xep-support] PDF File size

Michael Sulyaev msulyaev at renderx.com
Mon Oct 20 23:56:18 PDT 2008


Justin Lipton wrote:
> Is there a way to make the rendered PDF ignore these redundant inlines 
> such that the PDF output would be identical in both cases?
> We're trying to get PDF unit tests working and this seems to be tripping 
> us up. Stripping them out as part of the FO generation would really 
> complicate our existing XSLT transforms.

Hello Justin,

Generated PDF files may not be binary identical even for one and the 
same input, at least because they contain datetime fields which 
obviously vary. File size may also differ, at least because compression 
methods are not expected to produce outputs of equal size even if inputs 
are of equal size.

I'd suggest not to waste time doing unit tests on PDF, but rather 
consider using XEP Intermediate Format output for this purpose. These 
are text files (XML actually), so diffing is fast, and if you need to 
see the difference some way, xmldiff may help.

Another approach is to use a rasterizer (e.g. GhostScript) to produce 
raster images from PDF files and compare them pixel-wise page-by-page. 
Sounds a bit complicated, but it works, and may be extended to produce 
visible 'symmetric' difference images for pages that differ, if you need it.

Regards,
Michael Sulyaev
RenderX

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo at renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html



More information about the Xep-support mailing list