[xep-support] Reduction of size of the generated PDF's

W. Eliot Kimber ekimber at innodata-isogen.com
Thu Oct 19 05:50:45 PDT 2006


paritosh.mahato at exxonmobil.com wrote:
> Hi,

> If you have have suggestion or any documentation available to understand
> the technical details and how to resolve this, please revert back.

David will probably have more to say but it's difficult to know why your 
PDFs are so much bigger without seeing the details of the data going 
into the PDFs as well as the various PDF creation options. For example, 
do the big PDFs have bit-mapped (raster) graphics at high resolution? 
Are you embedding entire fonts without subsetting? Are the text streams 
in the PDF compressed or uncompressed?

In general, the most efficient way to get minimal-sized PDFs is to 
generate the PDF, open it in Acrobat and save it again, turning on all 
the options that minimize size (which can include down-sampling 
high-resolution graphics).

Also, it is usually graphics that determine the final PDF size--since 
the text content of the input and the PDF will be about the same, you 
should expect a PDF without graphics to be about the same size as the 
original XML data coming into it (assuming a more ore less one-to-one 
mapping from input elements to output pages). That is, the space taken 
up by the markup in the XML will be more or less matched by all the 
non-text formatting data in the generated PDF, with the text content 
taking up about the same amount of space (that is, no more than 2 or 3 
times as much space at most depending on variables like the original 
encoding of the XML files (UTF-16 takes twice as much disk space as 
UTF-8 for non-Asian content) and how the PDF text streams are encoded in 
the PDF (they can be clear text or compressed which can make a big 
difference).

So if you generate the PDF without graphics and the resulting PDF is no 
more than twice the size of the input XML you know it's the graphics 
that are causing your size problem. In that case you probably need to 
downsample your graphics to reduce their size or, if the graphics 
require that level of resolution for quality purposes, accept the size 
(but since you said your old PDFs were much smaller I assume that's not 
the case). Note that Distiller and Acrobat can automatically downsample 
graphics when they create PDF and I think do so by default, so that 
could account for most or all of the difference you're seeing.

Other things may be at work. For example, I had a client that wanted to 
have a gradient bar on the edges of their pages. The graphic they gave 
me was a 1.5Mbyte bitmap. I replaced it with a hand-coded 10-line EPS 
file that achieved the same result. That resulted in significantly 
smaller PDFs but had nothing to do with the composition engine.

There are various ways to automate the post-processing of PDFs using 
both Adobe-supplied and 3rd-party tools: see www.planetpdf.com for all 
the tools.

Cheers,

Eliot

-- 
W. Eliot Kimber
Professional Services
Innodata Isogen
9390 Research Blvd, #410
Austin, TX 78759
(214) 954-5198

ekimber at innodata-isogen.com
www.innodata-isogen.com

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo at renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/terms-of-service.html



More information about the Xep-support mailing list