[xep-support] Inter-document references

Jim Melton jim.melton at acm.org
Sat Apr 19 11:16:42 PDT 2003


Nikolai,

Your proposed strategy works wonderfully!  I am not posting my actual XSLT 
code because it is so specialized to my requirements that it probably 
wouldn't help anybody else.  However, if anybody reading this list wishes 
to see it, I have no objections to my sending it to them.

Your note outlining the strategy mentioned that you wonder why do I need a 
separate "symbol" file; one possible reason has come to light.  The 
documents in my suite of parts add up to about 300,000 lines of source text 
(several megabytes of XML).  An XSLT transformation from XML to XSL FO that 
formerly took under 10 seconds now takes about 45 seconds.  That is 
approximately a 5:1 reduction in speed.  I do not currently believe that 
this will be a significant problem for me (because these conversions are 
done at most several times a month, not several times a day).  However, I 
think that the strategy of using a separate "symbol" file would cause a 
much smaller loss of speed.

Now, I have a few follow-on questions for both Nikolai and David:  The need 
to create "named destinations" in my PDF files in order to have my links 
open the PDF file at the correct place suggests that additional PDF code 
will be generated by XEP at certain places in the PDF file.

1) Will *every* place in my source XML file that has an ID attribute result 
in one of those "named destinations" in the corresponding PDF file?  Or 
will it be only those places for which I explicitly generate some XSL FO 
code (e.g., an extension in the rx: namespace)?  Or will it be only those 
places for which there is actually some reference?  (The last choice seems 
unlikely, because the references will be in a different document entirely.)

2) How "big" (e.g., how many bytes) will the PDF code for a single "named 
destination"?  As David may recall, I have already encountered a problem of 
extreme PDF file sizes caused by vast numbers of internal references, which 
I have currently disabled (but would like to re-enable when possible).  If 
every element with an ID attribute results in adding 50 or 100 bytes (never 
mind 200 or 300 bytes!) to the PDF file, this is likely to be prohibitive!

3) Have you guys made any progress in reducing the size of the PDF code 
corresponding to a "hot link" within a single PDF file?  When I last used 
that so prolifically, it was something like 250 to 300 bytes per 
reference.  You told me then that you might be able to reduce to as little 
as 150 bytes, which would make a very significant difference to me.  Can 
you update me on this situation?

Again, MANY THANKS!
    Jim

At 10:47 2003-04-18 +0400 Friday, you wrote:
>Jim,
>
> > The only idea I've had so far is to write a new XSLT stylesheet that
> > processes each of my documents (one at a time, manually --- since I
>haven't
> > gotten around to worrying about scripting issues) and outputs some sort of
> > file containing nothing beyond the "values" of every symbol in that
> > document.  Then, when I'm processing document 5 through my "real"
> > stylesheet, it would somehow access all of the "symbol" files and pick up
> > the chapter/section number and title.
> >
> > Question 1) Does that seem like a reasonable approach?  Are there better
> > approaches that I have managed to overlook or suppress?
>
>I wonder why do you need a separate"symbol" file. If all parts are similar
>in their structure, you can set up a generic template to exract
>section/chapter
>references from both local and remote documents. To automate the thing,
>you will need an index of all parts to map numbers to file names,
>conveniently
>specified in a separate XML file. More or less like this (untested, just to
>show
>the idea):
>
>parts-index.xml:
>
><parts>
>    <part number="1" file="SQL-20030418-part1-rev3"/>
>    <part number="2" file="SQL-20030310-part2-rev12"/>
>    ...
></parts>
>
>
>Stylesheet (very roughly):
>
><!-- Global variable to store an array of part names -->
><xsl:variable name="parts"
>               select="document('parts-index')/parts/part"/>
>
><!-- Template to build section name -->
><xsl:template name="get-section-name">
>     <xsl:param name="root" select="/">
>     <xsl:param name="ref"/>
>
>     <xsl:for-each select="$root//*[@id=$ref][self::chapter or
>self::section]">
>         <xsl:number format="1.1.1.1.1.1. " level="multiple"
>                     count="chapter | section" from="$root" />
>         <xsl:value-of select="title"/>
>     </xsl:for-each>
></xsl:template>
>
><!-- Local link: root is /, internal destination -->
><xsl:template match="docref">
>     <fo:basic-link color="blue"
>                    internal-destination={@ref}>
>         <xsl:call-template name="get-section-name">
>             <xsl:with-param name="ref" select="@ref"/>
>         </xsl:call-template>
>     </fo:basic-link>
></xsl:template>
>
><!-- Remote link: root is retrieved by document(), external destination -->
><xsl:template match="docref">
>     <xsl:variable name="filename" select="$parts[@part]/@file"/>
>     <fo:basic-link color="blue"
>
>external-destination="url(file://{$filename}.pdf#{@ref})">
>         <xsl:call-template name="get-section-name">
>             <xsl:with-param name="root" select="document(concat($filename,
>".xml")"/>
>             <xsl:with-param name="ref" select="@ref"/>
>         </xsl:call-template>
>     </fo:basic-link>
></xsl:template>
>
> > Question 2) Would it be better if that "symbol" file were a plain text
>file
> > (how would I access such information in XSLT?) or an XML file (something
> > like <part2References><reference number="4"
> > title="Concepts/><reference.../></part2References>)?  I assume that the
> > information contained in an XML file would be more easily accessed in my
>XSLT.
>
>That's true. But my suggestion is to work directly from source files,
>without
>intermediate data. Using document() function, you can work conveniently
>with multiple source files.
>
> > Question 3) Once I have that information, how can I turn the text (e.g.,
> > Section 4.2, "Data types", in Part 2) into a "hot link" that will cause
> > Acrobat to open the file containing part 2 and position it at the start of
> > Section 4.2 --- or at least on the same page?  (I'm aware of the
> > external-destination attribute, but have not been successful at making
> > referenced PDF documents open at the right place.)
>
>This requires support for named destinations in PDF files produced
>by XEP. It is not yet available in the current version, but already
>implemented and under final testing; so in few weeks at most it will
>be delivered to XEP users. The syntax will be as follows:
>
>external-destination="url(file://somefile.pdf#someplace)"
>
>will open PDF file 'somefile.pdf' in the same Acrobat window (without
>going to a browser) and jump to a named destination 'someplace' inside it.
>Named destinations will be created by @id attributes in the source FO;
>so the above URL would bring you to the same place as
>internal-destination="someplace" in somefile.fo. That's what I tried
>to show in the above sample code.
>
>(I can unveil a secret: the above syntax for file links works even in the
>current version. Named destinations are not created however;
>so referenced PDF documents always open at the first page).
>
>Best regards,
>Nikolai Grigoriev
>RenderX
>
>
>-------------------
>(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
>in the body of the message to majordomo at renderx.com from the address
>you are subscribed from.
>(*) By using the Service, you expressly agree to these Terms of Service 
>http://www.renderx.com/tos.html

========================================================================
Jim Melton --- Editor of ISO/IEC 9075-* (SQL)     Phone: +1.801.942.0144
Oracle Corporation            Oracle Email: mailto:jim.melton at oracle.com
1930 Viscounti Drive          Standards email: mailto:jim.melton at acm.org
Sandy, UT 84093-1063              Personal email: mailto:jim at melton.name
USA                                                Fax : +1.801.942.3345
========================================================================
=  Facts are facts.  However, any opinions expressed are the opinions  =
=  only of myself and may or may not reflect the opinions of anybody   =
=  else with whom I may or may not have discussed the issues at hand.  =
========================================================================

-------------------
(*) To unsubscribe, send a message with words 'unsubscribe xep-support'
in the body of the message to majordomo at renderx.com from the address
you are subscribed from.
(*) By using the Service, you expressly agree to these Terms of Service http://www.renderx.com/tos.html



More information about the Xep-support mailing list