[xep-support] Selectable/searchable phrases in PDF in table column content spanning lines

David Clunie dclunie at dclunie.com
Fri Feb 21 05:32:48 PST 2014


Hi

In PDF table cells from the DocBook FO stylesheets rendered with xep,
if the words in a table cell spread cross several lines, then they
are not kept together from the perspective of selection or searching
in PDF viewers.

I have not been able to improve this behavior using any of the FO
"keep-together" options (e.g., applied in "table.cell.block.properties"
template customizations). I use "keep-together.within-column"
successfully to prevent cells breaking across pages, but that does
not affect the described problem.

And using "keep-together.within-line" doesn't solve the problem either
(phrases still get split up), and also causes some tables to over flow
the page margins anyway and is hence unusable for this.

In short, I am in need of some way of having the content wrap to fit
within the page (obviously) but remain together from the PDF encoding
perspective such that phrases are searchable, which is critical for
our use case (we have a standard with many thousands of pages and
need to be able to search for phrases that are (long) names of data
elements that are present in tables and wrapped to fit on a page
(e.g., as in the screen shots, "Shared Functional Groups Sequence").

Since Word can do it, I know PDF can be encoded this way; the question
is how to get xep to do it.

It isn't split in the FO (the text is contained in one <fo:block/>).

The attached screen shots show selection of a phrase spanning lines
wrapped within a cell highlighted when displayed using Acrobat, with
"good" output from Word and "bad" output from XEP.

The DocBook fragment for this row is:

<tr valign="top">
  <td align="left" colspan="1" rowspan="1">
   <para>Shared Functional Groups Sequence</para>
  </td>
  <td align="center" colspan="1" rowspan="1">
   <para>(5200,9229)</para>
  </td>
  <td align="center" colspan="1" rowspan="1">
   <para>1</para>
  </td>
  <td align="left" colspan="1" rowspan="1">
   <para>Sequence that contains the Functional Group Macros that are 
shared for all frames in this SOP Instance and Concatenation.</para>
   <note>
    <para>The contents of this sequence are the same in all SOP 
Instances that comprise a Concatenation.</para>
   </note>
   <para>Only a single Item shall be included in this sequence.</para>
   <para>See <xref linkend="sect_C.7.6.16.1.1" xrefstyle="select: 
label"/> for further explanation.</para>
  </td>
</tr>

The customization used is:

<xsl:template name="table.cell.block.properties">
<xsl:attribute name="keep-together.within-column">always</xsl:attribute>
   <xsl:choose>
     <xsl:when test="ancestor::d:thead or ancestor::d:tfoot">
       <xsl:attribute name="font-weight">bold</xsl:attribute>
     </xsl:when>
     <!-- Make row headers bold too -->
     <xsl:when test="ancestor::d:tbody and
                     (ancestor::d:table[@rowheader = 'firstcol'] or
                     ancestor::d:informaltable[@rowheader = 'firstcol']) and
 
ancestor-or-self::d:entry[1][count(preceding-sibling::d:entry) = 0]">
       <xsl:attribute name="font-weight">bold</xsl:attribute>
     </xsl:when>
   </xsl:choose>
</xsl:template>

and the extract of the FO produced by the DocBook FO stylesheets is:

<fo:table-row>
                 <fo:table-cell padding-start="2pt" padding-end="2pt" 
padding-top="2pt" padding-bottom="2pt" text-align="left" 
display-align="before" border-start-style="none" border-top-style="none" 
border-bottom-style="solid" border-bottom-width="0.5pt" 
border-bottom-color="black" border-end-style="solid" 
border-end-width="0.5pt" border-end-color="black"><fo:block 
keep-together.within-column="always">
                   <fo:block space-before.optimum="1em" 
space-before.minimum="0.8em" space-before.maximum="1.2em">Shared 
Functional Groups Sequence</fo:block>
                 </fo:block></fo:table-cell>
                 <fo:table-cell padding-start="2pt" padding-end="2pt" 
padding-top="2pt" padding-bottom="2pt" text-align="center" 
display-align="before" border-start-style="none" border-top-style="none" 
border-bottom-style="solid" border-bottom-width="0.5pt" 
border-bottom-color="black" border-end-style="solid" 
border-end-width="0.5pt" border-end-color="black"><fo:block 
keep-together.within-column="always">
                   <fo:block space-before.optimum="1em" 
space-before.minimum="0.8em" 
space-before.maximum="1.2em">(5200,9229)</fo:block>
                 </fo:block></fo:table-cell>
                 <fo:table-cell padding-start="2pt" padding-end="2pt" 
padding-top="2pt" padding-bottom="2pt" text-align="center" 
display-align="before" border-start-style="none" border-top-style="none" 
border-bottom-style="solid" border-bottom-width="0.5pt" 
border-bottom-color="black" border-end-style="solid" 
border-end-width="0.5pt" border-end-color="black"><fo:block 
keep-together.within-column="always">
                   <fo:block space-before.optimum="1em" 
space-before.minimum="0.8em" space-before.maximum="1.2em">1</fo:block>
                 </fo:block></fo:table-cell>
                 <fo:table-cell padding-start="2pt" padding-end="2pt" 
padding-top="2pt" padding-bottom="2pt" text-align="left" 
display-align="before" border-start-style="none" border-top-style="none" 
border-bottom-style="solid" border-bottom-width="0.5pt" 
border-bottom-color="black"><fo:block keep-together.within-column="always">
                   <fo:block space-before.optimum="1em" 
space-before.minimum="0.8em" space-before.maximum="1.2em">Sequence that 
contains the Functional Group Macros that are shared for all frames in 
this SOP Instance and Concatenation.</fo:block>
                   <fo:block id="idp140215214853168" 
space-before.minimum="0.8em" space-before.optimum="1em" 
space-before.maximum="1.2em" margin-left="0.25in" 
margin-right="0.25in"><fo:block keep-with-next.within-column="always" 
font-size="9pt" font-weight="bold" 
hyphenate="false">Note</fo:block><fo:block><fo:block 
space-before.optimum="1em" space-before.minimum="0.8em" 
space-before.maximum="1.2em">The contents of this sequence are the same 
in all SOP Instances that comprise a 
Concatenation.</fo:block></fo:block></fo:block>
                   <fo:block space-before.optimum="1em" 
space-before.minimum="0.8em" space-before.maximum="1.2em">Only a single 
Item shall be included in this sequence.</fo:block>
                   <fo:block space-before.optimum="1em" 
space-before.minimum="0.8em" space-before.maximum="1.2em">See 
<fo:basic-link 
internal-destination="sect_C.7.6.16.1.1"><fo:inline>Section C.7.6.16.1.1</fo:inline></fo:basic-link> 
for further explanation.</fo:block>
                 </fo:block></fo:table-cell>
               </fo:table-row>

Thanks ... David

!DSPAM:87,5307558d9852186785442!




More information about the Xep-support mailing list