[xep-support] Re: Creating embedded index in PDF for faster searching?

Mark Giffin mgiffin at earthlink.net
Wed May 7 19:02:12 PDT 2014


Sounds interesting! Do you have any idea what the official Adobe name 
for this feature is?

Mark

On 5/7/14 9:57 AM, David Clunie wrote:
> Hi Mark
>
> The feature I am describing is quite distinct from the separate
> "Catalog" feature that I think you are referring to, which produces
> separate index files, and is not what I want at all.
>
> Rather, I am referring to an embedded index within each PDF file.
>
> This greatly accelerates using the Find function when an individual
> PDF file is opened, as well as greatly accelerating the Search All
> PDF Documents in (folder) function when a bunch of files need to
> be searched, which allows the user to quickly find stuff without
> having to mess with configuration of separate catalogs.
>
> David
>
> On 4/30/14 8:56 PM, Mark Giffin wrote:
>> I don't think Word can do this. Adobe Acrobat Professional can do this
>> and I agree, the index it produces is vastly faster, and it will also
>> index a whole bunch of separate PDF files in one index. It's an old
>> feature (used to be called "Catalog") that Adobe doesn't seem to talk
>> about anymore. If you want to automate it you might look at Adobe
>> ExtendScript for Acrobat. ExtendScript is Adobe's JavaScript-based
>> scripting language for products like Photoshop, FrameMaker etc. but I
>> don't know if Acrobat supports it. But if it does you could probably
>> write a small script to kick off this Catalog indexing, and if you're
>> really lucky there may be a way to kick it off from the command line, so
>> you could incorporate it into your PDF build process.
>>
>> Mark Giffin
>> http://markgiffin.com/
>>
>> On 4/30/14 5:13 PM, David Clunie wrote:
>>> That's a bit disappointing. If Word can do it, it would be nice
>>> if RenderX could too (as a post-processing step if necessary),
>>> since doing it manually in Acrobat afterwards is painful, and
>>> I couldn't find a command line tool to do it.
>>>
>>> David
>>>
>>> On 4/20/14 5:45 PM, Kevin Brown wrote:
>>>> This is not supported by RenderX and there are no plans to add it.
>>>> This is
>>>> an operation best performed after the entire document is created 
>>>> and not
>>>> "as" it is being created.
>>>>
>>>>
>>>> Kevin Brown
>>>> (650) 327-1000 Direct
>>>> (650) 328-8008 Fax
>>>> (925) 395-1772 Mobile
>>>> skype:kbrown01
>>>> kevin at renderx.com
>>>> sales at renderx.com
>>>> http://www.renderx.com
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: xep-support-bounces at renderx.com
>>>> [mailto:xep-support-bounces at renderx.com] On Behalf Of David Clunie
>>>> Sent: Wednesday, April 16, 2014 6:10 AM
>>>> To: xep-support at renderx.com
>>>> Subject: [xep-support] Creating embedded index in PDF for faster
>>>> searching?
>>>>
>>>> Hi
>>>>
>>>> I am creating quite large PDF files that users frequently search
>>>> within, and
>>>> the searches are relatively slow.
>>>>
>>>> I am using the ENABLE_ACCESSIBILITY in xep.xml to created tagged PDF.
>>>>
>>>> If I load these into Acrobat and then use Advanced > Document
>>>> Processing >
>>>> Manage Embedded Index > Create Index, then the result is a MUCH faster
>>>> search.
>>>>
>>>> However, I would rather generate these in the pipeline with XEP (or an
>>>> additional pass with some other command line tool if anyone knows of
>>>> one).
>>>>
>>>> I couldn't find anything in the manual about this, or any obvious
>>>> option.
>>>>
>>>> David
>>
>>
>>
>


!DSPAM:87,536ae5cb9856928156718!




More information about the Xep-support mailing list