FoxTrot Search Forum
FoxTrot Search for macOS Forum

Home » Public Forums » FoxTrot Search User Forum » Indexing PDFs with lots of text
Re: Indexing PDFs with lots of text [message #1314 is a reply to message #1311] Sat, 27 November 2021 17:42 Go to previous messageGo to previous message
Grant Barrett
Messages: 35
Registered: October 2019
Member
I put together a script using pdftotext that counts characters in files. I ran it on 10 problem files. Six of the 10 are well under 5,000,000 characters, which means that something else is amiss. Here are the counts:

1554967
2167651
2293207
2458849
2903970
3388007
5458417
7954470
11318310
11481914

I have been, as you mention, option-clicking to look at the parsed text in FoxTrot. I have read the FAQ and taken all steps proposed there. I am now in the process of trying to determine what makes these other <5,000,000-character files problem files. I can make them, and the odd coverpage one, available to you in a private link.
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: BUG: Searching with OR operator and quotation mark
Next Topic: pdf documents with search highlights
Goto Forum:
  


Current Time: Mon Jul 07 15:58:39 GMT+2 2025