FoxTrot Search Forum
FoxTrot Search for macOS Forum

Home » Public Forums » FoxTrot Search User Forum » Broken words at the end of line (How to remove the carriage return)
Re: Broken words at the end of line [message #1275 is a reply to message #1274] Thu, 30 September 2021 15:27 Go to previous messageGo to previous message
FoxTrot Engineering
Messages: 420
Registered: April 2020
Senior Member
In fact, as far as I know, the PDF standard (in recent version) allows to create documents that handles hyphenated words correctly; for example, the German word "Drucker" can be displayed as "Druk-" (with a k) followed by "ker" (with a second k) on the next line, while still returning "Drucker" (with ck) to applications that process the content. However, I don't know how to create such a PDF file, and I can't find a sample file to download.
Your example file, as well as PDF files I tried to generate using TextEdit, Pages, Word or LibreOffice, do not use this feature, and generate two distinct words separated by a hyphen-minus character (U+002d, the plain old -), or an hyphen character (U+2010). Then Xpdf (and probably Acrobat also) uses a hack to delete the last character of a line when it is an hyphen-minus, which will probably be fine in your case; however I am quite surprised and disappointed that such a hack is still needed in 2021, for a file format that is more than 30-year old.


Jérôme - FoxTrot Engineering
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: how to find all non searchable pdf
Next Topic: Foxtrot Search over VPN from different locations (Foxtrot server)
Goto Forum:
  


Current Time: Thu Aug 21 14:47:07 GMT+2 2025