FoxTrot Search Forum
FoxTrot Search for macOS Forum

Home » Public Forums » FoxTrot Search User Forum » Can we search url links in html files using Gumbo, without indexing as txt file?
Can we search url links in html files using Gumbo, without indexing as txt file? [message #1853] Fri, 27 September 2024 11:05 Go to next message
Atlas
Messages: 140
Registered: August 2009
Senior Member
I have lots of html files with hyperlinks in them, and I can search for the name of links.  However, I just realized that I cannot search for the url of the links inside html files.  This occurs when I use Gumbo to index html files.  Is the only solution is to index the html file as raw text?  I could do that, but there are several drawbacks to this approach as documented here: https://forum.foxtrot-search.com/index.php?t=msg&goto=15 47&&srch=html#msg_1547
Re: Can we search url links in html files using Gumbo, without indexing as txt file? [message #1855 is a reply to message #1853] Fri, 27 September 2024 15:37 Go to previous messageGo to next message
FoxTrot Engineering
Messages: 406
Registered: April 2020
Senior Member
Link URLs will be indexed (as part of the other metadata, not the main contents) when using gumbo for HTML files, in version 8.0.4 (8.0.4 beta should be released soon).

Jérôme - FoxTrot Engineering
Re: Can we search url links in html files using Gumbo, without indexing as txt file? [message #1856 is a reply to message #1855] Fri, 27 September 2024 19:01 Go to previous messageGo to next message
Atlas
Messages: 140
Registered: August 2009
Senior Member
If the url is indexed separately from main file contents, then is the url going to be highlighted when viewing the html file in Foxtrot Preview?
Re: Can we search url links in html files using Gumbo, without indexing as txt file? [message #1857 is a reply to message #1856] Fri, 27 September 2024 19:18 Go to previous messageGo to next message
FoxTrot Engineering
Messages: 406
Registered: April 2020
Senior Member
No, as the URL of a link is not displayed at all in the HTML preview. However, if you switch to plain text preview (either permanently using the toolbar popup menu, or temporarily by option-clicking the found file in the result list), then a header shows all the indexed metadata and yes, the found URL will be highlighted in this header. But there will be no indication of the display name of the URL, or of its location in the document.

A different implementation would be to insert the URL into the plain text content, after the link name (e.g. between brackets); however I don't think this should be the default behavior, and there are already too many advanced settings.


Jérôme - FoxTrot Engineering
Re: Can we search url links in html files using Gumbo, without indexing as txt file? [message #1858 is a reply to message #1857] Sat, 28 September 2024 09:03 Go to previous message
Atlas
Messages: 140
Registered: August 2009
Senior Member
1.  I think it would make sense to embed the url into the plain text content, because the url IS part of the plain text content.  If we were to open the html file in a plain text editor, the url is clearly visible.  This is actually a value added feature I would pay for in upgrades, because I have quite a few html files, and I didn't realize that their url contents have been unsearchable until now.  However, I understand that perhaps this could confuse some people.

2.  If it takes only a bit more effort to embed the url, then I would vote for that.  But if it's going to be very complex and error prone, then I would rather have the original proposed solution (index the url's only as metadata).

Thanks for your considerations on this issue.
Previous Topic: Unable to set up iOS FoxTrot Attache with Google Drive
Next Topic: macOS Sequoia 15.0 disables third-party QuickLook plugins; will indexing epubs still be available?
Goto Forum:
  


Current Time: Thu Nov 21 10:46:47 GMT+1 2024