FoxTrot Search Forum
FoxTrot Search for macOS Forum

Home » Public Forums » FoxTrot Search User Forum » Do html files rely on Spotlight index?
Re: Do html files rely on Spotlight index? [message #1353 is a reply to message #1352] Sat, 19 February 2022 10:51 Go to previous messageGo to previous message
FoxTrot Engineering
Messages: 420
Registered: April 2020
Senior Member
FoxTrot does not rely on Spotlight's index; it does however use Spotlight's metadata importers to extract indexable text from files. So yes, if for some reason a Spotlight importer does not work correctly, files handled by this importer won't be indexed correctly both with Spotlight and with FoxTrot.
For HTML files, FoxTrot has a fallback extractor (Gumbo) that is used when Spotlight's importer does not return any data at all for a given HTML file (this sometimes happens). However, if Spotlight's importer returns some partial or garbled data, then FoxTrot indexes that.
You can option-click a file from FoxTrot's search result list (when searching by filename, for example) to see the plain text that was extracted from a given file.
What do you get for the file you attached, on both machines? Do these machines have the same version of macOS? A quick test here with your files shows very different results on macOS 10.14 (most visible text is actually indexed, as well as a bunch of base64-encoded images) and macOS 12 (no base64 data, but many text is missing). Interestingly, saving the source file from stackoverflow using the current version of SingleFile creates a file that can be extracted correctly on macOS 12.
FoxTrot 7.5 will allow to use Gumbo instead of Spotlight's importer for all HTML files, or for specific HTML files. This will however require using Terminal.app as Spotlight's importer is supposed to be fast and to give good results in most cases.


Jérôme - FoxTrot Engineering
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Safari History
Next Topic: Please help me get my indexing started
Goto Forum:
  


Current Time: Tue Jun 03 06:05:09 GMT+2 2025