Re: BUG: Foxtrot looks like it is not indexing large files completely [message #1601 is a reply to message #1600] |
Fri, 23 December 2022 19:18 |
FoxTrot Engineering
Messages: 413 Registered: April 2020
|
Senior Member |
|
|
- it seems that Gumbo has indeed some size limits when parsing large HTML files. I did not find any obvious setting to control this, and I can't tell what these limits are.
- .tsv and .csv files are currently not handled by FoxTrot's built-in text extractor, which is used for .txt files when the hidden preferences PlainTextFileLimitMB or PlainTextPreferredEncodings are set, or when "prefer alternatives: plain text files: FoxTrot built-in" is enabled in the "manage third-party metadata importers" window). You can however use the Aliases hidden preference to handle these files as .txt files.
- there is a bug that can make these settings have no effect (PlainTextFileLimitMB, PlainTextPreferredEncodings, and "prefer alternatives: plain text files: FoxTrot built-in"). Not sure when this bug was introduced. So even if you set both PlainTextFileLimitMB and Aliases for .tsv files, you may still have the 10 MB limit of Spotlight's importer.
- No, FoxTrot does not rely on Spotlight itself, just on Spotlight's importers to extract text from various document formats.
Jérôme - FoxTrot Engineering
|
|
|