Re: Display of results is in not formatted text [message #943 is a reply to message #941] |
Thu, 28 November 2019 16:34 |
FoxTrot Engineering
Messages: 406 Registered: April 2020
|
Senior Member |
|
|
> 2/ How can I know if I have too many documents (I noticed that the research engine is quite slow).
There is no arbitrary limit to the number of documents you index, nor to their size.
Regarding performance, are you talking of indexing speed, or search speed? And what do you mean by "quite slow"?
Indexing a large collection of large files can take some time (possibly hours), however updating the index should usually not take more than a few minutes, as most of the files have probably not changed since the last update. Version 7 (a beta version will be made available here soon) will be much faster than version 6 for indexing.
Some specific documents can deteriorate the index file, making both indexing, updating and searching much slower than it should. We call these "resource hogs" («boulets» in the french version), and you can check if you have some at the bottom of the indexed locations list, in the "indexed data" pane of the "manage indices" window. Having a few resource hog files is generally not a problem, but having lot of them is. Hog files can be large files containing non-linguistic text (logs, database dumps, large tables of numerical values, hexadecimal, base64 or other encoded data…), binary files with a text-file filename extension, PDF files (usually OCR'ed documents) where many words are either concatenated or split in multiple parts, files parsed using an incorrect character set, etc.
If searching is slow, make sure that the relevance slider (in the relevance categorizer, in the left column of the search window) is not set to "all". Some specific queries can be slow, i.e. when using wildcards (*) patterns that match many many different words, when using [includes neighboring words] along with very common words, or when using some filters (e.g. regular expression) etc. Otherwise, searching should rarely take more than a few seconds.
> 3/ Since the beginning I am trying to save my docs in pdf format as much as possible, thinking it would be the best for the efficiency/velocity of the research engine, is it the case?
Not necessarily; it really depends of the original format of your documents, and how you convert them to PDF. FoxTrot has some specific features when displaying PDF files (support of the table of content, display of page thumbnails), but otherwise it should have decent support for other formats (docx, html…) and may index them faster.
Jérôme - Foxtrot Engineering
--
---
Jérôme - FoxTrot Engineering
|
|
|