|
Re: Which is better? Several small/medium indexes or one huge/massive index? [message #676 is a reply to message #674] |
Thu, 01 February 2018 10:27 |
FoxTrot Engineering
Messages: 404 Registered: April 2020
|
Senior Member |
|
|
'Ajk Sanders' via foxtrot-search wrote:
> Which is better?
>
> Several small/medium indexes or one huge/massive index?
It depends…
If some of your indexed data is rarely modified (e.g. some archives or reference documents), while some other are frequently modified, it is usually wise to handle them in different FoxTrot indices; updating your frequently-modified-data index will be quite faster.
The same goes if you index logically distinct sets of data, and often know in which sets you want to search or not.
If you have an (i)Mac Pro with many cores and fast SSD drive, indexing / updating multiple indices in parallel may also be faster than having a single monolithic index.
In the other cases, a single massive index should be an acceptable choice (as long as your hardware is adequate with the mass of data you index). We recently found a bug which currently limits the size of an index to 16 GB (or more precisely, the size of some file inside the .ftindex package), and this will be fixed in a later version.
By the way, what is exactly "huge/massive" for you? In the 1.35 TB of data you are talking of, what part is actually textual data, rather than video / images etc?
Jérôme - CTM Engineering
------------------------------------------------------------ ---------
"I've been using Powermail for around 3 years now and find that it's
extremely stable. The interface is clear and intuitive and not cluttered
like other programs (Apple Mail and Eudora, e.g.). Filters work well,
and it does all that one would expect from an email client. The program
is robust and straightforward. It's a great application."
PowerMail user comment on www.versiontracker.com
Download a demo version from www.ctmdev.com
------------------------------------------------------------ ---------
--
---
Jérôme - FoxTrot Engineering
|
|
|
Re: Which is better? Several small/medium indexes or one huge/massive index? [message #677 is a reply to message #674] |
Thu, 01 February 2018 10:53 |
jonathanalix via foxt
Messages: 51 Registered: May 2019
|
Member |
|
|
1.3 TB is almost all PDFs of textbooks, reference books, files etc.
Some are pure or true PDFs, others are scanned and have had OCR.
Indexing is set to ignore image, video and audio content (I uncheck those
boxes under "Indexed data--Index contents of files")
Total of files indexed are 1 folder of 155,580 items totalling 1.37TB and
another folder of 147 GB
I have currently 14 indexes running.
My biggest index file is currently 24.95 GB. This is from a folder of
80,000 items totalling 364 GB.
I don't know if that makes me a power user or an average user.
|
|
|
Re: Which is better? Several small/medium indexes or one huge/massive index? [message #678 is a reply to message #674] |
Thu, 01 February 2018 10:58 |
jonathanalix via foxt
Messages: 51 Registered: May 2019
|
Member |
|
|
1.3 TB is almost all PDFs of textbooks, reference books, files etc.
Some are pure or true PDFs, others are scanned and have had OCR.
Indexing is set to ignore image, video and audio content (I uncheck those
boxes under "Indexed data--Index contents of files")
Total of files indexed are 1 folder of 155,580 items totalling 1.37TB and
another folder of 175 GB for 52,400 items.
I have currently 14 indexes running.
My biggest index file is currently 24.95 GB. This is from a folder of
80,000 items totalling 364 GB.
I don't know if that makes me a power user or an average user.
|
|
|
Re: Which is better? Several small/medium indexes or one huge/massive index? [message #764 is a reply to message #678] |
Sun, 03 June 2018 05:59 |
Des Bw
Messages: 26 Registered: June 2017
|
Junior Member |
|
|
that is huge data for a mortal.
I still feel overwhelmed by the 2GB data I accumulated over the years (just
PDF files).
you are definitely on the highest end of the data manglers :p
On Thursday, February 1, 2018 at 10:58:28 AM UTC+1, Ajk Sanders wrote:
>
> 1.3 TB is almost all PDFs of textbooks, reference books, files etc.
> Some are pure or true PDFs, others are scanned and have had OCR.
>
> Indexing is set to ignore image, video and audio content (I uncheck those
> boxes under "Indexed data--Index contents of files")
>
> Total of files indexed are 1 folder of 155,580 items totalling 1.37TB and
> another folder of 175 GB for 52,400 items.
>
> I have currently 14 indexes running.
>
> My biggest index file is currently 24.95 GB. This is from a folder of
> 80,000 items totalling 364 GB.
>
> I don't know if that makes me a power user or an average user.
>
|
|
|