FoxTrot Search Forum
FoxTrot Search for macOS Forum

Home » Public Forums » FoxTrot Search User Forum » What counts as "Any Metadata"?
What counts as "Any Metadata"? [message #1530] Sun, 23 October 2022 08:21 Go to next message
Atlas
Messages: 106
Registered: August 2009
Senior Member
In an example case, I want to search for documents with the words "supply" and "demand", AND I want these files to have either the character "‑" (non-breaking hyphen) in the full path OR in the tag metadata. The point of this example is that I want to search for a string or character that could be in either the full path or the tag metadata. The ONLY DIFFERENCE between the following two queries is that the first one filter using "Full Path" and the second one filter using "Any metadata".

First query setup:

(1) First criteria search use "Contents, any metadata or file name" -> Foxtrot Query -> Apply search string [supply demand].
(2) Second criteria use "Apply advanced filter" -> Full Path -> "Contain any of the strings" -> "Ignore Case"+"Ignore Composition"+"Multiple String"+Turn off all other options -> Apply search string "‑"
(3) I changed the default multiple string separator in advanced search to "," which is a comma with no space before or after.

RESULT: 21 results

Second query setup:

(1) First criteria search use "Contents, any metadata or file name" -> Foxtrot Query -> Apply search string [supply demand].
(2) Second criteria use "Apply advanced filter" -> Any metadata -> "Contain any of the strings" -> "Ignore Case"+"Ignore Composition"+"Multiple String"+Turn off all other options -> Apply search string "‑"
(3) I changed the default multiple string separator in advanced search to "," which is a comma with no space before or after.

RESULT: 17 results


I don't understand why filter by "Any metadata" is giving me LESS results than using "Full Path". According to the web documentation on Foxtrot hidden preferences, "The name of the folder containing a file is normally indexed as part of the “other metadata”. This means that tag data and full path name are both part of metadata right? So if I search for something in only "full path" then it should give me fewer results than if I search for something in "any metadata", right? Can you please clarify what's the expected behavior of "Any metadata"??

... or is only the parent folder name getting treated as metadata, but not the full path? But that shouldn't be the case because I can search for strings that are part of the full path (and not part of the parent folder name) just fine when I use Foxtrot Query. And just to be sure I turn on "defaults write com.ctmdev.FoxTrotShared UseRelativePathForParentFolder -bool YES", but the result is the same.
Re: What counts as "Any Metadata"? [message #1533 is a reply to message #1530] Mon, 24 October 2022 13:31 Go to previous messageGo to next message
FoxTrot Engineering
Messages: 355
Registered: April 2020
Senior Member
Quote:
The name of the folder containing a file is normally indexed as part of the “other metadata”. This means that tag data and full path name are both part of metadata right?
No; the name of the folder containing the file is part of “other metadata”, not its full path. And if you have set the UseRelativePathForParentFolder hidden preference, then the relative path is part of “other metadata”, not the full path. If the "-" character exists in the path of the indexed folder itself, but not in the subpath of the file inside the indexed folder, then the behavior you describe seems normal.

Quote:
And just to be sure I turn on "defaults write com.ctmdev.FoxTrotShared UseRelativePathForParentFolder -bool YES", but the result is the same
Make sure to relaunch FoxTrot and rebuild your index after setting / clearing this hidden preference.


Jérôme - FoxTrot Engineering
Re: What counts as "Any Metadata"? [message #1536 is a reply to message #1533] Mon, 24 October 2022 23:05 Go to previous messageGo to next message
Atlas
Messages: 106
Registered: August 2009
Senior Member
I want to make sure I understand this. Suppose I turn on UseRelativePathForParentFolder. The folder I'm indexing contains string A in its name(so string A is in the "path of the indexed folder itself"). However, none of the subfolders or files in folder A contain the string A in its name. Now if I want to find all files with string A in "Any metadata", will I find contents insider the subfolders of folder A?
Re: What counts as "Any Metadata"? [message #1537 is a reply to message #1536] Tue, 25 October 2022 19:16 Go to previous messageGo to next message
FoxTrot Engineering
Messages: 355
Registered: April 2020
Senior Member
For example, if you added the folder "/Users/Atlas/Documents/Stuff" to the list of indexed locations, and you have a file whose full path is "/Users/Atlas/Documents/Stuff/Project A/Folder B/document.pdf", then its relative path is "Project A/Folder B", and its parent folder name is "Folder B".
By default, "any metadata" will include the parent folder name, i.e. "Folder B"; if you have set the UseRelativePathForParentFolder preference, then "any metadata" will instead include the relative path, i.e. "Project A/Folder B".

However, if you consider a file directly in the indexed folder, instead of in one of its subfolders: "/Users/Atlas/Documents/Stuff/document.pdf"; its relative path should be empty, but in our current implementation (the UseRelativePathForParentFolder hidden preference is an unsupported and poorly-tested feature), it seems that the relative path is in fact the full path, stripped from the file name: "/Users/Atlas/Documents/Stuff". This is probably a bug as this is not consistent, however I am not sure that an empty string would be useful; using instead the name of the parent folder (which is also the indexed folder) would be more useful, but still not consistent.

One solution could be to redefine the relative path so it starts with the name of the indexed folder (e.g. "Stuff/Project A/Folder B"); but if a user indexes his whole home folder ("/Users/Atlas"), which is a common behavior, then his user name ("Atlas") will be stored in the metadata of every document, which would be an unwanted side effect. So I think we should just fix the issue by storing an empty string as the relative path for files directly in the indexed folder.


Jérôme - FoxTrot Engineering
Re: What counts as "Any Metadata"? [message #1538 is a reply to message #1537] Wed, 26 October 2022 03:15 Go to previous messageGo to next message
Atlas
Messages: 106
Registered: August 2009
Senior Member
Thanks for clarifying the following Foxtrot behaviors: When Foxtrot says it's using "relative path", what it's referring to is the path RELATIVE TO THE INDEXED FOLDER, not relative to the home folder. Thus, if the user indexed the folder "/Users/Atlas/Documents/Stuff", then the relative path for the file "/Users/Atlas/Documents/Stuff/document.pdf" would be empty UNLESS user turns on UseRelativePathForParentFolder.

Why I support index fullpath:

As discussed in this earlier thread about indexing fullpath, I support the current implementation of UseRelativePathForParentFolder. In my case, the queries do perform as expected (the two queries above give equal results) after I turn on the hidden preference AND rebuild the index. Not rebuilding the index was user error on my part. My desired behavior is that when users turn on UseRelativePathForParentFolder, and they search for a string, that string could be anywhere in the fullpath of files. Thus, when users search for "Any Metadata", it means the string could appear in any extended attributes like tags or anywhere in the fullpath of files.

The reason for this is straight forward: Users often store relational data (data that tells users how contents are related to each other) using both hierarchical structures -- i.e. nested folders -- as well as non-hierarchical structures -- i.e. tags. Thus, when a user like myself want to search for keywords that describe how contents are related to each other, we have to search for those keywords in both the fullpath and the tags. For example, a piece of content relating to the topic "automation" might be under a folder named "automation", or it might have a tag named "automation". Since the filter conditions of Foxtrot cannot be combined together in complex boolean statements (mixtures of AND's and OR's), it's good to be able to search for the keyword in ONE foxtrot search condition using "Any Metadata".

Why indexing fullpath is logically consistent

I understand the concern that allowing index of the full path mean users would get a lot of false positive results if they search for "the name of their indexed folder". This is an annoying behavior. However, there are two reasons why this is a logically consistent behavior that users would expect:

1. We would expect that all files relating to topic X would be under the folder named X. Thus, all contents under the folder "/Users/Atlas" are expected to be related to the topic "Atlas", which is why we would expect that Foxtrot returns all files under the folder "/Users/Atlas" when the user request Foxtrot return all files "relating to Atlas" by searching for the keyword "Atlas".

2. I'm sure a lot of users, like myself, store our documents in a big folder called "Dropbox". What happens when users search for documents relating to Dropbox and all they type is just "Dropbox"? Of course, they end up seeing ALL files stored in Dropbox, but this is obviously expected. Users know they need to form more specific queries so that they don't get so many false positives. In particular, I search for files relating specifically to Dropbox, by searching for the keyword "Dropbox" in the Contents field, and not just in all fields.


My current concern

Please don't "store an empty string as the relative path for files directly in the indexed folder". Due to size and organizational constraints, I use multiple smaller indexes rather than one massive indexed folders (which is recommended by Foxtrot). For example, I store my notes in a folder called "/Users/Atlas/Documents/Stuff/MyNotesX". I index this notes folder directly rather than indexing the parent folder, so that I can reduce my index size. If you store an empty string as the relative path for the files that I store directly inside this folder, then I wouldn't be able to search for files stored directly under the MyNotesX folder by searching the for keyword "MyNotesX". Think of how strange it would be if the query would return files under ".../MyNotesX/Topic A/Topic B", but not directly under ".../MyNotesX" when I use the search term "MyNotesX". Clearly, contents relating to MyNotesX are UNDER the folder called "MyNotesX", so users would expect to see all files under that folder. Please keep the current behavior of storing the fullpath as relative path if users CHOOSE TO TURN ON UseRelativePathForParentFolder. Another alternative is to store only the name of indexed folder as the relative path for files stored directly under the indexed folder.

[Updated on: Wed, 26 October 2022 03:27]

Report message to a moderator

Re: What counts as "Any Metadata"? [message #1544 is a reply to message #1538] Mon, 31 October 2022 06:45 Go to previous messageGo to next message
Atlas
Messages: 106
Registered: August 2009
Senior Member
Is there any feedbacks from the developer on this? What's the current thought on how to store the path metadata for files that are stored directly underneath a folder, when UseRelativePathForParentFolder is turned ON? Thanks ahead of giving the community transparency on your thought process.
Re: What counts as "Any Metadata"? [message #1546 is a reply to message #1544] Mon, 31 October 2022 08:50 Go to previous messageGo to next message
FoxTrot Engineering
Messages: 355
Registered: April 2020
Senior Member
The inconsistency will be fixed: "relative path of the parent folder" being defined as the full path of the parent folder, stripped from the path of the indexed folder, if a file is found directly inside the indexed folder, then the relative path of the parent folder will be empty. Perhaps his hidden preference could be fully supported in the future, although this is not our current plan.

We will add a different hidden preference to store the full path of the parent folder, instead of its name or relative path. In my opinion this adds too many irrelevant words in the "other metadata" field, especially the home folder name (which is usually the user's name), so this will certainly never end up as supported feature.


Jérôme - FoxTrot Engineering
Re: What counts as "Any Metadata"? [message #1554 is a reply to message #1546] Sun, 06 November 2022 15:27 Go to previous message
Atlas
Messages: 106
Registered: August 2009
Senior Member
Thanks. I look forward to using the new hidden preference pane that will allow us to store the full path of the parent folder in metadata.
Previous Topic: Is it advisable to index html files as txt?
Next Topic: FoxTrot Search products version 7.5.2b2 available for testing
Goto Forum:
  


Current Time: Tue Mar 28 02:21:14 CEST 2023