Re: Feature request: make selection of metadata importers a per-index option [message #1708 is a reply to message #1038] |
Fri, 29 September 2023 09:45 |
madison437
Messages: 1 Registered: September 2023
|
Junior Member |
|
|
This would be EXTREMELY useful to have an index based PDF parser, and whatever available options for the chosen parser, as well.
Also, it would be great to be able to specify arguments for xpdf "pdftotext", or instead use the Poppler version of "pdftotext". I'm finding with my OCR'd handwritten text, the result is better with Poppler, and with Poppler I don't have to use the "stream order" option, i.e. "-raw", which appears a little safer in getting the order of text right, generally speaking.
It's true there is the "Use Xpdf's "stream order" layout mode" checkbox when configuring xpdf, but even that would be helpful to be specified on an index basis.
Thanks.
|
|
|