-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suffix option (-o) deleting input file extension for output file name, when segmenting AND reconizing a directory #680
Comments
On 25/01/22 06:52AM, PierreYvesJallud wrote:
When I proceed for a single input file, the result is good:
`kraken -i myImgFile.jpg -o .ocr.txt segment mySegmentModel.mlmodel -bl ocr -m myReconizingModel.mlmodel`
Uhmm, this command isn't supposed to work and it doesn't on my system.
`-i` is for an explicit mapping between input and output files, `-I` is
for globbing multiple files and producing an output path with the suffix
defined by `-o`. The stripping of the last extension is by design.
|
Well 🙄... this combination of parameters has been suggested in the eScriptorium forum). Except the problem of suffix, it works perfectly 😎 That's not a real obstacle for my work. I already have to modify the ALTO files (name and fileName) to integrate the result in eScriptorium and that's pretty simple with a little script. So... you be the judge =) |
On 25/01/22 11:47PM, PierreYvesJallud wrote:
Well 🙄... this combination of parameters has been suggested in the
[eScriptorium
forum](https://matrix.to/#/!kwucPfyPlAhwGUrUBl:gitter.im/$tXIu0SFH-cJohQsCsmRUI3E6OhE5WBtLnuaz4m67mgQ?via=gitter.im&via=matrix.org&via=uni-graz.at)).
Except the problem of suffix, it works perfectly 😎
Yes I remmeber that thread and the commands are actually fine except
that you're missing the second required part of the `-i` argument so I
really got no idea how you got any file output at all with 5.3.0 (or any
other version for that matter).
|
And yet... it turns ✨🪐✨ 🤓! |
Hi all,
My environment:
kraken, version 5.3.0
Python 3.11.2
I'm not sure it's a bug, but when I use kraken with the following conditions, the input file extension is deleted:
find pathToImgs/*.jpg | parallel kraken -I {} -o .ocr.txt segment -i mySegmentModel.mlmodel -bl ocr -m myReconizingModel.mlmodel
If the input files look like myImgFile.jpg, the reconizing output files look like myImgFile .ocr.txt (without .jpg)...
I would have expected they look like myImgFile .jpg.ocr.txt
When I proceed for a single input file, the result is good:
kraken -i myImgFile.jpg -o .ocr.txt segment mySegmentModel.mlmodel -bl ocr -m myReconizingModel.mlmodel
The result file is
myImgFile.jpg.ocr.txt
. The jpg extension is retained.Is there an explanation? Did I make a mistake with the options 🤔?...
Greetings
The text was updated successfully, but these errors were encountered: