Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdf_images.py: command not found #2

Open
pverkind opened this issue Dec 6, 2023 · 4 comments
Open

pdf_images.py: command not found #2

pverkind opened this issue Dec 6, 2023 · 4 comments

Comments

@pverkind
Copy link
Contributor

pverkind commented Dec 6, 2023

When trying to running acdc with a pdf instead of images, I immediately get the following error:

(mkdir -p images/0049Gotha.OrA1521; cd images/0049Gotha.OrA1521; bash -c "pdf_images.py ../../pdf/0049Gotha.OrA1521.pdf") && touch images/0049Gotha.OrA1521/_SUCCESS
bash: pdf_images.py: command not found
make: *** [Makefile:64: images/0049Gotha.OrA1521/_SUCCESS] Error 127

The script creates a folder for the images extracted from the pdf but then fails to recognize that the pdf_images.py is in the bin folder of the acdc_train folder.

@pverkind
Copy link
Contributor Author

pverkind commented Dec 6, 2023

hacky solution: I included the absolute path to the pdf_images.py file in the makefile

@pverkind
Copy link
Contributor Author

pverkind commented Dec 8, 2023

I now have the same issue with the other files in the bin folder. I solved it too by creating a variable with the absolute path to these scripts in the makefile, but that should not be necessary.
This is on Ubuntu, Python 3.10, in a venv virtual environment.

@rohanchn
Copy link

rohanchn commented Dec 8, 2023

Were you able to test the whole thing?

After the changes I made to the makefile, I was able to run it with the original command, however the $(ocr) variable in the final rule keeps on using the base model for subsequent training iterations intead of updating to the fine-tuned model from the previous iteration.

It should be doing the latter, right?

Edit: I guess it is doing the former as the latter might lead to overfitting.

@rohanchn
Copy link

rohanchn commented Dec 8, 2023

Also, still not clear to me why gen2-%.out/alto-union/alto.lis uses lines from both print, gen1-print, and gen2-print?

Wouldn't there be duplicate lines in the final iteration because of this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants