Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ami-ocr ignores --gocr argument #4

Open
mdales opened this issue Jun 23, 2019 · 4 comments
Open

ami-ocr ignores --gocr argument #4

mdales opened this issue Jun 23, 2019 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@mdales
Copy link

mdales commented Jun 23, 2019

Unlike the --tesseract argument, the --gocr argument isn't used, and the path is always taken as /usr/local/bin/gocr.

@mdales mdales added the bug Something isn't working label Jun 23, 2019
@mdales
Copy link
Author

mdales commented Jun 23, 2019

You'll also need to check where /usr/local/bin/pngtopnm is if it's used.

@petermr
Copy link
Member

petermr commented Jun 23, 2019

Thanks,
Yes pngtopnm is essential; it's exactly the sort of installation issue that we have to deal with. Whereas tesseract reads png , gocr only reads pbm stuff. Converting any images is flaky and this seems the best strategy:

*.png ->pngtopnm -> *.pnm -> gocr

I'll check the gocr argument

@petermr
Copy link
Member

petermr commented Jun 23, 2019

The GOCRConverter builds a config string:

	private void runGocr(String inputFilename, String outputFilename) throws InterruptedException {
		List<String> gocrConfig = new ArrayList<>();
		gocrConfig.add(getProgram());
		
		gocrConfig.add("-o");		gocrConfig.add(outputFilename);
		gocrConfig.add("-f");		gocrConfig.add("XML");
		gocrConfig.add("-C"); /* gocrConfig.add("0-9A-Za-z--[]().,%'="); */	gocrConfig.add("0123456789"); // don't think this works
		gocrConfig.add("-m");		gocrConfig.add(String.valueOf(
				/*DONT_DIVIDE_OVERLAPPING + */DONT_CONTEXT_CORRECT + CHARACTER_PACKING));
		
		gocrConfig.add("-i");		gocrConfig.add(inputFilename);
//		System.out.println("GOCR command: "+gocrConfig);
			
		builder = new ProcessBuilder(gocrConfig);
        runBuilderAndCleanUp();
	}

    protected String getPngtopnmPath() {
    	return PNGTOPNM ;
    }

    protected String getProgram() {
    	return gocrPath ;
    }

maybe gocrPath isn't being set in the GOCRConverter - will check tomorrow.

@petermr
Copy link
Member

petermr commented Jun 23, 2019

Looks like the value wasn't being used. Have added a transfer in AMIOcrTool #268...

		if (gocrPath != null) {
			GOCRConverter gocrConverter = new GOCRConverter(this);
			if (replaceList.size() > 0) {
				gocrConverter.setReplaceList(replaceList);
			}
			
			try {
				gocrConverter.setGocrPath(gocrPath);
				gocrConverter.setImageFile(imageFile);
				gocrConverter.runGOCR();
			} catch (Exception e) {
				LOG.error("Cannot run GOCR", e);
				return;
			}	
//			processGOCR(imageDir);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants