Problem with DocumentSelectionDescriptor version of runCitLabHtr #4

jscrane · 2018-06-13T12:11:21Z

I'm having a problem with this API (the other one, with the string page range works fine).

Looks like the server isn't processing the page range properly.

If you can see it, job 346078 shows this. Digging into the "jobDataProps" I can see the following XML:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<documentSelectionDescriptor>
    <docId>27808</docId>
    <pageList>
        <pages>
            <pageId>1</pageId>
        </pages>
        <pages>
            <pageId>2</pageId>
        </pages>
        <pages>
            <pageId>3</pageId>
        </pages>
        <pages>
            <pageId>4</pageId>
        </pages>
        <pages>
            <pageId>5</pageId>
        </pages>
        <pages>
            <pageId>6</pageId>
        </pages>
        <pages>
            <pageId>7</pageId>
        </pages>
        <pages>
            <pageId>8</pageId>
        </pages>
        <pages>
            <pageId>9</pageId>
        </pages>
    </pageList>
</documentSelectionDescriptor>

In the GUI the pages column is blank for this job, and it finished immediately.

It's no big deal, as I can use the other form for now. Just FYI.

kahlep · 2018-06-13T15:04:03Z

The descriptor object is based on the values from TrpPage#getPageId instead of the pageNr.

Of course, the job should show errors when passing page IDs that do not belong to the document. It ignores those silently now and I will fix this.

jscrane · 2018-06-15T11:10:49Z

OK I've tried using the pageId, as follows:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<documentSelectionDescriptor>
    <docId>27808</docId>
    <pageList>
        <pages>
            <pageId>798149</pageId>
        </pages>
        <pages>
            <pageId>798150</pageId>
        </pages>
        <pages>
            <pageId>798151</pageId>
        </pages>
    </pageList>
</documentSelectionDescriptor>

And this time, the server says:

java.lang.NullPointerException
              at eu.transkribus.appserver.logic.jobs.standard.CITlabHtrJob.runHtr(CITlabHtrJob.java:202)
              at eu.transkribus.appserver.logic.jobs.standard.CITlabHtrJob.doProcess(CITlabHtrJob.java:140)
              at eu.transkribus.appserver.logic.jobs.abstractjobs.ATrpJobRunnable.run(ATrpJobRunnable.java:112)
              at eu.transkribus.appserver.logic.JobProcessStarter.run(JobProcessStarter.java:113)
              at eu.transkribus.appserver.logic.JobProcessStarter.main(JobProcessStarter.java:212)

(The modelId was 133, and the jobId was 347160.)

Apologies if I've done something stupid.

kahlep · 2018-06-18T09:44:07Z

Found the missing null check and added it. Thanks for reporting.
An update of the HTR module is not possible quickly though.
If you want to stick to the descriptor (which initially used int for the transcript ID) you would have to add values <= 0 by PageDescriptor#setTsId for running the HTR on the current version of a page.
This is equivalent to using the page String.

kahlep self-assigned this Jun 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with DocumentSelectionDescriptor version of runCitLabHtr #4

Problem with DocumentSelectionDescriptor version of runCitLabHtr #4

jscrane commented Jun 13, 2018 •

edited

Loading

kahlep commented Jun 13, 2018

jscrane commented Jun 15, 2018

kahlep commented Jun 18, 2018

Problem with DocumentSelectionDescriptor version of runCitLabHtr #4

Problem with DocumentSelectionDescriptor version of runCitLabHtr #4

Comments

jscrane commented Jun 13, 2018 • edited Loading

kahlep commented Jun 13, 2018

jscrane commented Jun 15, 2018

kahlep commented Jun 18, 2018

jscrane commented Jun 13, 2018 •

edited

Loading