Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase coverage #2

Open
12 of 20 tasks
bertsky opened this issue May 3, 2023 · 0 comments
Open
12 of 20 tasks

increase coverage #2

bertsky opened this issue May 3, 2023 · 0 comments

Comments

@bertsky
Copy link
Member

bertsky commented May 3, 2023

  • Confidence (unfortunately, this conflates Coords and Text @conf)
  • TextType (HANDWRITING@production=handwritten-printscript|handwritten-cursive, PRINTED@production=printed)
  • support tables:
    • top-level TableRegion for TABLE block
    • recursive TextRegion for CELL block (i.e. ColumnIndexRoles/TableCellRole/@columnIndex, RowIndexRoles/TableCellRole/@rowIndex)
    • recursive TextRegion for MERGED_CELL block (i.e. ColumnSpanRoles/TableCellRole/@colSpan, RowSpanRoles/TableCellRole/@rowSpan) – diverging recursion between Textract and PAGE?
    • recursive TextRegion for TABLE_TITLE and TABLE_FOOTER block (i.e. Roles/TableCellRole/@header... or via ReadingOrder)
    • EntityTypesSTRUCTURED_TABLE|SEMI_STRUCTURED_TABLE (unclear how to represent in PAGE), TABLE_TITLE|TABLE_SECTION_TITLE|TABLE_FOOTER|TABLE_SUMMARY|COLUMN_HEADER (unclear how this looks and compares with the actual recursive BlockType)?
    • also via ordered groups in ReadingOrder?
    • unclear: LineItemGroup and LineItems
  • PageClassification/PageType (unclear, but probably Page/@type)
  • support forms
    • BlockType=KEY_VALUE_SET and EntityTypes=KEY|VALUE → unclear how to represent: TableRegion or recursive TextRegion? Labels/Label?
    • register KEY_VALUE_SET
    • represent in page
  • support checkboxes within tables or forms
    • BlockType=SELECTION_ELEMENT and SelectionStatus=SELECTED|NOT_SELECTED → unclear how to represent
    • register SELECTION_ELEMENT
    • represent in page
  • ignore query type
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant