You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ami now carries out most of the required task and its my intention to prototype and test the full functionality in the next few days.
The result of running normami will be a large CTree and a set of html and csv files that can be re-used. The missing functionality includes
develop TableExtractor to identify table structure
TableExtractor should unify hocr and gocr output to a canonical table format.
TableExtractor will attempt to unify the cell content, according to a schema.
TableExtractor will apply simply heuristics to detect errors and add @class-based annotation
TableExtractor will emit CSV files or Html for the various components of a plot (i.e. possibly several files)
Develop GraphExtractor to extract SVGLines from body.graphs
Develop ScaleExtractor to extract numeric scales
apply the results of GraphExtractor and ScaleExtractor to convert to a CSV with user coordinates.
synchronise tables and graphs to determine consistency of horizontal content lines
provide an aggregate view of gocr, hocr and graph values.
extract and parse summary data in tables (e.g. Overall P values).
allow parameterisation of hocr and gocr as far as I understand it. (e.g. to prepare argument lists with whitelists. However both programs are very poorly documented, fragile and I shall not research this. I may open Issues showing the possible tasks.
This data should then be sufficient for repurposing for clients.
PMR output will be CSV and HTML that try to replicate what is visit on the screens, with some indications of reliability.
== What PMR will not currently do ==
domain-specific analysis of results.
customisation of use
client-facing documentations
refinement of image analysis parameters
creation of corpora
develop JS, containers, servers for this project
implement software on client site.
respond to alternative corpora.
write a clean facility for normami (there is a lot of potential output from a run, especially when different parameters are being used.)
== What PMR will do ==
attempt to fix runtime bugs
mentor CG and MD on how to run programs
The text was updated successfully, but these errors were encountered:
ami
now carries out most of the required task and its my intention to prototype and test the full functionality in the next few days.The result of running
normami
will be a largeCTree
and a set ofhtml
andcsv
files that can be re-used. The missing functionality includesTableExtractor
to identify table structureTableExtractor
should unifyhocr
andgocr
output to a canonical table format.TableExtractor
will attempt to unify the cell content, according to a schema.TableExtractor
will apply simply heuristics to detect errors and add @class-based annotationTableExtractor
will emit CSV files or Html for the various components of a plot (i.e. possibly several files)GraphExtractor
to extractSVGLine
s frombody.graph
sScaleExtractor
to extract numeric scalesGraphExtractor
andScaleExtractor
to convert to a CSV with user coordinates.gocr
,hocr
andgraph
values.hocr
andgocr
as far as I understand it. (e.g. to prepare argument lists with whitelists. However both programs are very poorly documented, fragile and I shall not research this. I may open Issues showing the possible tasks.This data should then be sufficient for repurposing for clients.
PMR output will be CSV and HTML that try to replicate what is visit on the screens, with some indications of reliability.
== What PMR will not currently do ==
clean
facility fornormami
(there is a lot of potential output from a run, especially when different parameters are being used.)== What PMR will do ==
The text was updated successfully, but these errors were encountered: