Skip to content

Document image classification with neural networks on a subset of the RVL-CDIP dataset.

Notifications You must be signed in to change notification settings

clemsage/NeuralDocumentClassification

Repository files navigation

Neural Document Classification

Document image classification with neural networks on a subset of the RVL-CDIP dataset [1].

Getting Started

The classification problem is tackled with two different approaches:

  • Visual approach over the image pixels with dense only and convolutional neural networks: skeleton.ipynb
  • Textual approach over the recognized image words with bag-of-words and word embedding models: skeleton_ocr.ipynb

It is recommended to begin with the visual approach as it includes more details about the computing environment setup and the dataset.

For a better experience, execute the notebooks within a Google Colab environment.

Authors

References

[1] A. W. Harley, A. Ufkes, K. G. Derpanis, "Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval," in ICDAR, 2015

About

Document image classification with neural networks on a subset of the RVL-CDIP dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published