Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 625 Bytes

README.md

File metadata and controls

18 lines (11 loc) · 625 Bytes

text-mining

Java based library that will extract text from Microsoft Word for Windows binary documents including Word 1.0/2.0/4.0/6.0/95/97/2000/xp/2003.

Extracts text from fast-saved files as well.

Initially imported from : https://code.google.com/archive/p/text-mining/source/default/source

This version has the following improvement compared to the legacy project :

  • compatible with Apache POI (version 3.17)
  • mavenized project
  • requires Java 8
  • use of generics

This version is provided AS IS and is NOT actively maintained by Jalios.