HindiNER

Entity recognition in Hindi sentences

A tool to recognize pre-defined entities in hindi sentences. Uses a crf for this purpose, with a bunch of hand-crafted features.

Instructions on how to use this module-

Clone the repo.
Install dependencies in requirements.txt ('pip install requirements.txt')
In your python app, import crf_NER
The syntax for training on your dataset is- train( file location of train.txt , file location of test.txt ,verbose=True) (test.txt and verbose arguments are optional) Example- trained_model=train('train.txt','test.txt',verbose=True)
The returned object can be used to make predictions on new sentences using object.predict_sent method. Example- trained_model.predict_sent('Hindi sentence here') This returns a string with sequence of entity labelings. '0' is default for None.

The format of 'train.txt' and 'test.txt' must be sentence followed by (in the next line) entity labelings. For example, in the training data provided with this repo ('train.txt'), we have classified sentence words into entities source ('S'), destination ('D'), time ('T'), number of bookings ('N'). Hence the entities are specified as- D:word,N:word,T:word

Go to this link to watch a live demo :- https://youtu.be/CqZFeb_YhoI

Thanks to CFILT, IIT Bombay for the Hindi Wordnet database.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.ipynb_checkpoints		.ipynb_checkpoints
database		database
.gitignore		.gitignore
Pickle_file_manager.ipynb		Pickle_file_manager.ipynb
README.md		README.md
Rough.ipynb		Rough.ipynb
crf_NER.ipynb		crf_NER.ipynb
crf_NER.py		crf_NER.py
data_parser.py		data_parser.py
environment.yml		environment.yml
features.py		features.py
hindiNER.crfsuite		hindiNER.crfsuite
list_of_words.pickle		list_of_words.pickle
requirements.txt		requirements.txt
test.txt		test.txt
train.txt		train.txt
word_lists.py		word_lists.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HindiNER

About

Releases

Packages

Contributors 2

Languages

sushant21/HindiNER

Folders and files

Latest commit

History

Repository files navigation

HindiNER

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages