ExplainableHateSpeechDetection

This repo hosts code for the Constituent Rationale Explainability Framework (C-REF). This Framework was developed in order to attempt to solve the issue of hate speech classification systems not providing human rationales for their responses. More details on the motivation can be seen in the PDF paper also hosted in this repo.

In order to use this Repo, the classifier component of the model must be fine-tuned on span consituents. For the HateXplain dataset used in the paper, this dataset is already generated, for similar datasets, one would have to use the preprocessing.py script under helpers/. Once the model is trained, evaluator.py and the inferencer.py scripts can be used.

This repo hosts hateful and vile language data, please peruse at your own risk.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
results		results
C-REF_DY.pdf		C-REF_DY.pdf
README.md		README.md
evaluator.py		evaluator.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ExplainableHateSpeechDetection

About

Releases

Packages

Languages

DanielYakubov/ExplainableHateSpeechDetection

Folders and files

Latest commit

History

Repository files navigation

ExplainableHateSpeechDetection

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages