Skip to content

Fair-Forward/afrisent-semeval-2023

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains data for the SemEval 2023 Shared Task 12: Sentiment Analysis in African Languages (AfriSenti-SemEval). More information can be found at the: shared task and competition websites.

No. Language Country
1 Algerian Arabic (arq) Algeria
2 Amharic (ama) Ethiopia
3 Hausa (hau) Nigeria
4 Igbo (ibo) Nigeria
5 Kinyarwanda (kin) Rwanda
6 Moroccan Arabic/Darija (ary) Morocco
7 Mozambique Portuguese (pt-MZ) Mozambique
8 Nigerian Pidgin (pcm) Nigeria
9 Oromo (orm) Ethiopia
10 Swahili (swa) Kenya/Tanzania
11 Tigrinya (tir) Ethiopia
12 Twi (twi) Ghana
13 Xithonga (tso) Mozambique
14 Yoruba (yor) Nigeria

Dataset

AfriSenti dataset is available on HugginFace or data folder

If you have used our dataset, please cite the following three papers: AfriSenti dataset paper, AfriSenti-SemEval task description paper (coming soon), and NaijaSenti paper.

@misc{muhammad2023afrisenti,
    title={{AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages}},
    author={Shamsuddeen Hassan Muhammad and Idris Abdulmumin and Abinew Ali Ayele and Nedjma Ousidhoum and David Ifeoluwa Adelani and Seid Muhie Yimam and Ibrahim Sa'id Ahmad and Meriem Beloucif and Saif Mohammad and Sebastian Ruder and Oumaima Hourrane and Pavel Brazdil and Felermino Dário Mário António Ali and Davis Davis and Salomey Osei and Bello Shehu Bello and Falalu Ibrahim and Tajuddeen Gwadabe and Samuel Rutunda and Tadesse Belay and Wendimu Baye Messelle and Hailu Beshada Balcha and Sisay Adugna Chala and Hagos Tesfahun Gebremichael and Bernard Opoku and Steven Arthur},
    year={2023},
    doi={10.48550/arXiv.2302.08956},
    url={https://arxiv.org/abs/2302.08956}
}

@inproceedings{muhammadSemEval2023,
    title = {{SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)}},
    author = {Shamsuddeen Hassan Muhammad and Idris Abdulmumin and Seid Muhie Yimam and David Ifeoluwa Adelani and Ibrahim Sa'id Ahmad and Nedjma Ousidhoum and Abinew Ali Ayele and Saif M. Mohammad and Meriem Beloucif and Sebastian Ruder},
    booktitle = {Proceedings of the 17th {{International Workshop}} on {{Semantic Evaluation}} ({{SemEval-2023}})},
    publisher = {{Association for Computational Linguistics}},
    year = {2023}
}

@inproceedings{muhammad-etal-2022-naijasenti,
    title = "{N}aija{S}enti: A {N}igerian {T}witter Sentiment Corpus for Multilingual Sentiment Analysis",
    author = "Muhammad, Shamsuddeen Hassan  and Adelani, David Ifeoluwa  and Ruder, Sebastian  and Ahmad, Ibrahim Sa{'}id  and Abdulmumin, Idris  and Bello, Bello Shehu  and Choudhury, Monojit  and Emezue, Chris Chinenye  and Abdullahi, Saheed Salahudeen  and Aremu, Anuoluwapo  and orge, Al{\'\i}pio  and Brazdil, Pavel",
    booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2022.lrec-1.63",
    pages = "590--602",
}

AfriSenti-SemEval 2023 Shared Task

We provide the training, dev and test set for each task below.

Sample Tweets and Distribution

Sentiment Lexicon

We provide sentiment lexicon in some languages that may be useful for the task.

Shared Task Starter kit

We provide a starter kit for the competition to create a baseline result. You can open the starter kit in Colab Notebook and run the baseline system. The resultant experiment can be submitted to codalab to ensure all submission format is clear. You can then work on your own system towards the competition.

To run the Colab Notebook, fork this repo first and click the badge "open on colab" on the forked version.

  • Task A: Open In Colab
  • Task B: Open In Colab

Funding Acknowledgements

This competition recieves generous support of the Lacuna Fund.

License

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

About

AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 64.4%
  • Python 35.2%
  • Shell 0.4%