GitHub - TomekGniazdowski/NLP-projects: The repository contains projects - basic NLP tasks, from my NLP classes from university and from self-study.

NLP projects

Transformers encoder from scratch

The repository contains a scratch-implementation of the transformer network encoder and its training in the emotion classification task on emotion dataset. The bert-base-uncased tokenizer was used.

Classification

The repository contains a comparison of models:

Logistic Regression and SVM ("classic" algorithms, trained on DistilBERT embeedings),
LSTM and BiLSTM (custom models, written with Pytorch, trained on Fasttext embeedings),
finetuned DistilBERT

trained in the task of emotions classification on emotion dataset.

Adapters

The repository contains a comparison of DistilBERT models trained in the task of emotions classification (emotion dataset). Compared models are:

DistilBERT with freezed first 6, 4 and 2 layers,
DistilBERT trained with bottleneck adapters,
unfreezed DistilBERT.

Named entity recognition

The repository contains a dataset preprocessing and XLM-RoBERTa model finetuning in the named entity recognition task on the subset ('de', 'fr', 'it, 'en') of xtreme dataset. Moreover cross-ligual transfer has been examined.

Text to Text generation

The repository contains finetuning of the distilled Pegasus model distill-pegasus-cnn-16-4 in the task of generating abstract summaries (on the Samsum dataset).

Generation

The repository contains a short study of the influence of parameters (temperature, number of beams, topk, topp) on the quality of the output generated by the trained gpt2 model.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
adapters		adapters
classification		classification
generation		generation
named_entity_recognition		named_entity_recognition
text_to_text_generation		text_to_text_generation
transformer_encoder_from_scratch		transformer_encoder_from_scratch
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP projects

Transformers encoder from scratch

Classification

Adapters

Named entity recognition

Text to Text generation

Generation

About

Releases

Packages

Languages

TomekGniazdowski/NLP-projects

Folders and files

Latest commit

History

Repository files navigation

NLP projects

Transformers encoder from scratch

Classification

Adapters

Named entity recognition

Text to Text generation

Generation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages