GitHub - DvdNss/multiclass-classification-perceiver: DeepMind's Perceiver for Multiclass Emotion Classification

Multiclass Classification using DeepMind's Perceiver

About The Project

This project aims to make DeepMind's Language Perceiver easily usable for Multiclass Classification.

Table of Contents

About The Project
Getting Started
- Installation
Usage

Structure
Example

Contact

Getting Started

Installation

Clone the repo

git clone https://github.com/DvdNss/nlp-perceiver

Install requirements

pip install -r requirements.txt

Usage

Structure

data/: contains torch data files
model/: contains models
resource/: contains readme images
source/: contains main scripts
- databuilder.py: loads, transforms and saves datasets
- train.py: training script
- mapping.py: mapping functions
- evaluate.py: evaluation script
- pipeline.py: model pipeline (inference)
- inference_example.py: inference use case
app.py: streamlit app script

Example

Set correct mapping functions in source/mapping.py for a given dataset

# Map inputs
def map_inputs(row: dict):
    """
    Map inputs with a given format.

    :param row: dataset row
    :return:
    """

    return row['text']


def map_targets(labels: List[int]):
    """
    Map targets with a given format.

    :param labels: list of labels
    :return:
    """

    targets = [0] * 28
    for label in labels:
        targets[label] = 1

    return {'targets': targets}

Build the torch files using source/databuilder.py script

python source/databuilder.py --dataset go_emotions --split train+validation --output_dir data --max_size max_size

Once the script stops running, there should be a .pt file in the output_dir for each split you selected.

Train your model using source/train.py script

python source/train.py --train_data train_data --validation_data validation_data --batch_size batch_size --lr lr --epochs epochs --output_dir output_dir

A model will be saved in output_dir each epoch, which will be named as :
output_dir/perceiver-e<epoch>-acc<eval_acc>.pt.

Evaluate your model using source/evaluate.py script

python source/evaluate.py --model model_path --validation_data validation_data --batch_size batch_size

Inference using the source/pipeline.py script (see use case in inference_example.py)

from pipeline import MultiLabelPipeline, inputs_to_dataset

model_path = '../model/perceiver-e2-acc0.pt'

# Load pipeline
pipeline = MultiLabelPipeline(model_path=model_path)

# Build a little dataset
inputs = ['This this a test.', 'Another test.', 'The final test.']

# Make inference
outputs = pipeline(inputs_to_dataset(inputs), batch_size=3)
print(outputs)

Finally, run streamlit app

streamlit run app.py

Contact

David NAISSE - @LinkedIn - [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multiclass Classification using DeepMind's Perceiver

About The Project

Getting Started

Installation

Usage

Structure

Example

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
data		data
model		model
resource		resource
source		source
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

License

DvdNss/multiclass-classification-perceiver

Folders and files

Latest commit

History

Repository files navigation

Multiclass Classification using DeepMind's Perceiver

About The Project

Getting Started

Installation

Usage

Structure

Example

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages