Major Cleanup #1 #252

Mehrad0711 · 2022-02-28T20:21:11Z

A cleanup for genienlp has been long overdue!

We've collected quite some technical debt which we need to address. We have created a project dashboard to keep track of issues we want to address. I've moved the cards this PR addresses to "In progress".

The goal of this series of cleanups is to simplify the code, make it more readable, easier to modify, and more inviting for outside collaborators.

There are no functional changes (besides dropping almond_multilingual tasks and associated code) - tests are still passing.

We're dropping support for old generic datasets and associated metrics which were inherited from decaNLP code.
Those implementations are obsolete and should be replaced with what is now accessible through datasets library. Please see issue. If you want support for a new dataset, feel free to submit a PR for it.

… language

these were inherited from decaNLP but the implementations are old and no longer used. We can import similar metrics from HF/datasets library.

it's almost never been used

s-jse

Thanks, overall is a good cleanup, but you should make a few changes before merging it. Please see the individual comments.
Also, have you tested this code with old non-multilingual models? What about the genie-k8s scripts? Especially since you have removed training/prediction arguments.

genienlp/metrics.py

genienlp/models/base.py

genienlp/predict.py

genienlp/train.py

lgtm-com · 2022-03-02T22:26:44Z

This pull request fixes 3 alerts when merging 91c3031 into a5089ae - view on LGTM.com

fixed alerts:

2 for Except block handles 'BaseException'
1 for Wrong name for an argument in a call

Mehrad0711 added 6 commits February 25, 2022 15:02

Remove almond_dialogue_multilingual_{nlu|nlg} tasks

db6923e

predict: remove code added for task with multiple subfolders, one per…

bc77aaa

… language

metrics: remove obsolete metric implementations

aecd306

these were inherited from decaNLP but the implementations are old and no longer used. We can import similar metrics from HF/datasets library.

Remove almond multilingual tasks

c05d884

Remove obsolete generic tasks inherited from decanlp code

c49ff4b

Move validation code to appropriate model class

7485141

Mehrad0711 force-pushed the wip/mehrad/cleanups branch from ff40c58 to 83cbff2 Compare February 28, 2022 21:17

Mehrad0711 added 4 commits February 28, 2022 13:19

Remove duplicate code in transformer_sequence_classification

0a8bfa1

Remove data caching

96f6db7

it's almost never been used

predict: move loop outside of create_output_lines

1c301f2

Remove no longer needed arguments

5ccaa1a

Mehrad0711 force-pushed the wip/mehrad/cleanups branch from 83cbff2 to 5ccaa1a Compare February 28, 2022 21:19

Mehrad0711 added 5 commits February 28, 2022 13:22

Remove test_main_almond_multilingual

1b790f6

Minor fixes

5e27d18

Define wrapper model classes to avoid duplicate code in models

f9b1985

base: bug fixes

f26cd12

Drop removed nf1 metric

c4ac2cd

Mehrad0711 requested a review from s-jse February 28, 2022 22:53

s-jse requested changes Mar 1, 2022

View reviewed changes

genienlp/metrics.py Show resolved Hide resolved

genienlp/metrics.py Outdated Show resolved Hide resolved

genienlp/models/base.py Show resolved Hide resolved

genienlp/models/base.py Show resolved Hide resolved

genienlp/predict.py Show resolved Hide resolved

stanford-oval deleted a comment from lgtm-com bot Mar 1, 2022

Mehrad0711 added 2 commits March 1, 2022 20:38

metrics: add rouge score

939f9d0

Simplify validation code

8833a1b

Mehrad0711 force-pushed the wip/mehrad/cleanups branch from 3df2253 to 8833a1b Compare March 2, 2022 04:38

stanford-oval deleted a comment from lgtm-com bot Mar 2, 2022

s-jse reviewed Mar 2, 2022

View reviewed changes

genienlp/train.py Outdated Show resolved Hide resolved

genienlp/train.py Outdated Show resolved Hide resolved

Mehrad0711 added 2 commits March 2, 2022 12:49

train: validate --> validate_while_training

e2cad6b

generation_output --> validation_output

91c3031

Mehrad0711 force-pushed the wip/mehrad/cleanups branch from aa3210a to 91c3031 Compare March 2, 2022 20:57

s-jse approved these changes Mar 2, 2022

View reviewed changes

Mehrad0711 merged commit fb35bef into master Mar 2, 2022

Mehrad0711 deleted the wip/mehrad/cleanups branch March 3, 2022 20:24

Mehrad0711 mentioned this pull request Mar 3, 2022

Cleanups stanford-oval/genie-k8s#68

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major Cleanup #1 #252

Major Cleanup #1 #252

Mehrad0711 commented Feb 28, 2022 •

edited

Loading

s-jse left a comment •

edited

Loading

lgtm-com bot commented Mar 2, 2022

Major Cleanup #1 #252

Major Cleanup #1 #252

Conversation

Mehrad0711 commented Feb 28, 2022 • edited Loading

s-jse left a comment • edited Loading

Choose a reason for hiding this comment

lgtm-com bot commented Mar 2, 2022

Mehrad0711 commented Feb 28, 2022 •

edited

Loading

s-jse left a comment •

edited

Loading