Support for Distilbert #289

andrelmfarias · 2019-10-25T08:27:11Z

This PR adds support for Distilbert as mentioned in #197.

I will be also releasing a Distilbert trained on SQuAD 1.1 using Knowledge Distillation and bert-large-uncased-whole-word-masking-finetuned-squad as a teacher.

From my experiments, this version of Distilbert achieves 80.1% EM and 87.5% F1-score (vs. 81.2% EM and 88.6% F1-score for our version of BERT), while being much faster and lighter.

codecov · 2019-10-25T09:10:01Z

Codecov Report

Merging #289 into master will increase coverage by 23.2%.
The diff coverage is 58.33%.

@@            Coverage Diff             @@
##           master     #289      +/-   ##
==========================================
+ Coverage   45.23%   68.44%   +23.2%     
==========================================
  Files          13       12       -1     
  Lines        1859     1255     -604     
==========================================
+ Hits          841      859      +18     
+ Misses       1018      396     -622

Impacted Files	Coverage Δ
cdqa/utils/download.py	`88.23% <ø> (ø)`	⬆️
cdqa/pipeline/cdqa_sklearn.py	`64.86% <0%> (-2.75%)`	⬇️
cdqa/reader/bertqa_sklearn.py	`62.28% <62.22%> (-1.18%)`	⬇️
cdqa/utils/evaluation.py	`81.45% <0%> (+8.87%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 50e1044...c7e8c5a. Read the comment docs.

…ytorch_pretrained_bert from requirements

yunhaizhu · 2019-10-31T01:58:55Z

How did you train and dump "distilbert_qa.joblib"? it seems that many files for distilbert did not updated.

andrelmfarias · 2019-10-31T09:06:32Z

How did you train and dump "distilbert_qa.joblib"?

distilbert_qa.joblib is an instance of BertQA, which has a torch model corresponding to the fine-tuned Distilbert model as attribute. I trained this torch model it using the official Hugging Face's repo for distillation on SQuAD with the only difference that I replaced bert-base-uncased by bert-large-uncased-whole-word-masking-finetuned-squad.

it seems that many files for distilbert did not updated.

Which files are you talking about?

yunhaizhu · 2019-10-31T13:37:26Z

How did you train and dump "distilbert_qa.joblib"?

distilbert_qa.joblib is an instance of BertQA, which has a torch model corresponding to the fine-tuned Distilbert model as attribute. I trained this torch model it using the official Hugging Face's repo for distillation on SQuAD with the only difference that I replaced bert-base-uncased by bert-large-uncased-whole-word-masking-finetuned-squad.

Thanks for the reply, Got it.

it seems that many files for distilbert did not updated.

Which files are you talking about?

Oh, I checked the bertqa_sklearn.py, DistilBert Model and corresponding classes did not use.

andrelmfarias added 5 commits October 24, 2019 17:07

added support for DistilBert

cf4500b

updated requirements

d7d0729

added torch to requirements

3803237

updated import

790097c

readded pytorch_pretrained_bert for compability with old models

bd605d8

andrelmfarias added 13 commits October 25, 2019 11:13

fixed error with bert inputs

e4ed892

added tests with distilbert

d72309a

tmp workaround to bypass travis

1137ab7

removed workaround

7ce9dec

removed run_squad.py script

3843ff7

fix typo in docstring

4d54d46

added test reader

584a5fd

Merge branch 'master' into distilbert

0deb58d

included new version of bert.joblib in tests and download / removed p…

3915dc7

…ytorch_pretrained_bert from requirements

bump version

906101b

send model to cpu before dump reader

d4fed18

merged master to distilbert

2732539

bump version

c7e8c5a

andrelmfarias merged commit f68b92b into master Nov 13, 2019

fmikaelian deleted the distilbert branch November 16, 2019 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Distilbert #289

Support for Distilbert #289

andrelmfarias commented Oct 25, 2019

codecov bot commented Oct 25, 2019 •

edited

Loading

yunhaizhu commented Oct 31, 2019

andrelmfarias commented Oct 31, 2019 •

edited

Loading

yunhaizhu commented Oct 31, 2019

Support for Distilbert #289

Support for Distilbert #289

Conversation

andrelmfarias commented Oct 25, 2019

codecov bot commented Oct 25, 2019 • edited Loading

Codecov Report

yunhaizhu commented Oct 31, 2019

andrelmfarias commented Oct 31, 2019 • edited Loading

yunhaizhu commented Oct 31, 2019

codecov bot commented Oct 25, 2019 •

edited

Loading

andrelmfarias commented Oct 31, 2019 •

edited

Loading