Skip to content

Support for Distilbert #289

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Nov 13, 2019
Merged

Support for Distilbert #289

merged 18 commits into from
Nov 13, 2019

Conversation

andrelmfarias
Copy link
Collaborator

This PR adds support for Distilbert as mentioned in #197.

I will be also releasing a Distilbert trained on SQuAD 1.1 using Knowledge Distillation and bert-large-uncased-whole-word-masking-finetuned-squad as a teacher.

From my experiments, this version of Distilbert achieves 80.1% EM and 87.5% F1-score (vs. 81.2% EM and 88.6% F1-score for our version of BERT), while being much faster and lighter.

@codecov
Copy link

codecov bot commented Oct 25, 2019

Codecov Report

Merging #289 into master will increase coverage by 23.2%.
The diff coverage is 58.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #289      +/-   ##
==========================================
+ Coverage   45.23%   68.44%   +23.2%     
==========================================
  Files          13       12       -1     
  Lines        1859     1255     -604     
==========================================
+ Hits          841      859      +18     
+ Misses       1018      396     -622
Impacted Files Coverage Δ
cdqa/utils/download.py 88.23% <ø> (ø) ⬆️
cdqa/pipeline/cdqa_sklearn.py 64.86% <0%> (-2.75%) ⬇️
cdqa/reader/bertqa_sklearn.py 62.28% <62.22%> (-1.18%) ⬇️
cdqa/utils/evaluation.py 81.45% <0%> (+8.87%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 50e1044...c7e8c5a. Read the comment docs.

@yunhaizhu
Copy link

How did you train and dump "distilbert_qa.joblib"? it seems that many files for distilbert did not updated.

@andrelmfarias
Copy link
Collaborator Author

andrelmfarias commented Oct 31, 2019

How did you train and dump "distilbert_qa.joblib"?

distilbert_qa.joblib is an instance of BertQA, which has a torch model corresponding to the fine-tuned Distilbert model as attribute. I trained this torch model it using the official Hugging Face's repo for distillation on SQuAD with the only difference that I replaced bert-base-uncased by bert-large-uncased-whole-word-masking-finetuned-squad.

it seems that many files for distilbert did not updated.

Which files are you talking about?

@yunhaizhu
Copy link

How did you train and dump "distilbert_qa.joblib"?

distilbert_qa.joblib is an instance of BertQA, which has a torch model corresponding to the fine-tuned Distilbert model as attribute. I trained this torch model it using the official Hugging Face's repo for distillation on SQuAD with the only difference that I replaced bert-base-uncased by bert-large-uncased-whole-word-masking-finetuned-squad.

Thanks for the reply, Got it.

it seems that many files for distilbert did not updated.

Which files are you talking about?

Oh, I checked the bertqa_sklearn.py, DistilBert Model and corresponding classes did not use.

@andrelmfarias andrelmfarias merged commit f68b92b into master Nov 13, 2019
@fmikaelian fmikaelian deleted the distilbert branch November 16, 2019 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants