Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistics of released dataset splits #7

Open
deepaknlp opened this issue Feb 15, 2019 · 5 comments
Open

Statistics of released dataset splits #7

deepaknlp opened this issue Feb 15, 2019 · 5 comments

Comments

@deepaknlp
Copy link

Hey @freesunshine0316,

Would you please confirm the stats of train/dev/test in your experiment? With the released data, I am getting:
~ 75500/17934/11805
Is this correct?

Thanks

@freesunshine0316
Copy link
Owner

I release the data I used, sorry I'm busy on something else. I assume yours are correct

@Chevalier1024
Copy link

I have the same question, the train/dev/test split in your github is 71500/16758/11805, which is different in your paper(split1: 70,484/10,570/11,877, split2: 86,635/8,965/8,964). if convenient, could you please share the train/dev/test split code or data? @deepaknlp @freesunshine0316

@deepaknlp
Copy link
Author

deepaknlp commented Sep 18, 2019

@freesunshine0316
Split-2 link is here:
https://res.qyzhou.me/redistribute.zip

@freesunshine0316
Copy link
Owner

freesunshine0316 commented Sep 18, 2019

Hi @Fengfeng1024

"Split-2" (released by Zhou et al.) exactly match the statistics and @deepaknlp just shared the link.
"Split-1" was originally released by Du et al., which we can't directly use as there is no information on the answer positions. As a result, we use their provided doclist-xxx.txt to generate our own data (provided along this repository). But we mistakenly report their train/dev/test split in our paper.

@Chevalier1024
Copy link

thank you so much for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants