Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add an argument to use only labeled data in training process. #7

Merged
merged 5 commits into from
Jul 14, 2019

Conversation

DoctorKey
Copy link
Contributor

Maybe, it is a simple change that can resolve the problem mentioned by #6 .

@avital
Copy link
Contributor

avital commented Nov 14, 2018

This looks probably good, but one thing: this is changing the effective batch size for the fully supervised experiments to be twice as large as original code.

Because batch size can affect the results of experiments, I think we should explicitly make the batch size half the size in the .yml files.

What do you think?

@DoctorKey
Copy link
Contributor Author

BN layers use 100 batch size in original code. Just making the batch size half will change the computation of BN layers, although it maintains the effective batch size for loss. Maybe, there is a need to optimize the hyperparameters again.

@avital
Copy link
Contributor

avital commented Nov 20, 2018

Here's what I propose we do:

Run all of the fully-supervised baselines with this pull request, in two ways: with batch size 100 and batch size 50. Hopefully the results end up being very close to what we have reported in the paper.

In case batch size of 50 or 100 matters, or if the results end up much different than our original published results, then we may want to re-tune the hyperparameters, which will take longer.

What do you think?

@DoctorKey
Copy link
Contributor Author

OK. Good luck to you.

lib/data_provider.py Outdated Show resolved Hide resolved
train_model.py Outdated Show resolved Hide resolved
@avital
Copy link
Contributor

avital commented Feb 7, 2019

Hi @DoctorKey, thanks again for the PR. I made some comments, please take a look. Also, have you run this code? Can you report the results for CIFAR-10 fully supervised and SVHN fully-supervised before and after this change? (the table-1 runs).

@JiechengZhao
Copy link

JiechengZhao commented Feb 11, 2019

@DoctorKey Thanks for the code, we are also wondering the effect that unlabeled data acts on BN layer. We use your code to have a test on SVHN. there is merely no change in the result. But we only run parts of the experiment for the limited resources.

avital and others added 3 commits February 13, 2019 13:00
@DoctorKey
Copy link
Contributor Author

I run the code when I made this PR. Because of the limited resources, I only got the result of CIFAR-10 fully supervised. We found that the result was almost unchanged.

@diego898
Copy link

diego898 commented Jul 8, 2019

hey @avital - just a friendly ping to see if this PR (and repo) are still planned to be merged. Thanks!

@avital avital merged commit 81fe4b2 into brain-research:master Jul 14, 2019
@avital
Copy link
Contributor

avital commented Jul 14, 2019

Merged, thanks for the ping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants