Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does label shift exactly mean? #1

Open
luccachiang opened this issue Oct 19, 2022 · 2 comments
Open

What does label shift exactly mean? #1

luccachiang opened this issue Oct 19, 2022 · 2 comments

Comments

@luccachiang
Copy link

Hello, I am just reading your paper published on MICCAI. I am very interested in your wonderful work. However, I find it difficult to understand the basic problem setting. I am wondering that if the "label shift" means the source classes set and the target classes set have totally the same labels but the proportion of each class is different, or the target classes set contains the source one, namely there are unseen class during test time?
In the paper, I notice that there is a sentence, "Label distribution shift and data distribution shift are two types of dataset shift, as introduced in previous work.". However, I do not find the exact definition in the work.
I may not raise my question clearly. Let's take an example. Assume that the training classes are DOG, CAT, under the setting of your work, if the testing classes are DOG, CAT which are the same as training classes, OR DOG, CAT, BIRD, FISH which are more than training classes?
I really hope you can help me solve the question and will be very grateful! :-)

@WenaoMA
Copy link
Collaborator

WenaoMA commented Oct 21, 2022

Thank you for being interested in our work. And yes, the "label shift means" the proportion of each class is different, and there is no unseen class during test time.

Regarding the data distribution shift problem, it is caused by the different vendor, acquisition parameter or some other factors during the imaging process, while the label distribution shift problem is inherent caused by the different label distributions. The experiment results shown in Fig. 2 of our paper also demonstrate that even though there are not apparent data distribution shift problems between training set and test set, the model performance can still be effected by the label distribution shift problem.

For example, in our problem setting, the proportions of the classes are 0.9 and 0.1 for DOG and CAT on training set, while on test set they are 0.1 and 0.9 for DOG and CAT respectively (there is no unseen classes in test set). Hope my answer would help you :)

@luccachiang
Copy link
Author

Got it. Thanks for ur kindness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants