Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will scaling the training set influence the conclusions? And why this work "support" the idea of "weak to strong alignment"? #1

Open
yucc-leon opened this issue Oct 28, 2024 · 0 comments

Comments

@yucc-leon
Copy link

Well done work. My colleagues and I have found similar results to yours in training a domain-specific LLM. But we cannot do such clear and clean experiments like yours.

But after reading it came out to me that there were other works suggesting millions of IFT data can still be used to train domain LLMs, such as Huatuo-I/II (https://arxiv.org/abs/2304.06975, https://arxiv.org/pdf/2311.09774.pdf) and Adapting Large Language Models via Reading Comprehension from MSR. So why is it so different or what are the borders/pre-conditions of two types of works? There may be more work to be done.

And would you explain why this paper supports the "weak to strong alignment" especially if the weak model cannot identify the knowledge it learns badly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant