Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High quality few-shots from retrieved dataset #327

Open
rayendito opened this issue Sep 6, 2023 · 7 comments
Open

High quality few-shots from retrieved dataset #327

rayendito opened this issue Sep 6, 2023 · 7 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@rayendito
Copy link

Instead of only using examples given by the users in the user prompt, what if we try to use retrieved datasets (if applicable) as high-quality shots for the dataset generator?

@zhaochenyang20
Copy link
Collaborator

Perfect idea. After we implement the autopilot dataset retriever, we can add your idea!

@zhaochenyang20 zhaochenyang20 added the good first issue Good for newcomers label Sep 6, 2023
@neubig neubig added the enhancement New feature or request label Sep 7, 2023
@neubig
Copy link
Collaborator

neubig commented Sep 7, 2023

Actually I think that our existing dataset retriever is probably fine, I don't think this issue is blocked by anything. @rayendito , if you'd like to take a stab at it I could assign the issue to you!

@rayendito
Copy link
Author

sure! i'll be happy to play around with it

@bilal-aamer
Copy link

@neubig Sir, Is this still being done by @rayendito, would love to collaborate on this.

@neubig
Copy link
Collaborator

neubig commented Jan 2, 2024

Hi @bilal-aamer , definitely go ahead and play around with this unless @rayendito has already finished!

@rayendito
Copy link
Author

hi @bilal-aamer I did a somewhat MVP implementation for this some months ago in https://github.com/rayendito/prompt2model/tree/sample-from-dataset but I haven't made any PRs yet since I figured that we need to figure out if this method actually yields better results. I've been paying attention to the discussion on the #multilingual channel on Discord and chimed in a few times. I thought maybe a core team member was already working on this and I sort of spectated because I didn't want to overstep any territory😅 but if @neubig says go ahead then go for it! I'm actually particularly interested in MT and am currently exploring maybe a more specific version of this issue (MT only)

@neubig
Copy link
Collaborator

neubig commented Jan 2, 2024

Oh, go ahead! I think this is distinct methodologically from what @VanyaBK is looking at, so please go ahead and run whatever you want to.

And @bilal-aamer maybe you could discuss with @rayendito and see if there's anything he could use help with on the discord. I'm happy to pitch in to the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants