Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added benchmarking results to main README.md #176

Merged
merged 19 commits into from
Sep 12, 2023

Conversation

olson-ibm
Copy link
Contributor

Added a section to the main README.md for benchmarking. This links to a second README.md in ./benchmarking that contains a poorly formatted table summarizing the benchmarking results on Llama2-7b using 1 x A100 80GB. on the CCC.

Also included under ./logs is the output from the individual benchmarks "showing our work" including all the parameters used for training, in the event someone wants to reproduce this.

The poor formatting on the table seems to be with how GitHub is rendering the markdown. I spent 2 hours messing with the formatting / reading docs / wasting time trying to get the column widths to adjust, but failed in every attempt. Several artifacts of this attempt are still left in the table formatting to prove that nothing works to set the column width so the table looks clean. The fact that this is a systematic problem with Github rendering the markdown is also demonstrated with the table in the main README.md that was not changed for this PR.

Comment on lines +9 to +10
| [2023-09-05](./logs/llama2-7b/20230905_183655.output) | 1 x A100 80GB | [Glue / RTE](https://huggingface.co/datasets/glue) | 1 | bfloat16 | 6 | 4096 | 350 | 21.325 | 0.22 | 1.65 | 4096 is the context size for Llama2 |
| [2023-09-05](./logs/llama2-7b/20230905_184809.output) | 1 x A100 80GB | [Glue / RTE](https://huggingface.co/datasets/glue) | 1 | bfloat16 | 6 | 1024 | 350 | 21.333 | 0.22 | 1.65 | batch size of 7 fails CUDA OOM |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think RTE might have text long enough for 4096 context length, but definitely good to note and then we can iterate later on with a different dataset..

README.md Outdated
@@ -41,6 +41,10 @@ Prompt tuning - learning soft prompts. This is different from prompt engineering

The important difference between fine tuning and capabilities like prompt tuning/multi-taskprompt tuning is that the latter doesn't change the base model's weights at all. So when you run inference for prompt tuned models, you can have n prompts to 1 base model, and just inject the prompt tensors you need when they're requested instead of having _n_ separate fine-tuned models.

### Benchmarking

[Benchmarks](./benchmarks/README.md) for tuning various models.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we add a small description here too (similar to benchmarks readme) so that people are not looking for quality metrics when they look at this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrase as "Performance Benchmarking"?

Copy link
Collaborator

@gkumbhat gkumbhat Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Rephrasing as Runtime Performance Benchmarking sounds good!

Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Signed-off-by: Joe Olson <[email protected]>
Copy link
Collaborator

@gkumbhat gkumbhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @olson-ibm

@gkumbhat gkumbhat merged commit 091e271 into caikit:main Sep 12, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants