-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Story : Plots , Metrics for LLM support [2] #138
Comments
Q&ATasks variants
CodePreloads a series of models (DistilBERT family) . Uses Trainer class from HF Transformers library so DVCLive HF Callback can be easily implemented here . Last DVCLive HF taken from #649 from dvclive.huggingface import DVCLiveCallback
from transformers import AutoModelForQuestionAnswering, TrainingArguments, Trainer
model = AutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")
...
...
training_args = TrainingArguments(
output_dir="output",
evaluation_strategy="epoch",
learning_rate=2e-5,
logging_strategy="epoch",
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
num_train_epochs=3,
save_strategy="epoch",
weight_decay=0.01,
push_to_hub=True,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_squad["train"],
eval_dataset=tokenized_squad["test"],
tokenizer=tokenizer,
data_collator=data_collator,
post_process_function=post_processing_function ,
callbacks=[DVCLiveCallback(save_dvc_exp=True, log_model=True)], #DVCLive
)
trainer.train() MetricsDuring training, the model computes the loss. It returns a dictionary of the form that can be logged into dvc API import evaluate
metric = evaluate.load("squad")
theoretical_answers = [
{"id": ex["id"], "answers": ex["answers"]} for ex in small_eval_set
]
metrics_load_dvc = metric.compute(predictions=predicted_answers, references=theoretical_answers) metrics_load_dvc returns {'exact_match': 83.0, 'f1': 88.25} As I understand, once the model is finetuned we should see these values in each table row (in DVC VSCode extension) per experiment. Important note: We can use the loss metric as a live metric during finetunning, and match and f1 might appear in the table once the fine-tuned process is finished. with Live() as live:
live.log_metric(("exact_match", metrics_load_dvc.get('exact_match'))
live.log_metric("f1", metrics_load_dvc.get('f1')) |
SummarizationTasks variantsSummarization creates a shorter version of a document or an article that captures all the important information. Along with translation, it is another example of a task that can be formulated as a sequence-to-sequence task. Summarization can be:
I saw that @dberenbaum used this in his example, so no much to add. Maybe he could use some of this text above to complement his README.MD + add html studio link to give it more reach in his repo ? If Dave read this at some point, please also consider changing the repo name to something more task oriented + frameworks, so it can be indexed as such when users search. - summarization_hf_dvc instead of seq2seqhf - CodeWorks with certain model architectures (BART, Pegasus, T5 family) . Uses Trainer class from HF Transformers library so DVCLive HF Callback can be easily implemented here . Last DVCLive HF taken from #649 from dvclive.huggingface import DVCLiveCallback
from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
...
...
training_args = Seq2SeqTrainingArguments(
output_dir="my_awesome_billsum_model",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
weight_decay=0.01,
save_total_limit=3,
num_train_epochs=4,
predict_with_generate=True,
fp16=True,
push_to_hub=True,
)
trainer = Seq2SeqTrainer(
model=model,
args=training_args,
train_dataset=tokenized_billsum["train"],
eval_dataset=tokenized_billsum["test"],
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
)
trainer.train() MetricsDuring training, the model computes the loss. It returns a dictionary of the form that can be logged into dvc API import evaluate
metric = evaluate.load("rouge") As I understand, once the model is finetuned we should see these values in each table row (in DVC VSCode extension) per experiment. with Live() as live:
live.log_metric(("rouge", metric) |
Submission Type
Context
Offer ML support for another use cases beyond Translation, that could mostly imply text generation (Q/A , summarization, etc)
Separated from #137 in other to focus on solving plots and one use case and offer possible support for others.
Impact
Issue creator Goal
Leaving a placeholder here to possible questions.
Will prob submit some q and a coming from discussion in the meantime
Thanks!
Tasks
The text was updated successfully, but these errors were encountered: