This repository contains the lab work for Coursera course on "Generative AI with Large Language Models".
Perform dialog summarization using Generative AI. Experiment with in context learning such as zero shot, one shot and few shot inferences and tune associated configuration parameters at inference to influence results.
Perform instruction fine tuning on an existing LLM from Hugging Face, Flan-T5 model. Explore both full fine tuning as well as PEFT (Parameter Efficient Fine Tuning) methods such as LoRA (Low Rank Adaptation) and evaluation using ROUGE metrics.
Further fine tune a Flan-T5 model using reinforcement learning with a reward model such as Meta AI's hate speech reward model to generate less toxic summaries. Use Proximal Policy Optimization (PPO) to fine-tune and detoxify the model.