GitHub - MathieuDesponds/Decoder-in-Natural-Language-Generation: Implementation of greedy and beam search, as well as top-p and top-k. Analyze how varying specific parameters of decoding and sampling algorithms can qualitatively affect the generation and interpretation of NLG evaluation metrics.

Assignment Description

In this assignment, you will be looking at natural language generation (NLG), precisely the task of summarization. You will be exploring ways to generate text and how fine-grained decisions of decoding parameters can affect the generations.
You will not need to train any models in this assignment. A pretrained one is provided for you by Huggingface.
In Part 1, you will implement two decoding algorithms (greedy and beam search), as well as two sampling algorithms (top-p and top-k) to replicate (to some extent) what one would get when using Huggingface's generate function that you've played with during the Week 7's exercise session.
For Part 2, you will analyze how varying specific parameters of decoding and sampling algorithms can qualitatively affect the generation.
For Part 3, you will answer some questions on interpreting automatic NLG evaluation metrics.

Setup
Introduction: T5 Primer
PART 1: Natural Language Generation Decoding and Sampling Algorithms
- 1.1) Implement decoding and sampling algorithms
- 1.2) Test your implementations
PART 2: Qualitative Evaluation of Generation Parameters
PART 3: Reflection on Automatic NLG Evaluation Metrics
- 3.1) Description
- 3.2) Task
PART 4: Checklist

Deliverables

To give us the deliverables you will have to commit the following files if your github classroom repository:

✅ The python files:
- a3_decoding.py
- a3_sampling.py
- a3_utils.py, if you added any helper functions
✅ This jupyter notebook a3_notebook.py with
- the answers to Part 2 questions written out in their corresponding cells.
  - Answers to (2.1) questions
  - Answers to (2.2) questions
  - Answers to (2.3) questions
  - Answers to (2.4) questions
- the answers to Part 3 questions written out in its corresponding cell.

Expected Workload

We expect the first part of the assignment, notably Beam search, to take the most out of the complete assignment. You can plan your workload according to that. Keep in mind that this is just our expectation, not a guarantee.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
a3_decoding.py		a3_decoding.py
a3_hello.py		a3_hello.py
a3_notebook.ipynb		a3_notebook.ipynb
a3_sampling.py		a3_sampling.py
a3_tests.py		a3_tests.py
a3_utils.py		a3_utils.py
part1_input_data.json		part1_input_data.json
part3_flan_T5_automatic_eval.png		part3_flan_T5_automatic_eval.png
part3_flan_t5_generations.json		part3_flan_t5_generations.json
requirements_colab.txt		requirements_colab.txt
requirements_local.txt		requirements_local.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assignment Description

Table of Contents

Deliverables

Expected Workload

About

Releases

Packages

Languages

MathieuDesponds/Decoder-in-Natural-Language-Generation

Folders and files

Latest commit

History

Repository files navigation

Assignment Description

Table of Contents

Deliverables

Expected Workload

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages