CNN_LSTM_Seq2Seq

Abstractive Text Summarization Using Sequence to Sequence Model

Project Overview

Abstractive text summarization, on the other hand, generates summaries by compressing the information in the input text in a lossy manner such that the main ideas are preserved. The advantage of abstractive text summarization is that it can use words that are not in the text and reword the information to make the summarizes more readable. In this model, a CNN-LSTM encoder and LSTM decoder model are used to generate headlines for articles using the Gigaword dataset. To improve the quality of the generated summaries, a Bahdanau attention mechanism, a pointer-generator network and a beam-search inference decoder are applied to the model.

Install

This project requires Python 3.6 and the following Python libraries installed:

NumPy
Pandas
matplotlib
PyTorch 1.0.0
files2rouge

You will also need to have software installed to run and execute a Jupyter Notebook

If you do not have Python installed yet, it is highly recommended that you install the Anaconda distribution of Python, which already has the above packages and more included. Make sure that you select the Python 3.6 installer.

Architecture

Hyperparameters

Parameters	Values
Kernel Size	[1,3,5]
Filter Size	100
Encoder Hidden Units	256
Encoder Layers	1
Decoder Hidden Units	512
Decoder Layers	1
Beam Width	10
Embedding	300d - GloVe
Dropout	0.5
Loss Function	torch.nn.CrossEntropyLoss
Optimizer	Adam Optimizer
Learning Rate	0.001

Dataset

The model is trained on the Gigaword corpus found at https://github.com/harvardnlp/sent-summary. The dataset contains the first sentence of articles as the input text and the headlines as the ground-truth summaries.

Results

The generated summaries achieved a ROUGE-1 score of 29.79 using the files2rouge function.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

CNN_LSTM_Seq2Seq

Project Overview

Install

Architecture

Hyperparameters

Dataset

Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

CNN_LSTM_Seq2Seq

Project Overview

Install

Architecture

Hyperparameters

Dataset

Results