Skip to content

Latest commit

 

History

History
116 lines (102 loc) · 9.15 KB

README.md

File metadata and controls

116 lines (102 loc) · 9.15 KB

Text-Summarization

  • Bold: Read

1) Survey Papers

Paper Summary
Text Summarization Techniques: A Brief Survey(2017)
A Survey on Methods of Abstractive Text Summarization(2014)
Recent automatic text summarization techniques: a survey(2017)
METHODOLOGIES AND TECHNIQUES FOR TEXT SUMMARIZATION: A SURVEY(2020)
A SURVEY OF RECENT TECHNIQUES IN AUTOMATIC TEXT SUMMARIZATION(2018)

2) Single Document Summarization

(1) Extractive Summarization

Graph-based Model

Paper Summary Reference
TextRank: Bringing Order into Texts(2004) https://lovit.github.io/nlp/2019/04/30/textrank/
Sentence Centrality Revisited for Unsupervised Summarization(2019) TextRank + BERT + Directed Graph

Autoencoder

Paper Summary Reference
Recursive Autoencoders for ITG-based Translation(2013)
Extractive Summarization using Continuous Vector Space Models(2014)

Neural Network

Paper Summary
CLASSIFY OR SELECT: NEURAL ARCHITECTURES FOR EXTRACTIVE DOCUMENT SUMMARIZATION(2015)
Neural Summarization by Extracting Sentences and Words(2016) * 좀 더 하이브리드에 가까운 것 같다
Extractive -> 그것을 가지고 Abstractive)
AttSum: Joint Learning of Focusing and Summarization with Neural Attention(2016)
SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents(2017)
Neural Latent Extractive Document Summarization(2018)
Fine-tune BERT for Extractive Summarization(2019)
Extractive Summarization of Long Documents by Combining Global and Local Context(2019)
Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers(2020)

(2) Abstractive Summarization

Attention

Paper Summary
A Neural Attention Model for Abstractive Sentence Summarization(2015)
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond(2016)
Abstractive Sentence Summarization with Attentive Recurrent Neural Networks(2017)
Get To The Point: Summarization with Pointer-Generator Networks(2017)
Deep Communicating Agents for Abstractive Summarization(2018)
Bottom-Up Abstractive Summarization(2018)
Text Summarization with Pretrained Encoders(2019) used BERT in Abstractive Summarization

3) Multi-Document Summarization

Paper Summary
GENERATING WIKIPEDIA BY SUMMARIZING LONG SEQUENCES(2018) * Extractive(중요한 정보 뽑기) + Abstractive(wiki article 생성)
* T-ED라는 트랜스포머에서 디코더만 취한 모델 구조 제안 -> 긴 시퀀스에 잘 작동

4) Long Document Summarization

Paper Summary
A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents(2018)
Deep Communicating Agents for Abstractive Summarization(2018)
Extractive Summarization of Long Documents by Combining Global and Local Context(2019)

5) Language Models

task-specific한 언어 모델을 학습하기 보다는 general하게 사용될 수 있는(downstream task) 언어 모델을 학습 하는 것이 트렌드

Paper Summary
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension(2019) * BERT의 인코더와 GPT의 디코더를 합친 형태의 모델
* seq2seq denoising autoencoder 언어 모델이며,
        1) noising function으로 text를 망가뜨리고
        2) 그걸 다시 원래 문장으로 만드는 과정을 학습하게 된다.
* text generation뿐만 아니라 comprehension에도 효과가 있어 다양한 nlp 분야의 sota 달성
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization(2019)
Big Bird: Transformers for Longer Sequences(2020)
        - PPT

6) Reinforcement Learning

Paper Summary
A Deep Reinforced Model for Abstractive Summarization(2017)
Improving Abstraction in Text Summarization(2018) - ML+RL ROUGE+Novel, with LM
-꼭 읽어보기
Deep Communicating Agents for Abstractive Summarization(2018 - DCA
-읽어보기
Ranking Sentences for Extractive Summarization with Reinforcement Learning(2018)
Ranking Sentences for Extractive Summarization with Reinforcement Learning(2018)
Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation(2019)
Better rewards yield better summaries: Learning to summarise without references(2019)
Fine-tuning language models from human preferences(2019)
Learning to summarize from human feedback(2020)
The Summary Loop: Learning to Write Abstractive Summaries Without Examples(2020) 1. key term 마스킹: M
- 마스킹은 tf-idf 이용해 k개 단어
2. 원문에 대해 summarizer 이용해 요약: S
3. M과 S 이용해 coverage로 마스킹 된 문서에 key term 채우기: F
4. 원문과 F 비교해 coverage score 비교
5. 요약문에 대한 fluency score 계산
- 언어 모델의 probability로 fluency 계산
6. 점수들 가지고 summarizer optimization - 여기에서 RL 사용

7) Autoencoder

Paper Summary
SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic Auto-Encoders(2019)
Sample Efficient Text Summarization Using a Single Pre-Trained Transformer(2019)
MeanSum: A Neural Model for Unsupervised Multi-document Abstractive Summarization(2019)

8) Metrics

9) Evaluation

Paper Summary
An Evaluation for Various Text Summarization Algorithms on Blog Summarization Dataset(2018)
Automatic Evaluation of Summaries Using N-gram Co-Occurrence Statistics

Some other fields that could be helpful

Grammatical Error Correction(GEC)

Paper Summary
Improving grammatical error correction via pre-training a copy-augmented archi- tecture with unlabeled data(2019)
A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning(2019) Kakao, 2019 ACL 2등

Content Selection

Paper Summary
Exploring Content Selection in Summarization of Novel Chapters(2020)

Factual Correctness in Summarization

Paper Summary
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports(2020)