Skip to content

burlamix/recipe-summarization

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recipe summarization

This repo implements a sequence-to-sequence encoder-decoder using Keras to summarize coktails ingredients instructions by predicting a coktails title. This code is based on Siraj Raval's How to Make a Text Summarizer and https://github.com/rtlee9/recipe-summarization

Data

We scraped 500 recipes from various websites for training. Each recipe consists of:

  • A Coktails title
  • A list of ingredients
  • Preparation instructions

The model was fitted on the recipe ingredients, instructions and title. Ingredients were concatenated in their original order to the instructions.

Training

This model was pre-trained on a food-recipe dataset, than a final training on the coktails dataset was made. Training consisted of several training iterations, in which the learning rate is exponentialy decremented and incremented the ratio of flip augmentations.

Sampled outputs

Below are a few cherry-picked in-sample predictions from the model:

Example 1:

  • Generated: Late Cocktail
  • Original: After Dinner Cocktail
  • Recipe: Apricot brandy; Triple sec; Lime; Shake all ingredients (except lime wedge) with ice and strain into a cocktail glass. Add the wedge of lime and serve.

Example 2:

  • Generated: Island
  • Original: Cuba Libre
  • Recipe: Light rum; Lime; Coca-Cola ; Build all ingredients in a Collins glass filled with ice. Garnish with lime wedge ;

Example 3:

  • Generated: Elizabeth
  • Original: Kir Royale
  • Recipe: Creme de Cassis; Champagne ;Pour Creme de cassis in glass, gently pour champagne on top.

Usage (Python 3.6)

  • Clone repo: git clone https://github.com/burlamix/recipe-summarization.git
  • Initialize submodules: git submodule update --init --recursive
  • Install dependencies: pip install -r requirements.txt
  • Setup directories: python src/config.py
  • Download the dataset
  • Tokenize data: python src/tokenize_recipes.py
  • Initialize word embeddings with GloVe vectors:
    • Get GloVe vectors: wget -P data http://nlp.stanford.edu/data/glove.6B.zip; unzip data/glove.6B.zip -d data
    • Initialize embeddings: python src/vocabulary-embedding.py
  • Train model: python src/train_seq2seq.py
  • Make predictions: use src/predict.ipynb

About

Sequence to sequence recipe summarization

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 81.3%
  • Jupyter Notebook 9.9%
  • HTML 8.8%