Releases · strangetom/ingredient-parser

06 Dec 07:29

strangetom

1.3.2

479d771

1.3.2 Latest

Latest

Processing

Fix bug that allowed fractions in the intermediate form (i.e. #1$2) to appear in the name, prep, comment, size, purpose fields of the ParsedIngredient output.

Assets 2

29 Nov 14:51

strangetom

1.3.1

5eeb994

1.3.1

Warning

This version requires pint >=0.24.4

General

Support Python 3.13. Requires pint >= 0.24.4.

Assets 2

06 Nov 19:42

strangetom

1.3.0

bd0dbbe

1.3.0

Processing

Various minor improvements to feature generation.
Add PREPARED_INGREDIENT flag to IngredientAmount objects. This is used to indicate if the amount refers to the prepared ingredient (PREPARED_INGREDIENT=True) or the unpreprared ingredient (PREPARED_INGREDIENT=False).
Add starting_index attribute to IngredientText objects, indicating the index of the token that starts the IngredientText.
Improve detection of composite amounts in sentences.

Add quantity_fractions keyword argument to parse_ingredient. When True, the quantity and quantity_max fields of IngredientAmount objects will be fractions.Fraction objects instead of floats. This allows fractions such as 1/3 to be represented exactly. The default behaviour is when quantity_fractions=False, where quantities are floats as previously. For example

>>> parse_ingredient("1 1/3 cups flour").amount[0]
IngredientAmount(
    quantity=1.333,
    quantity_max=1.333,
    unit=<Unit('cup')>, 
    text='1 1/3 cups', 
    ...
)
>>> parse_ingredient("1 1/3 cups flour", quantity_fractions=True).amount[0]
IngredientAmount(
    quantity=Fraction(4, 3),
    quantity_max=Fraction(4, 3),
    unit=<Unit('cup')>,
    text='1 1/3 cups',
    ...
)

Model

Addition of new dataset: tastecooking. This is a relatively small dataset, but includes a number of unique abbreviations for units and sizes.

Assets 2

0 Join discussion

29 Sep 12:20

strangetom

1.2.0

9321b8c

1.2.0

General

New optional keyword argument to extract foundation foods from the ingredient name. Foundation foods are the fundamental item of food, excluding any qualifiers or descriptive adjectives, e.g. for the name organic cucumber, the foundation food is cucumber.

See https://ingredient-parser.readthedocs.io/en/latest/guide/foundation.html for additional details.
Some minor post processing fixes.

Assets 2

0 Join discussion

23 Aug 13:47

strangetom

1.1.2

f6c2b6f

1.1.2

Require NLTK >= 3.9.1, due to change in their resources format.

Assets 2

16 Aug 05:58

strangetom

1.1.1

ca5670f

1.1.1

Revert upgrade to NLTK 3.8.2 after 3.8.2 removed to PyPI.

Assets 2

15 Aug 15:37

strangetom

1.1.0

3202d8c

1.1.0

General

Require NLTK >= 3.8.2 due to change in POS tagger weights format.

Model

Include new tokens features, which help improve performance:
- Word shape (e.g. cheese -> xxxxxx; Cheese -> Xxxxxx)
- N-gram (n=3, 4, 5) prefixes and suffixes of tokens
Add 15,000 new sentences to training data from AllRecipes. This dataset includes lots of branded ingredients, which the existing datasets were quite light on.
Tweaks to the model hyperparameters have yielded a model that is ~25% small, but with better performance than the previous model.

Processing

Change processing of numbers written as words (e.g. 'one', 'two' ). If the token is labelled as QTY, then the number will converted to a digit (i.e. 'one' -> 1) or collapsed into a range (i.e. 'one or two' -> 1-2), otherwise the token is left unchanged.

Assets 2

0 Join discussion

10 Aug 20:34

strangetom

1.0.1

f5c73ca

1.0.1

Warning

This version requires NLTK >=3.8.2

NLTK 3.8.2 changes the file format (from pickle to json) of the weights used by the part of speech tagger used in this project, to address some security concerns. This patch updates the NLTK resource checks performed when ingredient-parser is imported to check for the new json files, and downloads them if they are not present.

This version requires NLTK>=3.8.2.

Assets 2

17 Jun 15:44

strangetom

1.0.0

b058395

1.0.0

1.0

General

Improve performance when tagging multiple sentences. For large numbers of sentences (>1000), the performance improvement is ~100x.

Processing

Extend support for composite amounts that have the form e.g. 1 cup plus 1 tablespoon or 1 cup minus 1 tablespoon. Previously the phrase plus/minus 1 tablespoon would be returned in the comment. Now the whole phrase is captured as a CompositeAmount object.
Fix cases where the incorrect pint.Unit would be returned, caused by pint interpreting the unit as something else e.g. "pinch" -> "pico-inch".

Assets 2

0 Join discussion

27 May 16:43

strangetom

0.1.0-beta11

3a16425

0.1.0-beta11 Pre-release

Pre-release

General

Refactor package structure to make it more suitable for expansion to over languages.

Note: There aren't any plans to support other languages yet.

Model

Reduce duplication in training data
Introduce PURPOSE label for tokens that describe the purpose of the ingredient, such as for the dressing and for garnish.
Replace quantities with "!num" when determining the features for tokens so that the model doesn't need to learn all possible values quantities can take. This results in a small reduction in model size.

Processing

Various bug fixes to post-processing of tokens with labels NAME, COMMENT, PREP, PURPOSE, SIZE to correct punctuation and confidence calculations.
Modification of tokeniser to split full stops from the end of tokens. This helps to model avoid treating "token." and "token" as different cases to learn.
Add fallback functionality to parse_ingredient for cases where none of the tokens are labelled as NAME. This will select name as the token with the highest confidence of being labelled NAME, even though a different label has a high confidence for that token. This can be disabled by setting expect_name_in_output=False in parse_ingredient.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Processing

General

Processing

Model

General

General

Model

Processing

1.0

General

Processing

General

Model

Processing

Releases: strangetom/ingredient-parser

1.3.2

Processing

1.3.1

General

1.3.0

Processing

Model

1.2.0

General

1.1.2

1.1.1

1.1.0

General

Model

Processing

1.0.1

1.0.0

1.0

General

Processing

0.1.0-beta11

General

Model

Processing