Skip to content
forked from timo-liu/eng-syl

A Seq2Seq Model that syllabifies English words.

License

Notifications You must be signed in to change notification settings

anowa-eng/eng-syl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English Syllabifier (eng_syl)

This is a GRU-based neural network designed for English word syllabification. The model was trained on data from the Wikimorph dataset.

Usage

Use the syllabify() function from the Syllable class to syllabify your words:

>>> from eng_syl.syllabify import Syllable
>>> syllabler = Syllable()
>>> syllabler.syllabify("chomsky")
'chom-sky'

syllabify() parameters

  • text: string- English text to be syllabified. Input should only contain alphabetic characters.

syllabify() returns the given word with hyphens inserted at syllable boundaries.

Onceler (Onset, Nucleus, Coda Segmenter)

The onc_split() function from the Onceler class splits single syllables into their constituent Onset, Nucleus, and Coda components.

>>> from eng_syl.onceler import Onceler
>>> lorax = Onceler()
>>> print(lorax.onc_split("sloan")
'sl-oa-n'
  • text: string - English single syllable word/ component to be segmented into Onset, Nucleus, Coda. Input should only contain alphabetic characters.

Phonify (Grapheme sequence to IPA estimation)

The ipafy() function from the on_to_phon class tries to approximate an IPA pronunciation from a sequence of graphemes.

>>> from eng_syl.phonify import onc_to_phon
>>> skibidi = onc_to_phon()
>>> print(skibidi.ipafy(['b', 'u', 'tt'])
'bʌt'
  • sequence: array of strings - a sequence of English viable onsets, nuclei, and coda

About

A Seq2Seq Model that syllabifies English words.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%