Skip to content

v3.3.0

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 13 Nov 08:22
· 3 commits to develop since this release

Highlights

  • Offset correction of SudachiSplitFilter now works properly with CharFilter #149
  • SPI is changed to implement #149
    • New methods are added to MorphemeAttribute
  • Add allow_empty_morpheme setting to the tokenizer (#151)
    • If false (default), when a char is split into multiple morphemes (e.g. ㍿), all morphemes will contain the char in their span.
    • If true, only the first morpheme will contain the char and the span of other morphemes may be empty.
      • Previously this was set true by default.