Skip to content

Releases: WorksApplications/Sudachi

Sudachi version 0.6.0-beta2

07 Jun 10:32
Compare
Choose a tag to compare
Pre-release
v0.6.0-beta2

version -> 0.6.0-beta2

Sudachi version 0.6.0-beta1

03 Jun 02:18
Compare
Choose a tag to compare
Pre-release

Pre-relesease of 0.6.0

Sudachi version 0.5.3

04 Nov 08:35
df81119
Compare
Choose a tag to compare

This release includes the following new features and a bug fix.

  • Changed the priority of user dictionaries
    • If the cost is the same, the words in the dictionary added later will take precedence
  • Fixed a bug where sentences were incorrectly separated by spaces.
  • Added a method to dump the internal structure as JSON

Sudachi version 0.5.2

13 Mar 07:44
Compare
Choose a tag to compare

This release includes the following a new feature.

  • Added IgnoreYomiganaPlugin which removes yomigana in parentheses.
    • This feature is enabled by default
    • The default length of hiragana characters recognized as reading kana is up to 4 characters
    • See sudachi.json for details
$ echo '徳島(とくしま)に行(い)く' | java -jar sudachi-0.5.2.jar
徳島(とくしま)  名詞,固有名詞,地名,一般,*,*     徳島
に      助詞,格助詞,*,*,*,*     に
行(い)く        動詞,非自立可能,*,*,五段-カ行,終止形-一般       行く
EOS

Sudachi version 0.5.1

25 Nov 10:00
Compare
Choose a tag to compare

This release includes the following new features.

  • Added synonym group IDs field to user dictionary
  • Added allowEmptyMorpheme to settings
    • Setting this property to false suppresses tokens of length 0
    • The default value is true
$ echo … | java -jar sudachi-0.5.1.jar -s '{"allowEmptyMorpheme":false}'
…       補助記号,句点,*,*,*,*   .
…       補助記号,句点,*,*,*,*   .
…       補助記号,句点,*,*,*,*   .
EOS

Sudachi version 0.5.0

04 Nov 03:11
Compare
Choose a tag to compare

This release includes the following new features.

  • Added synonym group IDs field to use Sudachi Synonym Dictionary
    • New dictionary format, but is backwards compatible
  • Command line output can now be customized via plugins

Sudachi version 0.4.3

19 Jun 01:26
Compare
Choose a tag to compare

This release includes a bug fix.

  • Fix overrun with surrogate pairs

Sudachi version 0.4.2

29 May 02:03
Compare
Choose a tag to compare

This release includes a bug fix.

  • Fix buffer overrun with character normalization in Tokenizer#tokenize(Reader)

Sudachi version 0.4.1

26 May 07:36
Compare
Choose a tag to compare

This release includes a new method for sentence boundary detection.

  • Add Tokenizer#tokenizeSentences(Reader)

Sudachi version 0.4.0

05 Apr 04:51
Compare
Choose a tag to compare

This release includes a new sentence boundary detector and a bug fix.

  • Add a new sentence boundary detector
    • Add Tokenizer#tokenizeSentences
    • Add SentenceDetector
    • The CLI makes sentence boundary disambiguation
  • Fix a bug causing normalized characters to be misaligned