SPDX-FileCopyrightText | SPDX-FileType | SPDX-License-Identifier |
---|---|---|
2025 PyThaiNLP Project |
DOCUMENTATION |
CC0-1.0 |
Notable changes between versions.
- For full release notes, see: https://github.com/PyThaiNLP/pythainlp/releases
- For detailed commit changes, see: https://github.com/PyThaiNLP/pythainlp/compare/v5.1.0...dev (select tags to compare)
[WIP]
- Add Thai Discourse Treebank postag #910
- Add Thai Universal Dependency Treebank postag #916
- Add Thai G2P v2 Grapheme-to-Phoneme model #923
- Add support for list of strings as input to sent_tokenize() #927
- Add pythainlp.tools.safe_print to handle UnicodeEncodeError on console #969
- Fix collate() to consider tonemark in ordering #926
- Fix nlpo3.load_dict() that never print error msg when not success #979
- Add Thai Solar Date convert to Thai Lunar Date #998
- Add Thai pangram text #1045
- Remove clause_tokenize #1024
- Add clause_tokenize warnings #1026
- Fix maiyamok() that expanding the wrong word #962
- Fix: pythainlp.util.maiyamok does not duplicate words when more than one Maiyamok is used #917
- Fix: empty string ('') added when using word_tokenize with join_broken_num=True #912
- Fix: crfcut: Ensure splitting of sentences using terminal punctuation #905
- Fix: delay calling syllable_tokenize to avoid pycrfsuite ImportError #901