Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spell_checker.check()을 하기 전에 필요한 전처리가 무엇인가요? #27

Open
kwhkim opened this issue Nov 2, 2022 · 0 comments

Comments

@kwhkim
Copy link

kwhkim commented Nov 2, 2022

x = ' 관심 0&※ 간섭 있습니다'
 spell_checker.check(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/kwhkim/opt/miniconda3/envs/py38spacing/lib/python3.8/site-packages/hanspell/spell_checker.py", line 68, in check
    'checked': _remove_tags(html),
  File "/Users/kwhkim/opt/miniconda3/envs/py38spacing/lib/python3.8/site-packages/hanspell/spell_checker.py", line 27, in _remove_tags
    result = ''.join(ET.fromstring(text).itertext())
  File "/Users/kwhkim/opt/miniconda3/envs/py38spacing/lib/python3.8/xml/etree/ElementTree.py", line 1320, in XML
    parser.feed(text)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 15

문제는 &인 것 같은데, 문자열에 사용할 수 없는 문자가 있나요?

@kwhkim kwhkim changed the title spell_check.check()을 하기 전에 필요한 전처리가 무엇인가요? spell_checker.check()을 하기 전에 필요한 전처리가 무엇인가요? Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant