Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

第三章 3.4.3.1 wikiextractor 问题 #23

Open
ji90po opened this issue Sep 13, 2023 · 0 comments
Open

第三章 3.4.3.1 wikiextractor 问题 #23

ji90po opened this issue Sep 13, 2023 · 0 comments

Comments

@ji90po
Copy link

ji90po commented Sep 13, 2023

安装问题比较多 (https://dumps.wikimedia.org/zhwiki/latest/ 语料库)

  1. 如果遇到err 就像下面
    ’”aise source.error('global flags not at the start '
    re.error: global flags not at the start of the expression at position 4 “

请务必将python 退到py3.10 的版本 (我用的anaconda 是3.11的 一直报错)

example :
Conda create --name py310 python=3.10
conda activate py310
pip install wikiextractor

2) 如果开始运行 python -m wikiextractor.WikiExtractor jawiki-latest-pages-articles.xml.bz2 了 很长一段时间 ,如
'...xxx pages ...
...xxx pages ...
...xxx pages ...'
突然报 带’fork‘的错误

一个解决方案
pip install git+https://github.com/prokotg/wikiextractor

wikiextractor 会从3.0.6 回退到 3.0.4 从而 ok


python -m wikiextractor.WikiExtractor jawiki-latest-pages-articles.xml.bz2

从而ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant