Skip to content

Latest commit

 

History

History
61 lines (40 loc) · 1.42 KB

File metadata and controls

61 lines (40 loc) · 1.42 KB

Comments-Mining-System-for-Scholar-Citations

Graduation project for undergraduates

Installation

clone this repo into local disk.

git clone https://github.com/Phimos/Comments-Mining-System-for-Scholar-Citations

install citeminer with pip.

pip install -e .

P.S. Luminati is needed to crawl Google Scholar and other scholar webcites. https://luminati-china.biz/

Usage

Use json config to determine the author / publication needed to be mined.

Change the config path in pipeline.py, and you can only run specific step if you want.

python pipeline.py

P.S. Sci-Hub will block the crawlers after about 10 times try. download.sh will retry every 5 minutes.

TODO

Search TODO in code files, and you will find the TODO list. Some of them may be fixed soon.

Useful Information