Skip to content

2018达观杯长文本分类智能处理挑战赛 18解决方案

Notifications You must be signed in to change notification settings

hxl523/daguan-classify-2018

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

达观杯2018

Backers on Open Collective Sponsors on Open Collective

参数没调好,仓促比赛,单模型线上没测过,线下0.784,最终得分0.791,排名18/3462,排名不高就不多写了,等着前排分享。思路如同代码所写,很简单。

数据请在达观数据处下载,放在data目录下。

一、环境

环境/库 版本
Ubuntu 14.04.5 LTS
python 3.6
jupyter notebook 4.2.3
tensorflow-gpu 1.10.1
numpy 1.14.1
pandas 0.23.0
matplotlib 2.2.2
gensim 3.5.0
tqdm 4.24.0

二、数据预处理

都写在jupyter里了。运行src/preprocess/EDA.ipynb生成各种文件。

三、baseline模型训练

src/preprocess/中运行:

python baseline-x-cv.py

四、深度模型训练

然后直接train模型,单GPU运行,模型自选:

python train_predict.py --gpu 4 --option 5 --model convlstm --feature char

多GPU训练示例:

python train_predict.py --gpu 4,5,6,7 --option 5 --model convlstm --feature char

五、模型融合输出

python stacking.py --gpu 1 --tfidf True --option 5

这里是stacking和伪标签一起做了,请修改代码自选是否用伪标签。

Contributors

This project exists thanks to all the people who contribute. [Contribute].

Backers

Thank you to all our backers! 🙏 [Become a backer]

Sponsors

Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]

About

2018达观杯长文本分类智能处理挑战赛 18解决方案

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 84.9%
  • Python 15.1%