[1] Convolutional Neural Networks for Sentence Classification [2] Recurrent Neural Network for Text Classification with Multi-Task Learning [3] Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification [4] Recurrent Convolutional Neural Networks for Text Classification [5] Bag of Tricks for Efficient Text Classification [6] Deep Pyramid Convolutional Neural Networks for Text Categorization [7] Attention Is All You Need
Chinese-Text-Classification-Pytorch
中文文本分类,TextCNN,TextRNN,FastText,TextRCNN,BiLSTM_Attention, DPCNN, Transformer, 基于pytorch,开箱即用。
介绍
模型介绍、数据流动过程:我的博客
数据以字为单位输入模型,预训练词向量使用 搜狗新闻 Word+Character 300d,点这里下载
环境
python 3.7
pytorch 1.1
tqdm
sklearn
tensorboardX
中文数据集
我从THUCNews中抽取了20万条新闻标题,已上传至github,文本长度在20到30之间。一共10个类别,每类2万条。
类别:财经、房产、股票、教育、科技、社会、时政、体育、游戏、娱乐。
数据集划分:
更换自己的数据集
python run.py --model TextCNN --word True
效果
bert和ERNIE模型代码我放到另外一个仓库了,传送门:Bert-Chinese-Text-Classification-Pytorch,后续还会搞一些bert之后的东西,欢迎star。
使用说明
参数
模型都在models目录下,超参定义和模型定义在同一文件中。
对应论文
[1] Convolutional Neural Networks for Sentence Classification
[2] Recurrent Neural Network for Text Classification with Multi-Task Learning
[3] Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification
[4] Recurrent Convolutional Neural Networks for Text Classification
[5] Bag of Tricks for Efficient Text Classification
[6] Deep Pyramid Convolutional Neural Networks for Text Categorization
[7] Attention Is All You Need