Python-用Keras实现的多种深度学习文本分类模型_python文本分类模型资源-CSDN下载

共38个文件

py：18个

png：15个

md：2个

需积分: 50 101 浏览量 2019-08-11 04:04:35 上传评论 2 收藏 1.35MB ZIP 举报

**Python与Keras深度学习文本分类模型** 在Python的机器学习领域，Keras是一个非常流行的深度学习库，它以其简洁的API和强大的功能而受到广大开发者喜爱。在文本分类任务中，Keras提供了多种模型来处理自然语言数据，如FastText、TextCNN、TextRNN、TextBiRNN、TextAttBiRNN、HAN（Hierarchical Attention Networks）以及RCNN（Recurrent Convolutional Neural Network）和其变体。以下是对这些模型的详细介绍： 1. **FastText**： FastText是由Facebook开源的一种词嵌入模型，它在处理长尾词汇方面表现优秀。在Keras中，FastText可以用来生成词向量，然后通过全连接层进行分类。 2. **TextCNN**： TextCNN（Text Convolutional Neural Network）借鉴了计算机视觉领域的卷积神经网络，对文本进行局部特征提取。通过多个不同窗口大小的卷积核，捕获文本中的不同粒度信息，再通过池化操作减少维度，最后用于分类。 3. **TextRNN**： TextRNN是基于循环神经网络（RNN）的文本分类模型，尤其适合处理序列数据。RNN的每个时间步长可以理解一个单词，通过隐藏状态传递信息，捕捉文本的上下文依赖。 4. **TextBiRNN**： TextBiRNN是双向RNN的实现，同时考虑前向和后向的信息流。这使得模型能更好地理解句子中词之间的关系，特别适合处理双向依赖的语义信息。 5. **TextAttBiRNN**： TextAttBiRNN引入了注意力机制，允许模型在处理序列时更专注于关键信息。这种机制提高了模型对重要单词的敏感性，增强了模型的表达能力。 6. **HAN（Hierarchical Attention Networks）**： HAN是一种层次化的注意力网络，它首先在句子级别应用注意力机制，然后在文档级别再次应用。这样，模型能够分别关注到文本中的关键句子和关键词，提高分类效果。 7. **RCNN（Recurrent Convolutional Neural Network）**： RCNN结合了循环神经网络和卷积神经网络，先通过卷积操作提取局部特征，再通过RNN处理时序信息，适应于处理变长的输入序列。 8. **RCNN Variant**： RCNN的变体通常是对原始结构的改进或扩展，可能包括不同的池化策略、更复杂的注意力机制或其他优化技术，以适应特定的文本分类问题。这些模型在Keras中的实现，极大地简化了深度学习文本分类的流程，让开发者能够快速构建和训练模型。在实际应用中，选择哪种模型取决于数据特性、计算资源以及任务需求。通过对比实验和调参，可以找到最适合特定文本分类任务的模型。在`TextClassification-Keras-master`这个项目中，你可以找到这些模型的代码实现，通过阅读和运行这些代码，将有助于深入理解这些模型的工作原理和应用方法。

资源推荐

资源详情

资源评论

收起资源包目录

Python-用Keras实现的多种深度学习文本分类模型.zip （38个子文件）

TextClassification-Keras-master

.gitignore 18B

model

TextRNN

main.py 1KB

text_rnn.py 837B

TextAttBiRNN

text_att_birnn.py 966B

attention.py 4KB

main.py 1KB

HAN

han.py 1KB

attention.py 4KB

main.py 1KB

RCNN

rcnn.py 1KB

main.py 2KB

RCNNVariant

rcnn_variant.py 2KB

main.py 1KB

TextCNN

text_cnn.py 1KB

main.py 1KB

FastText

fast_text.py 837B

main.py 4KB

TextBiRNN

text_birnn.py 869B

main.py 1KB

LICENCE 1KB

README-ZH.md 9KB

README.md 9KB

image

TextRNN_network_structure.png 15KB

FeedForwardAttention.png 111KB

TextBiRNN_network_structure.png 16KB

HAN.png 67KB

FastText_network_structure.png 15KB

TextCNN_network_structure.png 26KB

RCNNVariant_network_structure.png 37KB

RCNN.png 171KB

TextCNN.png 97KB

FeedForwardAttetion_fomular.png 2KB

HAN_network_structure.png 32KB

FastText.png 44KB

RCNN_network_structure.png 26KB

TextAttBiRNN_network_structure.png 19KB

figures.pptx 694KB

TextRNN.png 40KB

# TextClassification-Keras This code repository implements a variety of **deep learning models** for **text classification** using the **Keras** framework, which includes: **FastText**, **TextCNN**, **TextRNN**, **TextBiRNN**, **TextAttBiRNN**, **HAN**, **RCNN**, **RCNNVariant**, etc. In addition to the model implementation, a simplified application is included. - [English documents](README.md) - [中文文档](README-ZH.md) ## Guidance 1. [Environment](#environment) 2. [Usage](#usage) 3. [Model](#model) 1. [FastText](#1-fasttext) 2. [TextCNN](#2-textcnn) 3. [TextRNN](#3-textrnn) 4. [TextBiRNN](#4-textbirnn) 5. [TextAttBiRNN](#5-textattbirnn) 6. [HAN](#6-han) 7. [RCNN](#7-rcnn) 8. [RCNNVariant](#8-rcnnvariant) 999. [To Be Continued...](#to-be-continued) 4. [Reference](#reference) ## Environment - Python 3.6 - NumPy 1.15.2 - Keras 2.2.0 - Tensorflow 1.8.0 ## Usage All codes are located in the directory ```/model```, and each kind of model has a corresponding directory in which the model and application are placed. For example, the model and application of FastText are located under ```/model/FastText```, the model part is ```fast_text.py```, and the application part is ```main.py```. ## Model ### 1 FastText FastText was proposed in the paper [Bag of Tricks for Efficient Text Classification](https://arxiv.org/pdf/1607.01759.pdf). #### 1.1 Description in Paper <img src="image/FastText.png"> 1. Using a look-up table, **bags of ngram** covert to **word representations**. 2. Word representations are **averaged** into a text representation, which is a hidden variable. 3. Text representation is in turn fed to a **linear classifier**. 4. Use the **softmax** function to compute the probability distribution over the predefined classes. #### 1.2 Implementation Here Network structure of FastText: <img src="image/FastText_network_structure.png"> ### 2 TextCNN TextCNN was proposed in the paper [Convolutional Neural Networks for Sentence Classification](http://www.aclweb.org/anthology/D14-1181). #### 2.1 Description in Paper <img src="image/TextCNN.png"> 1. Represent sentence with **static and non-static channels**. 2. **Convolve** with multiple filter widths and feature maps. 3. Use **max-over-time pooling**. 4. Use **fully connected layer** with **dropout** and **softmax** ouput. #### 2.2 Implementation Here Network structure of TextCNN: <img src="image/TextCNN_network_structure.png"> ### 3 TextRNN TextRNN has been mentioned in the paper [Recurrent Neural Network for Text Classification with Multi-Task Learning](https://www.ijcai.org/Proceedings/16/Papers/408.pdf). #### 3.1 Description in Paper <img src="image/TextRNN.png"> #### 3.2 Implementation Here Network structure of TextRNN: <img src="image/TextRNN_network_structure.png"> ### 4 TextBiRNN TextBiRNN is an improved model based on TextRNN. It improves the RNN layer in the network structure into a bidirectional RNN layer. It is hoped that not only the forward encoding information but also the reverse encoding information can be considered. No related papers have been found yet. Network structure of TextBiRNN: <img src="image/TextBiRNN_network_structure.png"> ### 5 TextAttBiRNN TextAttBiRNN is an improved model which introduces attention mechanism based on TextBiRNN. For the representation vectors obtained by bidirectional RNN encoder, the model can focus on the information most relevant to decision making through the attention mechanism. The attention mechanism was first proposed in the paper [Neural Machine Translation by Jointly Learning to Align and Translate](https://arxiv.org/pdf/1409.0473.pdf), and the implementation of the attention mechanism here is referred to this paper [Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems](https://arxiv.org/pdf/1512.08756.pdf). #### 5.1 Description in Paper <img src="image/FeedForwardAttention.png"> In the paper [Feed-Forward Networks with Attention Can Solve Some Long-Term Memory Problems](https://arxiv.org/pdf/1512.08756.pdf), the **feed forward attention** is simplified as follows, <img src="image/FeedForwardAttetion_fomular.png"> Function `a`, a learnable function, is recognized as a **feed forward network**. In this formulation, attention can be seen as producing a fixed-length embedding `c` of the input sequence by computing an **adaptive weighted average** of the state sequence `h`. #### 5.2 Implementation Here The implementation of attention is not described here, please refer to the source code directly. Network structure of TextAttBiRNN: <img src="image/TextAttBiRNN_network_structure.png"> ### 6 HAN HAN was proposed in the paper [Hierarchical Attention Networks for Document Classification](http://www.aclweb.org/anthology/N16-1174). #### 6.1 Description in Paper <img src="image/HAN.png"> 1. **Word Encoder**. Encoding by **bidirectional GRU**, an annotation for a given word is obtained by concatenating the forward hidden state and backward hidden state, which summarizes the information of the whole sentence centered around word in current time step. 2. **Word Attention**. By a one-layer **MLP** and softmax function, it is enable to calculate normalized importance weights over the previous word annotations. Then, compute the sentence vector as a **weighted sum** of the word annotations based on the weights. 3. **Sentence Encoder**. In a similar way with word encoder, use a **bidirectional GRU** to encode the sentences to get an annotation for a sentence. 4. **Sentence Attention**. Similar with word attention, use a one-layer **MLP** and softmax function to get the weights over sentence annotations. Then, calculate a **weighted sum** of the sentence annotations based on the weights to get the document vector. 5. **Document Classification**. Use the **softmax** function to calculate the probability of all classes. #### 6.2 Implementation Here The implementation of attention here is based on FeedForwardAttention, which is the same as the attention in TextAttBiRNN. Network structure of HAN: <img src="image/HAN_network_structure.png"> The TimeDistributed wrapper is used here, since the parameters of the Embedding, Bidirectional RNN, and Attention layers are expected to be shared on the time step dimension. ### 7 RCNN RCNN was proposed in the paper [Recurrent Convolutional Neural Networks for Text Classification](https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9745/9552). #### 7.1 Description in Paper <img src="image/RCNN.png"> 1. **Word Representation Learning**. RCNN uses a recurrent structure, which is a **bi-directional recurrent neural network**, to capture the contexts. Then, combine the word and its context to present the word. And apply a **linear transformation** together with the `tanh` activation fucntion to the representation. 2. **Text Representation Learning**. When all of the representations of words are calculated, it applys a element-wise **max-pooling** layer in order to capture the most important information throughout the entire text. Finally, do the **linear transformation** and apply the **softmax** function. #### 7.2 Implementation Here Network structure of RCNN: <img src="image/RCNN_network_structure.png"> ### 8 RCNNVariant RCNNVariant is an improved model based on RCNN with the following improvements. No related papers have been found yet. 1. The three inputs are changed to **single input**. The input of the left and right contexts is removed. 2. Use **bidirectional LSTM/GRU** instead of traditional RNN for encoding context. 3. Use **multi-channel CNN** to repr

评论收藏

内容反馈