With the popularity of the internet, the expression of emotions and methods of communication are becoming increasingly abundant, and most of these emotions are transmitted in text form. Text sentiment classification research mainly includes three methods based on sentiment dictionaries, machine learning and deep learning. In recent years, many deep learning-based works have used TextCNN (text convolution neural network) to extract text semantic information for text sentiment analysis. However, TextCNN only considers the length of the sentence when extracting semantic information. It ignores the semantic features between word vectors and only considers the maximum feature value of the feature image in the pooling layer without considering other information. Therefore, in this paper, we propose a convolutional neural network based on multiple convolutions and pooling for text sentiment classification (variable convolution and pooling convolution neural network, VCPCNN). There are three contributions in this paper. First, a multiconvolution and pooling neural network is proposed for the TextCNN network structure. Second, four convolution operations are introduced in the word embedding dimension or direction, which are helpful for mining the local features on the semantic dimensions of word vectors. Finally, average pooling is introduced in the pooling layer, which is beneficial for saving the important feature information of the extracted features. The verification test was carried out on four emotional datasets, including English emotional polarity, Chinese emotional polarity, Chinese subjective and objective emotion and Chinese multicategory. Our apporach is effective in that its result was up to 1.97% higher than that of the TextCNN network.
随着互联网的普及,情感的表达和交流方式变得越来越丰富,且这些情感大多以文本形式传递。文本情感分类研究主要包括基于情感词典、机器学习和深度学习的三种方法。近年来,许多基于深度学习的研究工作使用TextCNN(文本卷积神经网络)提取文本语义信息进行文本情感分析。然而,TextCNN在提取语义信息时仅考虑句子长度,忽略了词向量之间的语义特征,且在池化层仅考虑特征图的最大特征值而不考虑其他信息。因此,在本文中,我们提出一种基于多重卷积和池化的卷积神经网络用于文本情感分类(可变卷积和池化卷积神经网络,VCPCNN)。本文有三个贡献。首先,针对TextCNN网络结构提出了一种多重卷积和池化神经网络。其次,在词嵌入维度或方向上引入了四种卷积操作,这有助于挖掘词向量语义维度上的局部特征。最后,在池化层引入了平均池化,这有利于保存所提取特征的重要特征信息。在四个情感数据集上进行了验证测试,包括英语情感极性、汉语情感极性、汉语主客观情感以及汉语多类别。我们的方法是有效的,其结果比TextCNN网络高出了1.97%。