Font Size: a A A

A Study On Optimization Of Text Clustering Based On Convolutional Neural Network

Posted on:2019-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y SunFull Text:PDF
GTID:2428330590467470Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Text clustering is an important technique in the field of text mining.It is widely used in information retrieval and text mining.It has important application value in the organization and browsing of large-scale texts and the automatic generation of text-level categorization.The existing text clustering algorithms often ignore the semantic relevance of the words in the text,and the clustering results are unstable.The main work of this paper is as following.Firstly,a text clustering algorithm based on convolutional neural network is proposed.Based on a large-scale corpus,this algorithm uses the word2 vec model to train the corpus,learns the semantic relationship between the words,and transforms the text into a sparse original vector by concatenated word vectors.Then,the algorithm utilizes a simple CNN of one layer of convolution to extract features of texts.Finally,text clustering is implemented by using k-means algorithm.Experimental results show the effectiveness of the algorithm by accuracy of more than 75%.Finally,an optimized text clustering feedback neural algorithm is proposed.Firstly,the algorithm expands the vector representation of words based on the pre-trained word vectors,which obtains the context semantics of the words in the text through bidirectional circular neural networks.The expanded representation of the words includes both the meanings and the context of the words,so that the vector representation of the text is also richer and more complete.Then,the algorithm optimizes traditional k-means algorithm through the secondary partition adjustment in order to avoid getting into the local optimum.Finally,the fuzziness of the clustering result is taken as the loss function of CNN,and the convolution neural network is continuously trained so as to realize the mutual influence and continuous optimization between the feature extraction and the text clustering.Experimental results show that compared with the existing text clustering algorithms,the accuracy of the proposed algorithm is more than 80% and is effectively improved.
Keywords/Search Tags:deep learning, text clustering, convolutional neural network, feature extraction, feedback clustering
PDF Full Text Request
Related items