Font Size: a A A

Deep Learning For Short Text Semantic Similarity Measures

Posted on:2016-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ChenFull Text:PDF
GTID:2308330476454965Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the growth of social network and human-computer interaction technology, shorttext is widely used in the Internet. As the basic technology for short-text processing, shorttext semantic similarity measures has a broad prospects.In this paper, we anal4 ysis the features of short-text. Then we hold the option that to measure the semantic similarity of short-texts, the syntactic information and the word embedding should be considered. So, in this paper, we propose a short-text vectorization model that combines not only the dependency feature but also word embedding. And, from this, we propose a short-text semantic similarity measures model with multiple features.Firstly, we study on the theory and optimizing of dependency parsing to find a parsing model that not only has a high accuracy but also has a high speed. So, we propose a transitionbased structural dependency parsing model with Yamada algorithm. We also test different transition set, feature set and part-of-speech tag set for this model. Finally, the accuracy of our model is very close to the state-of-art transition-based dependency parser and the speed of our model is the fastest among the parsers whose accuracy is greater than 85%.Secondly, from dependency parsing, we propose a short-text semantic vectorization model with deep learning technology and dependency feature. In this model, we add a context vector to the neural network feature vectorization model and get the semantic representation of short-text via backward propagation. This short-text representation could effectively combine the word meaning, syntactic information and semantic information. At last, we propose a short-text similarity measure model that combines the TF-IDF feature, dependency feature, topic model, neural network language model and our short-text vectorization model.Finally, we tested our model in the short-text data from Internet. The experiments show that the accuracy of our model is the best among other models and the dependency feature is important in short-text semantic similarity measures.
Keywords/Search Tags:Dependency parsing, Sematic Similarity, Deep Learning, Neural Network
PDF Full Text Request
Related items