Font Size: a A A

Research On Short Text Classification Based On Word Distributed Representation

Posted on:2016-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:D P JiangFull Text:PDF
GTID:2308330470967669Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Today, with high popularity of the Internet, Netnews has become an important way for people to obtain news, the social platform has become an important place for people to express and comment. Analysis of network public opinion to understand the people’s livelihood and to make better policy has become increasingly important to some government departments. There are various types of co’ ntent on Netnews and social networking platform, they need to be classified according to requirements in order to satisfy public opinion analysis for certain areas. After research on these issues, this paper presents a short text classification framework based on word distributed representation, the main work is summarized as follows(1) In this paper, in-depth study of traditional vector space model and short text shows that the vector space model is very suitable for modeling long text, but it is poor to express short text which contains only a small amount of words, so we introduce word distributed representation for short text modeling.(2) We proposed Weighted Continuous Bag of Words model based on Wor2Vec after research on neural probabilistic language model and proposed word distributed representation based on LDA after research on topic model.(3) This paper presents a short text classification framework based on word distributed representation. We focus on two point:short text expanding based on word distributed representation and collaborative expression of short text based on multi-types word distributed representation.(4) We constructed a short text classification dataset by crawling news titles and news content from NewYorkTimes. Experiments on this dataset shown the effectiveness of our methods.
Keywords/Search Tags:Short Text, Classification, Word2Vec, Word Distributed Representation
PDF Full Text Request
Related items