Font Size: a A A

Online Learning Technology On Abstract Extraction System In Short Text Stream

Posted on:2016-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:L X XuFull Text:PDF
GTID:2298330467493038Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, information on Internet is showing explosive growth. Microblog is becoming more and more popular because of its convenient publication and wide propagation. By using microblog, users can freely post their comments and express their feelings at any time and anywhere. Information on Microblog involves all fields, such as military, politics, entertainment, science, technology and so on. A wealth of information contains important commercial value but also brings some problems that users can not get their interested information in time. The technology of abstract extraction in short text stream can help users access their interested information in huge short text steam in time, and has a very practical significance.Based on the above background, this paper studies relevant principles and techniques on abstract extraction system in short text stream. This paper designs and implements abstract extraction algorithms in short text stream based on Single Pass and topic model. This paper mainly completes the work in the following areas:1. By analyzing the characteristics of microblog’s short texts, this paper combines the structured information of microblog and optimizes feature vector representation of microblog’s short texts.2. According to the sparse features of short texts, this paper proposes an improved similarity calculation method between a short text and a cluster, and solves the similarity drift problem of short texts.3. Based on the improved similarity calculation method, this paper designs and implements an improved abstract extraction algorithm in short text stream based on Single Pass, which improves the accuracy of clustering and abstract extraction.4. According to studies on the long tail effect of microblog topics, this paper designs and implements an improved abstract extraction algorithm in short text stream based on topic model, which effectively reduces the impact of the long tail effect of microblog topics on abstract extraction’s result.5. This paper designs and implements an abstract extraction system in short text stream, which contains data preprocessing model, algorithm model and so on. More research and experiments will go on with this system.
Keywords/Search Tags:short text stream, similarity drift, incremental clustering, abstract extraction
PDF Full Text Request
Related items