Font Size: a A A

Design And Implementation Of Micro-blog Based Semantic Retrieval Subsystem

Posted on:2018-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q YeFull Text:PDF
GTID:2348330518997010Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, social networking platform has become an important place for people to show their views and comments. The social networking platform, such as Weibo, can produce large amounts of data every day. The rapid semantic retrieval and statistical analysis of these massive social network data is helpful to gain a real-time understanding of the general public's emotional tendencies towards certain events. The micro-blog big data platform analysis public opinions, hot topics or public sentimental attitude, and predicts user's emotional attitude or evaluation of some topics according to the analysis result.Basing on the data and the analysis result of the micro-blog big data platform, this thesis mainly studies and implements the semantic retrieval subsystem using in the retrieval of Weibo data. This thesis includes the following parts of works. In first part, some word semantic similarity computing algorithms are studied, including word similarity computing based on How-net, word similarity algorithm based on Tongyici Cilin. In second part, in order to overcome the shortcomings of traditional word semantic similarity computing algorithms, a method based on the distributed representation is proposed, using in converting the word to vector, then use the similarity between vectors as the similarity between words. At the same time, a certain degree of traditional word semantic similarity computing results are added to make word semantic similarity computing result more accurate and more reasonable. In third part, based on this new word semantic similarity computing method, design each system module of the semantic retrieval subsystem to implement every system function.Finally, in this thesis, based on the Weibo data and the trained semantic model, the test results showed that the semantic retrieval subsystem has a good performance in the actual work.
Keywords/Search Tags:full text search, semantic similarity, distributed representation, word vector
PDF Full Text Request
Related items