Font Size: a A A

The Research And Implementation Of Text Clustering Based On The Platform Of Micro-blog

Posted on:2016-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2308330470476868Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The technology application of Web2.0 has accelerated the arrival of the network era of big data. Nowadays, micro-blog has become the new media of social network, and billions of users enriched the storage of network big data, which also created huge challenges for the deep analysis and storing and digging of the data. The diverse micro-blog pattern has made a big difference between text editing and original text information. Micro-blog involves vast fields like marketing management, advertising technology and so on, and it covers inclusive contents. The fragmentation of the contents has directly led the network data to be sophisticated in structure and massive in contents. In this context, it is an major research focus to swiftly and accurately digging the effective information and analyze them.In the research and development process of the micro-blog software application, topic clustering technology is the basic link of its research structure, during which, classify and conclude the contents of the text,analysis could be made towards deep topic. Traditional data digging technology picks key words for processing, thus the returned data contains a lot of duplicate contents, and some of the data shows little relevance, which is a waste of the processing resource. Therefore, the traditional key words technology couldn’t meet the requirements. Based on micro-blog platform and micro-blog text intelligence processing computation could be a good solution to this problem.Firstly, collecting the micro-blog text information and preprocessing the collected information are needed in this thesis. Secondly, featured words of the text are filtered, at the same time, the optimal method of studying and filtering the featured words via using short text features is searched. Aimed at the characteristic of micro-blog, an algorithm is studied to optimize the clustering result of micro-blog text. Furthermore, to make the text clustering result clear, visualization for the clustering analysis result is necessary. At last, effectiveness of the method is validated through experiment, as well as further study for problem proposed in this paper will be carried out.
Keywords/Search Tags:topic clustering, Micro-blog, Feature vector, Visualization, Information gain
PDF Full Text Request
Related items