Font Size: a A A

Research Of Social Network Relation Discovery Algorithm Based On Topic Model

Posted on:2017-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:2308330485985948Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the high-speed development of the Internet in recent years, more and more people were connected through this new network, which establishes a giant social network structure. In this large network, there is a huge commercial value, if we can mining social relations among people from this complex network, we can specify corresponding business strategy for these groups such as friend recommendation and advertising delivering. In terms of network information transmission, social network research is also meaningful. Through the analysis of network, we can find out the key nodes of the information transmission and we can limit the spread of negative information through these key nodes.Research on social network relation discovery algorithm has made some achievements, this theses’s innovation is that I put forward a social network discovery algorithm based on the topic model, use text classification technology to realize automatic classification for network news and excavate social relationship among network friends according to the classification results.The traditional text classification algorithm does not consider text theme information, and the model just rely on word frequency information. In this theses I add the concept of theme into text classification model. On one hand I retain superficial knowledge of text, on the other hand I consider the hidden deep semantics in text, by combining concept of two levels to achieve better effect of text classification. In this theses the main work is as follows:1. LDA topic modeling and topic optimal selection. I preprocess original news, then build LDA model and get output result:theme distribution matrix and term distribution matrix. And then I use three methods-independent testing, characterization of word coverage testing and information entropy testing to screening superior topics, and I compare these screening methods through experiment.2. I combine LDA model with VSM model and propose LDA_KMeans text classification algorithm, and improve text similarity calculation formula. In addition I combine LDA model with SVM model, propose LDA_SVM text classification algorithm and improve feature weight calculation formula in algorithm.3. The paper finally compare LDA classification, LDA_KMeans classification, SVM classification and LDA_SVM classification through experiment and draw conclusion as follows according to recall rate, precision rate, F1 value and so on: algorithm which is combined with LDA model get higher classification accuracy than algorithm based on a single model. And in this four kinds of classification algorithms, LDA_SVM works best. Finally according to the results of the LDA_SVM classification I use Gephi to draw social network relationship graph behind texts.
Keywords/Search Tags:LDA, Social network, Topic model, Text classification
PDF Full Text Request
Related items