Font Size: a A A

People Relation Extraction And People Relation Analysis Of Chinese Microblog Based On Machine Learning

Posted on:2018-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:G ZhouFull Text:PDF
GTID:2348330518995303Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As the Internet technology develops dramatically fast, study of social network is getting more importance for public opinion monitoring and business analysis. Thus the study of network has been a hot topic these years. The Microblog is the biggest Social Network community in China. Its language characteristic materials vary from the traditional media. In accordance with characteristics of microblog, how to improve the people relation extraction on the scene of the users' society and how to improve the predicted ability of people relation intention were studied in this paper.The main work and results of this paper are as below:To be directed against the microblog language materials, the random forest algorithm was improved. The SVMTD RFC algorithm model was designed to enhance the people relation extraction for fuzzy samples. SVM decision tree is introduced to improve the random forest algorithm. By using the SVM decision tree maximum classification interval node splitting algorithm with maximum classification interval, and the random forest voting algorithm based on classification interval weighting, the people relation extraction ability can be enhanced in the fuzzy samples. Moreover, compared with the methods of SVM and random forest, the result indicated that the text method could affect the accuracy of people relation extraction in fuzzy samples, and the method can be used in people relation extraction for medium length text and long text.By studying the traditional model-built method about people relation, the degree of reduction of national method was not enough for people relation in real world. Combined with the advantage of the convenience of the microblog language materials in order to analyze the text emotion, by introducing the motion strength, a type of people relation model-built plan was designed. In this plan, users' properties and behavior characteristic were combined, and analyze the users' emotion strength by setting up motion dictionary and expression dictionary, and then the motion was introduced in the model, so that the experiment can build the multi-dimension model of people relation. The method can simulate the real people relation more accurately and enhance the reality and authenticity and validity.Based on the people relation model above, a type of people relation strength prediction plan based on the multilayer perceptron was raised. By 10-fold cross-validation, firstly, the comparison was between Decision Tree Model and Maximum Entropy Model, the result indicated that the method raised in this thesis can enhance the accuracy of people relation prediction.Secondly, the traditional character of the relational model and proposed character models were compared, we found after the introduction of emotion, the accuracy of forecasts increased, it proves the validity of this model.At last, the result of traditional people relation strength prediction plan can be shown with strong one and weak one, which can receive multiple level quantization relation strength and forecast the strength more precise and accurate. By the observation of different strength level of characters relation and the the influence in the motion feature prediction,it turns out that it is favorable for further study and analysis of relation strength with multiple level quantization.The thesis structure and the content of every chapter are below:Chapter 1 expounds the background and the significance of microblog network study .And then the study status of the people relation predication and the study status of relationship strength estimation was analyzed.Chapter 2 introduces the people relation extraction procedure and involved problems firstly. Lastly, it analyzes the problems of the current relation extraction plan.Chapter 3 analyzes that how to build model of people relation extraction and the shortcoming of the model; ultimately, the relative algorithm is introduced briefly.In Chapter 4, the problems of current people relation extraction is analyzed,which is not enough to extract in fuzzy samples. By introducing the SVM decision tree, it aims to improve the random forest algorithm, and then the technology plan of microblog people relation extraction based on SVMDT_RFC algorithm is raised.Chapter 5 is mainly about the people relation extraction, which can provide the relation description without relation strength. Firstly, the behavior characteristic relation model is combined with property feature, as well as introducing the motion character. Then in order to receive a type of multiple level quantization output, people relation prediction plan based on multilayer perceptron is introduced in this chapter.Chapter 6 summarizes the total thesis, and points out the deficiency in current studies and then the improvement direction hereafter.
Keywords/Search Tags:Chinese Microblog, People Relation Extraction, SVMDT_RFC, People Relationship Strength Prediaction, Multilayer Perceptro
PDF Full Text Request
Related items