Font Size: a A A

Entity Profiling Algorithm Design And Implementation In Email Network

Posted on:2015-10-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiangFull Text:PDF
GTID:2308330473453150Subject:Information security
Abstract/Summary:PDF Full Text Request
Social network entity outlining is one of the hot issues of current research in social network analysis. In real social network, it is usually hard to get the personalized information of network entities, making it difficult to form an objective view of the individuals and groups. By studying the social network outlining modeling method, researchers are more easily to get to know the microscopic characteristics of social networks, and thus can make better use of social network technology to serve the community.In this paper, we choose the typical social network--Email communication network as the research object, drawing on the theoretical achievements from varieties of complex networks and text data mining field, to study the social network outlining problem from four aspects which include community division, key node identification, node relationship prediction and theme modeling on nodes. Secondly, we use the importance analysis of node topology to deduce the node position in the whole network. Then, according to the result of network topology relation analysis, use the node link prediction technique to predict the potentials and the communication relation that may occur in the future. Finally, use the text theme modeling technique to do the theme extraction from the communication context of the entity, to get a knowledge of which theme the entity concerns and the hot themes in the network, thus to achieve the outline of features of entities and the group. The main contributions of this paper include:1 The proposal of a optimized node relation prediction algorithm based on ensemble learning. According to our study, there is a complementary relationship in the correct result of some node relationship prediction algorithm. Through transforming the node relation prediction problem to a classification problem, we use several classical algorithms as the weak learner, to get a strong classifier based on AdaBoost. The experimental result show that, either on the dataset arXiv paper collaboration network or on the Enron email network, the algorithm we propose has significantly improve the accuracy and the recall ratio.2 The design of a set of modules of entity outlining analysis for the Email communication network, and intergraded that into the visualized analysis system of our own. The test result show that, using the algorithm we proposed, we can accurately extract the social circle that the entity belongs, identify the position of the entity in the whole network, predict the potential communication relationship, discover the hot topics, thus to help users to get a better understanding of the social network and to hold the microscopic characteristics of social network.
Keywords/Search Tags:social network analysis, data mining, community structure, link prediction, topic model
PDF Full Text Request
Related items