Font Size: a A A

Research On Mining Community From Emails

Posted on:2008-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:X D ZhangFull Text:PDF
GTID:2178360242999132Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the network, the individual electronic information of a person such as webpage, email and documents can refer which domain he relates to and how deep he concerns it. Email, as one category in network data, has become the powerful communication platform for cooperation and knowledge sharing. It extends all over the society, trade and technological communication area. In the meanwhile, email associates closely with social network study. Email generally is used to interact in a kind of dynamic intra-organization communication action. This dynamic attribute blurs the relations of levels and structures of organization inner. Therefore, when one comes in front of a great deal of the emails by the automatic collection, it is difficult to find communities and the cooperation and leadership relations of the communities in an immediate and accurate way among the numerous and complicated links.In this thesis, we take email as the researched object and study the related technologies on mining community. These researches include mining methods to find out reasonable organization from large amount of Emails and to evaluate the ranks or importance of the member. Firstly, we extend the important node aggregate on the base of an original aggregate of important Email accounts with determinate steps, and find out the other nodes related tight with the nodes users pay attention to. The traditional community compartmentalization means almost impossibly accomplish based on huge amount of data within the reasonable time. Accordingly aiming at a particular application, we carry on an improvement to the demarcation methods from the efficiency and accuracy. We construct the pre-evaluating community aggregation with improved demarcation methods. Afterwards, for the sake of importance rank values signed on the nodes in the aggregation, we analyze the difference between web page links and email communications, and improve the Google's PageRank algorithm into the Importance Rank (IR) algorithm based on communication frequency and Email topic, which is used to evaluate the ranks of the members in the organization. Finally, we take important grade value higher nodes into important customer aggregation. In the thesis, we carry through experiment and gain prospective results.The mentioned methods have been applied in practiced system. Our experiment provides the important nodes related tight with the nodes of original aggregate. According to these and sampling data, we can get the worthy information. The mining method can pick up available organization comparatively accurately and IR algorithm can offer node's rank comparatively exactly. So we can estimate node's cooperation and leadership based on the foregoing result.
Keywords/Search Tags:Community mining, target discovery, importance rank, relationship
PDF Full Text Request
Related items