Font Size: a A A

A Study On Technique Of Email Group Analysis

Posted on:2011-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:H B WangFull Text:PDF
GTID:2178330338475908Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As an important part of digital-evidence analysis, the analysis on email group can indirectly reveal social relationships in a social group by means of looking into their emails. And the main way of this research is: based on analyzing the email group network, it aims at finding out the social relationships between different members of an social group, like the community structure denoting which person belongs to which community, or the degree of every member's significance within the group. The algorithms of detecting community structure in a network can be classified into two categories: divisive and aggregate. The former one finds out all the communities by deleting all the medium edges which is between different communities, whereas the latter one do it by aggregating nodes with higher similarity. And usually the analysis result of the former is more accurate. Generally an algorithm of assessing the degree of every member's significance uses an index for measuring it. These algorithms are not accurate enough for the indices they use are lopsided and can not stand for the degree of significance greatly.At the start of this paper, we get to know intimately about the technology of email-evidence analysis, and methods of digging social relationships in networks as well via a lot of literatures. We focus on the existing algorithms of detecting community structure and evaluating the significance of every member in networks, meanwhile find out the shortcomings of them. Then based on the above-mentioned work, we propose new methods for detecting community structure of email group and evaluating the degree of every member's significance within email group.First, in order to detect community structure of email group, a new divisive algorithm is proposed. A successful algorithm of detecting community structure requires two factors: a good index for measuring how medium an edge is and a good mechanism for checking whether a sub-graph is a community. So in order to find out all the medium edges which connects two different communities, we synthesize the concept of edge betweenness-centrality with communication frequency between two email accounts, and as a result set up a new index called"Mediumness"for measuring how medium an edge is. Meanwhile an algorithm for calculating all the edges'values of betweenness-centrality is proposed. Then for the purpose of checking if a sub-graph is a community, a thorough definition of community is made. If a sub-graph fits the definition, then it is a community; otherwise it needs to be divided by deleting the medium edges inside it. Finally, the whole algorithm is expounded based on the two above-memtioned points. In the experiment, we construct four kinds of simulation network with twenty instances for each, and then we use our algorithm and another divisive algorithm to detect their community structure. The results show that our algorithm is more accurate. Furthermore, in order to be clear about the accuracy of Mediumness in finding medium edge, specially we construct more simulation networks and execute the two algorithms again. The results show that Mediumness is better at finding medium edges.Second, in order to evaluate the degree every member's significance in email group, a new method is proposed, which is based on the above-mentioned algorithm of detecting community structure. In our method, an evaluation model is set up, in which the importance of every member in email group is assessed synthetically from various angles. We use this method to evaluate the degree of every member's significance in Enron email dataset, and the result shows that to a large extent it can evaluate the degree of every member's significance in email group.
Keywords/Search Tags:email group, community structure, significance of member
PDF Full Text Request
Related items