Research And Implementation Of Person Relationship Map Based On Co-Occurrence And Association Mining

Posted on:2020-04-17

Degree:Master

Type:Thesis

Country:China

Candidate:J Li

Full Text:PDF

GTID:2428330572493941

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Modern people's life is getting faster and faster,it is difficult to extract a large part of the time to read articles,the paper provides some methods,can quickly understand the relationship between characters and characters in an article,according to the data obtained,use data Explain the main character of the character in the whole article,and the character relationship diagram can display the person's interpersonal relationship circle,help the reader to clear and grasp the close relationship between the characters before reading the full text,greatly saving the reading time.The paper selects "White Deer" as the research object,and uses the method of co-occurrence analysis and association rule mining to focus on the research object.This thesis uses Python language to write the program,extracts the name node in the text by co-occurrence analysis,and gives the weight size;at the same time extracts the weight of the edge between the two nodes in the corpus.The co-word matrix is constructed according to the extracted nodes and keyword pairs.In order to obtain the similarity matrix,the similarity is determined by using the coincidence factor Ochiai,so that the closer the distance between the two keywords is,the larger the obtained value is,and the similarity is obtained.The better.The Euclidean distance is the most intuitive measure of the linear distance between two points in a two-dimensional space.The SPSS clustering analysis software is used to find the Euclidean distance of the co-word matrix.The larger the distance,the larger the difference,and the smaller the distance,the higher the similarity.In order to better analyze the clustering of the co-word matrix,R-type clustering and Q-type clustering are performed on the co-word matrix.R-type clustering can not only understand the intimacy between variables,but also understand the affinity between variable combinations.In far and near relationship,Q-type clustering clusters cases according to variable information,and the generated pedigree map better illustrates the results of cluster analysis.When drawing the character relationship map,the text document format of the extracted point node and the edge information is respectively converted into a.CSV format,and then respectively imported into Gephi software,and the figure of the character relationship is drawn according to the pre-designed requirements.It is more intuitive to analyze the intimacy between characters from the drawn person diagram.Weka is used as an auxiliary tool in mining association rules.The commonly used Apriori algorithm is used in association rules.The setting of data set in Apriori algorithm is an important link.The whole text should be used as a database to separate each chapter of the article.The keywords appearing in the chapter are used as a record.The list of keywords in all chapters is combined to form a data set.The database is scanned multiple times,and frequent itemsets are found from the constructed data set,and association rules between characters are found.

Keywords/Search Tags:

Co-occurrence Analysis, Cluster Analysis, Person Relationship Diagram, Frequent Itemsets, Association Rules

PDF Full Text Request

Related items

1	An Algorithm And Context Analysis Of Mining Frequent Closet Itemsets
2	Research On The Method Of Condensing Association Rules
3	Research On Top-K Frequent Itemsets Datamining Algorithm
4	Research On Mining Algorithms Of Maximal Frequent Itemsets And Opened Frequent Itemsets
5	Research On Key Algorithms For Mining Frequent Patterns In Data Streams And Their Application In Simulation System
6	Research And Improvement On The Algorithms Of Mining Association Rules
7	Research And Application Of Frequent Itemsets Mining Algorithm
8	Research On Frequent Closed Itemsets Mining Algorithms
9	The Applied Research Of Customer Relationship Management Based On Association Rules
10	Mining Negative Association Rules Study Based On Negative Frequent Itemsets