Font Size: a A A

Research On The Method Of Subgraph Query Based On Kinship Network

Posted on:2018-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2348330515474729Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Kinship network describes the relations between spouses,fertility,person and household.Among these,all the complicated relationship between the people can be expressed as the transfer closure of the five kinds of basic relationships: spouse-to-spouse,father-to-daughter,father-to-son,mother-to-son,mother-to-daughter.The population database includes five different levels of the data.Therefore,village is the basic research unit of the kinship network.The Village consists of households,which include the families,detailed information of household,the basic relationship between people and people,the relationship between household and person.With the increasing amount of data,the complexity of the relationship continues to rise.According to the different requirements,finding out the specific subgraph have become one of the difficulties of the study.This paper based on the real data.It analyses and preprocesses the data,as well identifies and corrects the wrong data in the kinship network.In this paper,we greatly improve the quality of the population data.In order to convenient the management of the population data,we add new nodes information and relationship.For the province's population data,we extend the original of kinship network from the dimension.Limit pattern matching method can work out a part of matching problem.This paper puts forward the oriented limited pattern subgraph query method,the definition of model was improved,and puts forward the new match rule.Based on the definition of different families in sociology and demography,using the new method to achieve query different structure.In this paper,the research content mainly includes the following aspects:(1)The research of the the kinship network data model.The structure of the Kinship network is changeable and complex relationship.Batch complicated query requirements for healthy storage structure.However,the efficiency of complex queries on the relational database is not so good as expected.At the same time,its results of the query is not straightforward.In this paper,on the basis of the real demographic data,combined with the actual meaning expressed by the kinship network,using the table and graph structure describe kinship network.Analysis of the real population data in a relationaldatabase design patterns,combination with the characteristics of graph structure completed transformation from relational model to graph model.Using the data of table structure storage and and the data of graph structure storage together,analyzing and testing data quality of network.(2)The research of the Redundant data screening algorithm.Due to data collection,registered permanent residence migration,data update,the real population database have a lot of repetition,approximate nodes.That is,some people have multiple population codes.At the same time,and also different population code carries on the different kinship.Therefore,the kinship network of identifying and correcting the error data is a very urgent task.In this paper,the Redundant data screening algorithm in order to complete delete duplicate nodes in the data,update the deleted node carries edge.In order to clearly describe the evolution of household,In this paper,the original household-to-person edge extended to household-to-person and old household-to-person.(3)Oriented limited pattern subgraph query method.Our laboratory group put forward the limit pattern matching method.This method can accurately describe the part of the specific structure,but it cannot describe complex connecting substructure.Hence,the limit pattern matching method is flawed.Therefore,this paper puts forward the oriented limited pattern subgraph query methods,the definition of original model was improved,and puts forward a new match rule.Using the oriented limited pattern subgraph query method can achieve on-demand match the subgraph which is having the specific structure.Such as: the incomplete family,we can use oriented limited pattern subgraph query method to describe the search criteria.According to the definition and standard of demography,determining the selection principle of subgraph,converting the selection principle to matching criteria.(4)The application of the oriented limited pattern subgraph query method.Starting from the actual application,this paper puts forward the dense subnet,the subnet of disadvantaged families,and the abnormal subnet.In order to complete the specific subgraph query work,using the oriented limited pattern subgraph query method is applied to the kinship network.According to the definition of demography,given different subnetselection principle.According to the selection principle generating the specific model.Converting the model to match rule,then achieve querying the subnet in the entire network.Finally,we display the visualization results.
Keywords/Search Tags:Kinship network, Data pattern, error data screening, limited pattern, subgraph query
PDF Full Text Request
Related items