Font Size: a A A

Anomalous Connected Subgraph Pattern Mining Of Multi-source Data

Posted on:2022-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:L N YuanFull Text:PDF
GTID:2518306485994629Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,data has been growing explosively,and people have entered the era of big data.The data is developing in the direction of multi-source,isomerization,and complexity.People can easily access and collect a large amount of data and they also face the problems of data security and reliability,such as information leakage.These behaviors are different from the behavior patterns of normal users,which are called abnormal behaviors or abnormal patterns.If we can detect these abnormal behaviors or patterns in the data in advance,we can avoid causing great losses.Therefore,it is of great significance to mine abnormal patterns accurately and efficiently for multi-source data.The main contents are as follows.Firstly,in view of the characteristics of multi-source,complex and diversified data structure,a framework of multi-source data processing based on complex network is proposed,which transforms data features into node attributes and data dependence into edge,and constructs data into complex network for further research.This framework provides support for pattern mining algorithm and text generation algorithm.Secondly,in order to solve the problem of unknown prior pattern and large amount of data that cannot detect abnormal connected subgraphs,the algorithm of abnormal pattern mining algorithm ACSU is proposed.By crawling the sky eye search data,aviation network data,Baidu migration data and academic big data(journals and patents)of School of information and electrical engineering of Hebei University of Engineering,which contain various types of data(directed,undirected,static and dynamic),we use network evaluation index to describe the data characteristics from the overall structure and reveal the rules of the network.Then,ACSU algorithm is used to mine and analyze the abnormal patterns of multiple data sets.The experimental results show that the data processing framework and pattern mining algorithm can quickly extract data features,find significant abnormal patterns of data and have high operation efficiency,which can be applied in practice.Thirdly,for text data,we need to get information from node attributes.To solve the problem of long training time of traditional methods,a text generation method based on abnormal subgraph detection,BJ-ASGD,is proposed,which aims at extracting key information from multi-source data.Firstly,the text generation problem is formally defined and transformed into a subgraph optimization problem.Then,the key words are extracted for subgraph detection.Experiments on three corpus databases,Daily Mail,CNN and News-5,show that compared with the contrast method,the proposed text generation algorithm greatly reduces the training time and improves the efficiency on the basis of ensuring the accuracy.The algorithms proposed in this article can be applied in a variety of real-life scenarios.In the computer network,the computer attack subgraph can be detected.In social networks,the discovered patterns can be used for topic discovery and abnormal user detection.It can also provide strong support for applications such as reading comprehension.
Keywords/Search Tags:anomaly subgraph detection, Complex Network, Data Mining, graphlets, multi-source data
PDF Full Text Request
Related items