Font Size: a A A

Multi-source Heterogeneous Secure Data Processing Analysis Based On Multi-manifold Learning

Posted on:2019-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:S XiaoFull Text:PDF
GTID:2428330548487410Subject:Engineering
Abstract/Summary:PDF Full Text Request
The analysis of multi-source security data is the foundation of network security analysis and prediction.The fusion analysis technology of multi-source data is an important method for processing security data.Log data can record changes in the status of the system,and log files can indicate changes in system status.Manifold learning algorithm is a widely used method of data dimensionality reduction and feature extraction in the last decade.This method integrates computer science,mathematics,intelligent science,and cognitive science,and has become the focus and hot direction in the field of machine learning and research.Combined with the manifold learning algorithm,the paper divides the fusion analysis of multi-source heterogeneous security data into three parts: multi-source data preprocessing,feature extraction and security analysis.The first part is the preprocessing stage,which is mainly the preprocessing of multi-source security data.Security data generally exists in network security devices.In order to reduce the heterogeneity of multi-source data in semantics,time and space,and remove dirty data,this paper proposes a data preprocessing method based on manifold learning algorithm.First,the data is first filtered to identify and identify noise data and other data cleaning operations.Then the stream source learning algorithm is used to reduce the data source and reduce the amount of data and other data reduction operations to obtain high-quality data.The second part is the security feature extraction stage,which is mainly the feature extraction of pre-processed data.In order to analyze multi-source heterogeneous data sources and select reasonable data features to reveal the essential features of the data,a method of data feature extraction based on multi manifold learning algorithm is proposed,which takes into account the category attributes and distance information of multisource data.The third part is the security analysis stage,which mainly analyzes the security of the extracted data features.Random forest algorithm is widely used due to its advantages of easy construction,strong universality,and convenience of combination with other algorithms.However,the traditional random forest learning algorithm has the disadvantages of being time-consuming,easily producing similar decision trees and having low construction efficiency.Therefore,a random forest construction method based on multiple manifold learning is proposed,which selects the essential attributes of the data to build a decision tree to generate random forests to improve the accuracy of the random forest,and to avoid the effect of noise and the phenomenon of over-fitting.Therefore,this paper proposes a random forest construction method based on multi-manifold learning.It selects the essential attributes of the data to build a decision tree,generates random forests,improves the accuracy of random forests,and effectively avoids the effects of noise and over-fitting.
Keywords/Search Tags:Security analysis, Multi-source heterogeneous data, Data preprocessing, Multi-manifold learning, Random forest
PDF Full Text Request
Related items