Font Size: a A A

Design And Application Of Efficient Storage System For Large-scale Dark Network Data

Posted on:2022-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:M HuangFull Text:PDF
GTID:2518306728466254Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Dark net data is characterized by large data volume,multi-source heterogeneity and so on.Traditional data storage has been unable to process it efficiently.There are many requirements for the storage and management of large-scale dark web data,such as data storage expansion,rapid data update and visualization.In this thesis,based on the research of distributed storage and processing technology of Hadoop,an efficient storage system is designed for heterogeneous and multi-source dark net data,to solve the large-scale dark web data storage and visualization problems.[1].The main research work of this thesis is as follows:(1)preprocessing of dark web data set.It mainly includes dark net data cleaning and dark net data coding.Data cleaning is mainly data deduplication.Currently,the method of deduplication of big data based on cloud computing is adopted,which makes use of the efficient computing ability of cloud computing and the data analysis and processing ability of big data to eliminate the redundant and disturbing information of data,reduce the retrieval time and storage space of data.Geo SOT(geo-graphical coordination subdivision grid with One dimension integer coding on 2"-tree)is used to establish the relationship between geo-space and physical storage.Geo SOT is a method of segmentation and coding,which aims to establish a global unique geo-identity for each slice of the Earth's surface and is the basis of data organization and storage.(2)An efficient storage system is designed.In order to improve the retrieval efficiency of dark web data[2],New SQL storage structure is used in data storage and scheduling management,and the relational database My SQL is combined with the non-relational database HBase to store dark web data.based on the storage structure of New SQL,a data partitioning framework for the dark net is designed,which maps the spatial slice to the storage cell.This architecture is based on the data partitioning mechanism of spherical surface partitioning framework,and builds a data partitioning association model based on uniform partitioning coding.The model makes the spatial information and query conditions of the dark web data relate to some special subdivision of the Earth surface,and provides support for the effective storage and efficient query of the dark web data.(3)A visual representation of the dark web data.Large-scale dark web data are visualized on the page by using ECHARTS(Enterprise Charts)technology,such as maps,line Charts,thermograms,bar Charts,etc.,so that these dark web data are presented in a more intuitive and concise way.Finally,the system is tested in function and performance,and the storage and management effect of mass dark net data is demonstrated,which proves that the system can achieve the expected effect and achieve the expected goal.
Keywords/Search Tags:Dark Web Data, Hadoop, Big Data Storage, Visual Display
PDF Full Text Request
Related items