Font Size: a A A

Research On Massive Simulation Calculation Data Processing Technology On Power Grid

Posted on:2018-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Y DuFull Text:PDF
GTID:2382330569485425Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Power grid simulation calculation is to simulate actual circuits by using various mathematical simulation softwares.Power grid simulation calculation can produce a very large number of data files.It's a good choice to process the data on Hadoop.As HDFS is designed to store and process big data,while the size of a power grid simulation file is much less than 1 MB,which brings a great challenge to its data processing on HDFS.The memory of NameNode is occupied by the metadata of power grid simulation calculation files.Moreover,the retrieval of power grid simulation calculation is difficult and inefficient.In order to solve the above problems,in this report,we analyze the architecture of HDFS,summarize the existing research results,and conduct a series of data processing on the power grid simulation calculation file.The specific operation include merging the files of the same type and establishing them a multi-layer indexing based on the Trie tree.Then we conduct experiments to evaluate our design for processing these data on the Hadoop platform.The main work of this paper includes the following aspects: We study and analyze the data characteristics of the power grid simulation calculation files,merge the grid simulation calculation files according to thier file types after inspired by the idea of Hadoop Archives files technology.Furtherly we take the file type as the global index.We establish the local index for the power grid simalation calculation files.We analyze the name of the power grid simulation calculation files,and make the Trie tree structure of them as the firstlevel index.Then we further establish the local secondary-index for them using the first letter of the file name.The experiment result shows that the merging algorithm of the multilayer technology on power grid simulation calculation files effectively reduces the memory usage of the NameNode in the Hadoop distributed file system.Also,the multilayer index structure of the grid simulation calculation files can improve its retrieval efficiency on the Hadoop.
Keywords/Search Tags:Simulation calculation file, NameNode memory, Trie tree, multilayer index
PDF Full Text Request
Related items