Font Size: a A A

Research On Key Technologies Of The Big Data Management Platform Of Nantong Archives

Posted on:2021-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z C YangFull Text:PDF
GTID:2438330626454558Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Archives are one of the important information resources of enterprises,nowadays,the trend of network information is more and more obvious,the data volume of "inventory" in Nantong archives is far more than before,there are archives from all provinces in the archives,it is particularly important for the management and application of big data in Nantong archives.One of the key applications of Nantong archives is the real-time analysis of big data.Based on this,for the big data management core of Nantong archives,it is necessary to achieve the important goals of storage,automatic backup,easy processing,etc.However,the realization of these goals based on traditional relational database will cause some bottleneck problems in the management of big data in Nantong archives,i.e.capacity,storage efficiency,processing performance and query optimization,to summarize the problems involved in the traditional research methods,the following three points are summarized.First,storage performance,more and more data stored in traditional relational data,scalability is also low,lower performance,and for semi-structured,unstructured data storage effect is not ideal.Second,on the level of efficiency query,for tables with relatively large data volume,it often has low query efficiency and high latency.Third,high concurrency,the slow query caused by many associations of its relational library results in high CPU load and no response of the server.Various deficiencies from traditional research methods,put forward optimization analysis platform for big data management of Nantong archives,the main work of this study includes:(1)Big data management of Nantong archives,firstly,the system architecture of big data management platform for Nantong archives is proposed,it includes many data management activities from source data collection to result output,a series of management modules coordinate with each other and have clear division of labor,build its complete big data platform management system.(2)Storage of source data for Nantong archives,a non relational database mongodb is proposed,compared with the original relational database mysql,it has faster write speed,it has a very significant effect in dealing with high concurrent and large amount of data.(3)Query performance optimization of big data management platform for Nantong archives,data optimization is the core of big data interactive query analysis,this paper successively optimizes the big data of Nantong archives in terms of modifying the query plan and based on spark streaming algorithm,improve query speed and efficiency,and meet real-time processing scenarios.
Keywords/Search Tags:Nantong archives, big data, Spark streaming
PDF Full Text Request
Related items