The scale of data center services for Internet companies and operators has shown a sharp increase with the development of big data and artificial intelligence technology.It brings new problems to the storage and analysis of networks in the environment with high-speed networks.From the perspective of existing storage methods,relational databases cannot adapt to the reception rate in massive data storage with the development of cloud networks and hardware equipment,which causes loss of data during execution.From the perspective of storage configuration,large databases contain a large number of parameters that need to be configured before running.Manual configuration or the default configuration cannot adapt to environmental changes,which causes the database cannot be maintained at the perfect storage level.This thesis aims at the optimization research of high-speed network storage in the DPDK environment,and mainly contains the following three aspects:Firstly,aiming at the inconsistent rate of memory database storage based on DPDK,a multi-batch parallel storage optimization method base on pipeline is proposed.The cache operation of high-speed network data is realized under the support of DPDK.This method realizes pre-processing by using RSS technology and data distribution technology,and transforms the current single-queue and singleCPU execution into multi-queue and multi-CPU operation.In consideration of the difference in data receiving and data storage batch size during the data storage process,the burst of DPDK and Redis pipeline are combined to reduce the difference in processing volume during data storage.The experimental results show that the pipeline-based multi-batch parallel processing method can effectively improve the storage of large-scale data and guarantee the storage I / O performance.Secondly,a parameter configuration optimization method based on the progressive perception model are proposed in view of the performance bottleneck caused by the data storage configuration.This method establishes a regression model by hierarchically using the working characteristics,and gradually generates an overall model of the storage system by using a gradient boosting tree algorithm.On the basis of generating a prediction model,iteratively find the optimal storage scheme of system storage through genetic algorithm.Experiments show that the system model generated by the proposed method can effectively sense the system performance,and the generated configuration scheme can effectively improve execution time and throughput of storage.Finally,the prototype system design is carried out for the above proposed traffic storage problem solution.The multi-batch Storage Optimization Algorithm and parameter configuration optimization method are combined.The module is redeveloped on the open source Zabbix.A visual storage performance analysis and monitoring system are written with web framework.Experimental results show that pipeline-based multi-batch parallel storage method guarantees real-time data storage;Redis-oriented storage parameter configuration method can ensure the accuracy of storage configuration and improve the efficiency of data storage. |