Font Size: a A A

Research On Distributed Storage Management Method For Manufacturing Big Data

Posted on:2018-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:M WangFull Text:PDF
GTID:2348330515989689Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
German industry 4.0 marks the arrival of large-data era for the manufacturing industry.Throughout the life cycle of a product,it will produce a large amount of structured,semi-structured and unstructured data.They are characterized with multiple modes,high throughput and strong correlation.As the key to the new generation of information technology,manufacturing big data is gradually becoming the core of the industrial revolution.And it is turning into an important factor in the realization of intelligent production.Therefore the storage and management of manufacturing big data has become a hot spot for research.Distributed storage is the most common solution in this field.Existing methods for storing manufacturing big data,like distributed storage and industrial big data management platforms,is inadequate for the following reasons:(1)data management is decentralized,when there is need for information sharing,it will cause frequent communication between personnel;(2)Their capacity is not enough for the management of complex relationship;(3)Present management platforms are all general-purpose system.They are not supportive for proprietary characteristics of manufacturing big data.In order to solve the shortage of existing storage methods,this paper designs and implements a distributed storage system specifically for manufacturing big data.It uses the Object Deputy Database to manage the metadata and association relations between data,and it uses HDFS to store the real data files.The work of this paper mainly includes the following aspects:(1)With the use of the deputy relationship between the source class and the deputy class,we propose a manufacturing unstructured data modeling method based on the object deputy model,which models the metadata,the composition relation,constraint relation and the life cycle relation respectively.(2)Because of the large amount of small files in the manufacturing big data,and the existence of storage space waste when HDFS stores small files,we optimize the small file storage by aggregating small files according to their semantic relations and the space utilization after their combination.(3)On account of the timeliness of manufacturing big data,we improve the replication management mechanism of HDFS in this paper.According to the historical access frequency of the file and usage of storage space,we calculate the replica requirement of the file and adjust replications dynamically.When there is a need to add replica,we choose the optimal storage node based on the working state of nodes,network overhead of replica copying and relevant users' reading efficiency.Finally we deploy the distributed storage system proposed in this paper in a practical working environment,to validate the function and performance of the proposed scheme.The experimental results show the correctness and integrity of the function.They also prove the effectiveness of the proposed method which significantly improves the reading efficiency of the system.
Keywords/Search Tags:Manufacturing Big Data, Distributed Storage, Object Deputy Model, Association, Storage Optimization
PDF Full Text Request
Related items