Font Size: a A A

Design And Implementation Of Enterprise Unstructured Data Management System

Posted on:2023-10-14Degree:MasterType:Thesis
Country:ChinaCandidate:H L LiFull Text:PDF
GTID:2558306914461014Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularization and promotion of multimedia technology,unstructured data such as web pages,documents,and pictures have exploded in the amount of data.For different types of unstructured data,there is no general solution for data storage and management.Selecting different storage strategies according to data characteristics is a solution,but it is more likely to cause the problem of data silo.Eliminating data silos and unified management of unstructured data is an important means to tap the potential wealth value of data and break through data barriers between businesses.In response to the above problems,this paper first introduces the background of the subject,clarifies the two types of data silo faced by enterprises,analyzes the problems that can be solved from the software level,and selects two types of typical unstructured data as examples for unstructured data management.System design and implementation.The two types of data are log data with time series characteristics and large volume,and contract document data with high information density and small volume.Then,according to the actual business needs of the enterprise and the users,the system needs analysis work,and the system is divided into four modules:data import,data retrieval,full lifecycle management and background management through modular thinking.In the implementation of the data import module,this paper proposes a log collection method that is not intrusive to the internal code of the enterprise Java system.In the implementation of the data retrieval module,this paper designs a language that is convenient for users to retrieve.In the realization of the life cycle management module,this paper designs different strategies for the two types of data according to their characteristics,and uses the Elastalert framework for secondary development to complete the data monitoring and alarm function.In the realization of the background management module,the use of RBAC strategy makes the system authority control more convenient.In the development process of the system,the front-end and back-end separation architecture is used,Vue.js is used to build user interaction pages,and the back-end is built using Tornado framework.The system has passed the relevant tests and is practically applied in a certain enterprise.After the system eliminates data silo to a certain extent and solves the problems of difficult access,description and retrieval of two types of unstructured data,it can lay a data foundation for subsequent research on strong data dependence.
Keywords/Search Tags:Unstructured Data, Data Management, Full Lifecycle Management, Data Silo
PDF Full Text Request
Related items