Font Size: a A A

Design And Implementation Of Universal Data Testing Platform Based On LAMP Framework

Posted on:2016-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:J P ZhaoFull Text:PDF
GTID:2308330470955842Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Faced with opportunities and challenges in big data era, it’s essential to conduct data quality assurance technology research and practice. Traditional data testing methods were found manpower consuming and not comprehensive. While automating the testing process of (semi-)structured data by means of structure-detecting, type-learning and rules-checking, the testing results were expected to be visual, helping data owners discover and analyze data problems. And rule code needed checking while being composed to ensure data testing tasks were performed correctly. In addition, the data testing experience of business personal was underused, which could contribute to repairing data problems directly. Constantly changing data carried risks to those who used them, therefore a complete set of periodic data monitoring solutions were required to warn data owners of problems in time and provide clues to solving them. The universe data testing platform was created to support the quality assurance businesses of (semi-)structured data, including data testing, data repair and data monitoring.According to the software engineering thinking, the author completes the requirements analysis, design and implementation, and testing tasks of the functionalities with regard to data testing, data repair and data monitoring mentioned in the article. Firstly, the author analyzes system tasks, target users and the three key business processes, data testing, data repair and data monitoring, then identifies system use cases, divides modules and sub-function points and makes both functional and non-functional requirements of the system clear. Afterwards, from the logical layered architecture, JSON data interfaces and database aspects, the author completes the system outline design and achieves data testing report and data preview functions based on LAMP framework, PHP Yii framework, and front-end technologies like JQuery and Ajax, which visualize data structure, column types and quality indexes, assisting in analyzing data problems and composing user rules, thus making full use of human cognitive abilities and experience accumulated in specific business areas. Meanwhile, the author designs and achieves a common set of simple Python code detection mechanism to ensure the quality of rule code. Besides, the author appends data repair mechanism to the rules computation process so as to improve data quality directly. Last but not least, the author designs and achieves a complete set of data quality indexes monitoring solutions, data file, coverage rate and error rate indexes included, which can warn data quality problems promptly. Data quality indexes report service and coverage tendency chart service based on Baidu Echarts are also provided to help data owners to have a global understanding of data quality condition and offer clues to locating data problems.The universal data testing platform currently provides data quality assurance services for more than20departments of the company every day. It’s been almost half a year since the data monitoring module launched. There are more than40data monitoring templates. By the afternoon of April14,2015, data monitoring computation has been performed for19172times and spotted over30effective problems in the monitoring process of online critical data like group-buying data and character relationship knowledge graph data. The corresponding features are simple, easy to use and able to find low-quality data problems including data-grabbing template failures promptly, and then inform and promote data owners to follow and solve the problems in time, thus guaranteeing product iterations and operation decisions based on the data.
Keywords/Search Tags:Data Quality, Data Testing, Data Repair, Data Monitoring, LAMPFramework, Yii Framework
PDF Full Text Request
Related items