Font Size: a A A

Design And Implementation Of Automated Tool And Log Monitoring For Picture Cleaning

Posted on:2021-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:D Z ZhongFull Text:PDF
GTID:2428330614471563Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The application of image recognition algorithm is more and more extensive,and the training of the algorithm model requires a large amount of original image data as the model training set and test set.The quality of the original data is mixed,and the dirty data will directly reduce the accuracy of the algorithm model.In order to get clean and accurate training data,data cleaning is very necessary.This paper analyzes the user's functional requirements.First the system meets the user's requirements for clean images,and then for the non-functional requirements,the requirements for the system in terms of practicality,reliability,maintainability,scalability,etc.Then,for the whole system framework,a summary design is proposed.The work of this paper mainly includes two modules,data cleaning pipeline and monitoring alarm system.The data cleaning pipeline includes two modules of data cleaning and data using.The monitoring alarm system includes log monitoring,alarms and results show three modules.On the basis of the outline design,the paper makes detailed design of the five modules.Different from the traditional data cleaning script,this paper builds a complete system of data cleaning and monitoring alarms.The algorithm for implementing the data cleaning process is mainly written in Python,and the data transmission link mainly uses shell scripts.The pipeline is built in parallel with the Jenkins task.At the same time,the EFK software is used to construct the monitoring alarm system.In order to achieve the cleaning and finishing of the original image data,this paper builds an automated pipeline based on Jenkins,including removing duplication,removing blurring,removing no-content,tilting rotation,and layout analysis.This pipeline mainly moves the dirty image in the original image set,resets the tilted image,and recognizes various question types in the image,such as horizontal,vertical,and de?form.According to the user,the algorithm model builder needs,it will contain the required question type.The most images,called valuable images,are provided to the user as a training set.A portion of the dirty image is provided to the user as a test set.In order to realize the situation of the image data cleaning situation,this paper builds a log monitoring alarm system based on EFK.Through the collection of the cleaning process log and regular monitoring,the user is alerted to the abnormal situation of cleaning,and the cleaning effect is graphically displayed.EFK continuously updates log data during the cleaning process and generates dynamic dashboard to show real-time cleaning results.After the system construction,the relevant attributes of functional requirements and non-functional requirements are tested to assess whether the quality meets user's expectations.I am responsible for the construction of the assembly line,including writing of removing blur,removing repeat,removing the no-content algorithm and monitoring the deployment of the alarm system,besides the later test work.
Keywords/Search Tags:Data clean, Jenkins, Pipeline, ELK, Monitor alarm
PDF Full Text Request
Related items