Font Size: a A A

Design And Implementation Of Image Automatic Cleaning And Labeling Platform

Posted on:2021-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:J D LiFull Text:PDF
GTID:2428330614971312Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of science and technology,people have entered the era of Big Data and Artificial Intelligence.Computer Vision is one of the important fields of Artificial Intelligence,which has a great impact on people's life.Data,algorithms and computational power are the three elements of the development of Computer Vision,and data plays an important role between them.The raw dataset generated in daily life is chaotic and without labels,the raw dataset needs to be cleaned and labeled before training algorithm model.At present,the company's image dataset are mainly cleaned and labeled by hand,which is not only inefficient but also costly.The image automatic cleaning and labeling platform described in this paper can automatically clean and label image dataset by using machine learning algrithms.this is a more efficient,intelligent and low-cost way to process dataset,which can help the company reduce the cost and improve the efficiency to process image dataset.In the process of project research and development,the author first participated in the feasibility analysis and requirements analysis of the project,established the overall goal of the project.The platform was divided into user interaction subsystem and dataset processing subsystem by analyzing platform requirements,the user interaction subsystem includes user management module,dataset management module,and operation management module,the dataset processing subsystem includes image cleaning and labeling module.In the outline design stage,the author designed the platform architecture,the system core process,the database table,the image cleaning and labeling process.In the detailed design stage,the author uses the Spring Boot Framework,Kafka Message Queue,Open CV Library,Mysql database and Object Storage Service and other technologies to design and develop the platform.In terms of dataset processing,in order to achieve efficient and intelligent image cleaning,the author uses Laplacian algorithm for automatically cleaning blurred images,uses Difference Hash algorithm for automatically cleaning similar images,uses Haar Classifier for automatically cleaning faceless images,and implemented image auto-labeing by calling face recognization service of Baidu AI Open Platform.In terms of data storage,the platform uses different ways to store heterogeneous data,using Mysql database to store normal business data,and using Object Storage Service to store files data.In terms of platfrom deployment,in order to achieve high availability of service,the author uses Tomcat,Nginx,Keepalived to deploy the user interaction subsystem,and uses docker cluster to deploy the dataset processing subsystem.Finally,the author thoroughly tested the platform and deployed it online.The image automatic cleaning and labeling service provided by the platform greatly improves the efficiency of image cleaning and labeling within the company,and helps the company to save time and labor costs.
Keywords/Search Tags:Image Automatic Labeling, Image Automatic Cleaning, Docker Cluster, Service High Availability, Message Queue
PDF Full Text Request
Related items