Font Size: a A A

Design And Implementation Of Deep Learning Container Cloud Platform Based On Docker And Kubernetes

Posted on:2020-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:S H LuoFull Text:PDF
GTID:2428330575995066Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the vigorous development of artificial intelligence technology,emerging science and technology have penetrated into the daily life and national development of human beings in the new era,and the people have truly felt the power brought by science and technology.As a powerful booster of artificial intelligence,deep learning is especially important in the development of science and technology.As far as the current situation of the computer science industry in China is concerned,China's achievements in the field of artificial intelligence are still far behind the developed countries in Europe and America,and lack of core technology is still in the catch-up period.There are many factors that hinder development,such as high maintenance costs of computing resources,relatively high industry thresholds,and lack of talents.These are the stumbling blocks that hinder the development of artificial intelligence in China.In order to reduce the threshold of artificial intelligence from the technical and cost as much as possible,and to improve the efficiency of domestic SMEs and individual developers,this paper uses container technology,container orchestration technology,and machine learning or deep learning framework such as Tensorflow,Caffe,and Pytorch,to create a stable,easy-to-use,and easy-to-expand deep learning platform.The platform provides three functions of model development,model training and model service.It can realize the task development environment of second level,one-click operation model training batch task and one-click release model service.At the same time,the platform provides GPU time-sharing multiplexing,task management,distributed storage,and task monitoring and alarming.In addition,the platform provides a variety of computing resources and can control the amount of resources used.It is a complete deep learning platform.I participated in the whole process of development of the deep learning platform,the specific content is as follows:(1)Assist the product manager to complete the analysis of the platform user needs and streamline the most critical needs;(2)Work with the architect on the overall design of the platform and investigate some key components in advance;(3)Realizing the construction,maintenance,and mirroring of the mirror warehouse;(4)Complete the design and implementation of the scheduling system Minion-proxy and state machine;(5)Completion of the docking of the storage system and the platform;(6)Completion of three model tasks.The current system has been officially released and has more than 500 users.The platform is simple and easy to use,enabling users to bid farewell to complex environmental configurations,improving resource utilization and greatly reducing development costs.The platform also has a complete technical manual and examples of each stage,which greatly facilitates the user's use.
Keywords/Search Tags:Artificial Intelligence, Deep Learning, Container Technology, Distributed System
PDF Full Text Request
Related items