Font Size: a A A

Multi-Sources Big Data Machine-Learning Cloud Platform

Posted on:2019-02-22Degree:MasterType:Thesis
Country:ChinaCandidate:Z K FanFull Text:PDF
GTID:2428330548977447Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As machine learning continues to evolve,the endless stream of machine learning frameworks that reduce the cost of learning for computer professionals can help accomplish many of the tasks of data analysis.However,machine learning still requires systematic expertise and there are many difficulties in building a machine learning system:1)large data volumes and complicated structures,and difficulties in processing;2)there are various industry backgrounds for data sources,so various industries and the computer industry There is a great deal of professional difference;3)The machine learning model and model super parameters are not scientifically selected and adjusted completely by hand.In response to the above three issues,we set up a cloud platform that supports multi-source big data.The cloud platform can provide simple interfaces and applications to eliminate the technical barriers of computer practitioners and other industry data experts.It can help non-computer practitioners It also allows for easy data analysis,as well as helping professional practitioners gain a comprehensive understanding of the data in advance and automatically select the model and adjust the parameters.Platform provides a one-stop big data processing and analysis,including 1)the use of the latest distributed data processing engine,through the analysis of different data source structure,as well as the current mainstream of the machine learning domain data processing methods abstraction,Automating the slicing and processing of multi-source big data so as to quickly deal with the problem of big data with complex structure;2)By integrating the most popular machine learning frameworks in the field of machine learning and considering various types Frame excellent design ideas,its multi-level,multi-language,multi-modular abstraction,making greatly reduced the use of various types of framework of the threshold without losing its powerful algorithmic processing capabilities,thus greatly simplifying the machine learning tools to facilitate Computer staffs use them;3)Independently design and implement automatic supervised machine learning algorithms,select the model automatically through the integrated genetic algorithm and Bayesian optimizer to quickly search the space of super-reference,greatly reducing the space of the whole machine learning process Time complexity,while GA and Bayesian gifted The dispenser can further ensure the independence of the search,which can guarantee optimum balance of local and global optimal solution.
Keywords/Search Tags:AutoML, Big data processing, Distributed clustering, Genetic algorithm, Bayesian optimizerz
PDF Full Text Request
Related items