Font Size: a A A

Research And Implementation Of Maritime Big-data Analysis And Process Platform Based On HADOOP

Posted on:2017-12-22Degree:MasterType:Thesis
Country:ChinaCandidate:M K FengFull Text:PDF
GTID:2348330518495638Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Maritime accidents can be reduced or avoided by building models to predict safety of water transport.These models can be built by applying data mining in maritime and finding out the causes that affect waterway accidents.Maritime data has the following characteristics:large-scale,wide range of data type,low density of data value,and requiring high speed data processing.These features make the maritime data have features of big data,which are different from traditional data.So the research and implementation of maritime big-data platform is extremely valuable.There are several data mining software tools today like weka and SPSS,which are able to handle data mining tasks effectively when the computing power of one single PC is adequate for the job.But when numerous data needs to be handled with,it will cost too much time.Mahout provides some well-used machine learning algorithms which can be distributed computing.But when the data changes,it is necessary to re-operate the complete data.At the meantime,maritime data is continuously increasing.Current data mining platforms cannot provide distributed analysis and update data into the platform at the same time and this makes it impossible to support continuously increasing data in maritime data platform.In view of characteristic of maritime data,this paper focuses on the implementation of maritime big data analyzing and processing platform with several common algorithms in data mining on Hadoop,including naive Bayesian,DBSCAN,Apriori.And it also introduces incremental data detection which can improve efficiency of data processing by incremental computation.The experiment shows the scheme this paper implemented can handle tasks including classifying,clustering and correlation analysis efficiently,and cut down the time cost without accuracy degradation.
Keywords/Search Tags:Maritime data, Data mining, Hadoop, Incremental computation
PDF Full Text Request
Related items