| With the deepening of the digitalization and intellectualization of waterborne traffic,ship data has blowout growth,which has caused a great burden on the traditional data processing and mining platform.At the same time,massive ship data has provided a cornerstone for the intellectualization of the ship field.How to correctly process and apply these data has become one of the research hotspots in the field of ships.On March 17,2016,the Outline of the 13th Five-Year Plan for National Economic and Social Development of the People’s Republic of China was issued.Chapter 27 of the Outline,Implementing the National Big Data Strategy,puts forward that we should take big data as the basic strategic resource,comprehensively implement actions to promote big data development,accelerate the sharing,opening up and Application of data resources,and promote industrial transformation,upgrading and social governance.Rational innovation.It is not difficult to see that large data technology will be one of the research hotspots in ship data processing and mining.At present,there are many kinds and sources of data in ship field,and there is no unified data processing and mining platform for data interaction.The situation of "separate government" is still widespread.Because there are many kinds of data sources,the single processing mode of traditional data processing and mining platform can not meet the actual needs of the shipping industry.In order to improve the ability of data processing and mining in the field of ship,this paper proposes to build a general data processing and mining platform based on Spark,and carries out a more specific processing and mining for ship AIS data.The main work is as follows:(1)Firstly,a general data processing and mining platform based on Spark is designed.The platform is divided into three modules:database module,Spark data mining module and visualization module.Secondly,the platform running environment based on Ubuntu system and software configuration such as JDK,Spark and HBASE are built.(2)Through specific AIS data,module testing and overall testing of the built platform are carried out,including database module testing with small data volume and platform overall testing with large data volume.With the actual display effect of the front end,the parameters of the platform are constantly adjusted to achieve the best display effect of the mining results.(3)After the database receives AIS data,it first stores the data uniformly and carries out basic pre-processing,separating the available data from the unavailable data,ensuring that the data used in the later data processing and mining are available data,then processes and mines the data needed according to the corresponding requirements,and links the visualization module to display the structure.(4)The ship data of 12 major coal ports along the coast of China and the traffic data of Sunda Strait,one of the largest Straits of global traffic flow,are selected for data processing and mining,in order to verify the practicability and efficiency of the platform.The results show that the platform works well and achieves the expected goal. |