| Machine learning,as a classical interdisciplinary subject involving many mathematical and computer disciplines,has been widely applied to complex engineering or field problems by the engineering and scientific circles.After more than half a century of twists and turns development,The adaptive and self-learning information analysis and processing mechanism of machine learning algorithms represented by deep learning,integrated learning,bayes learning,support vector machines,etc.,has made breakthroughs in the fields of medicine,agriculture,military science,especially computer vision and natural language processing.Marine science is a new direction of machine learning application research,and physical oceanography is a branch of this direction.With the large amount of investment in Marine research and the continuous improvement of Marine physical observation tools and means,more and more physical information in the ocean is being mastered by us.Therefore,data-driven research methods are widely used in physical oceanography.How to make use of the observed Marine physical information to predict the information of unobserved points is one of the problems to be solved by researchers in Marine science and computer information processing.This paper will introduce the method of hierarchical clustering,random forest and variational self-encoder to deeply study the spatial distribution law of ocean temperature,and strive to improve the spatial resolution of existing ocean temperature data through machine learning,so as to provide data support for the further study of ocean thermocline.The main research contents of this paper are as follows:(1)In order to solve the problem of low resolution of existing models of ocean water temperature,a high resolution model of ocean water temperature based on hierarchical clustering and random forest is proposed for fine-grained spatial distribution of existing ocean water temperature.In this method,the data sets are first normalized to eliminate the influence of dimensional difference on the experiment.Secondly,we use the bottom-up aggregation strategy for hierarchical clustering,and divide the data into five clusters.Finally,for each divided cluster,we use the grid search method to find the best parameters of the random forest model and build the random forest model for it.The experimental results on the typical ocean temperature data set BOA_Argo show that the prediction accuracy of the model proposed in this paper is better than that of the traditional random forest model,especially in some local sea areas divided by clustering,the model accuracy can be improved by about 10 times.Based on the analysis and calculation of ocean temperature gradient value of the original data and the fine-grained data,some thin thermocline layers which could not be found in the original data distribution were identified by inversion,and the distribution pattern of the thermocline in this region was further determined.(2)Aiming at another problem of sparsity and unbalance of ocean water temperature data,a heuristic high resolution ocean water temperature model based on variational self-encoder is proposed.The model combines supervised learning with unsupervised learning.Firstly,the method takes the vertical observed value of ocean water temperature in the data set as the input vector,and obtains the probability distribution of ocean water temperature on the vertical observed layer by using variational autoencoder.Then,a heuristic network is constructed by using the variational auto-encoder network to solve the problem of sample unbalance in the data set.Finally,heuristic network and deep learning regression network are combined to solve the problem of sample sparsity in data set.The experimental results show that the prediction accuracy of the model proposed in this paper is improved by about 0.084,or about 47.8%,compared with that of the deep regression learning model for simple prediction of ocean water temperature,which can alleviate the imbalance of data sets to some extent.And this model can theoretically improve the spatial resolution of ocean water temperature to any extent. |