As an extreme weather phenomenon,the wind generated by typhoons’ instantaneous maximum wind speed can cause significant harm to the affected areas.Due to the limited monitoring ability of typhoon maximum wind speed in China in the early years,some wind speed grades need to be included in the obtained typhoon maximum wind speed data,which affects the accuracy of the subsequent prediction of typhoon maximum wind speed.Therefore,completing the missing decision labels in the semi-supervised information system of maximum typhoon wind speed can effectively retain the feature information of this part of the samples and provide more abundant training samples for the prediction of maximum typhoon wind speed.In addition,there are too many meteorological features related to wind speed level in the semi-supervised typhoon maximum wind speed information system,and the redundant features in the information system will increase the prediction difficulty.Feature selection provides an excellent idea for solving the problem of redundant features in such a high-dimensional feature space.As a practical feature selection tool,fuzzy rough set theory can effectively deal with continuous data in this information system.Therefore,based on rough learning,this thesis studies semisupervised feature selection according to the above two problems and has achieved relevant research results.1.This thesis proposes a decision labeling algorithm based on local density.Firstly,the local density factor algorithm is used to calculate the local distribution of unlabeled samples in the sample space of the semi-supervised information system.Then,the correlation degree between the unlabeled samples and each decision class is calculated by the designed weight value formula.The unlabeled samples are divided into the decision classes with high correlation degrees,which reduces the occurrence probability of class noise while realizing the completion of the missing decision labels.2.In this thesis,an improved feature selection algorithm based on the information gain ratio is proposed by improving the feature selection algorithm based on the information gain ratio.The algorithm is based on fuzzy rough set theory and information theory and uses the gain ratio in the feature selection algorithm based on information gain ratio to measure the uncertainty of features.Then,based on fully considering the internal correlation and external correlation of features,a new feature importance search strategy is designed to realize feature selection,which further reduces the feature space dimension and improves the classification accuracy of the information system by 1.79%.3.In this thesis,the above two algorithms are applied to the semi-supervised information system of maximum typhoon wind speed to realize the semi-supervised feature selection of the maximum typhoon wind speed information system.In order to verify the effectiveness of the above-proposed algorithm in the semi-supervised typhoon maximum wind speed information system,this thesis first uses the local density-based decision marking algorithm to complete the decision label of five semi-supervised typhoon maximum wind speed information systems,and through a series of comparative experiments,the best Settings of the two parameters of the algorithm are found.It is proved that the proposed decision marking algorithm is effective in the semi-supervised typhoon maximum wind speed information system,and the marking effect is better than the other three decision marking algorithms.Then,the internal and external correlation feature selection algorithm based on the information gain ratio is used to select features in five completed decision labels of the typhoon maximum wind speed information system.A series of experiments prove that the proposed feature selection algorithm is effective and has more obvious advantages than the other five feature selection algorithms.Finally,the selected meteorological features are analyzed. |