| With the development of machine learning technology,it has been gradually applied to various fields.More and more experts and scholars have begun to use machine learning technology to research in the medical field,especially in disease prediction.In this paper,the data of gastric cancer are analyzed,and the naive Bayes algorithm is used to predict the 5-year survival status of gastric cancer patients,in order to provide help for medical personnel to diagnose the disease.Firstly,the data of gastric cancer were obtained through the SEER database,and the trend analysis was carried out from the aspects of diagnosis rate,incidence rate,and 5-year survival trend.The regression analysis was used to obtain the factors affecting the survival status of gastric cancer patients,and the prognosis of gastric cancer patients under different surgical methods was described.After that,the important factors affecting the 5-year survival status of gastric cancer patients were further analyzed by calculating the SHAP value of gastric cancer characteristics.Secondly,a R-NB model was proposed,which introduced the Relief algorithm on the basis of fusing Weighted Naive Bayes(NB)model.The Relief algorithm assigns different weights according to the impact of different feature attributes on the category and outputs feature attribute sets and weights with larger weights,which can solve the problem that the NB model cannot select feature attributes and weights.In order to further improve the prediction accuracy of R-NB model,a GAR-NB model is constructed by adding genetic algorithm to R-NB model,and the initial weights are optimized by setting the parameters such as crossover rate and mutation rate of genetic algorithm.A large number of experiments show that the model performs better when the crossover rate is 0.8 and the mutation rate is 0.06 under the same conditions.In view of the GAR-NB model is prone to local optimal solutions and miss the global optimal solution,the particle swarm optimization algorithm is introduced to improve it,and the PGAR-NB model is constructed.Among them,the number of particles,the number of iterations and other parameters of the PSO algorithm will affect the final prediction results of the PGOR-NB model.A large number of simulation experiments show that the prediction results are better when the number of particles is 32 and the number of iterations is 300.Experimental results verify that the accuracy,precision and recall rate of the PGAR-NB model are improved to 90.82%,91.15% and 91.22% respectively in predicting the5-year survival status of gastric cancer patients.Finally,the parallelism of the three models was studied,and the five-year survival status of gastric cancer patients was predicted by deploying them to the big data platform.Experimental results show that the PGAR-NB model under the distributed platform has the highest prediction accuracy,and the operating efficiency is 2.5 times higher than that in the single machine mode,which proves the superiority of the parallel PGAR-NB model,and provides a new idea for medical staff to study the survival status of patients with gastric cancer. |