| With the rapid development of big data technology,an increasing number of people are applying artificial intelligence algorithms to various fields.Meanwhile,colorectal cancer,as a disease with extreme harm,has jumped to the third place in national cancer rankings.Therefore,it is essential to explore the causes of colorectal cancer and conduct early prevention.This article uses machine learning algorithms to build a colorectal cancer model to assist doctors in diagnosing and understanding the risk factors for colorectal cancer.The main work of this article is to construct a colorectal cancer warning model.First,we collected data from colorectal cancer patients and a control group.The data were processed using preprocessing methods such as noise reduction,normalization,and multi-value attribute preprocessing,and feature selection was conducted through correlation analysis and medical advice.Then,the performance of various models on the dataset was compared,and the random forest algorithm was chosen as the prediction model.Finally,the relationships and influences between attributes were analyzed through attribute correlation,and different effects of factors such as colon polyps,sedentary behavior,abdominal discomfort,age,and sweet food on the incidence of colorectal cancer were summarized.This lays a foundation for doctors’ diagnosis and self-early prevention.The innovation of this article lies in proposing three methods for preprocessing multi-value attributes and analyzing and comparing the effectiveness and application scenarios of these three methods.Additionally,a random forest algorithm for multi-value attributes is proposed,which is based on the extended sim5 decision tree algorithm. |