| Among the many applications and research areas of machine learning,deep learning has made breakthroughs in numerous fields.And the application of convolutional neural networks in image processing is one of the current hot topics.Data,algorithms and computing power are the three aspects of machine learning that are effectively applied.At present,research and applications are mainly carried out in algorithms,but in many other application areas,data becomes a limiting factor for the effectiveness of machine learning.In particular,in applications where the amount of data is small and the data is not sufficiently well characterised,the size and quality of the data is difficult to meet the requirements of machine learning algorithms.Retinopathy of prematurity(ROP)is a condition that occurs when the blood vessels in the eye of a premature baby are not yet formed and are susceptible to various factors.Regular eye examinations,as well as timely ROP diagnosis and treatment,are required to avoid serious complications and visual impairment.However,timely diagnosis of ROP has been a challenge due to factors such as varying levels of physician proficiency and volume limitations.Therefore,with the development of deep learning and convolutional neural networks,computer vision techniques are gradually being widely used in the automatic diagnosis of ROP.Currently,many researchers have used deep learning techniques for the automatic detection and diagnosis of ROP lesion areas,and have achieved remarkable results.Therefore,computer vision technology has a broad application prospect in the diagnosis and treatment of ROP.In this paper,the improvement of data quality and the enhancement of features have led to an improvement in the accuracy of the network model for the recognition of ROP.In medical diagnosis and identification,positive disease data is often a minority of the data,which can also make the network model end up with a high false-negative rate of identification.In turn,the false negative rate,i.e.the missed diagnosis rate,is something that we need to reduce and avoid in medical diagnosis and recognition,so data processing is important for medical data where category imbalance is common.Due to the small size of the dataset used in this paper,and the fact that the proportion of positive data in the dataset is much smaller than that of negative data,there is a serious class imbalance.Therefore,this paper expands the data size and strengthens the data features so as to enhance the training effect of the network model.In this thesis,the data are imaged,labelled and segmented,and pre-processed.To address the problem of poor performance of the network models trained from the original data,we perform data cleaning and retain good quality data in each step,and then improve the quality and scale of the data through the data processing methods of undersampling and data enhancement,so that the performance of the network models in the training set,validation set and test set can be improved.In particular,Google Net achieved an overall accuracy of 95% in the test set,which is a more desirable level. |