Lithology plays a vital role in evaluating the formation characteristics and the oil and gas content of the reservoir.Therefore,the identification of reservoir lithology is of great significance to oil and gas exploration and development.However,the traditional lithology identification methods based on machine learning are still in their infancy,and there are some limitations: 1)The correlation of the both lithology and logging sequence in the depth direction is not fully considered;2)The influence of the differences in lithology distribution and feature distribution of the logging data in different oil wells is ignored.To overcome above shortcomings,this thesis takes the logging data of shale oil wells in Jianghan district of China as the research object,takes original logging sequence data analysis as starting point,and adopts deep learning and semi-supervised learning theoretical methods as research method.From the perspective of the preprocessing of logging sequence data,the lithology identification for one well and the lithology identification for different wells,conducting research on logging sequence data-driven lithology identification method,focusing on solving problems such as the limitation in identification accuracy and the differences in logging data distribution,providing the reference for improving the calculation accuracy of reservoir parameters and the evaluation of formation characteristics.First of all,in order to eliminate the influence of negative factors in the original logging sequence data on the lithology identification,correlation comparison method,maximum minimum normalization and MAHAKIL oversampling algorithm are adopted to carry out the research on the relative depth correction,standardization and lithology category balancing processing of the logging sequence data.On this basis,fully considering both the depth change trend and the spatial neighborhood information of the logging sequence,the lithology identification method for one well based on MAHAKIL and time convolutional network(TCN)is proposed,to improve the memory capacity of depth history information and computing capability.Furthermore,to solve the problem of the differences in lithology distribution and feature distribution of the logging data,the lithology identification algorithm for different wells based on the improved class-balanced self-training(CBST)and model ensemble(ME)are designed.The self-training method is used to adaptively learn the differences in data distribution,and the ME method is used to solve the randomness problem of the self-training model.Comparative experimental results show that,in the task of lithology identification for one well,the MAHAKIL algorithm can effectively improve both the identification accuracy of the minority types of lithology and the overall identification accuracy;the TCN model with optimal hyperparameters has the best overall performance.In the task of lithology identification for different wells,the proposed lithology identification method based on improved CBST and ME has effectively improved the identification accuracy of the TCN model on the two sets of different well data sets.The average accuracy increased from 88.91% and 78.87% to 90.22% and 81.05%,respectively,and the standard deviation of accuracy decreased from 1.84% and 1.41% to 1.14% and 1.01%,respectively. |