Research On Integrated Lithology Identification Method For Unbalanced Data

Posted on:2023-05-12

Degree:Master

Type:Thesis

Country:China

Candidate:Y Lou

Full Text:PDF

GTID:2531307163989389

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

In the exploration and development of oil and gas reservoirs,the lithology identification technology based on conventional logging data is helpful to understand the geological characteristics and is of great significance to the prediction of reservoir oil and gas.However,in the actual logging data,the distribution of different types of lithology data is not balanced,which leads to the low identification accuracy of traditional lithology identification methods,and it is difficult to be used for the actual lithology identification.In order to solve the classification problem of unbalanced data,this paper proposes a Bagging ensemble classification algorithm UWBagging based on undersampling.First,the original data set is formed into multiple training subsets by Bootstrap.Secondly,the distribution structure is determined according to the density peak of the majority class,and the training subset is balanced by combining the neighborhood features.Then,the base classifier is trained with the balanced data set,and the weight of the base classifier is constructed by using the out-of-bag data.Finally,the final model is generated by weighted voting.The effectiveness of UWBagging algorithm is verified by experiments.The resampling technique can effectively balance the data set,but there is still a problem of low minority class recognition.Therefore,this paper proposes a Boosting ensemble classification algorithm HCBoost based on mixed sampling.First,the minority class samples are clustered,and the minority class samples are synthesized according to the cluster density and sample weight.Secondly,combined with the sample weight,the random undersampling of majority class is carried out to obtain the balanced data set.Then,use the balanced data set to train the base classifier C4.5 and update the error rate of the sample according to the cost function to increase the attention to the minority samples.Finally,the final model is generated iteratively.The effectiveness of HCBoost algorithm is verified by experiments.In this paper,the lithology sample set is constructed by logging data and logging lithology data,and two lithology identification models are established by UWBagging algorithm and HCBoost algorithm.Experimental results show that both models can effectively improve the accuracy of lithology identification.

Keywords/Search Tags:

Imbalanced data, Ensemble learning, Lithology identification

PDF Full Text Request

Related items

1	Research On Lithology Identification Method Based On CNN-LSTM Network And Ensemble Learning
2	Logging Sequence Data-driven Lithology Identification
3	A Lithology Identification Method Based On Multi-sensor Fusion
4	Research And Application Of Lithology Identification Based On Ensembling Learning
5	Sewage Treatment Fault Diagnosis And Software Development Based On Weighted Extreme Learning Machine Ensemble Algorithm
6	Lithology Identification Based On Convolutional Neural Network
7	Imbalanced Data Classification And Its Application In Wastewater Treatment System
8	Research On Lithology Recognition Method Based On Hyperspectral And Machine Learning
9	Research And Application Of Imbalanced Data Classification Based On Oversampling And Ant Colony Optimization Resampling
10	Research On Intelligent Lithology Classification Method