| Logging curve clustering is a regional macroscopic study of stratigraphic distribution in the field of petroleum exploration.The clusters formed by different wells after clustering can be analyzed using the same method,breaking the traditional understanding of "one-hole view" in logging and helping logging interpreters to form a comprehensive complete stratigraphic analysis.The realization of logging curve clustering can provide relevant staff with new logging curve analysis tools and bring more choices for logging curve analysis,which has important practical significance.Logging curve clustering usually includes three steps:preprocessing,similarity calculation and curve clustering.Curve preprocessing realizes the function of the depth alignment of curves and the compression of curve data.Curve similarity calculation is based on dynamic cosine standard measurement.Curve clustering proposes an improved k-means algorithm to complete clustering.The classical similarity uses vector size or vector depth information to calculate the vector matching contribution value,but there is a problem:the vector has limited impact on the curve similarity calculation results when the vector angle is too large or too small.The dynamic cosine criterion extends the vector matching contribution value to the vector size and the angle between the vector and the depth,which makes the calculation result of curve similarity more reasonable.To solve the problem that it is difficult to determine the final number of clusters before clustering logging curves,an improved k-means algorithm is proposed to change the objective function from minimum distance to maximum benefit.The improved k-means algorithm can automatically adjust the number of clusters in the iterative operation process without specifying the number of clusters before the algorithm starts,avoiding the negative impact of unreasonable K value on the clustering results when using k-means algorithm.4 sets of clusters are obtained by using the improved K-means algorithm to cluster the logs from the different regions.By calculating the Jaccard similarity between the cluster set where each well is located and the similar set,the Jaccard similarity of more than 70%of the wells reaches 65%,and the Jaccard similarity of more than half of the wells exceeds 80%.The result that Jaccard similarity of the elements in the clustering set is higher indicates that the similar elements are divided into the same set and indicates the rationality of the clustering results of the improved K-mean algorithm.Comparing the clustering results of the improved K-means algorithm and the K-means algorithm with a reasonable K value,the clustering results of the two algorithms have little difference,and the consistency rate of the clustering results reaches 90%.The difference in the clustering results of the original K-means algorithm tends to increase as the difference in the K value increases.Comparing the clustering results when the difference in the K value of the clustering is 2,the consistency rate is only around 85%.The difference in the comparison results of the consistency rate indicates the necessity of improving the K-means algorithm. |