Font Size: a A A

Application Of Small Sample Data Mining In Soil Classification

Posted on:2022-10-23Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:2506306320985569Subject:Engineering
Abstract/Summary:PDF Full Text Request
Forensic geology is a modern subject that applies the knowledge and techniques of geology to the forensic evidence.Soil is one of the important inspection materials.Based on the inspection of the soil material evidence,the source of the soil samples is judged to provide clues and directions for the detection of the case.However,the relevant domestic departments have collected little data on typical soils across the country,and most analysis on soil evidence is still at the stage of manual comparison.Therefore,in this thesis,based on the research of domestic and foreign related theories,the characteristics of the actual soil sample data were analyzed,a reasonable comprehensive classification process algorithn was designed,and a set of forensic geology multiple data intelligent analysis system for soil data classification was constructed,which aimed to comprehensively classify the multiple indicators of soil evidence in forensic geology,study the multi-dimensional visualization technique of the soil data and realize the system.The main work of the thesis is as follows:(1)Based on the current research on the development of forensic geology in China and abroad,the types and distribution characteristics of the soil data sets provided by the subject were analyzed,and the concept of "scattered cloud clusters" distribution was proposed.In addition,the commonly used data mining techniques and small sample augmentation techniques were analyzed.(2)The limited number of soil samples in forensic geology led to insufficient training samples and over-fitting of the classification models,which severely restricted the accuracy of the classification algorithm.By analyzing the advantages and disadvantages of existing data augmentation algorithms,a new data augmentation algorithm,C-SMOTE algorithm was proposed,which was compared with other augmentation algorithms.Experiments were carried out aimed at the 6 types of soil index data.The results showed that the training data augmented by the C-SMOTE algorithm could make the classification accuracy improve significantly.(3)Traditional data visualization often failed to meet the needs for multi-dimensional data visualization.Based on the characteristics of soil data,the visualization principles of multi-dimensional data,such as parallel coordinate diagrams and RadViz diagrams,were studied.The concept of information entropy was introduced in the Chernoff Facebook diagram,which improved the distinction between different types of high-dimensional data visualization,and provided necessary technical support for users to better discover the hidden information behind multi-dimensional data.(4)Soil data was divided into character data and numerical data according to its indicators.The overall classification algorithm process of the system was designed to firstly classify character data,which aimed to focus and reduce the types to be classified.Then,the numerical classification was performed and the training sample data set was expanded using C-SMOTE algorithm in order to improve the classification accuracy of each indicator.Finally,according to the idea of integrated learning,the final traceability results of the soil samples were determined by voting.Experiment results showed that the classification process could further improve the accuracy of classification on the basis of the C-SMOTE algorithm.At last,the process of software engineering was adopted to complete the demand analysis,outline design,detailed design of the system,and the development of a multi-data intelligent analysis system for forensic geology.
Keywords/Search Tags:Forensic Geology, Data Mining, Small Sample Data Augmentation, Multi-Dimension Data Visualization, Soil Data Classification
PDF Full Text Request
Related items