Research On Urban Landmarks Extraction Based On Random Forests Classifier

Posted on:2021-03-17

Degree:Master

Type:Thesis

Country:China

Candidate:Y Gong

Full Text:PDF

GTID:2480306293452394

Subject:Cartography and Geographic Information System

Abstract/Summary:

PDF Full Text Request

Urban landmarks are spatial feature that are more salient than their neighbors in structural aspect,cognitive aspect or appearance.Urban landmarks play a significant role as spatial references and are important in spatial cognition and way-finding.As present landmarks are mainly extracted manually,which is time-consuming and label-intensive.More and more scholars begin to pay attention to the automatic extraction of landmarks but the current extraction methods cannot satisfy our need of navigation.This study introduces a random forest classifier in urban landmark extraction and extracts landmarks from basic geographic information databases.Since the urban feature database is highly imbalanced,in which the number of non-landmarks is much greater than the number of urban landmarks,this can present an low classification accuracy for urban landmarks.To solve the problem of low recognition rate for urban landmarks,this paper focuses on two aspects,which are data and algorithms,to reduce the influence of imbalanced data on classifier.This paper select 15 salience indicators for urban features to construct a feature space.These indicators are available from basic geographic information databases or social sensing data,and can be divided into three categories: structural,cognitive and perceptual.From the point of data balance,concerning the imbalance of the urban POI dataset,Random oversampling(ROS),SMOTE and ADASYN are applied to reduce the data imbalanced rate.We apply random forest algorithm for extracting urban landmarks after getting three balance datasets using ROS,SMOTE and ADASYN.To determine the best feature set,we evaluate the importance of each feature and perform tests on the different combinations of features.The results show that the improved algorithm based on oversampling preforms well in urban landmark extraction: the recall and AUC of the results is above 90%,and ROC is the best method to oversampling datasets.In addition,we obtain the best combination of indicators for the model,which can help reduce the difficulty of data collection.Form the point of algorithm design,for improving the precision of minority,this paper proposes a cost-sensitive random forest.The class distribution is added to the cost function,and corresponding weight is given to each sample according to its spatial scales.To improve the random forest,each decision tree is weighted by the classification performance.The results show that comparing with random forest and cost-sensitive decision tree,the cost-sensitive random forest gains higher precision classifying: recall and AUC is above 90%.Additionally,this method is suitable for small dataset,which can help reduce the manual marking workload.

Keywords/Search Tags:

urban landmarks, salience, random forest, class imbalance, cost-sensitive ensemble

PDF Full Text Request

Related items

1	Urban Landmarks Extraction Based On Basic Geographic Information Database
2	A Novel Approach To Product Quality Control In Industry Based On Ensemble Learning
3	Imbalanced Learning Based On Undersampling Technique And Rotation Forest
4	Credit Rating For Online Lending Based On Cost-sensitive Classification And Ensemble Learning
5	Research On Pattern Recognition And Application Of Quality Control Chart Considering Class Unbalanced Data
6	Protein Subcellular Localization Based On Feature Selection And Cost-Sensitive Learning
7	Prediction Of Protein SUMO Modification Sites Based On Cost-sensitive Learning
8	Classification Of Ensemble Algorithm In Gene Expression Data
9	Urban Air Quality Prediction Model Based On Improved Random Forest Algorithm
10	Prediction Research Of Protein-Protein Interaction Based On Ensemble Of Support Vector Machine And Random Forest