Based on the toponym gazetteer and compilation of toponymic data published in the first toponymic survey,this study collected more than 1 million toponymic data and sorted out 6756 toponymic words.It creatively used the main idea of TF-IDF algorithm for reference,calculated and sorted the usage frequency of toponymic words,and initially formed the classification of toponymic words table.According to statistics,the words table divided the toponymic words into 2 classes.there are 478 first-class words with TF-IDF value of 0,which are widely used in the whole country.There are 6278second-class words with TF-IDF value greater than 0.Among them,the greater the TF-IDF value is,the higher the regional popularity is.These words are commonly used in geographical entities,especially less in north of China,and more in south.The smaller the TF-IDF value is,the lower the regional popularity is.These words are rarely used in remote toponym,and they are generally used as proper names of toponym.This paper is divided into three parts.The first part of the introduction mainly defines the related concepts of toponym and the use of words in toponym,expounds the significance of the research on the classification of the use of words in toponym,reviews the research status of toponym,the use of words in toponym and the classification of words in toponym,and puts forward the research methods of this paper;the second part introduces the classification methods,classification principle and classification results of the use of words in toponym based on TF-IDF algorithm;the third part summarizes the results The fourth part summarizes the research methods and its advantages and disadvantages.The classification of toponymic words is of great significance to improve the position of toponymic words,find out the position of toponymic words,find out the family background of toponymic words,promote the standardization of toponymic words and the development of toponymic words research. |