Font Size: a A A

Community Classification Based On Tourism Information

Posted on:2016-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2208330473961431Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the improvement of people’s living standard, travelling is increasingly becoming the first choice in public holidays. At the same time, there is an urgent need for travel information services. Travelers need to do some preparation before traveling, such as making sightseeing enquiries, sharing travel stories or finding tourist groups on the internet. In recent years, microblog has become a very popular social network. It attracts a large number of users in a very short period of time, owing to its own unique features, such as information brevity, timely publishing and online interaction. Especially, many travelers can get or post some travel information through microblog. The information generates a large amount of short-text messages which are each within 150 characters, and forms a huge and complex network based on those short-text messages. Therefore, by using data mining techniques such as association algorithm and clustering algorithm, we can search for the travel information network, find out the traveling groups and predict the trend of traveling activity through those short messages. This is significant for providing travel service information, recommending a suitable route, and predicting the peak time of traveling.This essay will cluster short texts in the microblog containing different central ideas into different groups according to Hamming distance. Then the travel information networks will be divided in accordance with the community structure. At last, the users delivering the messages with same ideas will be grouped into one community center. Online community division is a hot spot in social network research, and the complex text network is one of the fields in complex networks research. Therefore, this essay combines these two parts in the research and the experiment.This essay will conduct the following study on the division of travel information network.(1) Study and analysis on the theory of social networks, especially on the complex network. The research will be mainly conducted in the aspect of using of symbols, analysis on the network, methodology, and network community. In the section of network community, the essay will focus on the community stability, centrality and small-world network, and will especially analyze on the key technology of typical complex network and text complex network. The description of the texts is based on the features of the social network and the complex network.(2) Assessment and experiment of short text clustering based on SimHash. Based on the former theoretical research, after text preprocessing, including world splitting and denoising, we will build a complex text Network. By using the improved SimHash method, considering the features of the social network and complex text network, this essay will make a new text clustering algorithm, and complete set of experimental data on clustering; evaluation of the algorithm.(3) Research and design of travel information network analysis. On the basis of the short text clustering, use community center node and important nodal analysis algorithm to divide community on Microblog network dataset, and then evaluate the feasibility and accuracy. At last, conduct a parallel process on the calculation and make the conclusion.
Keywords/Search Tags:Complex network, Community Division, SimHash Algorithm, Short Text Clustering, Tourist Information Service
PDF Full Text Request
Related items