Research On Resident Bahaviour Analysis Methods Based On Data Mining

Posted on:2024-01-18

Degree:Master

Type:Thesis

Country:China

Candidate:C Ma

Full Text:PDF

GTID:2568307157999769

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of science,technology and society,daily life becomes more and more colorful,and the location-based social media check-in data of residents also shows an exponential growth.The sign-in data of residents contains very rich temporal and spatial information and residents’ preference information.Reasonable collection and analysis of the sign-in data of residents is very important for urban planners and their strategic partners to formulate appropriate urban management policies.On the one hand,according to the needs of residents,corresponding policies can be introduced to improve the quality of life of residents;On the other hand,according to the characteristics of residents’ behavior patterns,preventive measures can be taken for various potential emergencies in advance,and problems can be solved at the initial stage to avoid serious impact on residents’ daily life.Such measures can effectively improve the government’s social credibility and residents’ sense of security and happiness.The analysis of resident check-in behavior can be roughly divided into two steps:data collection and data mining.By using appropriate data mining methods,the behavior characteristics of residents can be discovered from a large number of check-in data.However,the existing data mining of resident check-in behavior has the following problems: In the aspect of data set,the existing data set cannot capture the complete behavior of residents in a day,so it cannot capture the behavior pattern of residents completely;In terms of mining methods,statistical methods often fail to capture the residents’ behavior patterns hidden in large-scale check-in data because they only use simple mathematical processing.A single K-means clustering method cannot cluster the residents’ check-in behavior well,and it is easy to fall into the local optimal solution and be affected by the initial clustering center.In addition,most of the research focuses on hotspot identification,resident classification and behavior prediction,but there is little work on residents’ behavior pattern mining.And existing work tends to focus on the whole or part of the hot spot,ignoring the individual level.In view of the above problems,this paper analyzes the demand for data mining of residents’ check-in behavior.While improving data integrity,it realizes the comprehensive extraction and analysis of residents’ check-in behavior from the whole,part and individual level by using statistics,improved clustering algorithm and new pattern mining algorithm.Specific research content and innovation points are as follows:1.Proposed a complete framework for extracting and analyzing resident check-in behavior,including the complete process of data collection,data preprocessing,data mining,result analysis and interpretation.The framework uses statistics,clustering,pattern mining and other technologies to extract and analyze the behavioral patterns and time characteristics of residents from the overall,partial and individual levels.The case study based on the resident check-in data set of large venues in the United States proves that the proposed framework can effectively capture the characteristics of residents’ behavior patterns and provide decision-making support for city managers.The resident check-in data set was collected via Foursquare in conjunction with the Twitter API,and is realistic enough to capture actual information about each check-in location throughout the day.2.Proposed an improved K-means clustering method,KDE-Means-Shift,to improve the poor clustering effect of K-Means on resident check-in behavior data.In order to improve the clustering results of resident check-in behavior,we need to optimize the selection of initial clustering center and the determination process of cluster number.Kde-mean-shift algorithm uses kernel density estimation KDE to obtain data distribution information,and finds multiple locations with the densest data as the input of Mean-Shift algorithm.Then Mean-Shift algorithm is run to determine the initial clustering center location and the final number of clustering.Finally,this information is taken as the input of K-Means algorithm,and the final clustering result is obtained by running K-Means algorithm.Experimental results show that the proposed method can effectively improve the clustering results of resident check-in data.3.Proposed to use the CM-SPAM sequential and PFPM periodic pattern mining algorithm to extract the sequential and periodic behavior patterns in American resident check-in data.Both of these two pattern mining algorithms take into account the temporal characteristics of residents’ behavior and can effectively capture the sequence and periodic characteristics of residents’ behavior patterns from the overall and individual levels.Experiments were carried out on the check-in data of US residents collected during the 2020 epidemic period,which proved that compared with the traditional pattern mining algorithm,the above method has the advantages of fast running speed and strong interpretation of mining results.

Keywords/Search Tags:

Data mining, Check-in behavior analysis, Sequential pattern mining, Periodic pattern mining, Clustering

PDF Full Text Request

Related items

1	Reserch On The Sequence Mining Algorithm And Its Application In User Behavior Analysis
2	Constraint-based Sequential Pattern Mining And Its Applications
3	Research Of Misuse IDS Based On Sequential Pattern Mining And Key Technologies
4	Research On Sequential Pattern Mining Algorithm Based On Constraints
5	Nonoverlapping Closed Sequential Pattern Mining
6	Research And Application Of Mining Access Sequential Pattern In Weblog
7	Research On Mining Sequential Patterns With Periodic Wildcard Gaps
8	Mining Sequential Patterns With Periodic General Gap Constraints
9	Keyword Extraction Based On Sequential Pattern Mining
10	Research On Sequential Pattern Mining And Web Usage Mining