Font Size: a A A

Studies On Semantic Typicality Query And Diverse Recommendation Approach For Spatio-Textual Data

Posted on:2021-01-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:1488306602982469Subject:Mine spatial information engineering
Abstract/Summary:PDF Full Text Request
With the universal application of mobile internet and rapid development of GPS,lots of Spatial Web Objects(Spatial Object for short,e.g.,Point of Interests,Check-ins,etc.)which contain both the geographic and textual information gradually formed a large size of spatio-textual data.As a result,two kinds of high correlated technologies which are spatial keyword query and Point of Interest(POI)recommendation against the spatio-textual data is becoming a hot research topic in Location-based Sercice field.In this dissertation,the problems of spatial keyword semantic approximate query,query result typicality analysis,and diverse POI recommendation which need to be solved in the current spatial keyword query and POI recommendation fields,are studied.The innovative contributions are summarized as follows.(i)Most of the existing spatial keyword query processing models only support the location proximity and text similarity matching.However,in terms of text information processing,spatial objects with similar semantics but mismatched forms cannot be filtered out and provided to query users.Furthermore,the current spatio-text index structure can also not process the numerical attributes.To solve the above problems,a spatial keyword semantic query approach is proposed.First,a CGAN(Conditional Generative Adversarial Network)model based query keyword semantic expansion model is designed,which aims to generate a set of keywords that are semantically related to the original query keywords.And then,a hybrid index structure Attribute Inverted-file R-Tree,called AIR-Tree,which can support both location and semantic matching and use Skyline method to process numerical attributes,is proposed.The implementation algorithms for the operation of insert,delete,and query operations of AIR-Tree are also presented.At last,AIR-tree is used for query matching to return top-k spatial objects that are most related to the expanded query conditions and ranked according to the synthetic scoring function.Experimental results showed that the semantic relevant keywords generated by CGAN-based query expansion model is more reasonable and it is especially helpful for expanding the rare query keywords.The experiments also demonstrated that the AIR-Tree hybrid index structure can effectively process the numerical attributes and achieve a higher query precision,lower index structure building cost,and faster execution performance.(ii)The top-k result objects obtained by using the location proximity and text similarity scoring function are often similar with each other while users hope that the system can pick a small number of typical results in order to make users to understand the representative features of whole result set.To deal with the problem of typicality analysis and typical object selection of spaital keyword query results,a typicality evaluation and top-k approximate selection approach is proposed.First,the approach calculates the synthetic distances on dimensions of geographic location,textual semantic,and numeric attribute between all spatial objects.For measuring the semantic relevancy between descriptive texts(reps.user comment texts)associated to spatial objects,the semantic similarity evaluation methods based on the keyword coupling relationships and the combination of word embedding and convolutional neural network are proposed,respectively.Second,according to the synthetic distances between spatial objects,a Gaussian kernel probability density estimation-based method for evaluating the typicality of spatial objects is proposed.To facilitate the query result analysis and top-k typical object selection,the Tournament strategy-based and Local neighborhood-based top-k typical object approximate selection algorithms are presented,respectively.The upper bound of errer rate of Local neighborhood-based approximation algorithm is also proved.The experimental results demonstrated that the text semantic relevancy measuring method for spatial objects are accurate and reasonable,the Local neighborhood-based top-k selection algorithms achieved both the low error rate and high execution efficiency.(iii)After the user obtaining query results,one also hopes that the system can recommend other types of POIs from the query result region.To solve this problem,a diverse and personalized recommendation approach that integrally considers the geographical and social relationships between spatial objects is proposed.First,the geographic distances and social correlations between spatial objects are merged to construct a geographic-social relationship model of spatial objects to evaluate the geographic-social relationship correlations between spatial objects.And then,a spectral clustering-based clustering method is proposed to partition the spatial objects into several clusters.Lastly,the POIs relevant to user preferences are picked from each cluster by using the probabilistic factor model-based algorithm and these POIs are ranked according to their satisfaction degrees to the user preferences as the diversified and personalized recommendation list.Experimental results demonstrated that the clusters obtained by considering the geographic-social correlations between spatial objects are more reasonable,the recommended POI list achieves a high diversity with certain accuracy,which broadens and enhances the user's recognition of the POIs and their geograhpic-social relationships.The approaches and corresponding technologies proposed in this dissertation can be applied into several real application fields such as spatio-textual data query and recommendation,typicality analysis,location-based services,invisible community discovery,spatio-temporal data mining,urban computing,and marketing,which play an important role in improving the service quality of existing systems in the above fields.The dissertation includes 50 figures,41 tables and 138 references.
Keywords/Search Tags:spatio-textual data, CGAN model, semantic approximate query, AIR-Tree index, typicality analysis, diverse POI recommendation
PDF Full Text Request
Related items