| With the rapid development of various location technologies such as GPS,mobile devices and remote sensing,location-based social networks(LBSN)are becoming increasingly popular and generating a large amount of check-in data.Mining valuable knowledge from spatio-temporal data is crucial for real-world applications.Existing approaches typically generate dense representations of POIs by modeling these check-ins before proceeding to downstream data mining tasks.However,the coordinates in these representations are not interpreted in any meaningful way,and the issue of interpretability of POI representations has received little scholarly attention at this stage,and this uninterpretability will have a negative impact on the performance of downstream data mining tasks.To address the above-mentioned issues,this thesis proposes three models to improve the performance of downstream data mining tasks by providing semantic categories to make these POI representations interpretable.The main work of this thesis includes:1.A category-aware(a category may be a restaurant,a mall,etc.)Check-in Embedding Model named "CEM" is proposed to generate POI and category representations.The CEM model captures sequential patterns as well as semantic category information of check-in record,generating both POI and category embedding vectors.2.This paper then proposes an eXplainable Embedding Model named "XEM" that makes the POI representation interpretable by using semantic categories to explain each dimension of the POI representation.Specifically,we use these categories as semantic anchors and calculate the similarity between the POI embedding and these anchors based on the embedding vectors learned from the CEM.We then use these similarity scores as values for a POI representation,where each dimension of the POI representation corresponds to a semantic anchor(i.e.a semantic category)and can be interpreted as a coherent and easily understood topic.3.Aiming at the problem of high dimension and redundancy caused by a large number of categories in XEM,an improved model called "XEM-C" is proposed.It groups these categories into clusters and uses these clusters as semantic anchors.Within the same category cluster the categories are semantically similar to each other,and between different clusters the semantics of the categories differ significantly.XEM-C represents each dimension of the POI representation as a cluster of similar categories and calculates a similarity score between the POI and these category clusters as the value of the representation.XEM-C can be interpreted by a set of semantically similar categories.Extensive experiments are conducted on two real-world check-in datasets,including qualitative tasks and quantitative tasks(evaluation of POI similarity and evaluation of POI semantic annotation).Experimental results show that adding interpretability to the POI representation can improve the performance of downstream tasks. |