Font Size: a A A

The Research Of Learned Hash Index Models For Recommender Systems

Posted on:2019-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:W K XiangFull Text:PDF
GTID:2428330548975562Subject:System analysis and foundation
Abstract/Summary:PDF Full Text Request
It is a great issue to figure out how to quickly find the required data,when the large-scale data is been processed.In the past,the index model based on the traditional data structure needs a large amount of space to build an index,especially in the case of non-uniform data distribution,the indexing efficiency of the model will be greatly reduced.On the other hand,the large number of forward and reverse indexes in the recommender system requires the establishment of a bunch of large-scale hash tables.When the amount of data is increasing,it leads to a huge hash conflicts and decreases the indexing efficiency.Face to these problems,although many researchers proposed a series of improved algorithms for the index structures,the effect is not obvious.The main reason for the unsatisfactory results is that traditional data structures cannot make corresponding changes according to the distribution of data.Therefore,this paper proposes a learned index structure that builds indexes by learning data distributions.The learned index structures was proposed by Tim Kraska et al.in the end of 2017.Once it was proposed,it attracted widespread discussions and attentions.The learned index structures treats the traditional index structures as regression or classification problems.Learning data distribution through machine learning algorithms can save a lot of storage space and provide new ideas for indexing databases and other systems.At the beginning,this paper describes the shortcomings of the traditional index structures and the technical difficulties in establishing index structures in recommender systems,and summarizes the ideas and features of the existing learned index structures.Second a multi-layer model is established,and two algorithms using neural network as the hash model are designed to complete the functions of splitting the data set and mapping the data to the hash table respectively.Following by combing with the previously designed supervised learning neural network models according to the characteristics of the reverse indexes,an unsupervised learned hash model is designed and implemented to find the mapping relationship that can make the data evenly distributed.Based on this,focus on the requirements of the inverted index in the recommender system,the cyclic neural network structure is added,and a complete character-based hash index structure CB-LHI is designed and implemented.CB-LHI adds an LSTM layer to each supervised learning and unsupervised learning model in the multi-layer model,extracts different features from different sub-data sets,and separates similar data.Last but not least,experimental data satisfying various distributions are designed.Each model is compared with the traditional hash function.The experimental results show that the CB-LHI model is superior to the traditional hash function in both conflict rate and space utilization.The feasibility of the related system construction learning index model has been explored.
Keywords/Search Tags:Indexed Structure, Hash Function, Machine Learning, Recurrent Neural Network
PDF Full Text Request
Related items