Font Size: a A A

Research On Ecommerce Document Retrieval Technology Based On BERT Model

Posted on:2024-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:T C TangFull Text:PDF
GTID:2568307157488004Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
More and more consumers are searching for the products and services they need online,promoting the rapid development of the ecommerce industry.Consumers need to search for products through ecommerce platforms and hope to obtain the results that best match their needs.In recent years,the theory and application of deep learning have been verified by time,and natural language processing is an important field of deep learning,involving web information retrieval,ecommerce retrieval,intelligent question answering and other directions.In ecommerce retrieval,the text matching technology in natural language processing can semantically compare users’ queries with commodity documents,thus improving the accuracy and personalization of retrieval.This article mainly studies an ecommerce domain document retrieval model based on pre-trained models.The main work content is as follows:(1)A Semantic Recall Model Based on Sim CSEThe retrieval model based on keyword matching can only match results when the query and document contain the same keywords,and cannot consider the semantic relationship between the two.However,the semantic recall model based on deep learning can mine the correlation between the query and the document,improve the correlation and coverage of the recall results.This article adopts an incremental pre training strategy to perform unsupervised task adaptation training on the BERT model,and then conducts comparative learning fine-tuning on the e-commerce dataset through supervised methods.Finally,queries and documents are mapped to the semantic space,and rapid semantic recall is achieved through vector retrieval tools.The recall effects of Elasticsearch,Word2 Vec,Sentence BERT,and the model proposed in this article are compared,The results on the test set indicate that the model used in this article has achieved the best results.(2)A Document Ranking Model Based on BERTThe purpose of the recall stage is to find documents related to the user’s query as much as possible.The encoder lacks fine-grained semantic information between the two in the process of calculating the matching degree between the query and the document.Therefore,it is necessary to rank the recalled documents,placing the documents related to the query as high as possible.In this stage,a cross encoder is usually used to score the relevance between the query and the document to achieve the purpose of ranking.In this paper,the vector retrieval tool faiss is used to quickly recall the documents with the top 50 similarity for the training set query samples,and some documents in corpus are selected as negative samples.The Adaptive Margin Ranking Loss is used for training.The results show that the retrieval method through semantic recall and ranking in this paper has good results.
Keywords/Search Tags:BERT model, text matching, contrastive learning, semantic recall, document ranking
PDF Full Text Request
Related items