Design And Implementation Of Text Retrieval System Based On Deep Learning

Posted on:2020-02-09

Degree:Master

Type:Thesis

Country:China

Candidate:Z L Tang

Full Text:PDF

GTID:2428330575957050

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the increase of Internet data,different text retrieval systems have been applied to different products.At the same time,the increase of data makes the development of neural network and deep learning technology greatly.However,the existing text retrieval systems seldom use the depth learning technology.Therefore,a text retrieval system has been designed and implemented.Users can search text through this system and get some text that is closest to their own goals.Text retrieval technology and deep learning algorithm are focused and a text retrieval system using distributed operating system has been designed.The following three aspects have been completed.Use the distributed crawler based on the Master/Slave architecture to crawl data and clean the crawled data.The samples are constructed based on the crawled data,and the constructed samples are combined with the TREC data set.In order to improve the deep text matching result,two text matching models are introduced:Siamese semantic network model which is based on single semantic feature extraction and the MatchPyramid model which is based on drectly semantic modeling.At the same time,a new semantic network model is proposed based on these two models:a joint model based on the Siamese semantic network model and the MatchPyramid model,which combines the feature extracted from Siamese semantic model with the features extracted from MatchPyramid model.Experiments show that using the MAP value as an evaluation index,the model can achieve more than 8%better results than traditional retrieval methods and 3%higher results than existing deep learning algorithms.Text Retrieval System Based on Distributed Architecture has been designed and implemented.To speed up text retrieval,the system uses the Hadoop and Spark streaming computing frameworks,the most commonly used distributed systems in the industry.The system implemented several maj or modules:1.Offline data processing module,including data cleaning and distributed index construction.2.Offline model training module,in order to speed up online retrieval speed,the system adopts offline training online loading strategy.3.In order to improve the retrieval speed,the historical high frequency query with there search result cache module are added.4.Search word processing module:contains text error correction and feature extraction of search terms.5.Display of search results:The system uses a web server built on Flask to display the final search results.

Keywords/Search Tags:

text retrieval, deep learning, semantic similarity matching, distributed system, retrieval system

PDF Full Text Request

Related items

1	Image-Text Retrieval Based On Hierarchical Interaction Network
2	Text Retrieval Based On Real-time Twitter Streaming
3	Research Of Key Technologies On Rural Medical Text Retrieval Based On Distributed Environment
4	Research On CT Image Retrieval Method Of Pulmonary Nodule Based On Deep Learning
5	Text Semantic Matching Method With Hybrid Strategies
6	A Short Texts Matching Methodusing Multi-level Features
7	Research And Improvement Of Deep Relevance Matching Model Based On Information Retrieval
8	Research And Application Of Full Text Retrieval Based On Hadoop
9	Research On The Distributed Indexing Platform And Information Filter In Distributed Full-text Retrieval System
10	Semantic Based Retrieval For Heterogeneous3D CAD Models