Font Size: a A A

Research And Design Of New Media Manuscript Retrieval System Based On Solr

Posted on:2021-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:X F HuFull Text:PDF
GTID:2518306308475994Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the rise of the Internet and mobile Internet in recent years,new media has also flourished,and articles and manuscripts in new media have also shown explosive growth.Facing the massive and multifarious new media data information,how to quickly and accurately find the required manuscript information in such large-scale data information has become a problem faced by self-media users.Based on the above problems and requirements,this thesis is designed and developed based on the architecture of Spring+Spring MVC+Hibernate.Combining Solr search engine services and Baidu speech recognition tools,a new media manuscript retrieval system with B/S architecture is proposed.The system is implemented using Java as the development language.This thesis focuses on the analysis of key technologies and strategies used in system architecture design,and develops a new media manuscript retrieval system based on Solr,which mainly includes reprocessing,building Solr system,user query and database.This thesis takes the open source search engine Solr as the core of the system,and studies the implementation principle of the search engine core technology index.In order to ensure the efficiency and quality of word segmentation,the algorithm of word segmentation and the performance comparison of various Chinese word segmentation are studied.In order to facilitate Solr to use text to build an index,the method of text conversion of non-text files is studied.The main innovation points of this thesis are as follows:First,because the traditional relational database can not meet the huge real-time requests,and the relational database is not effective for full-text search in Chinese,an enterprise search engine is needed to solve the search problem.As an enterprise-level search engine,Solr's powerful full-text search function can meet the needs of enterprise search.Developers develop on the basis of Solr's mature search engine framework.They don't need to care about the implementation of the bottom layer of the search engine,but only focus on the implementation of the upper-level search business.This greatly reduces the development risks and costs,and can effectively reduce the development cycle.Second,in view of the large amount of multimedia files such as pictures,audio,and video in the new media manuscript,this system combines Solr's rapid indexing and mature Baidu speech recognition to form a new media manuscript retrieval system for enterprise platforms.Compared with the traditional way of querying through a single format such as text,this system integrates Baidu's speech recognition function,which can perform text conversion on audio and video format files for query.With the addition of audio and video file information,query accuracy is improved,query resources are diversified,and system retrieval functions are enriched.
Keywords/Search Tags:new media, Solr, index, full-text search
PDF Full Text Request
Related items