Font Size: a A A

Design And Implementation Of The LSA Data Processing Subsystem For Baidu Video

Posted on:2015-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:T JiangFull Text:PDF
GTID:2298330434950325Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The author participated in the work related to the Baidu Video Semantic Search System. This system aims to provide services of semantic search for Baidu’s video search, which means the system try to recognize the real meaning of users(not only the query itself)via more semantic information. The author’s major work is to design and implement the data subsystem for it. The subsystem mainly includes following functions:the preprocessing of data for later semantic search, the control of data quality, and the statistics of semantic search. The corresponding modules in the subsystem are the preprocessing of searching data module (Alamake), the preprocessing of shield data module (Pc_filter), the data monitoring module (Data_monitor) and the web interface.The work could be summarized as follows:(1) At the requirement analysis and system design stage, first, the author distills requirements according to the disadvantages of the old system. And then the author further proposes the design scheme of the subsystem. The system uses the B/S architecture in order to make the users easier to operator the system by using web interface. The system finally uses Inverted Index to build indexes and use Mysql+Redis to storage data.(2) During the implementation of those modules, the author uses HTML/Javascript to implement web interface, uses PHP to implement the main modules and uses Mysql+Redis to storage data.(3) When testing, the author does functional test and performance test to ensure that all the functions could carry out the requirements and the performance is tolerable.(4) Online tracking is also carried out. The author inspects the online system to ensure the system can meet the actual requirement online.The system is stable and the result turns out to be satisfying. The sundry dictionaries and indexes make Semantic Search more available. MySQL and Redis are used to store the data, which make the system more efficient. The backup management and data monitoring can ensure the quality of data. In a word, the data subsystem gives users better experience and realizes the requirements.
Keywords/Search Tags:Search Engine, Semantic Search, Inverted Index, Redis
PDF Full Text Request
Related items