Font Size: a A A

Design And Implementation Of Vertical Search Engine Based On ElasticSearch For MOOC

Posted on:2020-08-26Degree:MasterType:Thesis
Country:ChinaCandidate:C D DuanFull Text:PDF
GTID:2428330578957206Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the continuous integration and development of education and informatization,the large-scale online open course MOOC(Massive Open Online Course)has become an important learning method.There are more and more MOOC platforms,and the number of courses offered has also exploded.This leads learners to spend more time and energy on various online platforms to select courses that meet their needs.Although general search engines such as Google,Baidu,and Bing can provide search results of MOOC courses,the search results are not accurate,and it is necessary to filter the required information in complicated results,which is inefficient.Therefore,a system that satisfies the learner's search for MOOC information becomes even more important.This paper designs and implements a vertical search engine system for the MOOC field,which can meet the learner's search requirements for the MOOC field and improve the efficiency of learning.The whole system is developed in Python language.Scrapy framework collects and extracts data from multiple MOOC platforms,uses MongoDB non-relational database to store data,imports ElasticSearch to index data and build distributed search server,Django to implement user search websites.Through the study of search engine technology and the analysis of user behavior,the system is divided into MOOC crawler module,information index module and user search module.The author independently designs and completes the development of each module of the whole system.The details of each module are as follows:(1)Crawler module:Crawling data for multiple MOOC platforms,downloading the page for the specified URL,extracting the course name,course link,introduction,teacher,school and other information on the page.The acquired data is then processed,the non-compliant MOOC course data is filtered out,and the processed data is stored in the MongoDB and ElasticSearch index databases.(2)Information indexing module:Create mappings for complete MOOC course data,perform Chinese word segmentation,and establish inverted index.Use ElasticSearch fuzzy matching to implement search suggestion prompts,multi-field search search function to support users to retrieve course information.(3)User search module:use Django framework to develop dynamic website,provide good human-computer interaction page and logic,convenient for users to search MOOC data,provide recent search history record,search time,number of courses,data page browsing,etc.The search page returns accurate MOOC course information,click on the course name to jump to the corresponding page to learn the course.Through the function and performance test of the system,it can meet the learner's search requirements for the MOOC course field.The simple and accurate search results can easily view the course introduction and improve the efficiency of the user's search for the course,which has high practical value.
Keywords/Search Tags:Vertical Search, ElasticSearch, Scrapy, MOOC, Django
PDF Full Text Request
Related items