Font Size: a A A

The Design And Implementation Of Massive Text Information Search System Based On Distributed Framework

Posted on:2015-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:L H WangFull Text:PDF
GTID:2298330452961132Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet, people pay more and moreattentions on web applications. How to organize and process the massive of webtext messages is the critical factor for the fields of data mining, search engines,telecommunications services, network security, network monitoring, networkinformation collection. Processing the massive web text messages has followingfeatures: full-text retrieval, fast data generated, high density, large-scale andcontinuous. As a result, how to manage, store, retrieve the web text messagesefficiently is an important research topic. However, there are less researches orsolutions on the field.The material of this paper mainly refers an enterprise project from HarbinHeng Sheng Communication Technology Co. Ltd. The system of the project isapplied to public Internet security field. Meanwhile it is also used for retrieving dataand text messages that users interested in.The content of the paper is based on analyzing massive of web text messages.Following technologies are involved in the analyzing: ORACLE10g paralleldatabase, partitioning table, ROWID query, ElasticSearch distributed architectureand multi-thread scheduling algorithm. And analyzing the data by following steps:design database model for the massive web text messages, upload the data todatabase, manage the data by creating text indexing. Thus the efficient dataretrieving is available by using the method the paper present, and users could have abetter data retrieving experience.Author had studied the general problems about the storage and access of themassive text messages when took the research. Based on the studies, author lists thebusiness application scenarios, thus the system’s original requirements are createdby the scenarios. Finally, author will give more details about the design and implementation of the system, and presents it by following aspects: software’sdevelopment life cycle, system design and implementation, system testing.In the research procedure, the paper analyzes and summarizes the use-casemodel of the functional requirements of the system. The functional blocks andsystem architecture are designed base on the model. Text messages storage, indexcreation, process engine creation, http index service framework are core functionalblocks in this system, thus author will give more introductions about these by usingclass diagram, sequence diagram, flow-process diagram.
Keywords/Search Tags:Distributed architecture, Parallel database, Partitioning table, Textindexing, Full-text Retrieval
PDF Full Text Request
Related items