Font Size: a A A

Research On Information Retrieval And Public Opinion Detection Algorithm Based On Hadoop

Posted on:2016-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:P LiFull Text:PDF
GTID:2208330470968017Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The rapid development of network hardware level makes it through the Internet to access network services, data storage and online data processing, primary key to become a reality, at the same time, technological development has become increasingly mature in this area. In the modern of Internet data explosion, how to extract the data from such a large cluster of valuable data what is a very popular research direction,and is an important foundation of network resources to achieve its value.Cloud computing technology highlighted in this context, it provide the platform for most researchers, it enables users to save the the cost by traditional operating mode through the sharing mode of online storage, online access and online services,and improve the efficiency and quality of data access. Hadoop is a framework based on cloud computing infrastructure and the emergence of distributed cloud computing platform, and composed of the file system and programming model. It provides users with a good framework package, users only need to know the peripheral interface.what’s more,the application of distributed systems is very extensive, this paper emphasized on public opinion research, combined with data of Hadoop web crawling,complex networks and time synchronization theory, and presents the improvement of public opinion detection algorithm.This paper is based on the background of large data, firstly, it analyzes the background, of research, made the significance of the research clearly, and introduced status in this area; On this basis, it covers data storage and analysis, HDFS, Map/Reduce, web and distributed web crawler works in detail,then crawling on the specific needs of Hadoop data solutions designed specifically for analysis, elaborated on the design of the structure and layout, while, this paper proposes a improvement detection algorithm based on complex network and time synchronization, and point to the improved plan in detail.finally, it achieve and analysis its simulation.
Keywords/Search Tags:Hadoop, Web crawling, Public opinion detection, Complex network, Time synchronization
PDF Full Text Request
Related items