Font Size: a A A

Design And Implementation Of A Load Balancing System With The Function Of Web Data Integration

Posted on:2015-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:S M LiFull Text:PDF
GTID:2298330467463296Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development and popularity of the Internet, the quantity of web data is developing at a very fast speed. For example, as a typical social network company, more than one billion messages will be posted on Twitter, it parses more than two TB data. In the area of e-commerce, Alipay can accomplish more than188million business deals in one day. Search engine Google parses20PB data every day. Gilder law indicates that the size of Web data will expand in a sudden rapid. Meanwhile,the form of network flow is becoming more and more complex,As the population of CDN, the definition of the network flow session is not only HTTP signal connection any more, it brings a challenge to network flow filter system.Today, Akamai(CDN service provider) takes forty percent of network flow in the world, YouTube takes thirty percent of network flow in North America.In order to parse mass data, People adopt distributed and Multiprocessor Parallel architecture in DPI system, As a result, a load-balancing system is needed to distribute flow to DPI’s backend machines, This paper focus on load-balancing system’s functions,such as,Web data integration,flow distribution, flow scheduling and so on, Because of the population of CDN, a session usually contains more than one tcp connection, as a result, the traditional load-balancing system maybe distribute these connections in one session to different backend machines, leading to that the whole session can’t be detected integrally.In this paper, A mechanism of Web data integrating is advanced to solve this problem, In addition, reproducible data will be distributed to one backend machine under this mechanism, reproducible data will be analyzed only one time so that computational resource will be saved. IP-CLUSTER is the basic element for dispatching.
Keywords/Search Tags:load balancing, web data integrating, traffic shunttraffic scheduling
PDF Full Text Request
Related items