Font Size: a A A

Research And Implementation Of A Mail Identification Technology For The Backbone Network

Posted on:2013-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:K TianFull Text:PDF
GTID:2298330422974322Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Mail Identification also called E-mail Protocol Traffic Analysis is the basis ofimproving the quality of network service, enhancing network control and ensuring thesecurity of e-mail communications. The current mail identification in massive data linkis confronting challenges such that Bandwidth is too high, and flooding of rubbish mails,and traditional port identification and keyword identification method cannot meet theapplication requirements and challenges. In recent years, couples of scholars andresearchers implemented issues of mail identification from the view of statistics andmodeling, as well as in-depth processing of packets, while mail identification devices ofdifferent kind still cannot satisfy demands in terms of accuracy, scalability, robustnessand performance on the whole.This thesis targets at mail identification of backbone network data traffic, on thebasis of researches in various kinds of mail identification methods, and on theconsideration of high bandwidth, complex data types and massive parallel flows thatcore network data flow owns, a novel FPGA-based mail identification implementationis proposed to identify mail protocols in which hardware completes the main functionsand the software completes a secondary functions. The main work includes:(1) Proposing an e-mail identification scheme faced backbone link, the hardwareportion in the scheme includes five-tuple abstraction module, regular expression module,sampling module, TCAM module and packet forwarding module, and software portionconsists of multi-core processors and mail depth servers. Five-tuple abstraction moduletakes the five tuples in IP packet header to TCAM module for secondary identification,regular expression module identifies the corresponding mail protocols and configure thefive tuple rules into TCAM entry. TCAM module performs secondary identification tothe incoming five tuples and receives mail server addressing rules from regularexpression module and multi-core module to update its rule library. Mail depth serverimplements analysis and recover to the identified mails according to correspondingcommands.(2) Aiming at solving the missing identification and fault identification that portidentification and key-word identification method results in, this thesis adopts FPGAand TCAM cooperative methods to implement secondary identification on the fivetuples of mail protocol if possible. TCAM module performs first table lookup to theincoming five tuples, and secondary lookup is implemented on the mail protocol type asresults of first lookup and the five tuples, the mail protocol type is finally identifiedaccording to the two identifying consequences. The secondary identification in TCAMmodule avoids scatter port and range port issues in port identification methods, in and accuracy of matching to satisfy the link rate processing demands of backbonenetworks.(3) Due to the mismatch of performance of general-purpose processors cannotsatisfy the fast development of network bandwidth, in order to overcome intrinsicshortage of software and in maximize satisfy the link rate processing demand ofbackbone networks, the mail identification system adopts packet sampling, fast TCAMmatching, accurate multi-core processor identification and rapid regular expressionidentification to achieve fast and in-time mail protocol identification.(4)The mail identification system successfully identified POP3protocol, SMTPprotocol, and IMAP protocol from the large flow of data link under test environment.Additionally, the mail identification system cost half of the total time fromidentification to recovering mail protocol compared with traditional protocol stackrecover method under the same amount of data flow, which identifies the improvementof identification rate of the mail identification system. The identification of POP3protocol illustrates7.6G throughput of the system finally. All experimental results fullyproves that the FPGA-based mail identification system this thesis proposed couldidentify mail protocols accurately and rapidly from large backbone network.
Keywords/Search Tags:Mail Identification, FPGA, TCAM, Regulare Expressions, Packet Sampling
PDF Full Text Request
Related items