Font Size: a A A

Research And Implementation Of VoIP Audio Traceback Method Based On Big Data Technology

Posted on:2019-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y P WangFull Text:PDF
GTID:2348330542998889Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,VoIP has rapidly developed due to the advantages of low construction cost,powerful functions,substantially reduced communication costs under the premise of ensuring call quality,and ease of use.Research for traceback method of VoIP lags behind,making VoIP a new medium of criminal activity such as financial fraud.The advent of the era of big data provides a novel idea for traceability research,and VoIP traceback becomes a data-driven solution to the problem.The phonetic feature is more difficult for scammers to tamper.When the amount of data is large enough,big data analysis technology enables researchers to collect the data and find out the true source of the VoIP phone to a certain extent.To solve the above problems,this thesis proposes a scheme of implementing VoIP audio traceback system based on big data.The proposed method utilizes SVM,random forest and neural network to extract and filter audio features,which characterize network paths and related to geographical locations.Then the above features combine with the proposed implicit features based on LSTM neural network.The method uses audio feature vector to train the VoIP tracing classifier,which identifies the infrastructure and network paths.And through this way,the method will predict the call provenance of new calls.The method includes five steps,including network monitoring,packet analysis,feature engineering optimization,training of VoIP traceability model and result analysis.Among them,the key technology is the construction of audio feature vector.Considered with the sample data type,sample size,feature dimension,and linearly separable factors,this thesis choose SVM and random forest model to train and predict VoIP call provenance.For the selection of features,this thesis digs and optimizes the feature vectors.Then it extracts the basic audio features and the second-order dynamic segmentation features to improve the audio features selected based on the forest model.Furthermore,this thesis proposes implicit features based on LSTM cellular unit learning output and introduces the relationship of time series.Then the method removes the full connective layer of the model,and obtains implicit features of neural network cellular learning output.Combined with the features of the highest contribution,the thesis finally gets VoIP tracing audio feature vector.By training SVM and random forest non-linear classifier,the problem of VoIP Traceback gets the result reaching 91.9%of F1 value.In addition,this thesis designs two kinds of experimental scenarios of a single-person scene and multi-person scene to collect VoIP call data and to simulate real-world VoIP phone calling.Applying the method of VoIP traceability designed in this thesis to the above two scenarios,the experiments finally obtained the accuracies of 93.8%and 84.8%of VoIP traceability prediction.
Keywords/Search Tags:VoIP audio traceback, Audio features vector, Traceability classifier, LSTM neural network
PDF Full Text Request
Related items