Font Size: a A A

Research On Recognition Method For Imbalanced Multivariate Mouse Trajectory With Variable Length

Posted on:2021-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:L L KangFull Text:PDF
GTID:2428330614958332Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the development of behavioral verification code technology,the mouse trajectory recognition represented by the dragged slider is widely used in man-machine verification products due to its characteristics,such as small data transmission,great difficulty in brute force cracking and so on.But attackers can bypass detection by producing batches of human-like trajectory with the black production tool,and upgrade their forgery data during the confrontation to continuously bypass the upgraded detection techniques.Therefore,the mouse trajectory recognition model can be established by using machine learning algorithm to improve the detection rate of various machine behaviors in man-machine verification product.The mouse trajectory is a set of track points in the three dimensions of horizontal direction x,vertical direction y and time t,which can obtain by sampling in the process of dragging the slider.Different from the traditional time series,it has many characteristics,such as multivariate,variable length,data imbalance and few labeled samples.Because of these characteristics,the traditional time series classification method cannot be directly applied to mouse track recognition,and the current mouse trajectory recognition methods failed to solve these characteristics systematically.Therefore,this thesis makes an in-depth study on mouse trajectory recognition,and proposes a mouse trajectory recognition method combining feature group hierarchical and semi-supervised learning.The main research contents are as follows:1.Aiming at the problems of multivariate and variable length of mouse trajectory,the feature-based method is adopted to construct basic features and auxiliary features from different views.Specifically,the features extracted from the t-x dimension are used as basic features to describe the differences between human and machine trajectories,and extracted from the t-y dimension are used as auxiliary features to assist the judgment and increase the confidence of trajectory recognition.In addition,this thesis proposes an improved wrapper feature selection algorithm based on random forest to reduce the dimension of features.Firstly,the method of variable importance measure of random forest is improved to solve the problem that the original method gave too much weight to majority class when data is imbalanced.Then,the features are ranked in descending order based on their importance measure score,and the irrelevant features of the tail are removed.Finally,the sequential backward selection is carried out,i.e.every feature is deleted by traversing from back to front,and the feature subset is evaluated by the wrapper evaluation method,so as to judge whether to withdraw the deleted feature.Experimental results show that this method is superior to the traditional feature selection algorithm and can effectively remove redundant features.2.The recognition effect of mouse trajectory was currently still unsatisfactory due to data imbalance,few labeled samples and so on.In order to solve these problems,this thesis proposes a novel mouse trajectory recognition algorithm combining feature group hierarchical and semi-supervised learning.Specifically,at the feature level,the feature group hierarchical strategy is adopted to add the basic feature group and the auxiliary feature group into the model hierarchically.At the data level,the semi-supervised method is used to expand the data set,and the problem of data imbalance is improved by random under-sampling;Finally,the purpose of improving the recognition effect is achieved by combining the two.The experimental results show that the precision rate of the method can reach 96.26%,the recall rate can reach 91.63%,and the F-measure can reach 94.35%,which prove the effectiveness for mouse trajectory recognition.
Keywords/Search Tags:behavioral verification code, mouse trajectory recognition, imbalanced data, semi-supervised learning, random forest
PDF Full Text Request
Related items