Font Size: a A A

Robust Machine Learning In Large-Scale Networks

Posted on:2022-03-22Degree:DoctorType:Dissertation
Country:ChinaCandidate:J R XuFull Text:PDF
GTID:1520306830961559Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Networks(i.e.,graphs)are ubiquitous in the real world.The emergence of social platforms such as Twitter and Weibo has attracted public attention to the networks.Twitter,Weibo and so on constitute the social network among millions of users;Thousands of web pages on the Internet form a network of web hyperlinks;The spread of COVID-19 has aroused the attention of protein-protein interaction network.These networks are abstractions of the real world,describing the connection of different entities in a complex environment.The study of network data can help us understand some complex real-life phenomena,and can also provide evidence for the human life.While networks can be a double-edged sword.For example,although social networks bring low-cost communication means,they also facilitate the spread of rumours.Accordingly,a slight anomaly in a network is easy to propagate to other neighbors via relational information.The anomaly in networks can be attributed to two aspects.On the one hand,networks in real-world tend to be noisy;on the other hand,there exist malicious attacks in networks.The noise and malicious attacks in a network have given rise to tremendous concerns regarding the utilization of graph models.Therefore,the study of model robustness in networks plays a supportive role in security-critical domains,which can not only empower economic development but also maintain social stability.Despite the recent advances achieved by machine learning models on networks,the robustness of these graph models is far overlooked.However,considering the dependency between the nodes in a network,graph models can be easily fooled by small perturbation on the network.In addition,the explosive increase of real-world data present another challenge to the efficient computing in large-scale networks.In this thesis,we propose to study robust machine learning in large-scale networks.We summarize our major contributions as follows:First,to deal with the noise in a network,we propose two network enhancement algorithms against noise,i.e.,network denoising agent via reinforcement learning(NetRL)and enhanced network model(E-Net).These two algorithms can help us reconstruct a reliable denoised network from a noisy one.Since the labels of noise are unavailable,NetRL takes advantage of downstream task to guide the network reconstruction process,while E-Net adopts a self-supervised learning method to learn from the network itself.Moreover,by proposing an RWR subgraph extraction mechanism,our model can be scaled up to large networks.Second,we bridge the gaps between theoretical graph adversarial attacks and real-world scenarios,in this thesis,we propose a novel and more realistic setting:strict black-box graph attack(STACK).The success of STACK breaks the illusion that perfect protection for the victim model could block all kinds of attacks:even when the attacker is totally unaware of the underlying model and the downstream task,it is still able fool graph models.To design such an attack strategy,we first propose a generic graph filter to unify different families of graph-based models.We further introduce some improved eigen-solution approximation theories to launch an effective attack.Experiments demonstrate that,even with no exposure to the model,the Macro-F1 drops 6.4%in node classification by STACK.Third,to handle the above-mentioned adversarial vulnerability problem,we propose a adversarially-robust graph model,so that the perturbations on input network can be successfully identified and blocked before the model is applied to different downstream tasks.Specifically,considering the complexity and interdependency of network structure,we first introduce the graph representation vulnerability(GRV),an information theoretic-based robustness measure built upon the joint input space of network structure and node feature.Additionally,we explore a provable connection between GRV and the robustness of models on node classification task.We further provide a computationally efficient adversarial learning strategy,which is still able to largely enhance model robustness against adversarial attacks.Experimental results reveal that under adversarial attacks,our model beats the best baseline by an average of+1.8%,+1.8%,and+45.8%on node classification,link prediction,and community detection task,respectively.Fourth,we target a real-world application scenario of online games,and propose a general anomaly detection framework termed NGUARD.Specifically,by introducing the sequential information of user behaviors,NGUARD can accurately identify 3.9 million anomaly users from 436 billion game players.Eventually,NGUARD has been successfully deployed in 6 online games in the Netease,which can help the company save 80%processing time and save 50%loss every month.In general,the noise and malicious attacks are potential threats to the security of graph models.This thesis reveals the empirical law of noise and malicious attacks in networks,and provides some robust machine learning methods in large-scale networks.To handle the noisy networks,we propose two network enhancement algorithms.Considering the adversarial attacks can be regarded as the worst-case noise in a network,we further propose a novel strict black box graph attack and validate that blindfolded attacker can still launch effective attacks.In order to block these harmful graph attacks,we propose an adversarially-robust graph model.The above three studies focus on developing some general approaches to enhance graph model’s robustness.However,in real scenarios,it would be more practical to exactly identify what the anomaly is,and then the challenge is how to design the domain-specific solution.Thus,by illustrating an example in online games,we propose a game-specific anomaly detection framework and successfully deployed in the Netease Games.In summary,this thesis not only provides researchers some novel views of robust machine learning in networks,but also provides industry practitioners some new perspectives to handle anomalies in real-world scenarios.
Keywords/Search Tags:graph mining, network enhancement, adversarial attack, model robustness, anomaly detection
PDF Full Text Request
Related items