Font Size: a A A

Research On Detection Of Third-party Libraries In Android Applications

Posted on:2019-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y H ZhangFull Text:PDF
GTID:2428330572951520Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the thriving of the Android ecosystem,codes are widely reused in Android apps in the form of third-party libraries.Recent researches reveal that the emerging third-party libraries may introduce a lot of permissions abuses,privacy risks and other security threats.Nevertheless,current approaches based on white-list or clustering are far away from the demand of accuracy and efficiency for the reasons that:(1)There are variety of third-party libraries and it is impossible to build a comprehensive white-list,(2)The package hierarchy of third-party library is complicated in Android app and the boundary of third-party library cannot be identified directly,(3)Common code obfuscation or optimization techniques not only result in a large number of false positives in the result of clustering,but also fail the approaches based on white-list.To address the above 3 problems in the research field of Android third-party libraries,we propose a new approach and implement a detection tool.Researchers can use this tool to quickly and accurately detect third-party libraries incorporated in an Android app.Specifically,we have made the following contributions and works:We propose a novel third-party library identification technique.Firstly,we improve the boundary identification method based on inter-app dependency graph.The dependencies include function calls,field references,interface implementations and Dalvik annotations where the last two are first proposed.We discard the package hierarchy to avoid identifying multiple third-party libraries with the same prefix as the same library.Subsequently,we recognize the boundary of each third-party library instance in the dependency graph by weak-connected-components.The feature of each instance is calculated and third-party library instances are identified accurately by clustering.Lastly,we propose three steps of refinement to eliminate false positives in the initial result.We propose a novel third-party library detection algorithm based on index database and greedy strategy.Firstly,an index database for all identified third-party library instances is built,so that we can utilize the index database to find candidates quickly and avoid the pair-wise library candidate comparison.The feature matching algorithm based on greedy strategy and similarity ensures that the most likely candidate third-party library instance is prioritized for matching to guarantee the robustness and accuracy.We implement and release a new tool named Lib Hawkeye.We systematically evaluate this tool from the reliability of third-party library boundary identification,the effectiveness of refinement,the accuracy of third-party library detection,and the processing effectiveness.The experiment on 1,000 apps reports that compared to existing tools,Lib Hawkeye can precisely identify at least 26.5% more libraries.We also evaluate it with 3,987,206 Android apps,it identifies a total of 1,414,173 instances of 22,492 different third-party libraries.The accuracy of the random sampling for clustering results reaches 93.25%;and the average value of F1 for third-party library detection of 1,000 samples downloaded from four different application markets reaches 93.5%,which outperforms the current most advanced library detection tool by 40%.At the same time,Lib Hawkeye can handle conventional obfuscation techniques,while the performance overhead of Lib Hawkeye is reasonable and acceptable.
Keywords/Search Tags:Android, Third-party library, Boundary identification, Refinement, Third-party library detection
PDF Full Text Request
Related items