LibPass: 基于包结构和签名的第三方库检测方法
作者:
作者简介:

徐建(1979-), 男, 博士, 教授, 博士生导师, CCF专业会员, 主要研究领域为软件分析, 智能运维, 数据挖掘.
袁倩婷(1994-), 女, 硕士, 主要研究领域为软件分析, 数据挖掘.

通讯作者:

徐建, E-mail: dolphin.xu@njust.edu.cn

中图分类号:

TP311

基金项目:

国家自然科学基金(61872186, 61802205)


LibPass: Third-party Library Detection Method Based on Package Structure and Signature
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [37]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    第三方库检测是Android应用安全分析领域的上游任务, 其检测精度对于恶意应用检测、重打包检测、隐私泄露等下游任务有显著影响. 为了提升检测精度和效率, 采用相似性比较的思想, 提出一种基于包结构和签名的第三方库检测方法, 命名为LibPass. LibPass以流水线式模式组合主模块识别、第三方库候选识别和细粒度检测等3个组件. 主模块识别方法区分主程序二进制代码与引入的第三方库二进制代码, 旨在提升方法检测效率. 在此基础上, 提出由第三方库候选识别和细粒度检测构成的两阶段检测方法. 前者利用包结构特征的稳定性来应对应用程序的混淆行为以提升混淆情形下的检测精度, 并利用包结构签名完成快速比对以识别候选第三方库, 达到显著降低成对比较次数、提升检测效率的目的; 后者在前者涮选出的候选中, 通过更细粒度但代价更高的相似性分析精确地识别第三方库及其对应的版本. 为了验证方法的性能和效率, 构建3个评估不同检测能力的基准数据集, 在这些基准数据集上开展实验验证, 从检测性能、检测效率和抗混淆性等方面对实验结果进行深入分析, 结果表明LibPass具备较高的检测精度, 检测效率, 以及应对多种常用混淆操作的能力.

    Abstract:

    Third-party library (TPL) detection is an upstream task in the domain of Android application security analysis, and its detection accuracy has a significant impact on its downstream tasks including malware detection, repackaged application detection, and privacy leakage detection. To improve detection accuracy and efficiency, this study proposes a package structure and signature-based TPL detection method, named LibPass, by leveraging the idea of pairwise comparison. LibPass combines primary module identification, TPL candidate identification, and fine-grained detection in a streamlined way. The primary module identification aims at improving detection efficiency by distinguishing the binary code of the main program from that of the introduced TPL. On this basis, a two-stage detection method consisting of TPL candidate identification and fine-grained detection is proposed. The TPL candidate identification leverages the stability of package structure features to deal with obfuscation of applications to improve detection accuracy and identifies candidate TPLs by rapidly comparing package structure signatures to reduce the number of pairwise comparisons, so as to improve the detection efficiency. The fine-grained detection accurately identifies the TPL of a specific version by a finer-grained but more costly pairwise comparison among candidate TPLs. In order to validate the performance and the efficiency of the detection method, three benchmark datasets are built to evaluate different detection capabilities, and experiments are conducted on these datasets. The experimental results are deeply analyzed in terms of detection performance, detection efficiency, and obfuscation resistance, and it is found that LibPass has high detection accuracy and efficiency and can deal with various common obfuscation operations.

    参考文献
    [1] Hu WH, Octeau D, McDaniel PD, Liu P. Duet: Library integrity verification for Android applications. In: Proc. of the 2014 ACM Conf. on Security and Privacy in Wireless & Mobile Networks. Oxford: Association for Computing Machinery, 2014. 141–152.
    [2] Wang HY, Hong J, Guo Y. Using text mining to infer the purpose of permission use in mobile APPs. In: Proc. of the 2015 ACM Int’l Joint Conf. on Pervasive and Ubiquitous Computing. Osaka: Association for Computing Machinery, 2015. 1107–1118.
    [3] Li L, Bissyandé TF, Klein J. SimiDroid: Identifying and explaining similarities in Android APPs. In: Proc. of the 2017 IEEE Trustcom/BigDataSE/ICESS. Sydney: IEEE, 2017. 136–143.
    [4] Zheng X, Pan L, Yilmaz E. Security analysis of modern mission critical Android mobile applications. In: Proc. of the 2017 Australasian Computer Science Week Multiconference. Geelong: Association for Computing Machinery, 2017. 2.
    [5] Zhou W, Zhou YJ, Jiang XX, Ning P. Detecting repackaged smartphone applications in third-party Android marketplaces. In: Proc. of the 2nd ACM Conf. on Data and Application Security and Privacy. San Antonio: Association for Computing Machinery, 2012. 317–326.
    [6] Hanna S, Huang L, Wu E, Li S, Chen C, Song D. Juxtapp: A scalable system for detecting code reuse among Android applications. In: Proc. of the 9th Int’l Conf. on Detection of Intrusions and Malware, and Vulnerability Assessment. Heraklion: Springer, 2013. 62–81.
    [7] Grace MC, Zhou W, Jiang XX, Sadeghi AR. Unsafe exposure analysis of mobile in-APP advertisements. In: Proc. of the 5th ACM Conf. on Security and Privacy in Wireless and Mobile Networks. Tucson: Association for Computing Machinery, 2012. 101–112.
    [8] Book T, Pridgen A, Wallach DS. Longitudinal analysis of Android ad library permissions. arXiv:1303.0857, 2013.
    [9] Ma ZA, Wang HY, Guo Y, Chen XQ. LibRadar: Fast and accurate detection of third-party libraries in Android APPs. In: Proc. of the 38th Int’l Conf. on Software Engineering Companion. Austin: Association for Computing Machinery, 2016. 653–656.
    [10] Li MH, Wang W, Wang P, Wang S, Wu DH, Liu J, Xue R, Huo W. LibD: Scalable and precise third-party library detection in Android markets. In: Proc. of the 39th Int’l Conf. on Software Engineering. Buenos Aires: IEEE, 2017. 335–346.
    [11] Wang HY, Guo Y, Ma ZA, Chen XQ. WuKong: A scalable and accurate two-phase approach to Android APP clone detection. In: Proc. of the 2015 Int’l Symp. on Software Testing and Analysis. Baltimore: Association for Computing Machinery, 2015. 71–82.
    [12] Narayanan A, Chen LH, Chan CK. AdDetect: Automated detection of Android ad libraries using semantic analysis. In: Proc. of the 9th IEEE Int’l Conf. on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP). Singapore: IEEE, 2014. 1–6.
    [13] Liu B, Liu B, Jin HX, Govindan R. Efficient privilege de-escalation for Ad libraries in mobile apps. In: Proc. of the 13th Annual Int’l Conf. on Mobile Systems, Applications, and Services. Florence: Association for Computing Machinery, 2015. 89–103.
    [14] Backes M, Bugiel S, Derr E. Reliable third-party library detection in Android and its security applications. In: Proc. of the 2016 ACM SIGSAC Conf. on Computer and Communications Security. Vienna: Association for Computing Machinery, 2016. 356–367.
    [15] Zhang, Y, Dai JR, Zhang XH, Huang SR, Yang ZM, Yang M, Chen H. Detecting third-party libraries in Android applications with high precision and recall. In: Proc. of the 25th IEEE Int’l Conf. on Software Analysis, Evolution and Reengineering (SANER). Campobasso: IEEE, 2018. 141–152.
    [16] Zhang JX, Beresford AR, Kollmann SA. LibID: Reliable identification of obfuscated third-party Android libraries. In: Proc. of the 28th ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Beijing: Association for Computing Machinery, 2019. 55–65.
    [17] ProGuard. 2021. https://www.guardsquare.com/en/products/proguard
    [18] Allatori Obfuscator. 2021. http://www.allatori.com/
    [19] DashO. 2021. https://www.preemptive.com/products/dasho/overview
    [20] Hammad M, Garcia J, Malek S. A large-scale empirical study on the effects of code obfuscations on Android APPs and anti-malware products. In: Proc. of the 40th Int’l Conf. on Software Engineering. Gothenburg: Association for Computing Machinery, 2018. 421–431.
    [21] Chen K, Liu P, Zhang YJ. Achieving accuracy and scalability simultaneously in detecting application clones on Android markets. In: Proc. of the 36th Int’l Conf. on Software Engineering. Hyderabad: Association for Computing Machinery, 2014. 175–186.
    [22] Li L, Bissyandé TF, Klein J, Traon YL. An investigation into the use of common libraries in Android APPs. In: Proc. of the 23rd IEEE Int’l Conf. on Software Analysis, Evolution, and Reengineering (SANER). Osaka: IEEE, 2016. 403–414.
    [23] Zhong H, Wang XY. Boosting complete-code tool for partial program. In: Proc. of the 32nd IEEE/ACM Int’l Conf. on Automated Software Engineering (ASE). Urbana: IEEE, 2017. 671–681.
    [24] Lin JL, Amini S, Hong JI, Sadeh N, Lindqvist J, Zhang J. Expectation and purpose: Understanding users’ mental models of mobile APP privacy through crowdsourcing. In: Proc. of the 2012 ACM Conf. on Ubiquitous Computing. Pittsburgh: Association for Computing Machinery, 2012. 501–510.
    [25] Crussell J, Gibler C, Chen H. AnDarwin: Scalable detection of android application clones based on semantics. IEEE Transactions on Mobile Computing, 2015, 14(10): 2007–2019. [doi: 10.1109/TMC.2014.2381212]
    [26] 王浩宇, 郭耀, 马子昂, 陈向群. 大规模移动应用第三方库自动检测和分类方法. 软件学报, 2017, 28(6): 1373–1388. http://www.jos.org.cn/1000-9825/5221.htm
    Wang HY, Guo Y, Ma ZA, Chen XQ. Automated detection and classification of third-party libraries in large scale Android APPs. Ruan Jian Xue Bao/Journal of Software, 2017, 28(6): 1373–1388 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5221.htm
    [27] Li MH, Wang P, Wang W, Wang S, Wu DH, Liu J, Xue R, Huo W, Zou W. Large-scale third-party library detection in android markets. IEEE Transactions on Software Engineering, 2020, 46(9): 981–1003. [doi: 10.1109/TSE.2018.2872958]
    [28] Wang Y, Wu HW, Zhang HL, Rountev A. Orlis: Obfuscation-resilient library detection for Android. In: Proc. of the 5th Int’l Conf. on Mobile Software Engineering and Systems. Gothenburg: Association for Computing Machinery, 2018. 13–23.
    [29] Tang W, Luo P, Fu JL, Zhang D. LibDX: A cross-platform and accurate system to detect third-party libraries in binary code. In: Proc. of the 27th IEEE Int’l Conf. on Software Analysis, Evolution and Reengineering (SANER). London: IEEE, 2020. 104–115.
    [30] Soh C, Tan HBK, Arnatovich YL, Narayanan A, Wang LP. LibSift: Automated detection of third-party libraries in Android applications. In: Proc. of the 23rd Asia-Pacific Software Engineering Conf. Hamilton: IEEE, 2016. 41–48.
    [31] Glanz L, Amann S, Eichberg M, Reif M, Hermann B, Lerch J, Mezini M. CodeMatch: Obfuscation won’t conceal your repackaged app. In: Proc. of the 11th Joint Meeting on Foundations of Software Engineering. Paderborn: Association for Computing Machinery, 2017. 638–648.
    [32] Tang ZS, Xue MH, Meng GZ, Ying CG, Liu YG, He JA, Zhu HJ, Liu Y. Securing android applications via edge assistant third-party library detection. Computers & Security, 2019, 80: 257–272. [doi: 10.1016/j.cose.2018.07.024]
    [33] 黄思荣, 陶非凡, 张源, 杨珉. LibSeeker: 参数自整定的安卓应用第三方库检测方法. 小型微型计算机系统, 2019, 40(2): 332–340. [doi: 10.3969/j.issn.1000-1220.2019.02.017]
    Huang SR, Tao FF, Zhang Y, Yang M. LibSeeker: Detecting android third-party libraries using parameter auto-tuning. Journal of Chinese Computer Systems, 2019, 40(2): 332–340 (in Chinese with English abstract). [doi: 10.3969/j.issn.1000-1220.2019.02.017]
    [34] Wang Y, Rountev A. Who Changed You? Obfuscator identification for Android. In: Proc. of the 4th IEEE/ACM Int’l Conf. on Mobile Software Engineering and Systems (MOBILESoft). Buenos Aires: IEEE, 2017. 154–164.
    [35] Wang P, Bao QK, Wang L, Wang S, Chen ZF, Wei T, Wu DH. Software protection on the go: A large-scale empirical study on mobile APP obfuscation. In: Proc. of the 40th Int’l Conf. on Software Engineering. Gothenburg: Association for Computing Machinery, 2018. 26–36.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

徐建,袁倩婷. LibPass: 基于包结构和签名的第三方库检测方法.软件学报,2024,35(6):2880-2902

复制
分享
文章指标
  • 点击次数:445
  • 下载次数: 1661
  • HTML阅读次数: 854
  • 引用次数: 0
历史
  • 收稿日期:2021-02-25
  • 最后修改日期:2021-06-09
  • 在线发布日期: 2023-07-26
  • 出版日期: 2024-06-06
文章二维码
您是第20046206位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号