Detecting Incompatible Third-party Library APIs in Python Based on Static Analysis
Author:
Affiliation:

Clc Number:

TP311

  • Article
  • | |
  • Metrics
  • |
  • Reference [32]
  • |
  • Related [7]
  • | | |
  • Comments
    Abstract:

    The rich development ecosystem of Python provides a lot of third-party libraries, significantly boosting developers’ efficiency and quality. Third-party library developers encapsulate underlying code, enabling upper-layer application developers to swiftly accomplish tasks by calling relevant APIs. However, APIs of third-party libraries are not constant. Owing to fixes, refactoring and feature additions, these libraries undergo continuous updates. Incompatible changes are seen in some APIs after updates, leading to abnormal termination or inconsistent results in upper-layer applications. Therefore, the API compatibility of the Python third-party library has become one of the issues that needs to be solved. There have been related studies focusing on API compatibility issues of Python third-party libraries, of which reasons have yet to be fully classified so that, the fine-grained cause can not be provided. An empirical study is conducted on the symptoms and causes of API compatibility issues with Python third-party library and a targeted static detection method is proposed. Initially, this study gathers 108 pairs of incompatible API versions by combining version update logs and regression tests across 6 version pairs of the flask and pandas libraries. Subsequently, an empirical study is conducted on the collected data, summarizing the symptoms and causes of compatibility issues. Finally, this study proposes a static analysis-based detection method for incompatible Python APIs, providing syntactic-level causes of incompatible API issues. This study conducts experimental evaluations on 12 version pairs of 4 popular Python third-party libraries. The results show that the proposed method is good in effectiveness, generalization, time performance, memory performance, and usefulness.

    Reference
    [1] Islam M, Jha AK, Nadi S, Akhmetov I. PyMigBench: A benchmark for python library migration. In: Proc. of the 20th Int’l Conf. on Mining Software Repositories (MSR). Melbourne: IEEE, 2023. 511–515. [doi: 10.1109/MSR59073.2023.00075]
    [2] Zhang ZX, Zhu HC, Wen M, Tao YD, Liu YP, Xiong YF. How do python framework APIs evolve? An exploratory study. In: Proc. of the 27th Int’l Conf. on Software Analysis, Evolution and Reengineering (SANER). London: IEEE, 2020. 81–92.
    [3] Haryono SA, Thung F, Lo D, Lawall J, Jiang LX. Characterization and automatic updates of deprecated machine-learning API usages. In: Proc. of the 2021 IEEE Int’l Conf. on Software Maintenance and Evolution (ICSME). Luxembourg: IEEE, 2021. 137–147. [doi: 10.1109/ICSME52107.2021.00019]
    [4] Du XL, Ma J. AexPy: Detecting API breaking changes in python packages. In: Proc. of the 33rd Int’l Symp. on Software Reliability Engineering (ISSRE). Charlotte: IEEE, 2022. 470–481. [doi: 10.1109/ISSRE55969.2022.00052]
    [5] Mostafa S, Rodriguez R, Wang XY. Experience paper: A study on behavioral backward incompatibilities of Java software libraries. In: Proc. of the 26th ACM SIGSOFT Int’l Symp. on Software Testing and Analysis. Santa Barbara: ACM, 2017. 215–225.
    [6] Brito A, Xavier L, Hora A, Valente MT. Why and how Java developers break APIs. In: Proc. of the 25th Int’l Conf. on Software Analysis, Evolution and Reengineering (SANER). Campobasso: IEEE, 2018. 255–265. [doi: 10.1109/SANER.2018.8330214]
    [7] Zhao YJ, Li L, Liu K, Grundy J. Towards automatically repairing compatibility issues in published Android Apps. In: Proc. of the 44th Int’l Conf. on Software Engineering. Pittsburgh: ACM, 2022. 2142–2153. [doi: 10.1145/3510003.3510128]
    [8] Xia H, Zhang Y, Zhou YT, Chen XT, Wang Y, Zhang XY, Cui SS, Hong G, Zhang XH, Yang M, Yang ZM. How Android developers handle evolution-induced API compatibility issues: A large-scale study. In: Proc. of the 42nd Int’l Conf. on Software Engineering. Seoul: ACM, 2020. 886–898. [doi: 10.1145/3377811.3380357]
    [9] Wei LL, Liu YP, Cheung SC, Huang HX, Lu X, Liu XZ. Understanding and detecting fragmentation-induced compatibility issues for Android apps. IEEE Trans. on Software Engineering, 2020, 46(11): 1176–1199.
    [10] Chen LC, Hassan F, Wang XY, Zhang LM. Taming behavioral backward incompatibilities via cross-project testing and analysis. In: Proc. of the 42nd Int’l Conf. on Software Engineering. Seoul: ACM, 2020. 112–124. [doi: 10.1145/3377811.3380436]
    [11] Sun XY, Chen X, Zhao YJ, Liu P, Grundy J, Li L. Mining Android API usage to generate unit test cases for pinpointing compatibility issues. In: Proc. of the 37th IEEE/ACM Int’l Conf. on Automated Software Engineering. Rochester: ACM, 2022. 70.
    [12] Zhang L, Liu CW, Xu ZZ, Chen S, Fan LL, Chen BH, Liu Y. Has my release disobeyed semantic versioning? Static detection based on semantic differencing. In: Proc. of the 37th IEEE/ACM Int’l Conf. on Automated Software Engineering. Rochester: ACM, 2022. 51. [doi: 10.1145/3551349.3556956]
    [13] Mahmud T, Che MR, Yang GW. Android compatibility issue detection using API differences. In: Proc. of the 2021 IEEE Int’l Conf. on Software Analysis, Evolution and Reengineering (SANER). Honolulu: IEEE, 2021. 480–490. [doi: 10.1109/SANER50967.2021.00051]
    [14] He DJ, Li L, Wang L, Zheng HJ, Li GW, Xue JL. Understanding and detecting evolution-induced compatibility issues in Android apps. In: Proc. of the 33rd ACM/IEEE Int’l Conf. on Automated Software Engineering. Montpellier: ACM, 2018. 167–177.
    [15] Yang S, Chen S, Fan LL, Xu SH, Hui ZW, Huang S. Compatibility issue detection for Android Apps based on path-sensitive semantic analysis. In: Proc. of the 45th IEEE/ACM Int’l Conf. on Software Engineering (ICSE). Melbourne: IEEE, 2023. 257–269. [doi: 10.1109/ICSE48619.2023.00033]
    [16] Peng Y, Zhang Y, Hu MZ. An empirical study for common language features used in python projects. In: Proc. of the 2021 IEEE Int’l Conf. on Software Analysis, Evolution and Reengineering (SANER). Honolulu: IEEE, 2021. 24–35.
    [17] flask. https://github.com/pallets/flask
    [18] pandas. https://github.com/pandas-dev/pandas
    [19] Wang Y, Chen BH, Huang KF, Shi BW, Xu CY, Peng X, Liu YJ, Wu Y. An empirical study of usages, updates and risks of third-party libraries in Java projects. In: Proc. of the 2020 IEEE Int’l Conf. on Software Maintenance and Evolution (ICSME). Adelaide: IEEE, 2020. 35–45. [doi: 10.1109/ICSME46990.2020.00014]
    [20] Zhang ZJ, Yang YM, Xia X, Lo D, Ren XX, Grundy J. Unveiling the mystery of API evolution in deep learning frameworks: A case study of TensorFlow 2. In: Proc. of the 43rd Int’l Conf. on Software Engineering: Software Engineering in Practice (ICSE-SEIP). Madrid: IEEE, 2021. 238–247. [doi: 10.1109/ICSE-SEIP52600.2021.00033]
    [21] Dilhara M, Ketkar A, Dig D. Understanding software-2.0: A study of machine learning library usage and evolution. ACM Trans. on Software Engineering and Methodology, 2021, 30(4): 55.
    [22] Liu P, Li L, Yan YC, Fazzini M, Grundy J. Identifying and characterizing silently-evolved methods in the Android API. In: Proc. of the 43rd Int’l Conf. on Software Engineering: Software Engineering in Practice (ICSE-SEIP). Madrid: IEEE, 2021. 308–317. [doi: 10.1109/ICSE-SEIP52600.2021.00040]
    [23] Dig D, Johnson R. How do APIs evolve? A story of refactoring. Journal of Software Maintenance and Evolution: Research and Practice, 2006, 18(2): 83–107.
    [24] Wang JW, Li L, Liu K, Cai HP. Exploring how deprecated python library APIs are (not) handled. In: Proc. of the 28th ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. Virtual Event: ACM, 2020. 233–244. [doi: 10.1145/3368089.3409735]
    [25] Vadlamani A, Kalicheti R, Chimalakonda S. APIScanner-towards automated detection of deprecated APIs in python libraries. In: Proc. of the 43rd Int’l Conf. on Software Engineering: Companion Proc. (ICSE-Companion). Madrid: IEEE, 2021. 5–8. [doi: 10.1109/ICSE-Companion52605.2021.00022]
    [26] Haryono SA, Thung F, Lo D, Lawall J, Jiang LX. MLCatchUp: Automated update of deprecated machine-learning APIs in Python. In: Proc. of the 2021 IEEE Int’l Conf. on Software Maintenance and Evolution (ICSME). IEEE, 2021. 584–588.
    [27] Brito G, Hora A, Valente MT, Robbes R. On the use of replacement messages in API deprecation: An empirical study. Journal of Systems and Software, 2018, 137: 306–321.
    [28] Gumtree. https://github.com/GumTreeDiff/gumtree
    [29] Python program analysis. https://github.com/microsoft/python-program-analysis
    [30] Pereira RB, Plastino A, Zadrozny B, Merschmann LHC. Correlation analysis of performance measures for multi-label classification. Information Processing & Management, 2018, 54(3): 359–369.
    [31] Sklearn. https://github.com/scikit-learn/scikit-learn
    [32] Numpy. https://github.com/numpy/numpy
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

沈阚,黄凯锋,陈碧欢,彭鑫.基于静态分析的Python第三方库API兼容性问题检测方法.软件学报,2025,36(4):1435-1460

Copy
Share
Article Metrics
  • Abstract:553
  • PDF: 1931
  • HTML: 68
  • Cited by: 0
History
  • Received:December 18,2023
  • Revised:March 20,2024
  • Online: July 03,2024
You are the first2036622Visitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063