操作系统内核并发错误检测研究进展
作者:
作者简介:

石剑君(1991-),女,博士,CCF学生会员,主要研究领域为计算机体系结构,程序分析与优化.
石峰(1961-),男,博士,教授,博士生导师,主要研究领域为计算机体系结构.
计卫星(1980-),男,博士,副教授,主要研究领域为计算机体系结构,并行和高性能计算,程序分析与优化.

通讯作者:

计卫星,E-mail:jwx@bit.edu.cn

基金项目:

2018年工业互联网创新发展工程-工业微服务与工业APP部署应用工业互联网平台测试床


Recent Progress of Concurrency Bug Detection in Operating System Kernels
Author:
Fund Project:

2018 Industrial Internet Innovation and Development Project-The Project of Industrial Internet Platform Test Bed for Industrial Microservice and Application Deployment

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [92]
  • |
  • 相似文献
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    并发错误是程序设计语言和软件工程领域的研究热点之一.近年来,针对应用程序并发错误检测的研究已取得了很大进展.但是由于操作系统内核的并发和同步机制复杂、代码规模庞大,与应用程序级并发错误检测相比,操作系统内核的并发错误检测研究仍面临巨大的挑战.对此,国内外学者提出了各种用于操作系统内核并发错误检测的方法.首先介绍了并发错误的基本类型、检测方法和评价指标,讨论了现有的并发错误检测方法和工具的局限性;然后,从形式化验证、静态分析、动态分析和静态动态相结合4个方面,对现有的操作系统内核并发错误检测的研究工作进行了分类阐述,并作了系统总结和对比分析;最后,探讨了操作系统内核并发错误检测研究面临的挑战,并对该领域未来的研究趋势进行了展望.

    Abstract:

    Concurrency bug detection is a hot research topic in the area of programming language and software engineering. In recent years, researchers have made great progress in concurrency bug detection of applications. However, as operating system (OS) kernels always have high concurrency, complex synchronization mechanisms, and large scale of source codes, researches on concurrency bug detection of OS kernels are more challenging than applications. To address this issue, researchers have proposed various approaches to detect concurrency bugs in OS kernels. This study first introduces the basic types, detection techniques, and evaluation indicators of concurrency bug detection, and the limitations of existing concurrency bug detection tools in OS kernels are discussed. Then, researches on concurrency bug detection in OS kernels are described from four aspects:Formal verification, static analysis, dynamic analysis, and combination of both static and dynamic analysis. Some typical approaches are comprehensively compared. Finally, the challenges of concurrency bug detection in OS kernels are discussed, and the future research trends in this field are prospected.

    参考文献
    [1] Leveson NG, Turner CS. An investigation of the therac-25 accidents. Computer, 1993,26(7):18-41.
    [2] Zhang W, Lim J, Olichandran R, Scherpelz J, Jin G, Lu S, Reps T. ConSeq:Detecting concurrency bugs through sequential errors. ACM SIGARCH Computer Architecture News, 2011,39(1):251-264.
    [3] Nasdaq's facebook Glitch came from race conditions. 2020. https://www.computerworld.com/article/2504676/nasdaq-s-facebook-glitch-came-from--race-conditions-.html
    [4] Yin Z, Yuan D, Zhou Y, Pasupathy S, Bairavasundaram L. How do fixes become bugs? In:Proc. of the Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering (ESEC/FSE). 2011. 26-36.[doi:10.1145/2025113. 2025121]
    [5] Naik M, Aiken A, Whaley J. Effective static race detection for Java. In:Proc. of the 27th ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI). 2006. 308-319.[doi:10.1145/1133981.1134018]
    [6] Cai Y, Zhang J, Cao L, Liu J. A deployable sampling strategy for data race detection. In:Proc. of the 24th ACM SIGSOFT Symp. on Foundations of Software Engineering (FSE). Seattle:ACM Press, 2016. 810-821.[doi:10.1145/2950290.2950310]
    [7] Fonseca P, Li C, Rodrigues R. Finding complex concurrency bugs in large multi-threaded applications. In:Proc. of the 6th Conf. on Computer Systems (EuroSys). Salzburg:ACM Press, 2011. 215-228.[doi:10.1145/1966445.1966465]
    [8] Weeratunge D, Zhang X, Sumner WN, Jagannathan S. Analyzing concurrency bugs using dual slicing. In:Proc. of the 19th Symp. on Software Testing and Analysis (ISSTA). Trento:ACM Press, 2010. 253-264.[doi:10.1145/1831708.1831740]
    [9] Zhang T, Jung C, Lee D. ProRace:Practical data race detection for production use. ACM SIGPLAN Notices, 2017,45(2):149-162.[doi:10.1145/3093336.3037708]
    [10] Liu H, Li G, Lukman JF, et al. DCatch:Automatically detecting distributed concurrency bugs in cloud systems. ACM SIGPLAN Notices, 2017,45(1):677-691.[doi:10.1145/3093337.3037735]
    [11] Bond MD, Coons KE, McKinley KS. PACER:Proportional detection of data races. ACM SIGPLAN Notices, 2010,45(6):255-268.[doi:10.1145/1809028.1806626]
    [12] Lu S, Park S, Zhou YY. Detecting concurrency bugs from the perspectives of synchronization intentions. IEEE Trans.on Parallel and Distributed Systems, 2012,23(6):1060-1072.[doi:10.1109/TPDS.2011.254]
    [13] Lu S, Park S, Zhou Y. Finding atomicity-violation bugs through unserializable interleaving testing. IEEE Trans. on Software Engineering, 2012,38(4):844-860.[doi:10.1109/TSE.2011.35]
    [14] Flanagan C, Freund SN. FastTrack:Efficient and precise dynamic race detection. ACM SIGPLAN Notices, 2009,44(6):121-133.[doi:10.1145/1543135.1542490]
    [15] Marino D, Musuvathi M, Narayanasamy S. LiteRace:Effective sampling for lightweight data-race detection. ACM SIGPLAN Notices, 2009,44(6):134-143.[doi:10.1145/1542476.1542491]
    [16] Singhal M. Deadlock detection in distributed systems. Computer, 1989,22(11):37-48.[doi:10.1109/10.1109/2.43525]
    [17] Serebryany K, Iskhodzhanov T. ThreadSanitizer:Data race detection in practice. In:Proc. of the Workshop on Binary Instrumentation and Applications (WBIA). New York:ACM Press, 2009. 62-71.[doi:10.1145/1791194.1791203]
    [18] Lucia B, Devietti J, Strauss K, Ceze L. Atom-aid:Detecting and surviving atomicity violations. In:Proc. of the 2008 Symp. on Computer Architecture (ISCA). Beijing:IEEE, 2008. 277-288.[doi:10.1109/ISCA.2008.4]
    [19] Lu S, Park S, Seo E, et al. Learning from mistakes-A comprehensive study on real world concurrency bug characteristics. In:Proc. of the 13th Conf. on Architectural Support for Programming Languages and Operating Systems. 2008. 329-339.[doi:10.1145/1346281.1346323].
    [20] Su X, Yu Z, Wang T, Ma P. A survey on exposing, detecting and avoiding concurrency bugs. Chinese Journal of Computers, 2015(11):2215-2233(in Chinese with English abstract).[doi:10.11897/SP.J.1016.2015.02215]
    [21] Atomic context and kernel API design. 2020. https://lwn.net/Articles/274695/
    [22] Fonseca P, Cheng Li, Singhal V, Rodrigues R. A study of the internal and external effects of concurrency bugs. In:Proc. of the 2010 IEEE/IFIP Conf. on Dependable Systems & Networks (DSN). Chicago:IEEE, 2010. 221-230.[doi:10.11897/10.1109/DSN. 2010.5544315]
    [23] Abbaspour Asadollah S, Sundmark D, Eldh S, Hansson H. Concurrency bugs in open source software:A case study. Journal of Internet Services and Applications, 2017,8(1):1-15.
    [24] Shi JJ, Ji WX, Wang YZ, et al. Linux kernel data races in recent 5 years. Chinese Journal of Electronics, 2018,27(3):556-560.[doi:10.1049/cje.2018.03.015]
    [25] Bai JJ, Lawall J, Chen QL, Hu SM. DCUAF:Effective static analysis of concurrency use-after-free bugs in linux device drivers. In:Proc. of the 2019 USENIX Annual Technical Conference (ATC). 2019. 255-268.
    [26] Williams A, Thies W, Ernst MD. Static deadlock detection for Java libraries. In:Proc. of the European Conf. on Object-oriented Programming (ECOOP). Berlin, Heidelberg:Springer-Verlag, 2005. 602-629.[doi:10.1145/1735970.1736040]
    [27] Burckhardt S, Kothari P, Musuvathi M, Nagarakatte S. A randomized scheduler with probabilistic guarantees of finding bugs. ACM SIGARCH Computer Architecture News, 2010,38(1):167-178.[doi:10.1145/1735970.1736040]
    [28] Cai Y, Chan WK. MagicFuzzer:Scalable deadlock detection for large-scale applications. In:Proc. of the 34th Conf. on Software Engineering (ICSE). 2012. 606-616.[doi:10.1109/ICSE.2012.6227156]
    [29] Lamport L. Time, clocks, and the ordering of events in a distributed system. Communications of the ACM, 1978,21(7):558-565.[doi:10.1145/3335772.3335934]
    [30] Sheng T, Vachharajani N, Eranian S, Hundt R, Chen W, Zheng W. RACEZ:A lightweight and non-invasive race detection tool for production applications. In:Proc. of the 33rd Conf. on Software Engineering (ICSE). Waikiki:ACM Press, 2011. 401-410.[doi:10. 1145/1985793.1985848]
    [31] Savage S. Eraser:A dynamic data race detector for multithreaded programs. ACM Trans. on Computer Systems, 1997,15(4):391-411.[doi:10.1145/265924.265927]
    [32] O'Callahan R, Choi JD. Hybrid dynamic data race detection. In:Proc. of the 9th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPoPP). New York:Association for Computing Machinery, 2003. 167-178.[doi:10.1145/781498. 781528]
    [33] Voung JW, Jhala R, Lerner S. RELAY:Static race detection on millions of lines of code. In:Proc. of the 6th Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering (ESEC/FSE). Dubrovnik:ACM Press, 2007. 205-214.[doi:10.1145/1287624.1287654]
    [34] Pratikakis P, Foster JS, Hicks M. Locksmith:Context-sensitive correlation analysis for race detection. ACM SIGPLAN Notices, 2006,41(6):320-331.[doi:10.1145/1133255.1134019]
    [35] Deligiannis P, Donaldson AF, Rakamaric Z. Fast and precise symbolic analysis of concurrency bugs in device drivers (t). In:Proc. of the 30th IEEE/ACM Conf. on Automated Software Engineering (ASE). Lincoln:IEEE, 2015. 166-177.[doi:10.1109/ASE.2015. 30]
    [36] Park S, Lu S, Zhou Y. CTrigger:Exposing atomicity violation bugs from their hiding places. In:Proc. of the 14th Conf. on Architectural Support for Programming Languages and Operating Systems. 2009. 25-36.[doi:10.1145/1508244.1508249]
    [37] Sasturkar A, Agarwal R, Wang L, Stoller SD. Automated type-based analysis of data races and atomicity. In:Proc. of the 10th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPoPP). Chicago:ACM Press, 2005. 83-94.[doi:10.1145/1065944.1065956]
    [38] Lockdep. 2020. https://www.kernel.org/doc/Documentation/locking/lockdep-design.txt
    [39] KASAN. 2020. https://www.kernel.org/doc/html/latest/dev-tools/kasan.html
    [40] Syzkaller. 2020. https://github.com/google/syzkaller
    [41] KTSAN. 2020. https://github.com/google/ktsan
    [42] Luk CK, Cohn R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi VJ, Hazelwood K. Pin:Building customized program analysis tools with dynamic instrumentation. ACM SIGPLAN Notices, 2005,40(6):190-200.[doi:10.1145/1064978.1065034]
    [43] Nethercote N, Seward J. Valgrind:A framework for heavyweight dynamic binary instrumentation. ACM SIGPLAN Notices, 2007, 42(6):89-100.[doi:10.1145/1273442.1250746]
    [44] KCSAN. 2020. https://www.kernel.org/doc/html/latest/dev-tools/kcsan.html
    [45] Jeong DR, Kim K, Shivakumar B, Lee B, Shin I. Razzer:Finding kernel race bugs through fuzzing. In:Proc. of the 2019 IEEE Symp. on Security and Privacy (SP). San Francisco:IEEE, 2019. 754-768.[doi:10.1109/SP.2019.00017]
    [46] Wang J, Zhan N, Feng X, Liu Z. Overview of formal methods. Ruan Jian Xue Bao/Journal of Software, 2019,30(1):33-61(in Chinese with English abstract). http://www.jos.org.cn/9825-1000/5652.htm[doi:10.13328/j.cnki.jos.005652]
    [47] Owicki S, Gries D. An axiomatic proof technique for parallel programs i. Acta Informatica, 1976,6(4):319-340.
    [48] Yang J, Hawblitzel C. Safe to the last instruction:Automated verification of a type-safe operating system. In:Proc. of the 31st ACM SIGPLAN Conf. on Programming Language Design and Implementation (PLDI). New York:Association for Computing Machinery, 2010. 99-110.[doi:10.1145/1806596.1806610]
    [49] Boogie:An intermediate verification language. 2020. https://www.microsoft.com/en-us/research/project/boogie-an-intermediate-verification-language/
    [50] Nelson L, Sigurbjarnarson H, Zhang K, Johnson D, Bornholt J, Torlak E, Wang X. Hyperkernel:Push-button verification of an os kernel. In:Proc. of the 26th Symp. on Operating Systems Principles (SOSP). Shanghai:ACM, 2017. 252-269.
    [51] Sigurbjarnarson H, Bornholt J, Torlak E, Wang X. Push-button verification of file systems via crash refinement. In:Proc. of the 12th USENIX Conf. on Operating Systems Design and Implementation. 2016. 1-16.[doi:10.1145/3132747.3132748]
    [52] Xu F, Fu M, Feng X, Zhang X, Zhang H, Li Z. A practical verification framework for preemptive OS kernels. In:Proc. of the Conf. on Computer Aided Verification. Cham:Springer-Verlag, 2016. 59-79.
    [53] Klein G, Andronick J, Elphinstone K, Murray T, Sewell T, Kolanski R, Heiser G. Comprehensive formal verification of an OS microkernel. ACM Trans. on Computer Systems (TOCS), 2014,32(1):1-70.[doi:10.1145/2560537]
    [54] Merz S. Model checking:A tutorial overview. In:Summer School on Modeling and Verification of Parallel Processes. Berlin, Heidelberg:Springer-Verlag, 2001. 3-38.
    [55] Visser W, Havelund K, Brat G, Park S. Model checking programs. In:Proc. of the Conf. of Automated Software Engineering (ASE). 2003,10(2):203-232.
    [56] Kim M, Hong S, Hong C, Kim T. Model-based kernel testing for concurrency bugs through counter example replay. Electronic Notes in Theoretical Computer Science, 2009,253(2):21-36.[doi:10.1016/j.entcs.2009.09.049]
    [57] Holzmann GJ, Joshi R. Model-driven software verification. ACM Computing Surveys (CSUR), 2009,41(4):1-54.[doi:10.1145/1592434.1592438]
    [58] Holzmann GJ. The model checker SPIN. IEEE Trans. on Software Engineering, 1997,23(5):279-295.[doi:10.1109/32.588521]
    [59] Flanagan C, Ave L, Freund SN. Type-based race detection for Java. In:Proc. of the ACM SIGPLAN 2000 Conf. on Programming Language Design and Implementation (PLDI). 2000. 219-232.[doi:10.1145/349299.349328]
    [60] Sasturkar A, Agarwal R, Wang LQ, Stoller SD. Automated type-based analysis of data races and atomicity. In:Proc. of the 10th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming. 2005. 83-94.[doi:10.1145/1065944.1065956]
    [61] Vojdani V, Apinis K, Rõtov V, et al. Static race detection for device drivers:The Goblint approach. In:Proc. of the 31st IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE). Singapore:ACM Press, 2016. 391-402.
    [62] Kildall GA. A unified approach to global program optimization. In:Proc. of the 1st Annual ACM SIGACT-SIGPLAN Symp. on Principles of Programming Languages (POPL). Boston:ACM Press, 1973. 194-206.[doi:10.1145/512927.512945]
    [63] Engler D, Ashcraft K. RacerX:Effective, static detection of race conditions and deadlocks. ACM SIGOPS Operating Systems Review, 2003,37(5):237-252.[doi:10.1145/1165389.945468]
    [64] Bai JJ, Wang YP, Lawall J, Hu SM. DSAC:Effective static analysis of sleep-in-atomic-context bugs in kernel modules. In:Proc. of the 2018 USENIX Annual Technical Conf. (USENIX ATC). 2018. 587-600.
    [65] King JC. Symbolic execution and program testing. Communications of the ACM, 1976,19(7):385-394.[doi:10.1145/360248. 360252]
    [66] Tan L, Zhou Y, Padioleau Y. aComment:Mining annotations from comments and code to detect interrupt related concurrency bugs. In:Proc. of the 33rd Int'l Conf. on Software Engineering (ICSE). IEEE, 2011. 11-20.[doi:10.1145/1985793.1985796]
    [67] Breuer PT, Valls MG. Static deadlock detection in the Linux kernel. In:Proc. of the Int'l Conf. on Reliable Software Technologies. Berlin, Heidelberg:Springer-Verlag, 2004. 52-64.
    [68] Hong S, Kim M. Effective pattern-driven concurrency bug detection for operating systems. Journal of Systems and Software, 2013, 86(2):377-388.[doi:10.1016/j.jss.2012.08.063]
    [69] Andrianov P, Mutilin V, Khoroshilov A. Predicate abstraction based configurable method for data race detection in Linux kernel. In:Proc. of the Int'l Conf. on Tools and Methods for Program Analysis. Cham:Springer-Verlag, 2017. 11-23.
    [70] Pozniansky E, Schuster A. Efficient on-the-fly data race detection in multithreaded C++ programs. In:Proc. of the 9th ACM SIGPLAN Symp. on Principles and Practice of Parallel Programming (PPoPP). 2003. 179-190.[doi:10.1145/781498.781529]
    [71] Rajagopalan AK, Huang J. RDIT:Race detection from incomplete traces. In:Proc. of the 10th Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering (ESEC/FSE). Bergamo:ACM Press, 2015. 914-917.[doi:10.1145/2786805.2803209]
    [72] Seyster J, Radhakrishnan P, Katoch S, Duggal A, Stoller SD, Zadok E. Redflag:A framework for analysis of kernel-level concurrency. In:Proc. of the Int'l Conf. on Algorithms and Architectures for Parallel Processing. Berlin, Heidelberg:Springer-Verlag, 2011. 66-79.
    [73] What is RCU? 2020. https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt
    [74] Chen QL, Bai JJ, Jiang ZM, Lawall J, Hu SM. Detecting data races caused by inconsistent lock protection in device drivers. In:Proc. of the 26th IEEE Int'l Conf. on Software Analysis, Evolution and Reengineering (SANER). Hangzhou:IEEE, 2019. 366-376.[doi:10.1109/SANER.2019.8668017]
    [75] The LLVM compiler infrastructure project. 2020. https://llvm.org/
    [76] Kernel-strider. 2020. https://github.com/euspectre/kernel-strider
    [77] Erickson J, Musuvathi M, Burckhardt S, Olynyk K. Effective data-race detection for the kernel. In:Proc. of the 9th USENIX Symp. on Operating Systems Design and Implementation (OSDI). 2010.
    [78] Jiang Y, Yang Y, Xiao T, Sheng T, Chen W. DRDDR:A lightweight method to detect data races in Linux kernel. The Journal of Supercomputing, 2016,72(4):1645-1659.
    [79] Blum B. Landslide:Systematic dynamic race detection in kernel space. 2012. https://www.pdl.cmu.edu/PDL-FTP/associated/CMU-CS-12-118.pdf
    [80] Magnusson PS, Christensson M, Eskilson J, Forsgren D, Hallberg G, Hogberg J, Larsson F, Moestedt A, Werner B. Simics:A full system simulation platform. Computer, 2002,35(2):50-58.[doi:10.1109/2.982916]
    [81] Fonseca P, Rodrigues R, Brandenburg BB. SKI:Exposing kernel concurrency bugs through systematic schedule exploration. In:Proc. of the 11th USENIX Symp. on Operating Systems Design and Implementation (OSDI). 2014. 415-431.
    [82] QEMU. 2020. https://www.qemu.org/
    [83] Zhou C, Wang M, Liang J, Jiang Y. Zeror:Speed up fuzzing with coverage-sensitive tracing and scheduling. In:Proc. of the Conf. of Automated Software Engineering (ASE). 2020.
    [84] Chen H, Guo S, Xue Y, Sui Y, Zhang C, Li Y, Wang H, Liu Y. MUZZ:Thread-aware grey-box fuzzing for effective bug hunting in multithreaded programs. In:Proc. of the 29th USENIX Security Symp. (USENIX Security). 2020. 2325-2342.
    [85] Vinesh N, Sethumadhavan M. ConFuzz-A concurrency fuzzer. In:Proc. of the 1st Int'l Conf. on Sustainable Technologies for Computational Intelligence. Singapore:Springer-Verlag, 2020. 667-691.
    [86] Xu M, Kashyap S, Zhao H, Kim T. Krace:Data race fuzzing for kernel file systems. In:Proc. of the 2020 IEEE Symp. on Security and Privacy (SP). San Francisco:IEEE, 2020. 1643-1660.
    [87] Gu R, Shao Z, Chen H, Wu X, Kim J, Sjöberg V, Costanzo D. CertiKOS:An extensible architecture for building certified concurrent OS kernels. In:Proc. of the USENIX Symp. on Operating Systems Design and Implementation. Savannah, 2016. 653-669.
    [88] Lu S, Tucek J, Qin F, Zhou Y. AVIO:Detecting atomicity violations via access interleaving invariants. ACM SIGOPS Operating Systems Review, 2006,40(5):37-48.[doi:10.1145/1168917.1168864]
    [89] Lochmann A, Schirmeier H, Borghorst H, Spinczyk O. LockDoc:Trace-based analysis of locking in the Linux kernel. In:Proc. of the 14th EuroSys Conf. 2019(EuroSys). Dresden:ACM Press, 2019. 1-15.[doi:10.1145/3302424.3303948]
    附中文参考文献:
    [20] 苏小红,禹振,王甜甜,马培军.并发缺陷暴露、检测与规避研究综述.计算机学报,2015,395(11):2215-2233.[doi:10.11897/SP.J. 1016.2015.02215]
    [46] 王戟,詹乃军,冯新宇,刘志明.形式化方法概貌.软件学报,2019,30(1):33-61. http://www.jos.org.cn/1000-9825/5652.htm[doi:10.13328/j.cnki.jos.005652]
    相似文献
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

石剑君,计卫星,石峰.操作系统内核并发错误检测研究进展.软件学报,2021,32(7):2016-2038

复制
分享
文章指标
  • 点击次数:2893
  • 下载次数: 8028
  • HTML阅读次数: 7442
  • 引用次数: 0
历史
  • 收稿日期:2020-09-14
  • 最后修改日期:2020-10-26
  • 在线发布日期: 2021-01-22
  • 出版日期: 2021-07-06
文章二维码
您是第19893964位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号