本文由“软件学科发展回顾特刊”特约编辑梅宏教授、金芝教授、郝丹副教授推荐
在信息化时代,人们对软件的质量要求越来越高.程序分析是保障软件质量的重要手段之一,日益受到学术界和产业界的重视.介绍了若干基本程序分析技术(抽象解释、数据流分析、基于摘要的分析、符号执行、动态分析、基于机器学习的程序分析等),特别是最近10余年的研究进展.进而介绍了针对不同类型软件(移动应用、并发软件、分布式系统、二进制代码等)的分析方法.最后展望了程序分析未来的研究方向和所面临的挑战.
In the information age, people are increasingly demanding high quality of software systems. Program analysis is one of the important approaches to guarantee the quality of software, and has been receiving attentions from academia and industry. This article mainly focuses on the research progress in program analysis in the last decade. First, the article introduces the basic program analysis techniques, including abstract interpretation, data flow analysis, summary-based analysis, symbolic execution, dynamic analysis, machine learning-based program analysis, etc. Then, it summarizes program analysis approaches for different types of software systems, including mobile applications, concurrent software, distributed systems, binary code, etc. Finally, the article discusses potential research directions and challenges of program analysis in the future.
随着信息化的不断发展, 软件对人们生活的影响越来越大, 在国民经济和国防建设中具有越来越重要的地位.如何提高软件质量, 保证其行为的可信性, 是学术界和工业界共同关注的重要问题.要解决该问题, 应加强软件开发过程管理, 在全生命周期采取各种方法和技术提升软件质量.由于软件系统的复杂性, 在编码实现完成之后, 甚至在软件产品发布、被广泛使用之后, 往往还有各种各样的缺陷和漏洞.各种软件测试和分析技术, 是发现这些缺陷、漏洞的有效手段.
不同于很多黑盒测试方法, 软件分析技术可深入软件系统内部, 细致地考察其结构及各个组成部分, 进而发现其各种特性, 如性能、正确性、安全性等.软件分析已逐渐发展成为程序语言和软件工程领域的一个重要研究方向, 并已影响到信息安全等相关领域.进入21世纪以来, 该方向进展显著, 研究成果不断涌现.软件分析不仅可用来发现软件中的缺陷、漏洞, 还可用于软件理解、修复以及测试用例生成等方面[
软件包含程序和文档.由于篇幅所限, 本文主要介绍程序分析方面的研究, 特别是最近10余年该方向的一些重要工作.本文将介绍程序分析的基本概念(分析对象、难度、评价等)、主要的基础性分析技术、针对不同类型分析对象的分析方法等.最后, 简要提及一些挑战性问题以及新兴的研究方向.
本节介绍程序分析的基本概念、程序及其性质多样性等, 进而解释程序分析的难度及其评价.
程序分析指的是:对计算机程序进行自动化的处理, 以确认或发现其特性, 比如性能、正确性、安全性等[
传统上, 程序分析包括各种静态分析技术(类型检查、数据流分析、指向分析等)与动态分析技术:所谓的静态分析, 是指对程序代码进行自动化的扫描、分析, 而不必运行程序; 与静态分析相对应的是动态分析技术, 其利用程序运行过程中的动态信息, 分析其行为和特性.
与程序分析密切相关的两类方法是形式验证及测试:前者试图通过形式化方法, 严格证明程序具有某种性质, 目前, 其自动化程度尚有不足, 难于实用; 测试方法多种多样, 在实际工程中广泛使用, 这些方法也是以发现程序中的缺陷为目的, 它们一般都需要人们提供输入数据, 以便运行程序, 观察其输出结果.
程序分析内涵比较丰富, 具有多样性.
1) 被分析对象的多样性
程序分析的对象既可以是源代码, 也可以是二进制可执行代码.对于源程序, 根据其实现语言不同, 可能需要不同的分析技术.比如, C程序和Java程序的分析, 其技术可能就不一样:后者需要处理一些面向对象的结构.即使是同一种高级语言编写的程序, 也可能具有不同的特性.比如, 程序中是否有比较多的指针?是否使用并发控制结构、通信机制?是否有大量的数值计算.从规模看, 被分析的程序可以是几百行的代码, 也可以是几十万行乃至于上千万行代码.
2) 程序性质的多样性
对不同类型的程序, 在不同的应用环境下, 人们关注的性质也不一样.以C程序为例, 很多C程序会用到指针, 用于动态分配、释放内存.对这样的程序, 人们会关注空指针、内存泄漏等问题.但对某些嵌入式系统来说, 程序中很少动态分配内存空间, 却可能要处理中断.另外一些C程序可能有大量的数值(浮点数)计算, 程序员可能更关注其中是否存在溢出问题.
由于程序语言的复杂性、程序性质的多样性, 自动化的程序分析往往具有相当大的难度.根据Rice定理:对于程序行为的任何非平凡属性, 都不存在可以检查该属性的通用算法[
鉴于程序分析的理论难度以及被分析程序的复杂度, 我们往往不要求对程序进行完全精确的分析.这就可能带来误报(false positive)、漏报(false negative)等问题:前者指的是报警信息指出的缺陷实际上不存在、不可能发生; 后者指的是程序中存在的某个缺陷, 没有被程序分析工具所发现.
对程序分析技术或工具进行评价, 不仅要看其分析对象的规模、复杂度, 分析过程的效率, 还要看其对用户的要求, 发现缺陷的严重程度, 以及误报率、漏报率等.
本节主要介绍一些近期有重要进展的程序分析方法和技术, 包括抽象解释、数据流分析、基于摘要的过程间分析、符号执行、动态分析、基于机器学习的程序分析等.这6项方法和技术的关系如
主要的程序分析技术
Main program analysis techniques
程序分析总体可以分为静态分析和动态分析, 涉及的基础理论包括抽象解释、约束求解、自动推理等.而静态分析的关键技术包括数据流分析、过程间分析、符号执行等.最后, 近期机器学习技术被用于提升各种不同的程序分析技术.
除了这6种基本程序分析技术之外, 还有一些其他广泛使用的程序分析技术也在近期有显著进展, 如污点分析、模型检验、编程规则检查[
抽象解释是一种对程序语义进行可靠抽象(或近似)的通用理论[
抽象域是抽象解释框架下的核心要素, 一般是面向某类特定性质设计的.到目前为止, 已出现了数十种面向不同性质的抽象域.其中, 具有代表性的抽象域包括区间抽象域、八边形抽象域、多面体抽象域等数值抽象域.此外, 还出现了若干开源的抽象域库, 如APRON[
在提高分析精度方面, 基于加宽的不动点迭代过程所导致的分析精度损失问题和抽象域本身表达能力的局限性是当前面临的主要问题.
● 在缓解加宽所导致的精度损失方面, 近年来的研究进展大致可以分为两种思路:(1)使用基于策略迭代[
● 在弥补抽象域本身表达能力的局限性方面, 最近的研究进展可分为3类:(1)将符号化方法与抽象方法结合起来, 利用SMT求解器[
在提高可扩展性方面, 如何有效降低分析过程中抽象状态表示与计算的时空开销是目前考虑的主要问题.在这方面, 最近的研究进展包括:
● 利用变量访问的局部性原理, 降低当前抽象环境中所涉及的变量维数, 并根据数据流依赖的稀疏性, 降低抽象状态的存储开销和传播开销.基于该思想, 最近, Oh等人提出了一种通用的全局稀疏分析框架, 在不损失分析精度的前提下能够显著降低时空开销, 并在静态分析工具Sparrow上进行了应用, 取得了显著的可扩展性提升效果[
● 利用矩阵分解等在线分解优化策略来对抽象域操作的算法进行优化.基于该思想, 最近, Singh等人[
最近, Singh等人[
此外, 将抽象解释应用到特定类型程序或特定性质的分析验证方面的研究也取得了不少进展, 主要的关注点包括复杂数据结构自动分析的支持、不同谱系目标程序的支持、活性性质分析的支持.在复杂数据结构的自动分析方面, 最近的研究重点关注针对数组内容的精确分析[
未来, 抽象解释技术将进一步在新的架构、语言、应用等实际需求驱动下不断发展.值得关注的方向包括对弱内存模型的分析验证[
数据流分析通过分析程序状态信息在控制流图中的传播来计算每个静态程序点(语句)在运行时可能出现的状态.经典的数据流分析理论[
数据流分析为抽象解释的一个特例, 其计算的状态信息(抽象域)局限于有限高度的格〈
1) IFDS/IDE数据流分析框架
IFDS分析框架由Reps等人[
Reps等人[
2) 基于值流图的稀疏数据流分析方法
传统的数据流分析在程序控制流图上将所需计算的状态信息在每个程序点传播得到最终分析结果.此过程通常存在较多冗余操作, 对效率, 特别是过程间数据流分析效率会有很大影响.为了进一步提高数据流分析的效率, 近年来, 研究者们提出了多种稀疏的分析方法, 从而无需计算状态信息在每个程序点的传播即可得到与数据流分析相同的结果.该类分析技术[
摘要(summary)是可复用的程序模块分析结果, 能够简要地刻画模块的外部行为.创建摘要和利用摘要开展分析的过程称为摘要分析.在编写程序的过程中, 一个程序模块往往需要调用其他程序模块.对主调(caller)模块的分析必然涉及到对被调模块(callee)的分析.与复用被调模块代码可提高软件生产效率类似, 摘要分析期望复用被调模块的分析结果也能提高分析主调模块的效率.通过基于摘要的程序分析技术, 我们将被复用模块的分析结果被构造成为摘要, 并在分析主调模块时对其实例化, 以加快分析速度.
传统摘要分析的研究主要关注模块化(modular)分析, 即:在分析过程中将软件划分为多个模块, 对各个模块分别分析和创建摘要, 然后合并摘要获得整体的分析结果.这样, 在分析过程中创建摘要的技术也称为在线摘要技术.摘要分析在最近10年的主要进展是离线摘要技术, 即:在程序分析之前对常用代码库生成摘要, 从而加快使用这些代码库的客户端的分析速度.离线摘要根据其自动化程度的不同, 可以分成人工编写摘要和自动生成摘要, 分别在接下来的两小节中加以介绍.
1) 人工编写摘要
对于程序中一些难以分析的代码, 例如第三方的库和系统代码等, 可采用人工编写摘要的方式近似代码的行为.这种技术原先应用于程序验证中, 比如ESC/Java通过用户提供的前置条件和后置条件来对程序进行模块化验证.FlowDroid[
2) 自动生成摘要
在程序分析过程前创建摘要, 通常是对程序常用的模块(如Java代码库)进行摘要分析, 随后加速客户代码的分析.诸多的未知信息[
● 基于简化的分析:这个方案的思路是针对环形依赖导致的大模块上的分析通过标准的编译技术(如partial evaluation)进行化简, 提高分析速度.每个模块上的分析程序可以被当成是一个函数, 然后, 把这些函数看成是标准程序之后就可以应用普通的程序分析进行化简了;
● 最坏情况分析:每个模块分别分析, 对于该模块所依赖的其他模块作最坏情况假设.这样的分析会导致不精确, 并且在多数情况下严重不精确;
● 符号化关系型分析:把未知的信息做成符号化信息, 然后把程序分析转变成符号化分析.所有与符号化相关的部分不进行计算.
Smaragdakis等人[
a) 对于连续赋值, 比如
b) 去掉与客户端无关的赋值语句.
这两个优化能省掉4%~69%的时间.该方法仅支持上下文不敏感且流不敏感的分析.
Rountev等人[
Tang等人[
由于模块含有大量的未知量, 建立考虑所有情况的摘要在许多分析任务中非常困难.所以, 一些工作通过动态程序分析或训练的方式建立了部分摘要.Palepu等人[
上述工作都是针对整个代码库进行摘要分析.当精度要求较高时, 分析完整的代码库比较困难.部分工作只对单个函数进行分析.Yorsh等人[
Dillig等人[
符号执行[
在约束条件可被判定的情况下, 符号执行提供了一种系统遍历程序路径空间的手段.符号执行中的程序路径精确刻画了程序路径上的信息, 可基于路径信息开展多种软件验证确认阶段的活动, 包括自动测试、缺陷查找以及部分的程序验证等[
最初, 符号执行主要用于程序自动测试, 但受制于当时的计算能力和求解技术与工具的能力, 符号执行技术并没有得到实际的应用.近年来, 随着计算能力的提高、SAT/SMT技术和工具的蓬勃发展[
不同于传统的符号执行, 动态符号执行[
目前, 符号执行技术仍然面临提高可扩展性(scalability)与可行性(feasibility)这两方面的挑战:可扩展性挑战是指如何在有限资源(比如时间、内存)的前提下提高符号执行的效率, 更快地达成分析目标; 可行性挑战是指如何支持不同类型分析目标的符号执行, 权衡分析的可靠性与精确性.已有的研究工作基本都是围绕这两个方面展开.
在可扩展性方面, 路径空间爆炸和约束求解是主要的两大难题.在缓解路径空间爆炸方面, 目前已有的工作基本分为两个思路:(1)在具体目标下提供高效的搜索策略, 使符号执行分析更快地达到目标, 包括提高程序的覆盖率[
在可行性方面, 环境建模和多形态分析目标的支持是目前的主要问题.在环境建模方面, 已有的工作基本都是在分析的精确性、可靠性、建模工作量以及可扩展性之间进行权衡和折中, 包括手工建模[
同时, 面向新的分析需求, 也有一些新的符号执行技术出现, 其中比较有代表性的是概率符号执行技术[
随着符号执行技术近些年的发展, 符号执行技术在工业界也得到了实际的采纳和应用, 其中有代表性的工作有:微软公司把自己开发的二进制动态符号执行工具SAGE用于Win 7的测试, 发现了文件模糊测试中1/3的缺陷[
未来, 符号执行技术将进一步在软件工程、安全、系统、网络等相关领域的实际需求驱动下不断发展.面向大规模软件的高效符号执行方法、技术和工具将是下一步研究所面临的挑战和重点.同时, 符号执行搜索策略的更加智能化[
动态分析是指通过在指定测试用例下运行给定的程序, 并分析程序运行过程或结果, 用于缺陷检测等.与静态分析相比, 动态分析能够更好地处理编程语言中的动态属性, 例如指针、动态绑定、面向对象语言中的多态与继承、线程交替等.动态分析在一定程度上弥补了静态分析的不足之处.
基本的程序动态分析可以简单地分为在线(online)动态分析与离线(offline)动态分析[
程序动态分析, 从报告缺陷的准确性出发, 也可以分为:(1)缺陷动态检测; (2)缺陷动态预测.一般所指的缺陷动态查找方法是缺陷动态检测, 是指在程序的运行过程中某个缺陷已经发生了之后的检测, 即, 缺陷已经反映在程序行为中或者运行结果中.其关注点是:如何设计检测算法等, 保证将所有已经发生的缺陷检测出来.然而, 程序中的不同缺陷不会都在有限的若干测试用例中被触发.因此, 为了提高缺陷动态检测的有效性, 需要从程序若干次运行中预测出该程序潜在的某些行为, 并判断这些潜在行为是否会触发缺陷.这种方法称为缺陷的动态预测.例如:某个程序中存在一个缓冲区溢出缺陷, 该缺陷只有在输入input大于128时发生.那么, 如果我们可以从某个input不大于128的输入下预测对应行为是一个潜在缺陷, 其可能会在input大于128时发生.进一步地, 我们可以构造一个满足预测条件的输入, 再次运行程序去触发(验证)缺陷.缺陷的动态预测在并发程序的动态分析中经常用到, 主要原因之一是:并发程序的运行除了与输入有关外, 还与程序中线程之间的调度有关.程序缺陷的动态预测还会涉及到一些其他技术, 例如, 测试用例生成、约束求解、缺陷重现、线程调度等, 其分析过程与缺陷动态检测相比更为复杂.
经典的程序分析技术提供相对精确的分析结果, 但同时也带来了包括路径组合爆炸、误报率较高等一系列在实践中不可避免却难以应对的问题.随着近年来通用计算设备能力的提高, 海量的程序执行数据被存储和管理; 研究者采用机器学习、统计分析等系列技术提升现有的程序分析能力.
现有的程序分析技术依赖于一定的启发式策略, 如符号执行技术中的深度(或广度)优先搜索.在动态分析技术中, 为了达到分析精度, 这些策略可能带来较高的计算成本.基于现有的标记数据(即已知分析结果的程序)建立学习模型, 机器学习技术可以学习新的可自适应的策略, 减少启发式策略带来的成本消耗.Li等人[
在静态分析技术中, 分析工具在获得较高的分析精度的同时, 往往带来过高的误报率.研究者建立了机器学习模型, 以剔除潜在的误报并减少静态分析的成本.Heo等人[
本节介绍程序分析技术在一些重点领域软件的应用, 包括移动应用软件、并发软件、分布式系统、二进制代码等方面的重要应用.
近10年来, 以智能手机为代表的智能终端以惊人的速度得以普及.目前, 以Android和苹果iOS系统为代表的智能手机数量和使用频度已远超个人计算机, 成为最流行的个人电子设备.通过移动支付、购物和社交等各式各样的移动应用, 智能手机已经深度渗透进了人们的日常活动中.相应地, 这些应用的运行状态和广大用户的日常生活和工作有着直接的关系, 也在很大程度上影响到了整个互联网的运转.为此, 研究人员通过对移动应用软件的程序分析, 来检测其是否具有所期望的以安全性为核心的各种特性.
1) 污点分析技术
由于智能手机的使用特点, 移动应用的安全性分析常常可以归结为应用代码上跟踪敏感数据流的动态/静态污点分析(taint analysis)问题.
● 动态污点分析:最具代表性的Android应用动态污点分析技术是TaintDroid[
● 静态污点分析:FlowDroid[
2) 面向移动应用特性的程序分析技术
与桌面应用的分析相比, 移动应用的分析还会涉及到一些与移动平台特性密切相关的分析任务, 如组件间通信(inter-component communication)的分析[
3) 移动应用分析辅助技术
为了达到较好的分析效果, 仅仅关注于应用软件分析技术本身是不够的, 研究人员还发展出一些分析辅助技术来进一步提升分析效果.为了在动态分析工作中获得较高的覆盖率, 学术界研究出一些系统化的应用运行驱动技术, 如AndroidRipper[
自从计算机进入并发处理时代, 系统效率得到显著提升.然而, 并发程序(如多线程程序)的使用, 导致并发问题(或者并发缺陷)的存在且难以解决.其难点是:并发程序在运行时, 多个任务之间的交替运行空间巨大, 而搜索该空间是NP难的.2007年图灵奖、2013年图灵奖、2016年哥德尔奖等都授予了为解决并发问题而做出突出贡献的学者.他们提出的方法, 例如模型检验[
由于多线程程序运行时各个线程之间的交替执行(interleaving)[
并发缺陷的检测包括静态分析[
1) 全面调度技术
动态分析依赖于程序的具体运行来检测并发缺陷, 因此其无法检测那些没有运行或者被隐藏的并发缺陷.引入随机性则可以增加动态分析的检测能力.ConTest[
PCT[
2) 限定调度
完全探索一个并发程序所有可能的运行空间是NP难的.近年来, 研究人员提出了限定调度技术来缓解这一难题.CHESS[
3) 启发式调度
并发缺陷主要包括死锁(deadlock)、数据竞争(data race)和原子性违反(atomicity violation)[
针对数据竞争检测, FastTrack[
原子性违反检测的难点, 首先是如何确定程序中的内存访问应该是原子执行的.一旦原子性区域被识别, 其违反的检测就会比较容易.AVIO[
死锁不同于数据竞争和原子性违反, 其涉及到线程之间的同步.同样, 死锁一旦发生, 则其很容易被检测到.但是如何从一个没有死锁发生的程序中预测潜在死锁, 同样是并发程序分析中的一个难题.其中的主要难题是如何高效地检测死锁.iGoodLock[
4) 消除误报与漏报
并发缺陷的检测通常会有误报[
另一方面, 动态分析必然产生漏报, 其中一个原因就是缺乏并发测试用例.ConTeGe[
随着越来越多的数据与计算从本地向云端迁移, 大规模分布式系统逐步得到广泛使用, 比如分布式存储系统、分布式计算框架、同步服务、集群管理服务等.典型的大规模分布式系统包括HDFS、Hbase、Hadoop、Spark、ZooKeeper、Mesos、YARN等.与传统的单机系统相比, 大规模分布式系统具有较好的可扩展性和容错能力, 获得同样计算能力的成本较低.因此, 分布式系统在应对复杂业务场景以及单机无法完成的大规模计算任务时具有较大的优势, 已成为支撑大规模网络应用不可或缺的一部分.大规模分布式系统已在大型互联网公司, 如阿里巴巴、谷歌、百度等得到广泛应用.
大规模分布式系统必须管理大量分布式软件组件、硬件及其配置, 使得该类系统异常复杂.因而, 大规模分布式系统不可避免地会发生故障, 并影响到大量终端用户, 降低了它的可靠性和可用性[
1) 分布式系统中缺陷的实证研究
近年来, 为了提高复杂分布式系统的可靠性, 研究人员对分布式系统中的各种缺陷进行实证研究以加深对这些缺陷的理解.CBS[
2) 基于动态/静态分析的分布式系统缺陷检测
由于分布式系统的复杂性, 很少有动态/静态分析工具直接检测分布式系统特有的缺陷, 比如由于消息引起的并发缺陷、节点失效引起的缺陷、超时相关缺陷等.得益于近年来对分布式系统中缺陷的深入分析, 开发人员构建了一系列新的开发工具.DCatch[
3) 分布式系统验证与模型检验
大规模分布式系统由数量众多的计算节点组成, 运行不同的复杂协议.建立可验证的分布式系统是避免错误的一个重要手段.Verdi[
模型检验是分析现存分布式系统可靠性的重要手段.MODIST[
4) 基于失效注入的分布式系统分析技术
在分布式系统中, 计算节点可能由于硬件、断电、操作系统错误等发生失效.分布式系统需要从节点失效中正确恢复, 以保证系统的正确性.但是由于节点失效时机、失效恢复过程难以测试, 研究人员开发了一系列工具分析分布式系统的失效恢复行为.SETSUD[
二进制分析是经久不衰的研究话题.尽管越来越多的程序是用解释型语言(Python、JavaScript等)编写, 但是它们仍然需要二进制程序来解释执行, 或者通过即时编译(JIT)技术转换为二进制代码执行.另一方面, 传统的操作系统等对性能要求较高的应用仍然是以C/C++等编译型语言编写.此外, 物联网中大量设备的计算资源有限, 运行的都是C等语言编写的二进制程序.再者, 从安全的角度来讲, 二进制程序中可能引入源码中不存在的安全问题, 例如Thompson提出的编译器后门问题[
二进制分析的首要任务是反汇编, 即识别二进制程序中的代码和数据, 解析函数间调用关系, 及函数内的控制流图.最直接的反汇编方法是线性扫描[
二进制分析面临的另一个难题是高级语义恢复.恢复二进制程序语义可以用于多种安全应用, 例如漏洞挖掘和安全防护.与源码程序不同, 二进制程序中大量信息缺失, 例如函数和变量名、函数类型、数据结构定义、虚函数调用与类信息等.Chua等人[
二进制程序分析通常需要对二进制程序进行代码插装或改写.主流有3类二进制插装方案:在原始二进制程序中直接静态修改、将二进制程序提升到中间表示中进行修改或者在代码执行过程中动态修改.第1类方案, 静态二进制插装, 面临的最大挑战是反汇编的准确率, 基于不准确的反汇编结果进行代码插装可能导致程序执行异常.Smithson等人[
二进制分析是许多安全分析的基础, 具体的安全应用场景包括漏洞挖掘、恶意代码识别、安全防护等.在漏洞挖掘方面, 二进制分析方案通常通过匹配漏洞模式的方式来挖掘漏洞.Machiry等人[
程序分析是一项重要的基础性技术, 它不仅能够直接用于发现各类软件中的缺陷、改进软件质量, 还可以在软件测试、调试、维护以及缺陷预测等方面发挥作用, 包括测试数据生成、软件维护与程序理解、程序修复等.例如, 软件测试的一个重要问题是:如何自动构造比较合适的测试数据, 从而达到一定的测试覆盖率.Xu和Zhang[
随着新型软件形式的出现, 程序分析技术也面临一些新的问题.下面针对新兴的智能合约及机器学习软件上的程序分析进行讨论.
最近数年, 源自比特币(bitcoin)的区块链(blockchain)技术已成为各行各业关注的焦点.区块链可被认为是一种分布式、去中心化的计算与存储架构, 除了支持数字资产外, 还能被用于商品溯源、信用管理乃至游戏等各种各样的去中心化应用(decentralized application, 简称DApp).为了方便DApp的开发, 一些基于区块链技术的区块链平台应运而生, 其中最为引人注目的是以太坊(Ethereum)[
智能合约由Solidity语言开发, 被编译为字节码后在以太坊虚拟机EVM上运行.由于智能合约可能承载着数额巨大的数字资产, 自然也成为了黑客的攻击目标.针对智能合约的攻击已经造成了惊人的财产损失[
近年来, 机器学习及其他智能技术在行业软件中得到了广泛的应用, 尤其是深度学习技术与工具在语音识别、图像检索、自动驾驶等领域取得了较大突破.然而, 由于广泛存在的概率模型、多层传播的复杂网络结构、黑盒形式的用户接口等特性, 深度学习工具的质量难以度量.现有的软件分析技术难以直接应用.研究者提出了一系列程序分析方案用以发现机器学习系统的潜在风险和程序错误, 提升机器学习系统的质量[
在形式化验证方面, Gehr等人[
程序分析是剖析复杂软件系统、提高软件质量的重要手段, 是软件工程、程序设计语言、操作系统、信息安全等领域日益受到关注的研究方向.经过多年的发展, 程序分析已在多个方面取得了长足的进步.研究人员将程序语言理论与编译、人工智能、数据处理等技术相结合, 针对不同的软件形态发展出众多程序分析技术, 同时开发了高效率的自动化分析工具, 其中有些工具帮助人们在开源软件中找到了很多缺陷、漏洞, 有些工具在大公司里得到应用, 并发挥了重要作用.由于篇幅所限, 本文只是简述了该方向近期的一部分重要成果.
当前, 程序分析技术还面临一系列挑战, 比如准确性、可扩展性等.另外, 新型的软件形态, 如智能合约、深度学习等, 也需要全新的程序分析技术.随着软件应用范围的发展以及对软件可靠性要求的提高, 程序分析在软件开发、维护过程中起的作用将越来越大.
在本文写作过程中, 日本九州大学的赵建军和国防科技大学的王戟提供了很多建设性意见, 中国科学院软件研究所的苏静和刘晴帮助整理了部分资料, 在此一并感谢.
Mei H, Wang QX, Zhang L, Wang J. Software analysis:A road map. Chinese Journal of Computers, 2009, 32(9):1697-1710(in Chinese with English abstract).
梅宏, 王千祥, 张路, 王戟.软件分析技术进展.计算机学报, 2009, 32(9):1697-1710.
Nielson F, Nielson HR, Hankin C. Principles of Program Analysis. Springer-Verlag, 1999.
Rice HG. Classes of recursively enumerable sets and their decision problems. Trans. of the American Mathematical Society, 1952, 74(2):358-366.
Zhang J, Wang X. A constraint solver and its application to path feasibility analysis. Int'l Journal of Software Engineering and Knowledge Engineering, 2001, 11(2):139-156.
Cousot P, Giacobazzi R, Ranzato F. Program analysis is harder than verification:A computability perspective. In:Proc. of the CAV. 2018. 75-95.
Li Z, Lu S, Myagmar S, Zhou Y. CP-Miner:Finding copy-paste and related bugs in large-scale software code. IEEE Trans. on Software Engineering, 2006, 32(3):176-192.
Cousot P, Cousot R. Abstract interpretation: A unified lattice mode for static analysis of programs by construction or approximation of fixpoints. In: Proc. of the POPL. 1977. 238-252.
Cousot P, Cousot R. Systematic design of program analysis frameworks. In: Proc. of the POPL. 1979. 269-282.
Jeannet B, Min A. Apron:A library of numerical abstract domains for static analysis. In:Proc. of the CAV. 2009. 661-667.
Singh G, Schel MP, Vechev MT. A practical construction for decomposing numerical abstract domains. In: Proc. of the POPL. 2018. 1-28.
Bagnara R, Hill PM, Zaffanella E. The parma polyhedra library:Toward a complete set of numerical abstractions for the analysis and verification of hardware and software systems. Science of Computer Programming, 2008, 72:3-21.
https://www.mathworks.com/products/polyspace-code-prover.html ]]>
https://www.absint.com/astree/index.htm ]]>
Kirchner F, Kosmatov N, Prevosto V, Signoles J, Yakobowski B. Frama-C:A software analysis perspective. Formal Aspects of Computing, 2015, 27(3):573-609.
Logozzo F. Practical verification for the working programmer with code contracts and abstract interpretation. In: Proc. of the VMCAI. 2011. 19-22.
http://pop-art.inrialpes.fr/people/bjeannet/bjeannet-forge/interproc/ ]]>
Blanchet B, Cousot P, Cousot R, Feret J, Mauborgne L, Mine A, Monniaux D, Rival X. A static analyzer for large safety-critical software. In: Proc. of the PLDI. 2003. 196-207.
Miné A, Delmas D. Towards an industrial use of sound static analysis for the verification of concurrent embedded avionics software. In: Proc. of the EMSOFT. 2015. 65-74.
Roux P, Garoche PL. Practical policy iterations-a practical use of policy iterations for static analysis:The quadratic case. Formal Methods in System Design, 2015, 46(2):163-196.
Jeannet B, Schrammel P, Sankaranarayanan S. Abstract acceleration of general linear loops. In: Proc. of the POPL. 2013. 529-540.
Amato G, Scozzari F, Seidl H, Apinis K, Vojdani V. Efficiently intertwining widening and narrowing. Science of Computer Programming, 2016, 120:1-24.
Li Y, Albarghouthi A, Kincaid Z, Gurfinkel A, Chechik M. Symbolic optimization with SMT solvers. In: Proc. of the POPL. 2014. 607-618.
Jiang J, Chen L, Wu X, Wang J. Block-Wise abstract interpretation by combining abstract domains with SMT. In: Proc. of the VMCAI. 2017. 310-329.
Cousot P. Abstracting induction by extrapolation and interpolation. In: Proc. of the VMCAI. 2015. 19-42.
Chen L, Liu J, Miné A, Kapur D, Wang J. An abstract domain to infer octagonal constraints with absolute value. In: Proc. of the SAS. 2014. 101-117.
Chen J, Cousot P. A binary decision tree abstract domain functor. In: Proc. of the SAS. 2015. 36-53.
Kincaid Z, Breck J, Reps T. Compositional recurrence analysis revisited. In: Proc. of the PLDI. 2017. 248-262.
Kincaid Z, Cyphert J, Breck J, Reps T. Non-Linear reasoning for invariant synthesis. In: Proc. of the POPL. 2017. 1-33.
Allamigeon X, Gaubert S, Goubault E, Putot S, Stott N. A fast method to compute disjunctive quadratic invariants of numerical programs. Trans. on Embedded Computing Systems, 2017, 16(5):1-19.
Oh H, Heo K, Lee W, Lee W, Park D, Kang J, Yi K. Global sparse analysis framework. Trans. on Programming Languages and Systems, 2014, 36(3):1-44.
Oh H, Lee W, Heo K, Yang H, Yi K. Selective x-sensitive analysis guided by impact pre-analysis. Trans. on Programming Languages and Systems, 2015, 38(2):1-45.
Singh G, Püschel M, Vechev M. Making numerical program analysis fast. In: Proc. of the PLDI. 2015. 303-313.
Singh G, Püschel M, Vechev M. Fast polyhedra abstract domain. In: Proc. of the POPL. 2017. 46-59.
Singh G, Püschel M, Vechev M. A practical construction for decomposing numerical abstract domains. In: Proc. of the POPL. 2018. 46-59.
Singh G, Püschel M, Vechev M. Fast numerical program analysis with reinforcement learning. In: Proc. of the CAV. 2018. 211-229.
Becchi A, Zaffanella E. A direct encoding for NNC polyhedra. In: Proc. of the CAV. 2018. 230-248.
Liu J, Rival X. Abstraction of arrays based on non contiguous partitions. In: Proc. of the VMCAI. 2015. 282-299.
Liu J, Rival X, Chen L. Automatic verification of embedded manipulating dynamic structures stored in system code contiguous regions. In: Proc. of the EMSOFT. 2018.
Chen L, Miné A, Wang J, Cousot P. An abstract domain to discover interval linear equalities. In: Proc. of the VMCAI. 2010. 112-128.
Fu Z. Modularly combining numeric abstract domains with points-to analysis, and a scalable static numeric analyzer for Java. In: Proc. of the VMCAI. 2014. 282-301.
Illous H, Lemerre M, Rival X. A relational shape abstract domain. In: Proc. of the NFM. 2013. 212-229.
Wu X, Chen L, Miné A, Dong W, Wang J. Static analysis of runtime errors in interrupt-driven programs via sequentialization. ACM Trans. on Embedded Computing Systems, 2016, 15(4):1-26.
Sung C, Kusano M, Wang C. Modular verification of interrupt-driven software. In: Proc. of the ASE. 2017. 206-216.
Chakarov A, Sankaranarayanan S. Expectation invariants for probabilistic program loops as fixed points. In: Proc. of the SAS. 2014. 85-100.
Wang D, Hoffmann J, Reps T. PMAF: An algebraic framework for static analysis of probabilistic programs. In: Proc. of the PLDI. 2018. 513-528.
Ouadjaout A, Miné A, Lasla N, Badache N. Static analysis by abstract interpretation of functional properties of device drivers in TinyOS. Journal of Systems and Software, 2016, 120:114-132.
Cox A, Chang BE, Rival X. Desynchronized multi-state abstractions for open programs in dynamic languages. In: Proc. of the ESOP. 2015. 483-509.
Lim J, Reps T. TSL:A system for generating abstract interpreters and its application to machine-code analysis. Trans. on Programming Languages and Systems, 2013, 35(1):1-59.
Tripp O, Pistoia M, Cousot P, Cousot R, Guarnieri S. Andromeda: Accurate and scalable security analysis of Web applications. In: Proc. of the FASE. 2013. 210-225.
Urban C, Miné A. Proving guarantee and recurrence temporal properties by abstract interpretation. In: Proc. of the VMCAI. 2015. 190-208.
Urban C. FuncTion: An abstract domain functor for termination. In: Proc. of the TACAS. 2015. 464-466.
Gonnord L, Monniaux D, Radanne G. Synthesis of ranking functions using extremal counterexamples. In: Proc. of the PLDI. 2015. 608-618.
Dan A, Meshman Y, Vechev M, Yahav E. Effective abstractions for verification under relaxed memory models. In: Proc. of the VMCAI. 2017. 62-76.
Alglave J, Cousot P. Ogre and Pythia: An invariance proof method for weak consistency models. In: Proc. of the PLDI. 2017. 3-18.
Gehr T, Mirman M, Drachsler-Cohen D, Tsankov P, Chaudhuri S, Vechev M. AI2: Safety and robustness certification of neural networks with abstract interpretation. In: Proc. of the IEEE S&P. 2018. 3-18.
Wang S, Pei K, Whitehouse J, Yang J, Jana S. Formal security analysis of neural networks using symbolic intervals. Technical Report, 1804.10829, 2018.
Urban C, Müller P. An abstract interpretation framework for input data usage. In: Proc. of the ESOP. 2018. 683-710.
Fromherz A, Ouadjaout A, Miné A. Static value analysis of python programs by abstract interpretation. In: Proc. of the NFM. 2018. 185-202.
Chae K, Oh H, Heo K, Yang H. Automatically generating features for learning program analysis heuristics for C-like languages. In: Proc. of the OOPSLA. 2017. 1-25.
Seladji Y. Finding relevant templates via the principal component analysis. In: Proc. of the VMCAI. 2017. 483-499.
Cousot P, Cousot R. Abstract interpretation: Past, present and future. In: Proc. of the CSL-LICS. 2014. 1-10.
Aho AV, Sethi R, Ullman JD, et al. Compilers:Principles, Techniques, and Tools. Pearson Education Singapore, 1986.
Reps T, Horwitz S, Sagiv M. Precise interprocedural dataflow analysis via graph reachability. In: Proc. of the POPL. 1995. 49-61.
Bodden E. Inter-Procedural data-flow analysis with IFDS/IDE and Soot. In: Proc. of the SOAP. 2012. 3-8.
Dolby J. Watson libraries for analysis (WALA). 2018. https: //github.com/wala/WALA
Arzt S, Rasthofer S, Fritz C, Bodden E, Bartel A, Klein J, Traon YL, Octeau D, McDaniel P. FlowDroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. In: Proc. of the PLDI. 2014. 259-269.
Reps T. Program analysis via graph reachability. Information and Software Technology, 1998, 40(11-12):701-726.
Sridharan M, Bodik R. Refinement-Based context-sensitive points-to analysis for Java. In: Proc. of the PLDI. 2006. 387-400.
Sui Y, Li Y, Xue J. Query-Directed adaptive heap cloning for optimizing compilers. In: Proc. of the CGO. 2013. 1-11.
Pratikakis P, Foster JS, Hicks M. LOCKSMITH: Context-Sensitive correlation analysis for race detection. In: Proc. of the PLDI. 2006. 320-331.
Zhou Q, Li L, Wang L, Xue J, Feng X. May-Happen-in-Parallel analysis with static vector clocks. In: Proc. of the CGO. 2018. 228-240.
Li L, Cifuentes C, Keynes N. Boosting the performance of flow-sensitive points-to analysis using value flow. In: Proc. of the FSE. 2011. 343-353.
Sui Y, Xue J. On-Demand strong update analysis via value-flow refinement. In: Proc. of the FSE. 2016. 460-473.
Yu H, Xue J, Huo W, Feng X, Zhang Z. Level by level: Making flow-and context-sensitive pointer analysis scalable for millions of lines of code. In: Proc. of the CGO. 2010. 218-229.
Hardekopf B, Lin C. Flow-Sensitive pointer analysis for millions of lines of code. In: Proc. of the CGO. 2011. 289-298.
Li L, Cifuentes C, Keynes N. Precise and scalable context-sensitive pointer analysis via value flow graph. In: Proc. of the ISMM. 2013. 85-96.
Ali K, Lhoták O. Application-Only call graph construction. In: Proc. of the ECOOP. 2012. 688-712.
Ali K, Lhoták O. Averroes: Whole-Program analysis without the whole program. In: Proc. of the ECOOP. 2013. 378-400.
Dillig I, Dillig T, Aiken A. Reasoning about the unknown in static analysis. Communications of the ACM, 2010, 53(8):115-123.
Cousot P, Cousot R. Modular static program analysis. In: Proc. of the CC. 2002. 159-179.
Smaragdakis Y, Balatsouras G, Kastrinis G. Set-Based pre-processing for points-to analysis. In: Proc. of the OOPSLA. 2013. 253-270.
Rountev A, Ryder BG. Points-To and side-effect analyses for programs built with precompiled libraries. In: Proc. of the CC. 2001. 20-36.
Rountev A, Kagan S, Marlowe T. Interprocedural dataflow analysis in the presence of large libraries. In: Proc. of the CC. 2006. 2-16.
Rountev A, Sharp M, Xu G. IDE dataflow analysis in the presence of large object-oriented libraries. In: Proc. of the CC. 2008. 53-68.
Sharir M, Pnueli A. Two approaches to interprocedural data flow analysis. Computer Science Department, Courant Institute of Mathematical Sciences, New York University, 1978. http://www.rw.cdl.uni-saarland.de/teaching/spa10/slides/kboesche.pdf
Arzt S, Bodden E. StubDroid: Automatic inference of precise data-flow summaries for the Android framework. In: Proc. of the ICSE. 2016. 725-735.
Tang H, Wang X, Zhang L, Xie B, Zhang L, Mei H. Summary-Based context-sensitive data-dependence analysis in presence of callbacks. In: Proc. of the POPL. 2015. 83-95.
Tang H, Wang D, Xiong Y, Zhang L, Wang X, Zhang L. Conditional DYCK-CFL reachability analysis for complete and efficient library summarization. In: Proc. of the ESOP. 2017. 880-908.
Palepu VK, Xu G, Jones JA. Improving efficiency of dynamic analysis with dynamic dependence summaries. In: Proc. of the ASE. 2013. 59-69.
Kulkarni S, Mangal R, Zhang X, Naik M. Accelerating program analyses by cross-program training. In: Proc. of the OOPSLA. 2016. 359-377.
Yorsh G, Yahav E, Chandra S. Generating precise and concise procedure summaries. In: Proc. of the POPL. 2008. 221-234.
Strom RE, Yemini S. Typestate:A programming language concept for enchancing software reliability. IEEE Trans. on Software Engineering, 1986, 12(1):157-171.
Dillig I, Dillig T, Aiken A, Sagiv M. Precise and compact modular procedure summaries for heap manipulating programs. In: Proc. of the PLDI. 2011. 567-577.
Boyer RS, Elspas B, Levitt KN. SELECT-A formal system for testing and debugging programs by symbolic execution. ACM SIGPLAN Notices, 1975, 10(6):234-245.
Clarke LA. A program testing system. In: Proc. of the Annual Conf. 1976. 488-491.
King JC. Symbolic execution and program testing. Communications of the ACM, 1976, 19(7):385-394.
Zhang J. Sharp static analysis of programs. Chinese Journal of Computers, 2008, 31(9):1549-1553(in Chinese with English abstract).
张健.精确的程序静态分析.计算机学报, 2008, 31(9):1549-1553.
Zhang J. Constraint solving and symbolic execution. In: Proc. of the VSTTE. 2008. 539-544.
De Moura L, Bjørner N. Z3: An efficient SMT solver. In: Proc. of the TACAS. 2008. 337-340.
Ganesh V, Dill DL. A decision procedure for bit-vectors and arrays. In: Proc. of the CAV. 2007. 519-531.
Dutertre B, de Moura L. A fast linear-arithmetic solver for DPLL(T). In: Proc. of the CAV. 2016. 81-94.
Godefroid P, Klarlund N, Sen K. DART: Directed automated random testing. In: Proc. of the PLDI. 2005. 213-223.
Sen K, Marinov D, Agha G. CUTE: A concolic unit testing engine for C. In: Proc. of the FSE. 2005. 263-272.
Godefroid P, Levin MY, Molnar D. Automated whitebox fuzz testing. In: Proc. of the NDSS. 2008. 151-166.
Cadar C, Dunbar D, Engler D. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: Proc. of the OSDI. 2008. 209-224.
Pǎsǎreanu CS, Mehlitz PC, Bushnell DH, Gundy-Burlet K, Lowry M, Person S, Pape M. Combining unit-level symbolic execution and system-level concrete execution for testing NASA software. In: Proc. of the ISSTA. 2008. 15-26.
Tillmann N, de Halleux J. Pex-White box test generation for.NET. In: Proc. of the TAP. 2008. 134-153.
Chipounov V, Kuznetsov V, Candea G. S2E: A platform for in-vivo multi-path analysis of software systems. In: Proc. of the ASPLOS. 2011. 265-278.
Xie T, Tillmann N, de Halleux J, Schulte W. Fitness-Guided path exploration in dynamic symbolic execution. In: Proc. of the DSN. 2009. 359-368.
Li Y, Su Z, Wang L, Li X. Steering symbolic execution to less traveled paths. In: Proc. of the OOPSLA. 2013. 19-32.
Seo H, Kim S. How we get there: A context-guided search strategy in concolic testing. In: Proc. of the FSE. 2014. 413-424.
Ma KK, Khoo YP, Foster JS, Hicks M. Directed symbolic execution. In: Proc. of the SAS. 2011. 95-111.
Zhang Y, Chen Z, Wang J, Dong W, Liu Z. Regular property guided dynamic symbolic execution. In: Proc. of the ICSE. 2015. 643-653.
Person S, Yang G, Rungta N, Khurshid S. Directed incremental symbolic execution. In: Proc. of the PLDI. 2011. 504-515.
Dan Marinescu P, Cadar C. Make test-zesti: A symbolic execution solution for improving regression testing. In: Proc. of the ICSE. 2012. 716-726.
Godefroid P, Kiezun A, Levin MY. Grammar-Based whitebox fuzzing. In: Proc. of the PLDI. 2008. 206-215.
Siddiqui JH, Khurshid S. Scaling symbolic execution using ranged analysis. In: Proc. of the OOPSLA. 2012. 523-536.
Slabý J, Strejček J, Trtík M. Checking properties described by state machines: On synergy of instrumentation, slicing, and symbolic execution. In: Proc. of the FMICS. 2012. 207-221.
Cui H, Hu G, Wu J, Yang J. Verifying systems rules using rule-directed symbolic execution. In: Proc. of the ASPLOS. 2013. 329-342.
Trabish D, Mattavelli A, Rinetzky N, Cadar C. Chopped symbolic execution. In: Proc. of the ICSE. 2018. 350-360.
Yu H, Chen Z, Wang J, Su Z, Dong W. Symbolic verification of regular properties. In: Proc. of the ICSE. 2018. 871-881.
Boonstoppel P, Cadar C, Engler D. RWset: Attacking path explosion in constraint-based test generation. In: Proc. of the TACAS. 2008. 351-366.
Jaffar J, Murali V, Navas JA. Boosting concolic testing via interpolation. In: Proc. of the FSE. 2013. 48-58.
Godefroid P. Compositional dynamic test generation. In: Proc. of the POPL. 2007. 47-54.
Saxena P, Poosankam P, McCamant S, Song D. Loop-Extended symbolic execution on binary programs. In: Proc. of the ISSTA. 2009. 225-236.
Godefroid P, Luchaup D. Automatic partial loop summarization in dynamic test generation. In: Proc. of the ISSTA. 2011. 23-33.
Strejček J, Trtík M. Abstracting path conditions. In: Proc. of the ISSTA. 2012. 155-165.
Qiu R, Yang G, Pâsâreanu CS, Khurshid S. Compositional symbolic execution with memoized replay. In: Proc. of the ICSE. 2015. 632-642.
Yi Q, Yang Z, Guo S, Wang C, Liu J, Zhao C. Eliminating path redundancy via postconditioned symbolic execution. IEEE Trans. on Software Engineering, 2018, 44(1):25-43.
Sen K. Scalable automated methods for dynamic program analysis[Ph.D. Thesis]. University of Illinois at Urbana-Champaign, 2006.
Wang C, Yang Z, Kahlon V, Gupta A. Peephole partial order reduction. In: Proc. of the TACAS. 2008. 382-396.
Kuznetsov V, Kinder J, Bucur S, Candea G. Efficient state merging in symbolic execution. ACM SIGPLAN Notices, 2012, 47(6):193-204.
Avgerinos T, Rebert A, Sang KC, Brumley D. Enhancing symbolic execution with veritesting. In: Proc. of the ICSE. 2014. 1083-1094.
Qi D, Nguyen HDT, Roychoudhury A. Path exploration based on symbolic output. ACM Trans. on Software Engineering and Methodology, 2013, 22(4):32.
Guo S, Kusano M, Wang C, Yang Z, Gupta A. Assertion guided symbolic execution of multithreaded programs. In: Proc. of the FSE. 2015. 854-865.
Wang H, Liu T, Guan X, Shen C, Zheng Q, Yang Z. Dependence guided symbolic execution. IEEE Trans. on Software Engineering, 2017, 43(3):252-271.
Visser W, Geldenhuys J, Dwyer MB. Green: Reducing, reusing and recycling constraints in program analysis. In: Proc. of the FSE. 2012. 58.
Aquino A, Bianchi FA, Chen M, Denaro G, Pezzè M. Reusing constraint proofs in program analysis. In: Proc. of the ISSTA. 2015. 305-315.
Jia X, Ghezzi C, Ying S. Enhancing reuse of constraint solutions to improve symbolic execution. In: Proc. of the ISSTA. 2015. 177-187.
Zhang Y, Chen Z, Wang J. Speculative symbolic execution. In: Proc. of the ISSRE. 2012. 101-110.
Barr ET, Vo T, Le V, Su Z. Automatic detection of floating-point exceptions. In: Proc. of the POPL. 2013. 549-560.
Lakhotia K, Tillmann N, Harman M, de Halleux J. FloPSy-Search-Based floating point constraint solving for symbolic execution. In: Proc. of the ICTSS. 2010. 142-157.
Romano A. Practical floating-point tests with integer code. In: Proc. of the VMCAI. 2014. 337-356.
Li X, Liang Y, Qian H, Hu YQ, Bu L, Yu Y, Chen X, Li X. Symbolic execution of complex program driven by machine learning based constraint solving. In: Proc. of the ASE. 2016. 554-559.
Fu Z, Su Z. XSat: A fast floating-point satisfiability solver. In: Proc. of the CAV. 2016. 187-209.
Bjørner N, Tillmann N, Voronkov A. Path feasibility analysis for string-manipulating programs. In: Proc. of the TACAS. 2009. 307-321.
Khurshid S, Pasareanu CS, Visser W. Generalized symbolic execution for model checking and testing. In: Proc. of the TACAS. 2003. 553-568.
Ramos DA, Engler D. Under-Constrained symbolic execution: Correctness checking for real code. In: Proc. of the USENIX Security. 2015. 49-64.
Rosner N, Geldenhuys J, Aguirre NM, Visser W, Frias MF. BLISS:Improved symbolic execution by bounded lazy initialization with SAT support. IEEE Trans. on Software Engineering, 2015, 41(7):639-660.
Jeon J, Qiu X, Fetter-Degges J, Foster JS, Solar-Lezama A. Synthesizing framework models for symbolic execution. In: Proc. of the ICSE. 2016. 156-167.
Cha SK, Avgerinos T, Rebert A, Brumley D. Unleashing mayhem on binary code. In: Proc. of the IEEE S&P. 2012. 380-394.
Shoshitaishvili Y, Wang R, Salls C, Stephens N, Polino M, Dutcher A, Grosen J, Feng S, Hauser C, Kruegel C, Vigna G. SOK: (state of) the art of war: Offensive techniques in binary analysis. In: Proc. of the IEEE S&P. 2016. 138-157.
Banabic R, Candea G, Guerraoui R. Finding trojan message vulnerabilities in distributed systems. In: Proc. of the ASPLOS. 2014. 113-126.
Bucur S, Kinder J, Candea G. Prototyping symbolic execution engines for interpreted languages. In: Proc. of the ASPLOS. 2014. 239-253.
Emmi M, Majumdar R, Sen K. Dynamic test input generation for database applications. In: Proc. of the ISSTA. 2007. 151-162.
Sasnauskas R, Landsiedel O, Alizai MH, Weise C, Kowalewski S, Wehrle K. KleeNet: Discovering insidious interaction bugs in wireless sensor networks before deployment. In: Proc. of the IPSN. 2010. 186-196.
Fu X, Chen Z, Yu H, Huang C, Dong W, Wang J. Poster: Symbolic execution of MPI programs. In: Proc. of the ICSE. 2015. 809-810.
Yu H. Combining symbolic execution and model checking to verify MPI programs. In: Proc. of the ICSE. 2018. 527-529.
Davidson D, Moench B, Jha S, Ristenpart T. FIE on firmware: Finding vulnerabilities in embedded systems using symbolic execution. In: Proc. of the USENIX Security. 2013. 463-478.
Guo S, Wu M, Wang C. Symbolic execution of programmable logic controller code. In: Proc. of the FSE. 2017. 326-336.
Liu S, Zhang J. Program analysis: From qualitative analysis to quantitative analysis. In: Proc. of the ICSE. 2011. 956-959.
Geldenhuys J, Dwyer MB, Visser W. Probabilistic symbolic execution. In: Proc. of the ISSTA. 2012. 166-176.
Filieri A, Pasareanu CS, Visser W. Reliability analysis in symbolic pathFinder. In: Proc. of the ICSE. 2013. 622-631.
Chen B, Liu Y, Le W. Generating performance distributions via probabilistic symbolic execution. In: Proc. of the ICSE. 2016. 49-60.
Filieri A, Pasareanu CS, Yang G. Quantification of software changes through probabilistic symbolic execution. In: Proc. of the ASE. 2015. 703-708.
Godefroid P, Levin MY, Molnar D. SAGE:Whitebox fuzzing for security testing. Communications of the ACM, 2012, 10(3):40-44.
Miscrosfot. Visual studio 2015 RTM. 2018. https: //www.visualstudio.com/news/vs2015-vs#Testing
Xu Z, Zhang J, Wang J. Canalyze: A static bug-finding tool for C programs. In: Proc. of the ISSTA. 2014. 425-428.
Cha S, Hong S, Lee J, Oh H. Automatically generating search heuristics for concolic testing. In: Proc. of the ICSE. 2018. 1244-1254.
Su T, Fu Z, Pu G, He J, Su Z. Combining symbolic execution and model checking for data flow testing. In: Proc. of the ICSE. 2015. 654-665.
Christakis M, Müller P, Wüstholz V. Guiding dynamic symbolic execution toward unverified program executions. In: Proc. of the ICSE. 2016. 144-155.
Wang X, Sun J, Chen Z, Zhang P, Wang J, Lin Y. Towards optimal concolic testing. In: Proc. of the ICSE. 2018. 291-302.
Stephens N, Grosen J, Salls C, Dutcher A, Wang R, Corbetta J, Shoshitaishvili Y, Kruegel C, Vigna G. Driller: Augmenting fuzzing through selective symbolic execution. In: Proc. of the NDSS. 2016. 21-24.
Gosain A, Sharma G. A survey of dynamic program analysis techniques and tools. In: Proc. of the FICTA. 2014. 113-122.
Kong P, Li Y, Chen X, Sun J, Sun M, Wang J. Towards concolic testing for hybrid systems. In: Proc. of the FM. 2016. 460-478.
Chen Y, Poskitt CM, Sun J. Learning from mutants: Using code mutation to learn and monitor invariants of a cyber-physical system. In: Proc. of the SP. 2018. 648-660.
Xiong Y, Wang B, Fu G, Zang L. Learning to synthesize. In: Proc. of the GI. 2018. 37-44.
Heo K, Oh H, Yi K. Machine-Learning-Guided selectively unsound static analysis. In: Proc. of the ICSE. 2017. 519-529.
Oh H, Yang H, Yi K. Learning a strategy for adapting a program analysis via Bayesian optimisation. In: Proc. of the OOPSLA. 2015. 572-588.
Heo K, Oh H, Yang H, Yi K. Adapting static analysis via learning with bayesian optimization. ACM Trans. on Programming Languages and Systems, 2018.
Jeong S, Jeon M, Cha S, Oh H. Data-Driven context-sensitivity for points-to analysis. In: Proc. of the OOPSLA. 2017. 1-28.
Enck W, Gilbert P, Chun BG, Cox LP, Jung J, McDaniel P, Sheth AN. TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In: Proc. of the USENIX Security. 2014. 393-407.
Hornyack P, Han S, Jung J, Schechter S, Wetherall D. These aren't the Droids you're looking for: Retrofitting Android to protect data from imperious applications. In: Proc. of the CCS. 2011. 639-652.
http://code.google.com/p/droidbox/ ]]>
Yan LK, Yin H. DroidScope: Seamlessly reconstructing the OS and Dalvik semantic views for dynamic Android malware analysis. In: Proc. of the USENIX Security. 2014. 569-584.
Bellard F. QEMU, a fast and portable dynamic translator. In: Proc. of the ATEC. 2005. 41-46.
http://bitblaze.cs.berkeley.edu/temu.html ]]>
Gibler C, Crussell J, Erickson J, Chen H. AndroidLeaks: Automatically detecting potential privacy leaks in Android applications on a large scale. In: Proc. of the TRUST. 2012. 291-307.
Feng Y, Anand S, Dillig I, Aiken A. Apposcopy: Semantics-Based detection of Android malware through static analysis. In: Proc. of the FSE. 2014. 576-587.
Gordon MI, Kim D, Perkins J, Gilham L, Nguyen N, Rinard M. Information-Flow analysis of Android applications in DroidSafe. In: Proc. of the NDSS. 2015. 8-11.
Jin X, Hu X, Ying K, Du W, Yin H, Peri GN. Code injection attacks on HTML5-based mobile apps: Characterization, detection and mitigation. In: Proc. of the CCS. 2014. 66-77.
Egele M, Kruegel C, Kirda E, Vigna G. PiOS: Detecting privacy leaks in iOS applications. In: Proc. of the NDSS. 2011. 11.
Octeau D, McDaniel P, Jha S, Bartel A, Bodden E, Klein J, Traon YL. Effective inter-component communication mapping in Android with Epicc: An essential step towards holistic security analysis. In: Proc. of the USENIX Security. 2013. 543-558.
Wu T, Liu J, Xu Z, Guo C, Zhang Y, Yan J, Zhang J. Light-Weight, inter-procedural and callback-aware resource leak detection for Android apps. IEEE Trans. on Software Engineering, 2016, 42(11):1054-1076.
Chan PPF, Hui LCK, Yiu SM. DroidChecker: Analyzing Android applications for capability leak categories. In: Proc. of the WISEC. 2012. 125-136.
Felt AP, Wang HJ, Moshchuk A, Hanna S, Chin E. Permission re-delegation: Attacks and defenses. In: Proc. of the USENIX Security. 2011. 22-22.
Amalfitano D, Fasolino AR, Tramontana P, De Carmine S, Memon AM. Using GUI ripping for automated testing of Android applications. In: Proc. of the ASE. 2012. 258-261.
Machiry A, Tahiliani R, Naik M. Dynodroid: An input generation system for Android apps. In: Proc. of the FSE. 2013. 224-234.
Hu G, Yuan X, Tang Y, Yang J. Efficiently, effectively detecting mobile app bugs with AppDoctor. In: Proc. of the EuroSys. 2014. 1-15.
Mahmood R, Mirzaei N, Malek S. EvoDroid: Segmented evolutionary testing of Android apps. In: Proc. of the FSE. 2014. 599-609.
Mirzaei N, Garcia J, Bagheri H, Sadeghi A, Malek S. Reducing combinatorics in GUI testing of Android applications. In: Proc. of the ICSE. 2016. 559-570.
You W, Liang B, Shi W, Zhu S, Wang P, Xie S, Zhang X. Reference hijacking: Patching, protecting and analyzing on unmodified and non-rooted Android devices. In: Proc. of the ICSE. 2016. 959-970.
Au KWY, Zhou YF, Huang Z, Lie D. PScout: Analyzing the Android permission specification. In: Proc. of the CCS. 2012. 217-228.
Cao Y, Fratantonio Y, Bianchi A, Egele M, Kruegel C, Vigna G, Chen Y. EdgeMiner: Automatically detecting implicit control flow transitions through the Android framework. In: Proc. of the NDSS. 2015. 8-11.
Rasthofer S, Arzt S, Bodden E. A machine-learning approach for classifying and categorizing Android sources and sinks. In: Proc. of the NDSS. 2014. 23-26.
Zhang Y, Luo X, Yin H. DexHunter: Toward extracting hidden code from packed Android applications. In: Proc. of the ESORICS. 2015. 293-311.
Xue L, Luo X, Yu L, Wang S, Wu D. Adaptive unpacking of Android apps. In: Proc. of the ICSE. 2017. 358-369.
Clarke EM, Emerson EA, Sistla AP. Automatic verification of finite-state concurrent systems using temporal logic specifications. Trans. on Programming Languages and Systems, 1986, 8(2):244-263.
Bron A, Farchi E, Magid Y, Nir Y, Ur S. Applications of synchronization coverage. In: Proc. of the PPoPP. 2005. 206-212.
Křena B, Letko Z, Vojnar T. Coverage metrics for saturation-based and search-based testing of concurrent software. In: Proc. of the RV. 2011. 177-192.
Sorrentino F, Farzan A, Madhusudan P. PENELOPE: Weaving threads to expose atomicity violations. In: Proc. of the FSE. 2010. 37-46.
Wang C, Said M, Gupta A. Coverage guided systematic concurrency testing. In: Proc. of the ICSE. 2011. 221-230.
Musuvathi M, Qadeer S, Ball T, Basler G, Nainar PA, Neamtiu I. Finding and reproducing heisenbugs in concurrent programs. In: Proc. of the OSDI. 2008. 267-280.
Kahlon V, Sinha N, Kruus E, Zhang Y. Static data race detection for concurrent programs with asynchronous calls. In: Proc. of the FSE. 2009. 13-22.
Naik M, Aiken A, Whaley J. Effective static race detection for Java. In: Proc. of the PLDI. 2006. 308-319.
Voung JW, Ranjit J, Lerner S. RELAY: Static race detection on millions of lines of code. In: Proc. of the FSE. 2007. 205-214.
Cai Y, Wu S, Chan WK. ConLock: A constraint-based approach to dynamic checking on deadlocks in multithreaded programs. In: Proc. of the ICSE. 2014. 491-502.
Pozniansky E, Schuster A. Efficient on-the-fly data race detection in multihreaded C++ programs. In: Proc. of the PPoPP. 2003. 179-190.
Savage S, Burrows M, Nelson G, Sobalvarro P, Anderson T. Eraser:A dynamic data race detector for multithreaded programs. ACM Trans. on Computer Systems, 1997, 15(4):391-411.
Yu Y, Rodeheffer T, Chen W. RaceTrack: Efficient detection of data race conditions via adaptive tracking. In: Proc. of the SOSP. 2005. 221-234.
Xie X, Xue J. Acculock: Accurate and efficient detection of data races. In: Proc. of the CGO. 2011. 201-212.
Kasikci B, Zamfir C, Candea G. RaceMob: Crowdsourced data race detection. In: Proc. of the SOSP. 2013. 406-422.
Williams A, Thies W, Ernst M. Static deadlock detection for Java libraries. In: Proc. of the ECOOP. 2005. 602-629.
Clarke EM, Emerson EA. Design and synthesis of synchronization skeletons using branching time temporal logic. In: Proc. of the Logic of Programs. 1981. 52-71.
Bach M, Charney M, Cohn R, Demikhovsky E, Devor T, Hazelwood K, Jaleel A, Luk C, Lyons G, Patil H, Tal A. Analyzing parallel programs with pin. IEEE Computer, 2010, 43(3):34-41.
Eslamimehr M, Palsberg J. Race directed scheduling of concurrent programs. In: Proc. of the PPoPP. 2014. 301-314.
Edelstein O, Farchi E, Nir Y, Ratsaby G, Ur S. Multithreaded Java program test generation. IBM Systems Journal, 2002, 41(1):111-125.
Burckhardt S, Kothari P, Musuvathi M, Nagarakatte S. A randomized scheduler with probabilistic guarantees of finding bugs. In: Proc. of the ASPLOS. 2010. 167-178.
Cai Y, Yang Z. Radius aware probabilistic testing of deadlocks with guarantees. In: Proc. of the ASE. 2016. 356-367.
Fonseca P, Rodrigues R, B.Brandenburg B. SKI: Exposing kernel concurrency bugs through systematic schedule exploration. In: Proc. of the OSDI. 2014. 415-431.
Yu J, Narayanasamy S, Pereira C, Pokam G. Maple: A coverage-driven testing tool for multithreaded programs. In: Proc. of the OOPSLA. 2012.
Abdelrasoul M. Promoting secondary orders of event pairs in randomized scheduling using a randomized stride. In: Proc. of the ASE. 2017. 741-752.
Lu S, Park S, Seo E, Zhou Y. Learning from mistakes-A comprehensive study on real world concurrency bug characteristics. In: Proc. of the ASPLOS. 2008. 329-339.
Flanagan C, Freund SN. FastTrack:Efficient and precise dynamic race detection. Communications of the ACM, 2010, 53(11):93-101.
Cai Y, Cao L. Effective and precise dynamic detection of hidden races for Java programs. In: Proc. of the FSE. 2015. 450-461.
Smaragdakis Y, Evans JM, Sadowski C, Yi J, Flanagan C. Sound predictive race detection in polynomial time. In: Proc. of the POPL. 2012. 387-400.
Marino D, Musuvathi M, Narayanasamy S. LiteRace: Effective sampling for lightweight data-race detection. In: Proc. of the PLDI. 2009. 134-143.
Bond MD, Coons KE, Mckinley KS. PACER: Proportional detection of data races. In: Proc. of the PLDI. 2010. 255-268.
Zhai K, Xu B, Chan WK, Tse TH. CARISMA: A context-sensitive approach to race-condition sample-instance selection for multithreaded applications. In: Proc. of the ISSTA. 2012. 221-231.
Erickson J, Musuvathi M, Burckhardt S, Olynyk K. Effective data-race detection for the kernel. In: Proc. of the OSDI. 2010. 1-16.
Sen K. Race directed random testing of concurrent programs. In: Proc. of the PLDI. 2008. 11-21.
Joshi P, Park CS, Sen K, Naik M. A randomized dynamic program analysis technique for detecting real deadlocks. In: Proc. of the PLDI. 2009. 110-120.
Huang J, Meredith PO, Rosu G. Maximal sound predictive race detection with control flow abstraction. In: Proc. of the PLDI. 2013. 337-348.
Cai Y, Zhang J, Cao L, Liu J. A deployable sampling strategy for data race detection. In: Proc. of the PLDI. 2016. 810-821.
Guo Y, Cai Y, Yang Z. AtexRace: Across thread and execution sampling for in-house race detection. In: Proc. of the FSE. 2017. 315-325.
Lu S, Tucek J, Qin F, Zhou Y. AVIO: Detecting atomicity violations via access-interleaving invariants. In: Proc. of the ASPLOS. 2006. 37-48.
Lu S, Park S, Hu C, Ma X, Jiang W, Li Z, Popa RA, Zhou Y. MUVI: Automatically inferring multi-variable access correlations and detecting related semantic and concurrency bugs. In: Proc. of the SOSP. 2007. 103-116.
Park CS, Sen K. Randomized active atomicity violation detection in concurrent programs. In: Proc. of the FSE. 2008. 135-145.
Muzahid A, Otsuki N, Torrellas J. AtomTracker: A comprehensive approach to atomic region inference and violation detection. In: Proc. of the MICRO. 2010. 287-297.
Cai Y, Chan WK. MagicFuzzer: Scalable deadlock detection for large-scale applications. In: Proc. of the ICSE. 2012. 606-616.
Cai Y, Chan WK. Magiclock:Scalable detection of potential deadlocks in large-scale multithreaded programs. IEEE Trans. on Software Engineering, 2014, 40(3):266-281.
Cai Y, Lu Q. Dynamic testing for deadlocks via constraints. IEEE Trans. on Software Engineering, 2016, 42(9):825-842.
Cai Y, Jia C, Wu S, Zhai K, Chan WK. ASN:A dynamic barrier-based approach to confirmation of deadlocks from warnings for large-scale multithreaded programs. IEEE Trans. on Parallel and Distributed Systems, 2015, 26(1):13-23.
Pradel M, Gross TR. Fully automatic and precise detection of thread safety violations. In: Proc. of the PLDI. 2012. 521-530.
Choudhary A, Lu S, Pradel M. Efficient detection of thread safety violations via coverage-guided generation of concurrent tests. In: Proc. of the ICSE. 2017. 266-277.
http://www.crn.com/slide-shows/cloud/300075204/the-10-biggest-cloud-outages-of-2014.html ]]>
Gunawi HS, Hao M, Leesatapornwongsa T, Patana-anake T, Do T, Adityatama J, Eliazar KJ, Laksono A, Lukman JF, Martin V, Satria AD. What bugs live in the cloud? A study of 3000+ issues in cloud systems. In: Proc. of the SOCC. 2014. 1-14.
Leesatapornwongsa T, Lukman JF, Lu S, Gunawi HS. TaxDC: A taxonomy of non-deterministic concurrency bugs in datacenter distributed systems. In: Proc. of the ASPLOS. 2016. 517-530.
Dai T, He J, Gu X, Lu S. Understanding real-world timeout problems in cloud server systems. In: Proc. of the IC2E. 2018. 1-11.
Gao Y, Dou W, Qin F, Gao C, Wang D, Wei J, Huang R, Zhou L, Wu Y. An empirical study on crash recovery bugs in large-scale distributed systems. In: Proc. of the FSE. 2018. 539-550.
Guo Z, McDirmid S, Yang M, Zhuang L, Zhang P, Luo Y, Bergan T, Bodik P, Musuvathi M, Zhang Z, Zhou L. Failure recovery: When the cure is worse than the disease. In: Proc. of the HotOS. 2013. 1-6.
Yuan D, Luo Y, Zhuang X, Rodrigues GR, Zhao X, Zhang Y, Jain PU, Stumm M. Simple testing can prevent most critical failures: An analysis of production failures in distributed data-intensive systems. In: Proc. of the OSDI. 2014. 249-265.
Wang G, Zhang L, Xu W. What can we learn from four years of data center hardware failures? In: Proc. of the DSN. 2017. 25-36.
Liu H, Li G, Lukman JF, Li J, Lu S, Gunawi HS, Tian C. DCatch: Automatically detecting distributed concurrency bugs in cloud systems cloud systems. In: Proc. of the ASPLOS. 2017. 677-691.
Liu H, Wang X, Li G, Lu S, Ye F, Tian C. FCatch: Automatically detecting time-of-fault bugs in cloud systems. In: Proc. of the ASPLOS. 2018. 1-11.
Liu X, Guo Z, Wang X, Chen F, Lian X, Tang J, Wu M, Kaashoek MF, Zhang Z. D3S: Debugging deployed distributed systems. In: Proc. of the NSDI. 2008. 423-437.
Xu W, Huang L, Fox A, Patterson D, Jordan MI. Detecting large-scale system problems by mining console logs. In: Proc. of the SOSP. 2009. 117-132.
Grant S, Cech H, Beschastnikh I. Inferring and asserting distributed system invariants. In: Proc. of the ICSE. 2018. 1149-1159.
Scott C, Panda A, Brajkovic V, Necula G, Krishnamurthy A, Shenker S. Minimizing faulty executions of distributed systems. In: Proc. of the NSDI. 2016. 291-309.
Flanagan C, Godefroid P. Dynamic partial-order reduction for model checking software. In: Proc. of the POPL. 2005. 110-121.
Zeller A, Hildebrandt R. Simplifying and isolating failure-inducing input. IEEE Trans. on Software Engineering, 2002, 28(2):183-200.
Wilcox JR, Woos D, Panchekha P, Tatlock Z, Wang X, Ernst MD, Anderson T. Verdi: A framework for implementing and formally verifying distributed systems. In: Proc. of the PLDI. 2015. 357-368.
Hawblitzel C, Howell J, Kapritsos M, Lorch JR, Parno B, Roberts ML, Setty S, Zill B. IronFleet: Proving practical distributed systems correct. In: Proc. of the SOSP. 2015. 1-17.
Lesani M, Bell CJ, Chlipala A. Chapar: Certified causally consistent distributed key-value stores. In: Proc. of the POPL. 2016. 357-370.
Fonseca P, Zhang K, Wang X, Krishnamurthy A. An empirical study on the correctness of formally verified distributed systems. In: Proc. of the EuroSys. 2017. 328-343.
Yang J, Chen T, Wu M, Xu Z, Liu X, Lin H, Yang M, Long F, Zhang L, Zhou L. MODIST: Transparent model checking of unmodified distributed systems. In: Proc. of the NSDI. 2009. 213-228.
Guo H, Wu M, Zhou L, Hu G, Yang J, Zhang L. Practical software model checking via dynamic interface reduction. In: Proc. of the SOSP. 2011. 265-278.
Leesatapornwongsa T, Hao M, Joshi P, Lukman JF, Gunawi HS. SAMC: Semantic-Aware model checking for fast discovery of deep bugs in cloud systems. In: Proc. of the OSDI. 2014. 399-414.
Sasnauskas R, Dustmann OS, Kaminski BL, Wehrle K, Weise C, Kowalewski S. Scalable symbolic execution of distributed systems. In: Proc. of the ICDCS. 2011. 333-342.
Joahi P, Ganai M, Balakrishnan G, Gupta A, Papakonstantinou N. SETSUDO: Perturbation-Based testing framework for scalable distributed systems pallavi. In: Proc. of the TRIOS. 2013. 1-14.
Joshi P, Gunawi HS, Sen K. PREFAIL: A programmable tool for multiple-failure injection. In: Proc. of the OOPSLA. 2011. 171-188.
Gunawi HS, Do T, Joshi P, Alvaro P, Hellerstein JM, Arpaci-Dusseau AC, Arpaci-Dusseau RH, Sen K, Borthakur D. FATE and DESTINI: A framework for cloud recovery testing. In: Proc. of the NSDI. 2011. 1-18.
Alagappan R, Ganesan A, Patel Y, Pillai TS, Arpaci-Dusseau AC, Arpaci-Dusseau RH. Correlated crash vulnerabilities. In: Proc. of the OSDI. 2016. 151-167.
Ganesan A, Alagappan R, Arpaci-Dusseau AC, Arpaci-Dusseau RH. Redundancy does not imply fault tolerance: Analysis of distributed storage reactions to file-system faults. In: Proc. of the FAST. 2017. 149-166.
Alvaro P, Rosen J, Hellerstein JM. Lineage-Driven fault injection. In: Proc. of the SIGMOD. 2015. 331-346.
Thompson K. Reflections on trusting trust. Communications of the ACM, 1984, 27(8):761-763.
Xiao C. Novel malware xcodeghost modifies xcode, infects apple iOS apps and hits app store. Technical Report, 2018. https://researchcenter.paloaltonetworks.com/2015/09/novel-malware-xcodeghost-modifies-xcode-infects-apple-ios-apps-and-hits-app-store/
De Sutter B, de Bus B, de Bosschere K, Keyngnaert P, Demoen B. On the static analysis of indirect control transfers in binaries. In: Proc. of the PDPTA. 2000.
Kinder J. Static analysis of x86 executables (statische analyse von programmen in x86-Maschinensprache)[Ph.D. Thesis]. Fachbereich Informatik, Technische Universitat Darmstadt, 2010.
Cifuentes C, van Emmerik M. Recovery of jump table case statements from binary code. Science of Computer Programming, 2001, 40(2-3):171-188.
Kinder J, Veith H. Jakstab: A static analysis platform for binaries. In: Proc. of the CAV. 2008. 423-427.
Xu L, Sun F, Su Z. Constructing precise control flow graphs from binaries. Technical Report, 2009.
Shin ECR, Song D, Moazzezi R. Recognizing functions in binaries with neural networks. In: Proc. of the USENIX Security. 2015. 611-626.
Andriesse D, Chen X, Van der Veen V, Slowinska A, Bos H. An in-depth analysis of disassembly on full-scale x86/x64 binaries. In: Proc. of the USENIX Security. 2016. 583-600.
Chua ZL, Shen S, Saxena P, Liang Z. Neural nets can learn function type signatures from binaries. In: Proc. of the USENIX Security. 2017. 99-116.
Tröger J, Cifuentes C. Analysis of virtual method invocation for binary translation. In: Proc. of the WCRE. 2002. 65-74.
Balakrishnan G, Reps T. Analyzing memory accesses in x86 executables. In: Proc. of the CC. 2004. 5-23.
Jin W, Cohen C, Gennari J, Hines C, Chaki S, Gurfinkel A, Havrilla J, Narasimhan P. Recovering C++ objects from binaries using inter-procedural data-flow analysis. In: Proc. of the POPL. 2014. 1-11.
Lin Z, Zhang X, Xu D. Automatic reverse engineering of data structures from binary execution. In: Proc. of the NDSS. 2010.
Zeng J, Lin Z. Towards automatic inference of kernel object semantics from binary code. In: Proc. of the RAID. 2015. 538-561.
Smithson M, El Wazeer K, Anand K, Kotha A, Barua R. Static binary rewriting without supplemental information: Overcoming the tradeoff between coverage and correctness. In: Proc. of the WCRE. 2013. 52-61.
Wang S, Wang P, Wu D. UROBOROS: Instrumenting stripped binaries with static reassembling. In: Proc. of the SANER. 2016. 236-247.
Wang S, Wang P, Wu D. Reassembleable disassembling. In: Proc. of the USENIX Security. 2015. 627-642.
Wang R, Shoshitaishvili Y, Bianchi A, Machiry A, Grosen J, Grosen P, Kruegel C, Vigna G. RAMBLR: Making reassembly great again. In: Proc. of the NDSS. 2017.
Song D, Brumley D, Yin H, Caballero J, Jager I, Kang MG, Liang Z, Newsome J, Poosankam P, Saxena P. BitBlaze: A new approach to computer security via binary analysis. In: Proc. of the ICISS. 2008. 1-25.
Brumley D, Jager I, Avgerinos T, Schwartz EJ. BAP: A binary analysis platform. In: Proc. of the CAV. 2011. 463-469.
Bruening D. Efficient, transparent, and comprehensive runtime code manipulation[Ph.D. Thesis]. Massachusetts Institute of Technology, 2004.
Bernat AR, Miller BP. Anywhere, any-time binary instrumentation. In: Proc. of the PASTE. 2011. 9-16.
Nethercote N, Seward J. Valgrind: A framework for heavyweight dynamic binary instrumentation. In: Proc. of the PLDI. 2007. 89-100.
Luk CK, Cohn R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi VJ, Hazelwood K. Pin: Building customized program analysis tools with dynamic instrumentation. In: Proc. of the PLDI. 2005. 190-200.
Machiry A, Spensky C, Corina J, Stephens N, Kruegel C, Vigna G, Barbara S. DR.CHECKER: A soundy analysis for Linux kernel drivers. In: Proc. of the USENIX Security. 2017. 1007-1024.
Wang P, Krinke J, Lu K, Li G, Dodier-Lazaro S. How double-fetch situations turn into double-fetch vulnerabilities: A study of double fetches in the Linux kernel. In: Proc. of the USENIX Security. 2017. 1-16.
Dewey D, Giffin J. Static detection of C++ vtable escape vulnerabilities in binary code. In: Proc. of the NDSS. 2012.
Kruegel C, Robertson W, Vigna G. Detecting kernel-level rootkits through binary analysis. In: Proc. of the ACSAC. 2004. 91-100.
Bergeron J, Debbabi M, Erhioui MM, Ktari B. Static analysis of binary code to isolate malicious behaviors. In: Proc. of the WETICE. 1999. 184-189.
Bergeron J, Debbabi M, Desharnais J, M.Erhioui M, Lavoie Y, Tawbi N. Static detection of malicious code in executable programs. In: Proc. of the Symp. on Requirements Engineering for Information Security. 2001.
O'Sullivan P, Anand K, Kotha A, Smithson M, Barua R, Keromytis AD. Retrofitting security in COTS software with binary rewriting. In: Proc. of the SEC. 2011. 154-172.
Wartell R, Mohan V, Hamlen KW, Lin Z. Binary stirring: Self-Randomizing instruction addresses of legacy x86 binary code. In: Proc. of the Computer and Communications Security. New York: ACM Press, 2012. 157-168.
Wartell R, Mohan V, Hamlen KW, Lin Z. Securing untrusted code via compiler-agnostic binary rewriting. In: Proc. of the ACSAC. ACM Press, 2012. 299-308.
Zhang M, Sekar R. Contol flow integrity for COTS binaries. In: Proc. of the USENIX Security. 2013. 337-352.
Batyuk L, Herpich M, Camtepe SA, Raddatz K, Schmidt AD, Albayrak S. Using static analysis for automatic assessment and mitigation of unwanted and malicious activities within Android applications. In: Proc. of the MALWARE. 2011. 66-72.
Zhang C, Song C, Chen KZ, Chen Z, Song D. VTint: Protecting virtual function tables' integrity. In: Proc. of the NDSS. 2015.
Xu Z, Zhang J. A test data generation tool for unit testing of C programs. In: Proc. of the QSIC. 2006. 107-116.
Wu R, Wen M, Cheung SC, Zhang H. ChangeLocator:Locate crash-inducing changes based on crash reports. Empirical Software Engineering, 2017, 23(5):2866-2900.
Gao Q, Xiong Y, Mi Y, Zhang L, Yang W, Zhou Z, Xie B, Mei H. Safe memory-leak fixing for C programs. In: Proc. of the ICSE. 2015. 459-470.
Xuan J, Martinez M, De Marco F, Clement M, Marcote SL, Durieux T, Le Berre D, Monperrus M. NOPOL:Automatic repair of conditional statement bugs in Java programs. IEEE Trans. on Software Engineering, 2017, 43(1):34-55.
Briand LC, Daly JW, Wust J. A unified framework for cohesion measurement in object-oriented systems. Empirical Software Engineering, 1998, 3(1):65-117.
Briand LC, Daly JW, Wust JK. A unified framework for coupling measurement in object-oriented systems. IEEE Trans. on Software Engineering, 1999, 25(1):91-121.
https://www.ethereum.org/]]>
https://www.coindesk.com/dao-attacked-code-issue-leads-60-million-ether-theft/]]>
https://www.cnbc.com/2017/11/08/accidental-bug-may-have-frozen-280-worth-of-ether-on-parity-wallet.html ]]>
Luu L, Chu DH, Olickel H, Saxena P, Hobor A. Making smart contracts smarter. In: Proc. of the CCS. 2016. 254-269.
Nikolic I, Kolluri A, Sergey I, Saxena P, Hobor A. Finding the greedy, prodigal, and suicidal contracts at scale. In: Proc. of the CoRR. 2018. https: //arxiv.org/pdf/1802.06038.pdf
Hirai Y. Formal verification of deed contract in ethereum name service. Technical Report, 2016. 1-81.
Bhargavan K, Delignat-Lavaud A, Fournet C, Gollamudi A, Gonthier G, Kobeissi N, Kulatova N, Rastogi A, Sibut-Pinote T, Swamy N, Zanella-Béguelin S. Formal verification of smart contracts. In: Proc. of the PLAS. 2016. 91-96.
Bigi G, Bracciali A, Meacci G, Tuosto E. Validation of decentralised smart contracts through game theory and formal methods. In: Proc. of the Programming Languages with Applications to Biology and Security. 2015. 142-161.
Kalra S, Goel S, Dhawan M, Sharma S. ZEUS: Analyzing safety of smart contracts. In: Proc. of the NDSS. 2018.
Tsankov P, Dan A, Drachsler-Cohen D, Gervais A, Bunzli F, Vechev M. Securify: Practical security analysis of smart contracts. In: Proc. of the CoRR. 2018. https: //arxiv.org/pdf/1806.01143.pdf
Gurfinkel A, Kahsai T, Komuravelli A, Navas JA. The SeaHorn verification framework. In: Proc. of the CAV. 2004. 343-361.
Masuda S, Ono K, Yasue T, Hosokawa N. A survey of software quality for machine learning applications. In: Proc. of the ICSTW. 2018. 279-284.
Qin Y, Wang H, Xu C, Ma X, Lu J. SynEva: Evaluating ML programs by mirror program synthesis. In: Proc. of the QRS. 2018. 171-182.
Sun Y, Wu M, Ruan W, Huang X, Kwiatkowska M, Kroening D. Concolic testing for deep neural networks. In: Proc. of the ASE. 2018. 109-119.
Ruan W, Huang X, Kwiatkowska M. Reachability analysis of deep neural networks with provable guarantees. In: Proc. of the IJCAI. 2018. 2651-2659.
Huang X, Kwiatkowska M, Wang S, Wu M. Safety verification of deep neural networks. In: Proc. of the CAV. 2017. 3-29.
Ma S, Aafer Y, Xu Z, Lee WC, Zhai J, Liu Y, Zhang X. LAMP: Data provenance for graph based machine learning algorithms through derivative computation. In: Proc. of the FSE. 2017. 786-797.
Gulzar MA, Interlandi M, Yoo S, Tetali SD, Condie T, Millstein T, Kim M. BigDebug: Debugging primitives for interactive big data processing in spark. In: Proc. of the ICSE. 2016. 784-795.