Abstract:With the prosperity of open-source software, almost all software companies use these reusable components as basic build blocks to build their software products, thus forming the software supply chain. The software supply chain improves development efficiency and reduces labor costs for software companies. However, it may also introduce new security problems. In particular, if one software component has high-risk vulnerabilities, the software supply chain inevitably spreads these vulnerabilities to all its dependencies, thus amplifying these vulnerabilities' impact. For example, through the software supply chain, the Log4j2 vulnerability causes a catastrophic security issue for the whole Java ecosystem.
Unfortunately, current research studies on Java software supply chain mainly focus on a single component or a group of components and misses the impact study on the ecosystem scale. Therefore, in this paper, we present the essential software supply analysis techniques to study the component and vulnerability impact on the Java ecosystem. More specifically, we first give the formal definition of component dependencies in the software supply chain. Next, we propose new techniques and build an analysis tool to analyze all component dependencies in the Java ecosystem, including over 8.8 million component versions and 65 million dependencies. Finally, we use Log4j2, a logging library affected by the vulnerability, as an example to evaluate its impact on the whole Java ecosystem. The results show that the vulnerability affects 15.12% of the ecological components (71082) and 16.87% of the component versions (1488971), and the vulnerability-fix rate is only 29.13%.