Abstract:Due to the extensive use of code reuse and third-party SDKs, open source components are ubiquitous in IoT device firmware. The security vulnerabilities which cause threat to the firmware usually exist in some specific versions of the components. The version information identification of open source binary components in the IoT firmware is of great significance for the safety assessment and emergency response of IoT devices. The existing version string based version extraction method is not applicable to the cases with missing version strings. This paper designs and implements a version extraction method (termed as Protues) for open source components that does not depend on version strings. The core idea of this method is to construct a version difference chain by using the differences between the open source components' neighboring versions of the source code to convert the version identification problem into a query on the version difference chain. Furthermore, in order to improve the recognition accuracy, the conditional judgment expressions are used in this paper to represent the nodes on the version difference chain. To verify the practicability of this method, version identification experiments are performed on a total number of 428 binary files from 4 kinds of open source components Samba, Msmtp, Nginx and Libgcrupt. The experimental results show that the number of versions that can be accurately identified by this method reaches 418, and the recognition accuracy rate is 98%.