Abstract:With the widespread adoption and rapid advancement of open-source software, the maintenance of open-source software projects has become a critical phase within the software development cycle. As a globally representative developer community, GitHub hosts numerous software project repositories with similar functionalities within the same domain, creating challenges for users when selecting the appropriate project repository for use or further development. Therefore, accurate identification of project repository maintenance status holds substantial practical value. However, the GitHub platform does not provide direct metrics for assessing the maintenance status of repositories. This study proposes an automatic identification method for project repository maintenance status based on machine learning. A classification model, GitMT, has been developed and implemented to achieve this objective. By effectively integrating dynamic time series features and descriptive features, the proposed model enables accurate identification of “active” and “unmaintained” repository status. Through a series of experiments conducted on large-scale real-world data, an AUC value of 0.964 is achieved in maintenance status identification tasks. In addition, this study constructs an open-source dataset centered on the maintenance status of software project repositories—GitMT Dataset: https://doi.org/10.7910/DVN/OJ2NI3.