Abstract:Comment generation for software codes has been an important research task in the field of software engineering in the past few years. Several research efforts have achieved impressive results on the open-source datasets that contain copious <code snippet, comment> pairs. In the practice of software enterprises, however, the codes to be commented usually belong to a software project library, and it should be decided first on which code lines the comment generation can achieve better performance; moreover, the code snippets to be commented have different lengths and granularity. Thus, a code comment generation method is required, which can integrate commenting decisions and comment generation and is resistant to noise. To this end, CoComment, a software project-oriented code comment generation approach, is proposed in this study. This approach can automatically extract domain-specific basic concepts from software project documents and then uses code parsing and text matching to propagate and expand these concepts. On this basis, automatic code commenting decisions are made by locating code lines or segments related to these concepts, and corresponding natural language comments with high readability are generated upon the fusion of concepts and contexts with templates. Comparative experiments are conducted on three enterprise software projects containing more than 46000 manually annotated code comments. The experimental results demonstrate the proposed approach can effectively make code commenting decisions and generate more helpful code comments compared with existing methods, which provides an integrated solution to code commenting decisions and comment generation for software projects.