Code Smell Detection Approach Based on Pre-training Model and Multi-level Information

doi:10.13328/j.cnki.jos.006548

微信服务号

微信订阅号

Home > Archive>Volume 33, Issue 5, 2022 >1551-1568. DOI:10.13328/j.cnki.jos.006548

PDF HTML XML Export Cite reminder

Code Smell Detection Approach Based on Pre-training Model and Multi-level Information
DOI:
                        10.13328/j.cnki.jos.006548
                    
Author:
                        
                        
                    
Affiliation:
Clc Number:TP311
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Most of the existing code smell detection approaches rely on code structure information and heuristic rules, while pay little attention to the semantic information embedded in different levels of code, and the accuracy of code smell detection approaches is not high. To solve this problem, this study proposes a novel approach DeepSmell based on a pre-trained model and multi-level metrics. Firstly, the static analysis tool is used to extract code smell instances and multi-level code metric information in the source program and mark these instances. Secondly, the level information that relate to code smells in the source code are parsed and obtained through the abstract syntax tree. The textual information composed of the level information is combined with code metric information to generate the data set. Finally, text information is converted into word vectors using the BERT pre-training model. The GRU-LSTM model is applied to obtain the potential semantic relationship among the identifiers, and the CNN model is combined with attention mechanism to code smell detection. The experiment tested four kinds of code smells including feature envy, long method, data class, and god class on 24 open source programs such as JUnit, Xalan, and SPECjbb2005. The results show that DeepSmell improves the average recall and F1 by 9.3% and 10.44% respectively compared with existing detection methods, and maintains a high level of precision at the same time.

Reference

Cited by

Get Citation

张杨,东春浩,刘辉,葛楚妍.基于预训练模型和多层次信息的代码坏味检测方法.软件学报,2022,33(5):1551-1568

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 06,2021
Revised:October 09,2021
Adopted:
Online: January 28,2022
Published: May 06,2022

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address：4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code：100190
Phone：010-62562563 Fax：010-62562533 Email：jos@iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063

微信服务号

微信订阅号

Get Citation

Share

Article Metrics

History