Deep Learning Test Optimization Method Using Multi-objective Optimization
Author:
Affiliation:

Clc Number:

TP311

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    With the rapid development of deep learning technology, the research on its quality assurance is raising more attention. Meanwhile, it is no longer difficult to collect test data owing to the mature sensor technology, but it costs a lot to label the collected data. In order to reduce the cost of labeling, the existing work attempts to select a test subset from the original test set. They only ensure that the overall accuracy (the accuracy of the target deep learning model on all test inputs of the test set) of the test subset is similar to that of the original test set. However, existing work only focuses on estimating overall accuracy, ignoring other properties of the original test set. For example, it can not fully cover all kinds of test input in the original test set. This study proposes a method based on multi-objective optimization called DMOS (deep multi-objective selection). It firstly analyzes the data distribution of the original test set based on HDBSCAN (hierarchical density-based spatial clustering of applications with noise) clustering method. Then, it designs the optimization objective based on the characteristics of the clustering results and then carries out multi-objective optimization to find out the appropriate selection solution. A large number of experiments are carried out on 8 pairs of classic deep learning test sets and models. The results show that the best test subset selected by DMOS method (corresponding to the Pareto optimal solution with the best performance) can not only cover more test input categories in the original test set, but also estimate the accuracy of each test input category extremely close to the original test set. Meanwhile, it can also ensure that the overall accuracy and test adequacy are close to the original test set: The average error of the overall accuracy estimation is only 1.081%, which is 0.845% less than the PACE (practical accuracy estimation), with the improvement of 43.87%. The average error of the accuracy estimation of each category of test input is only 5.547%, which is 2.926% less than PACE, with the improvement of 34.53%. The average estimation error of the five test adequacy measures is only 8.739%, which is 7.328% lower than PACE, with the increase improvement of 45.61%.

    Reference
    Related
    Cited by
Get Citation

沐燕舟,王赞,陈翔,陈俊洁,赵静珂,王建敏.采用多目标优化的深度学习测试优化方法.软件学报,2022,33(7):2499-2524

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:September 05,2021
  • Revised:October 14,2021
  • Adopted:
  • Online: January 28,2022
  • Published: July 06,2022
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063