Abstract:Testing in continuous integration environment is characterized by constantly changing test sets, limited test time, fast feedback, and so on. Traditional test optimization methods are not suitable for this. Reinforcement learning is an important branch of machine learning, and its essence is to solve sequential decision problems, thus it can be used in test optimization in continuous integration. However, in the existing reinforcement learning based methods, the reward function calculation only includes the execution information of the test case in the current integration cycle. The research is carried out from two aspects:reward function design and reward strategy. In the design of reward function, complete historical execution information of the test case is used to replace the current execution information and the total number of historical failures and historical failure distribution information of the test case is also considered. In terms of the reward strategy, two reward strategies are proposed, which are overall reward for test cases in current execution sequence and partial reward only for failed test cases. In this study, experimental research is conducted on three industrial-level programs. The results show that:(1) Compared with the existing methods, reinforcement learning methods based on reward function with complete historical information proposed in this study can greatly improve the error detection ability of test sequences in continuous integration; (2) Test case historical failure distribution can help to identify potential failure test cases, which is more important for the design of the reward function in reinforcement learning; (3) The two reward strategies, i.e. overall reward and partial reward, are influenced by various factors of the system under test, therefore the reward strategy need to be selected according to the actual situation; and (4) History-based reward functions have longer time consumption, though the test efficiency is not affected.