求解二维装箱问题的强化学习启发式算法

doi:10.13328/j.cnki.jos.006161

微信服务号

微信订阅号

2025年4月26日 19:28 星期六

首页 > 过刊浏览>2021年第32卷第12期 >3684-3697. DOI:10.13328/j.cnki.jos.006161

PDF HTML阅读 XML下载导出引用引用提醒

求解二维装箱问题的强化学习启发式算法
DOI:
                        10.13328/j.cnki.jos.006161
                    
CSTR:
                        
                    
作者:
                        阳名钢阳名钢
厦门大学 信息学院, 福建 厦门 361005
在期刊界中查找
在百度中查找
在本站中查找
陈梦烦陈梦烦
厦门大学 信息学院, 福建 厦门 361005
在期刊界中查找
在百度中查找
在本站中查找
杨双远杨双远
厦门大学 信息学院, 福建 厦门 361005
在期刊界中查找
在百度中查找
在本站中查找
张德富张德富
厦门大学 信息学院, 福建 厦门 361005
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:阳名钢(1998-),男,硕士生,主要研究领域为大数据,人工智能.
陈梦烦(1994-),女,硕士,主要研究领域为大数据,人工智能.
杨双远(1976-),男,博士,副教授,主要研究领域为人工智能,图像处理.
张德富(1971-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为大数据,人工智能.
通讯作者:张德富,E-mail:dfzhang@xmu.edu.cn
中图分类号:TP301
基金项目:国家自然科学基金（61672439）

Reinforcement Learning Heuristic Algorithm for Solving the Two-dimensional Strip Packing Problem

Author:

YANG Ming-Gang
YANG Ming-Gang
School of Informatics, Xiamen University, Xiamen 361005, China
在期刊界中查找
在百度中查找
在本站中查找
CHEN Meng-Fan
CHEN Meng-Fan
School of Informatics, Xiamen University, Xiamen 361005, China
在期刊界中查找
在百度中查找
在本站中查找
YANG Shuang-Yuan
YANG Shuang-Yuan
School of Informatics, Xiamen University, Xiamen 361005, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG De-Fu
ZHANG De-Fu
School of Informatics, Xiamen University, Xiamen 361005, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

National Natural Science Foundation of China (61672439)

摘要

图/表

访问统计

参考文献 [43]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

二维带形装箱问题是一个经典的NP-hard的组合优化问题，该问题在实际的生活和工业生产中有着广泛的应用.研究该问题，对企业节约成本、节约资源以及提高生产效率有着重要的意义.提出了一个强化学习求解算法.新颖地使用强化学习为启发式算法提供一个初始的装箱序列，有效地改善启发式冷启动的问题.该强化学习模型能进行自我驱动学习，仅使用启发式计算的解决方案的目标值作为奖励信号来优化网络，使网络能学习到更好的装箱序列.使用简化版的指针网络来解码输出装箱序列，该模型由嵌入层、解码器和注意力机制组成.使用Actor-Critic算法对模型进行训练，提高了模型的效率.在714个标准问题实例和随机生成的400个问题实例上测试提出的算法，实验结果显示：提出的算法能有效地改善启发式冷启动的问题，性能超过当前最优秀的启发式求解算法.

关键词:二维装箱问题;强化学习;指针网络;启发式算法;分层搜索

Abstract:

The two-dimensional strip packing problem is a classic NP-hard combinatorial optimization problem, which has been widely used in daily life and industrial production. This study proposes a reinforcement learning heuristic algorithm for it. The reinforcement learning is used to provide an initial boxing sequence for the heuristic algorithm to effectively improve the heuristic cold start problem. The reinforcement learning model can perform self-driven learning, using only the value of the heuristically calculated solution as a reward signal to optimize the network, so that the network can learn a better packing sequence. A simplified version of the pointer network is used to decode the output boxing sequence. The model consists of an embedding layer, a decoder, and an attention mechanism. Actor-critic algorithm is used to train the model, which improves the efficiency of the model. The reinforcement learning heuristic algorithm is tested on 714 standard problem instances and 400 generated problem instances. Experimental results show that the proposed algorithm can effectively improve the heuristic cold start problem and outperform the state-of-the-art heuristics with much higher solution quality.

Key words:two-dimensional strip packing problem;reinforcement learning;pointer network;heuristics;hierarchical search

参考文献

[1] Lodi A, Silvano M, Michele M. Two-dimensional packing problems:A survey. European Journal of Operational Research, 2002, 141(2):241-252.

[2] Hifi M. Exact algorithms for the guillotine strip cutting/packing problem. Computers & Operations Research, 1998,25(11):925-940.

[3] Martello S, Michele M, Daniele V. An exact approach to the strip-packing problem. INFORMS Journal on Computing, 2003,15(3):310-319.

[4] Kenmochi M, Takashi I, Koji N, et al. Exact algorithms for the two-dimensional strip packing problem with and without rotations. European Journal of Operational Research, 2009,198(1):73-83.

[5] Côté JF, Mauro DM, Manuel I. Combinatorial Benders' cuts for the strip packing problem. Operations Research, 2014,62(3):643-661.

[6] Baker BS, Jr-Edward-GC, Ronald-LR. Orthogonal packings in two dimensions. SIAM Journal on Computing, 1980,9(4):846-855.

[7] Chazelle B. The bottom-left bin-packing heuristic:An efficient implementation. IEEE Trans. on Computers, 1983,(8):697-707.

[8] Huang WQ, Liu JF. A deterministic heuristic algorithm based on euclidian distance for solving the rectangles packing. Chinese Journal of Computers, 2006,29(5):734-739(in Chinese with English abstract).

[9] Burke EK, Graham K, Glenn W. A new placement heuristic for the orthogonal stock-cutting problem. Operations Research, 2004,52(4):655-671.

[10] Leung SCH, Zhang DF, Zhou CL, et al. A hybrid simulated annealing metaheuristic algorithm for the two-dimensional knapsack packing problem. Computers & Operations Research, 2012,39(1):64-73.

[11] Jakobs S. On genetic algorithms for the packing of polygons. European Journal of Operational Research, 1996,88(1):165-181.

[12] Hopper E, Turton BCH. An empirical investigation of meta-heuristic and heuristic algorithms for a 2D packing problem. European Journal of Operational Research, 2001,128(1):34-57.

[13] Bortfeldt A. A genetic algorithm for the two-dimensional strip packing problem with rectangular pieces. European Journal of Operational Research, 2006,172(3):814-837.

[14] Liu DQ, Teng HF. An improved BL-algorithm for genetic algorithm of the orthogonal packing of rectangles. European Journal of Operational Research, 1999,112(2):413-420.

[15] He K, Ji PL, Li CM. Heuristics for solving the 2D rectangle packing area minimization problem basing on a dynamic reduction method. Ruan Jian Xue Bao/Journal of Software, 2013,24(9):2078-2088(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/4404.htm[doi:10.3724/SP.J.1001.2013.04404]

[16] Wei LJ, Oon WC, Zhu WB, et al. A skyline heuristic for the 2D rectangular packing and strip packing problems. European Journal of Operational Research, 2011,215(2):337-346.

[17] Alvarez-Valdés R, Francisco P, José-Manuel T. Reactive GRASP for the strip-packing problem. Computers & Operations Research, 2008,35(4):1065-1083.

[18] Zhang DF, Wei LJ, Leung SCH, et al. A binary search heuristic algorithm based on randomized local search for the rectangular strip-packing problem. INFORMS Journal on Computing, 2013,25(2):332-345.

[19] Leung SCH, Zhang DF, Sim KM. A two-stage intelligent search algorithm for the two-dimensional strip packing problem. European Journal of Operational Research, 2011,215(1):57-69.

[20] Yang SY, Han SH, Ye WG. A simple randomized algorithm for two-dimensional strip packing. Computers & Operations Research, 2013,40(1):1-8.

[21] Chen Z, Chen JL. An effective corner increment-based algorithm for the two-dimensional strip packing problem. IEEE Access, 2018,6:72906-72924.

[22] Chen MF, Li K, Zhang DF, et al. Hierarchical search-embedded hybrid heuristic algorithm for two-dimensional strip packing problem. IEEE Access, 2019,7,179086-179103.

[23] Hopfield JJ, David-W T. "Neural" computation of decisions in optimization problems. Biological Cybernetics, 1985,52(3):141-152.

[24] Vinyals O, Meire F, Navdeep J. Pointer networks. In:Advances in Neural Information Processing Systems (NIPS). 2015.2692-2700.

[25] Mnih V, Nicolas H, Alex G. Recurrent models of visual attention. arXiv:1406.62472014.2204-2212.

[26] Bello I, Hieu P, Quoc-V L, et al. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940, 2016.

[27] Nazari M, Afshin O, Lawrence S, et al. Reinforcement learning for solving the vehicle routing problem. In:Proc. of the Advances in Neural Information Processing Systems (NIPS), Vol. 31.2018.9839-9849.

[28] Kool W, Herke VH, Max W. Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475, 2018.

[29] Hu HY, Zhang XD, Yan XW, et al. Solving a new 3D bin packing problem with deep reinforcement learning method. arXiv preprint arXiv:1708.05930, 2017.

[30] Bortfeldt A, Hermann G. New large benchmark instances for the two-dimensional strip packing problem with rectangular pieces. In:Proc. of the 39th Annual Hawaii Int'l Conf. on System Sciences (HICSS 2006). IEEE, 2006.30b.

[31] Wang PY, Christine LV. Data set generation for rectangular placement problems. European Journal of Operational Research, 2001,134(2):378-391.

[32] Belov G, Guntram S, Mukhacheva EA, et al. One-dimensional heuristics adapted for two-dimensional rectangular strip packing. Journal of the Operational Research Society, 2008,59:823-832.

[33] Hopper E. Two-dimensional packing utilising evolutionary algorithms and other meta-heuristic methods. Cardiff:University of Wales, 2000.

[34] Berkey JO, Wang PY. Two-dimensional finite bin-packing algorithms. Journal of the Operational Research Society, 1987,38(5):423-429.

[35] Martello S, Daniele V. Exact solution of the two-dimensional finite bin packing problem. Management Science, 1998,44(3):388-399.

[36] Mumford-Valenzuela CL, Janis V, Wang PY. Heuristics for Large Strip Packing Problems with Guillotine Patterns:An Empirical Study. Springer, 2004.501-522.

[37] Leung SCH, Zhang DF. A fast layer-based heuristic for non-guillotine strip packing. Expert Systems with Applications, 2011, 38(10):13032-13042.

[38] Glorot A, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In:Proc. of the 13th Int'l Conf. on Artificial Intelligence and Statistics, PMLR, 2010.249-256.

[39] Chen MF. Research on hybrid heuristic algorithm for two-dimensional strip packing problem[MS. Thesis]. Xiamen:Xiamen University, 2020(in Chinese with English abstract).

附中文参考文献:

[8] 黄文奇,刘景发.基于欧氏距离的矩形Packing问题的确定性启发式求解算法.计算机学报,2006,29(5):734-739.

[15] 何琨,姬朋立,李初民.求解二维矩形Packing面积最小化问题的动态归约算法.软件学报,2013,24(9):2078-2088. http://www.jos.org.cn/1000-9825/4404.htm[doi:10.3724/SP.J.1001.2013.04404]

[39] 陈梦烦.二维条形装箱问题的混合启发式算法研究[硕士学位论文].厦门:厦门大学,2020.

引用本文

阳名钢,陈梦烦,杨双远,张德富.求解二维装箱问题的强化学习启发式算法.软件学报,2021,32(12):3684-3697

复制

文章指标

点击次数:2155
下载次数: 5663
HTML阅读次数: 2353
引用次数: 0

历史

收稿日期:2020-07-14
最后修改日期:2020-08-20
录用日期:
在线发布日期: 2021-05-21
出版日期: 2021-12-06

微信服务号

微信订阅号

引用本文

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码