基于性能建模的深度学习训练任务调度综述
作者:
中图分类号:

TP18

基金项目:

山东省重大创新工程 (2021CXGC010101); 国家自然科学基金 (62302489)


Survey on Task Scheduling of Deep Learning Training Based on Performance Modeling
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [88]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    近年来, 深度学习研究成果在全球范围内得到广泛应用. 为了提高大规模深度学习模型的训练效率, 业界通常采用建设GPU集群并配置高效的任务调度器的策略. 然而, 深度学习训练任务具有性能异构性和放置拓扑敏感性等复杂性能特性. 对性能无感知的调度容易导致资源利用率低下、训练效率差等问题. 为了应对这一挑战, 近期涌现出大量基于性能建模的深度学习训练任务调度器. 这些调度器通过构建精确的性能模型, 深入了解任务的复杂性能特性, 并据此设计更优化的调度算法, 从而形成更高效的调度方案. 首先基于建模设计思路, 对目前调度器使用的性能建模方法进行分类综述. 随后, 根据调度器利用性能建模的调度优化途径, 对现有的任务调度工作进行系统性分析. 最后, 对性能建模与调度在未来的研究方向进行展望.

    Abstract:

    In recent years, research achievements in deep learning have found widespread applications globally. To enhance the training efficiency of large-scale deep learning models, industry practices often involve constructing GPU clusters and configuring efficient task schedulers. However, deep learning training tasks exhibit complex performance characteristics such as performance heterogeneity and placement topological sensitivity. Scheduling without considering performance can lead to issues such as low resource utilization and poor training efficiency. In response to this challenge, a great number of schedulers of deep learning training tasks based on performance modeling have emerged. These schedulers, by constructing accurate performance models, delve into the intricate performance characteristics of tasks. Based on this understanding, they design more optimized scheduling algorithms, thereby forming more efficient scheduling solutions. This study begins with a modeling design perspective, providing a categorized review of the performance modeling methods employed by current schedulers. Subsequently, based on the optimized scheduling approaches from performance modeling by schedulers, a systematic analysis of existing task scheduling efforts is presented. Finally, this study outlines prospective research directions for performance modeling and scheduling in the future.

    参考文献
    [1] Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der laak JAWM, van Ginneken B, Sánchez CI. A survey on deep learning in medical image analysis. Medical Image Analysis, 2017, 42: 60–88.
    [2] Parekh D, Poddar N, Rajpurkar A, Chahal M, Kumar N, Joshi GP, Cho W. A review on autonomous vehicles: Progress, methods and challenges. Electronics, 2022, 11(14): 2162.
    [3] Kortli Y, Jridi M, Al Falou A, Atri M. Face recognition systems: A survey. Sensors, 2020, 20(2): 342.
    [4] 刘宇宸, 宗成庆. 跨模态信息融合的端到端语音翻译. 软件学报, 2023, 34(4): 1837–1849. http://www.jos.org.cn/1000-9825/6413.htm
    Liu YC, Zong CQ. End-to-end speech translation by integrating cross-modal information. Ruan Jian Xue Bao/Journal of Software, 2023, 34(4): 1837–1849 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6413.htm
    [5] Khurana D, Koli A, Khatter K, Singh S. Natural language processing: State of the art, current trends and challenges. Multimedia Tools and Applications, 2023, 82(3): 3713–3744.
    [6] Weng QZ, Xiao WC, Yu YH, Wang W, Wang C, He J, Li Y, Zhang LP, Lin W, Ding Y. MLaaS in the wild: Workload analysis and scheduling in large-scale heterogeneous GPU clusters. In: Proc. of the 19th USENIX Symp. on Networked Systems Design and Implementation. Renton: USENIX, 2022. 945–960.
    [7] Jeon M, Venkataraman S, Phanishayee A, Qian JJ, Xiao WC, Yang F. Analysis of large-scale multi-tenant GPU clusters for DNN training workloads. In: Proc. of the 2019 USENIX Annual Technical Conf. Renton: USENIX, 2019. 947–960.
    [8] Rico-Gallego JA, Díaz-Martín JC, Manumachu RR, Lastovetsky AL. A survey of communication performance models for high-performance computing. ACM Computing Surveys, 2019, 51(6): 126.
    [9] Reuther A, Byun C, Arcand W, Bestor D, Bergeron B, Hubbell M, Jones M, Michaleas P, Prout A, Rosa A, Kepner J. Scalable system scheduling for HPC and big data. Journal of Parallel and Distributed Computing, 2018, 111: 76–92.
    [10] Netto MAS, Calheiros RN, Rodrigues ER, Cunha RLF, Buyya R. HPC cloud for scientific and business applications: Taxonomy, vision, and research challenges. ACM Computing Surveys, 2019, 51(1): 8.
    [11] 宋杰, 孙宗哲, 毛克明, 鲍玉斌, 于戈. MapReduce大数据处理平台与算法研究进展. 软件学报, 2017, 28(3): 514–543. http://www.jos.org.cn/1000-9825/5169.htm
    Song J, Sun ZZ, Mao KM, Bao YB, Yu G. Research advance on MapReduce based big data processing platforms and algorithms. Ruan Jian Xue Bao/Journal of Software, 2017, 28(3): 514–543 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5169.htm
    [12] Yu FX, Wang D, Shangguan LF, Zhang MJ, Tang XL, Liu CC, Chen X. A survey of large-scale deep learning serving system optimization: Challenges and opportunities. arXiv:2111.14247, 2021.
    [13] Yu FX, Wang D, Shangguan LF, Zhang MJ, Liu CC, Chen X. A survey of multi-tenant deep learning inference on GPU. arXiv:2203.09040, 2022.
    [14] 任杰, 高岭, 于佳龙, 袁璐. 面向边缘设备的高能效深度学习任务调度策略. 计算机学报, 2020, 43(3): 440–452.
    Ren J, Gao L, Yu JL, Yuan L. Energy-efficient deep learning task scheduling strategy for edge device. Chinese Journal of Computers, 2020, 43(3): 440–452 (in Chinese with English abstract).
    [15] Mittal S, Vaishay S. A survey of techniques for optimizing deep learning on GPUs. Journal of Systems Architecture, 2019, 99: 101635.
    [16] Rasley J, Rajbhandari S, Ruwase O, He YX. DeepSpeed: System optimizations enable training deep learning models with over 100 billion parameters. In: Proc. of the 26th ACM SIGKDD Int’l Conf. on Knowledge Discovery & Data Mining. ACM, 2020. 3505–3506.
    [17] 高赫然, 吴恒, 许源佳, 李修和, 王焘, 张文博. 面向深度学习训练的内存交换机制综述. 软件学报, 2023, 34(12): 5862–5886. http://www.jos.org.cn/1000-9825/6800.htm
    Gao HR, Wu H, Xu YJ, Li XH, Wang T, Zhang WB. Survey on memory swapping mechanism for deep learning training. Ruan Jian Xue Bao/Journal of Software, 2023, 34(12): 5862–5886 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/6800.htm
    [18] Gao W, Hu QH, Ye ZS, Sun P, Wang XL, Luo YW, Zhang TW, Wen YG. Deep learning workload scheduling in GPU datacenters: Taxonomy, challenges and vision. arXiv:2205.11913, 2022.
    [19] Mayer R, Jacobsen HA. Scalable deep learning on distributed infrastructures: Challenges, techniques, and tools. ACM Computing Surveys, 2021, 53(1): 3.
    [20] Gao W, Ye ZS, Sun P, Wen YG, Zhang TW. Chronus: A novel deadline-aware scheduler for deep learning training jobs. In: Proc. of the 2021 ACM Symp. on Cloud Computing. Seattle: ACM, 2021. 609–623. [doi: 10.1145/3472883.3486978]
    [21] Narayanan D, Santhanam K, Kazhamiaka F, Phanishayee A, Zaharia M. Heterogeneity-aware cluster scheduling policies for deep learning workloads. In: Proc. of the 14th USENIX Symp. on Operating Systems Design and Implementation. USENIX, 2020. 481–498.
    [22] Qiao A, Choe SK, Subramanya SJ, Neiswanger W, Ho Q, Zhang H, Ganger GR, Xing EP. Pollux: Co-adaptive cluster scheduling for goodput-optimized deep learning. In: Proc. of the 15th USENIX Symp. on Operating Systems Design and Implementation. USENIX, 2021.
    [23] He KM, Zhang XY, Ren SQ, Sun J. Deep residual learning for image recognition. In: Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778. [doi: 10.1109/CVPR.2016.90]
    [24] Xiao WC, Bhardwaj R, Ramjee R, Sivathanu M, Kwatra N, Han ZH, Patel P, Peng X, Zhao HY, Zhang QL, Yang F, Zhou LD. Gandiva: Introspective cluster scheduling for deep learning. In: Proc. of the 13th USENIX Symp. on Operating Systems Design and Implementation. Carlsbad: USENIX, 2018. 595–610.
    [25] Han ZH, Tan HS, Jiang SHC, Fu XM, Cao WL, Lau FCM. Scheduling placement-sensitive BSP jobs with inaccurate execution time estimation. In: Proc. of the 2020 IEEE Conf. on Computer Communications. Toronto: IEEE, 2020. 1053–1062.
    [26] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proc. of the 3rd Int’l Conf. on Learning Representations. San Diego: ICLR, 2015.
    [27] Gu JC, Chowdhury M, Shin KG, Zhu YB, Jeon M, Qian JJ, Liu HH, Guo CX. Tiresias: A GPU cluster manager for distributed deep learning. In: Proc. of the 16th USENIX Symp. on Networked Systems Design and Implementation. Boston: USENIX, 2019. 485–500.
    [28] Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectionals for language understanding. In: Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: ACL, 2018. 4171–4186. [doi: 10.18653/v1/N19-1423]
    [29] Yang ZC, Wu H, Xu YJ, Wu YW, Zhong H, Zhang WB. Hydra: Deadline-aware and efficiency-oriented scheduling for deep learning jobs on heterogeneous GPUs. IEEE Trans. on Computers, 2023, 72(8): 2224–2236.
    [30] Le TN, Sun X, Chowdhury M, Liu ZH. AlloX: Compute allocation in hybrid clusters. In: Proc. of the 15th European Conf. on Computer Systems. Heraklion: ACM, 2020. 31. [doi: 10.1145/3342195.3387547]
    [31] Zheng HY, Xu F, Chen L, Zhou Z, Liu FM. Cynthia: Cost-efficient cloud resource provisioning for predictable distributed deep neural network training. In: Proc. of the 48th Int’l Conf. on Parallel Processing. Kyoto: ACM, 2019. 86. [doi: 10.1145/3337821.3337873]
    [32] Mohan J, Phanishayee A, Kulkarni J, Chidambaram V. Looking beyond GPUs for DNN scheduling on multi-tenant clusters. In: Proc. of the 16th USENIX Symp. on Operating Systems Design and Implementation. Carlsbad: USENIX, 2022. 579–596.
    [33] Peng YH, Bao YX, Chen YR, Wu C, Guo CX. Optimus: An efficient dynamic resource scheduler for deep learning clusters. In: Proc. of the 13th EuroSys Conf. Porto: ACM, 2018. 3. [doi: 10.1145/3190508.3190517]
    [34] Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proc. of the 2016 IEEE Conf. on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 2818–2826. [doi: 10.1109/CVPR.2016.308]
    [35] Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. In: Proc. of the 27th Int’l Conf. on Neural Information Processing Systems. Montreal: MIT Press, 2014. 3104–3112.
    [36] Zheng PF, Pan R, Khan T, Venkataraman S, Akella A. Shockwave: Fair and efficient cluster scheduling for dynamic adaptation in machine learning. In: Proc. of the 20th USENIX Symp. on Networked Systems Design and Implementation. Boston: USENIX, 2023. 703–723.
    [37] Agarwal S, Wang HY, Lee K, Venkataraman S, Papailiopoulos D. Adaptive gradient communication via critical learning regime identification. In: Proc. of the 4th Machine Learning and Systems. MLSys, 2021. 55–80.
    [38] Qin HY, Rajbhandari S, Ruwase O, Yan F, Yang L, He YX. SimiGrad: Fine-grained adaptive batching for large scale training using gradient similarity measurement. In: Proc. of the 34th Int’l Conf. on Neural Information Processing Systems. NeurIPS, 2021. 20531–20544.
    [39] Zhu HY, Phanishayee A, Pekhimenko G. Daydream: Accurately estimating the efficacy of optimizations for DNN training. In: Proc. of the 2020 USENIX Annual Technical Conf. USENIX, 2020. 337–352.
    [40] Lam MO, Hollingsworth JK, De Supinski BR, Legendre MP. Automatically adapting programs for mixed-precision floating-point computation. In: Proc. of the 27th Int’l ACM Conf. on Int’l Conf. on Supercomputing. Eugene: ACM, 2013. 369–378.
    [41] Niu W, Guan JX, Wang YZ, Agrawal G, Ren B. DNNFusion: Accelerating deep neural networks execution with advanced operator fusion. In: Proc. of the 42nd ACM SIGPLAN Int’l Conf. on Programming Language Design and Implementation. ACM, 2021. 883–898.
    [42] Duan JF, Li XH, Xu P, Zhang XC, Yan SG, Liang Y, Lin DH. Proteus: Simulating the performance of distributed DNN training. arXiv:2306.02267, 2023.
    [43] Hu QH, Sun P, Yan SG, Wen YG, Zhang TW. Characterization and prediction of deep learning workloads in large-scale GPU datacenters. In: Proc. of the 2021 Int’l Conf. for High Performance Computing, Networking, Storage and Analysis. St. Louis: ACM, 2021. 104.
    [44] Bao YX, Peng YH, Wu C. Deep learning-based job placement in distributed machine learning clusters. In: Proc. of the 2019 IEEE Conf. on Computer Communications. Paris: IEEE, 2019. 505–513. [doi: 10.1109/INFOCOM.2019.8737460]
    [45] Graves A, Fernández S, Gomez F, Schmidhuber J. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In: Proc. of the 23rd Int’l Conf. on Machine Learning. Pittsburgh: ACM, 2006. 369–376.
    [46] Chen ZY, Quan W, Wen M, Fang JB, Yu J, Zhang CY, Luo L. Deep learning research and development platform: Characterizim/
    scheduling with QoS guarantees on GPU clusters. IEEE Trans. on Parallel and Distributed Systems, 2020, 31(1): 34–50.
    [47] Steinberg D, Colla P. CART: Classification and regression trees. The Top Ten Algorithms in Data Mining, 2009, 9: 179.
    [48] Yeung G, Borowiec D, Yang RY, Friday A, Harper R, Garraghan P. Horus: Interference-aware and prediction-based scheduling in deep learning systems. IEEE Trans. on Parallel and Distributed Systems, 2022, 33(1): 88–100.
    [49] Yeung G, Borowiec D, Friday A, Harper R, Garraghan P. Towards GPU utilization prediction for cloud deep learning. In: Proc. of the 12th USENIX Workshop on Hot Topics in Cloud Computing. USENIX, 2020.
    [50] Bai L, Ji WX, Li QY, Yao XL, Xin W, Zhu WY. DNNAbacus: Toward accurate computational cost prediction for deep neural networks. arXiv:2205.12095, 2022.
    [51] He X, Zhao KY, Chu XW. AutoML: A survey of the state-of-the-art. Knowledge-based Systems, 2021, 212: 106622.
    [52] Gao YJ, Gu XY, Zhang HY, Lin HX, Yang M. Runtime performance prediction for deep learning models with graph neural network. In: Proc. of the 45th IEEE/ACM Int’l Conf. on Software Engineering: Software Engineering in Practice. Melbourne: IEEE, 2023. 368–380. [doi: 10.1109/ICSE-SEIP58684.2023.00039]
    [53] Yang G, Shin C, Lee J, Yoo Y, Yoo C. Prediction of the resource consumption of distributed deep learning systems. Proc. of the ACM on Measurement and Analysis of Computing Systems, 2022, 6(2): 29.
    [54] Yu GX, Gao YB, Golikov P, Pekhimenko G. Habitat: A runtime-based computational performance predictor for deep neural network training. In: Proc. of the 2021 USENIX Annual Technical Conf. USENIX, 2021. 503–521.
    [55] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proc. of the 4th Int’l Conf. on Learning Representations. San Juan: ICLR, 2016.
    [56] Liu GD, Wang S, Bao YG. SEER: A time prediction model for CNNs from GPU kernel’s view. In: Proc. of the 30th Int’l Conf. on Parallel Architectures and Compilation Techniques (PACT). Atlanta: IEEE, 2021. 173–185. [doi: 10.1109/PACT52795.2021.00020]
    [57] Wang CC, Liao YC, Kao MC, Liang WY, Hung SH. PerfNet: Platform-aware performance modeling for deep neural networks. In: Proc. of the 2020 Int’l Conf. on Research in Adaptive and Convergent Systems. Gwangju: ACM, 2020. 90–95.
    [58] Wang CC, Liao YC, Kao MC, Liang WY, Hung SH. Toward accurate platform-aware performance modeling for deep neural networks. ACM SIGAPP Applied Computing Review, 2021, 21(1): 50–61.
    [59] Vavilapalli VK, Murthy AC, Douglas C, Agarwal S, Konar M, Evans R, Graves T, Lowe J, Shah H, Seth S, Saha B, Curino C, O’Malley O, Radia S, Reed B, Baldeschwieler E. Apache Hadoop YARN: Yet another resource negotiator. In: Proc. of the 4th Annual Symp. on Cloud Computing. Santa Clara: ACM, 2013. 5. [doi: 10.1145/2523616.2523633]
    [60] Mahajan K, Balasubramanian A, Singhvi A, Venkataraman S, Akella A, Phanishayee A, Chawla S. THEMIS: Fair and efficient GPU cluster scheduling. In: Proc. of the 17th USENIX Conf. on Networked Systems Design and Implementation. Santa Clara: USENIX, 2020. 289–304.
    [61] Kargahi M, Movaghar A. A method for performance analysis of earliest-deadline-first scheduling policy. The Journal of Supercomputing, 2006, 37(2): 197–222.
    [62] Ghodsi A, Zaharia M, Hindman B, Konwinski A, Shenker S, Stoica I. Dominant resource fairness: Fair allocation of multiple resource types. In: Proc. of the 8th USENIX Symp. on Networked Systems Design and Implementation. Boston: USENIX, 2011.
    [63] Chen FH, Li P, Wu C, Guo S. Hare: Exploiting inter-job and intra-job parallelism of distributed machine learning on heterogeneous GPUs. In: Proc. of the 31st Int’l Symp. on High-performance Parallel and Distributed Computing. Minneapolis: ACM, 2022. 253–264. ACM: Minneapolis MN USA. [doi: 10.1145/3502181.3531462]
    [64] Gu DD, Zhao YH, Zhong YM, Xiong YF, Han ZH, Cheng P, Yang F, Huang G, Jin X, Liu XZ. ElasticFlow: An elastic serverless training platform for distributed deep learning. In: Proc. of the 28th ACM Int’l Conf. on Architectural Support for Programming Languages and Operating Systems. Vancouver: ACM, 2023. 266–280. [doi: 10.1145/3575693.3575721]
    [65] Zhao YH, Liu X, Liu SF, Li X, Zhu YB, Huang G, Liu XZ, Jin X. MuxFlow: Efficient and safe GPU sharing in large-scale production deep learning clusters. arXiv:2303.13803, 2023.
    [66] Wang HY, Liu ZT, Shen HY. Job scheduling for large-scale machine learning clusters. In: Proc. of the 16th Int’l Conf. on Emerging Networking Experiments and Technologies. Barcelona: ACM, 2020. 108–120. [doi: 10.1145/3386367.3432588]
    [67] Liaw R, Bhardwaj R, Dunlap L, Zou YT, Gonzalez JE, Stoica I, Tumanov A. HyperSched: Dynamic resource reallocation for model development on a deadline. In: Proc. of the 2019 ACM Symp. on Cloud Computing. Santa Cruz: ACM, 2019. 61–73.
    [68] Yu ML, Wu C, Ji B, Liu J. A sum-of-ratios multi-dimensional-knapsack decomposition for DNN resource scheduling. In: Proc. of the 2021 IEEE Conf. on Computer Communications. Vancouver: IEEE, 2021. 1–10.
    [69] Sun QX, Liu Y, Yang HL, Zhang RZ, Dun M, Li MZ, Liu XY, Xiao WC, Li Y, Luan ZZ, Qian DP. CoGNN: Efficient scheduling for concurrent GNN training on GPUs. In: Proc. of the 2022 Int’l Conf. for High Performance Computing, Networking, Storage and Analysis. Dallas: IEEE, 2022. 1–15. [doi: 10.1109/SC41404.2022.00044]
    [70] NVIDIA. Multi-process service. 2023. https://docs.nvidia.com/deploy/mps/index.html
    [71] NVIDIA. NVIDIA multi-instance GPU user guide. 2023. https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html
    [72] Gu DD, Xie XT, Huang G, Jin X, Liu XZ. Energy-efficient GPU clusters scheduling for deep learning. arXiv:2304.06381, 2023.
    [73] Zhao HY, Han ZH, Yang Z, Zhang QL, Yang F, Zhou LD, Yang M, Lau FCM, Wang YQ, Xiong YF, Wang B. HiveD: Sharing a GPU cluster for deep learning with guarantees. In: Proc. of the 14th USENIX Symp. on Operating Systems Design and Implementation. USENIX, 2020. 515–532.
    [74] Shukla D, Sivathanu M, Viswanatha S, et al. Singularity: Planet-scale, preemptive and elastic scheduling of AI workloads. arXiv:2202.07848, 2022.
    [75] Wang SQ, Gonzalez OJ, Zhou XB, Williams T, Friedman BD, Havemann M, Woo T. An efficient and non-intrusive GPU scheduling framework for deep learning training systems. In: Proc. of the 2020 Int’l Conf. for High Performance Computing, Networking, Storage and Analysis. Atlanta: IEEE, 2020. 1–3. [doi: 10.1109/SC41405.2020.00094]
    [76] Yeh TA, Chen HH, Chou J. KubeShare: A framework to manage GPUs as first-class and shared resources in container cloud. In: Proc. of the 29th Int’l Symp. on High-performance Parallel and Distributed Computing. Stockholm: ACM, 2020. 173–184.
    [77] Gu J, Song SB, Li Y, Luo HM. GaiaGPU: Sharing GPUs in container clouds. In: Proc. of the 2018 IEEE Int’l Conf. on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom). Melbourne: IEEE, 2018. 469–476. [doi: 10.1109/BDCloud.2018.00077]
    [78] Wu BY, Zhang ZL, Bai ZH, Liu XZ, Jin X. Transparent GPU sharing in container clouds for deep learning workloads. In: Proc. of the 20th USENIX Symp. on Networked Systems Design and Implementation. Boston: USENIX, 2023. 69–85.
    [79] ALIBABA. Alibaba cloud elastic GPU service best practice. 2023. https://static-aliyun-doc.oss-cn-hangzhou.aliyuncs.com/download%2Fpdf%2F163835%2FBest_Practices_reseller_en-US.pdf
    [80] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. In: Proc. of the 31st Int’l Conf. on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 2017. 6000–6010.
    [81] OpenAI. GPT-4 technical report. arXiv:2303.08774, 2023.
    [82] 百度. 文心一言. 2023. https://yiyan.baidu.com/
    Baidu. ERNIE bot. 2023. https://yiyan.baidu.co
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

杨紫超,吴恒,吴悦文,张文博.基于性能建模的深度学习训练任务调度综述.软件学报,2025,36(4):1570-1589

复制
分享
文章指标
  • 点击次数:325
  • 下载次数: 2225
  • HTML阅读次数: 11
  • 引用次数: 0
历史
  • 收稿日期:2023-09-25
  • 最后修改日期:2023-11-06
  • 在线发布日期: 2024-06-20
文章二维码
您是第19780495位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号