类脑超大规模深度神经网络系统
CSTR:
作者:
作者单位:

作者简介:

吕建成(1973-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为神经网络基础理论,自然语言处理,智慧医疗,智慧文旅,工业智能化;
韩军伟(1977-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为人工智能,模式识别,类脑计算,遥感影像解译;
叶庆(1989-),男,博士生,主要研究领域为神经网络分布式训练,联邦学习;
吴枫(1969-),男,博士,教授,博士生导师,CCF杰出会员,主要研究领域为视频编码与通信,多媒体内容分析,计算机视觉;
田煜鑫(1998-),男,博士生,主要研究领域为深度学习及其应用.

通讯作者:

吕建成,E-mail:lvjiancheng@scu.edu.cn

中图分类号:

基金项目:

国家重点研发计划(2017YFB1002201);国家杰出青年科学基金(61625204);国家自然科学基金(61836006)


Brain-inspired Large-scale Deep Neural Network System
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    大规模神经网络展现出强大的端到端表示能力和非线性函数的无限逼近能力,在多个领域表现出优异的性能,成为一个重要的发展方向.如自然语言处理(NLP)模型GPT,经过几年的发展,目前拥有1 750亿网络参数,在多个NLP基准上到达最先进性能.然而,按照现有的神经网络组织方式,目前的大规模神经网络难以到达人脑生物神经网络连接的规模.同时,现有的大规模神经网络在多通道协同处理、知识存储和迁移、持续学习方面表现不佳.提出构建一种启发于人脑功能机制的大规模神经网络模型,该模型以脑区划分和脑区功能机制为启发,集成大量现有数据和预训练模型,借鉴脑功能分区来模块化构建大规模神经网络模型,并由脑功能机制提出相应的学习算法,根据场景输入和目标,自动构建神经网络通路,设计神经网络模型来获得输出.该神经网络模型关注输入到输出空间的关系构建,通过不断学习,提升模型的关系映射能力,目标在于让该模型具备多通道协同处理能力,实现知识存储和持续学习,向通用人工智能迈进.整个模型和所有数据、类脑功能区使用数据库系统进行管理,该系统了还集成了分布式神经网络训练算法,为实现超大规模神经网络的高效训练提供支撑.提出了一种迈向通用人工智能的思路,并在多个不同模态任务验证该模型的可行性.

    Abstract:

    Large-scale deep neural networks (DNNs) exhibit powerful end-to-end representation and infinite approximation of nonlinear functions, showing excellent performance in several fields and becoming an important development direction. For example, the natural language processing model GPT, after years of development, now has 175 billion network parameters and achieves state-of-the-art performance on several NLP benchmarks. However, according to the existing deep neural network organization, the current large-scale network is difficult to reach the scale of human brain biological neural network connection. At the same time, the existing large-scale DNNs do not perform well in multi-channel collaborative processing, knowledge storage, and reasoning. This study proposes a brain-inspired large-scale DNN model, which is inspired by the division and the functional mechanism of brain regions and built modularly by the functional of the brain, integrates a large amount of existing data and pre-trained models, and proposes the corresponding learning algorithm by the functional mechanism of the brain. The DNN model implements a pathway to automatically build a DNN as an output using the scene as an input. Simultaneously, it should not only learn the correlation between input and output but also needs to have the multi-channel collaborative processing capability to improve the correlation quality, thereby realizing knowledge storage and reasoning ability, which could be treated as a way toward general artificial intelligence. The whole model and all data sets and brain-inspired functional areas are managed by a database system which is equipped with the distributed training algorithms to support the efficient training of the large-scale DNN on computing clusters. This study also proposes a novel idea to implement general artificial intelligence, and the large-scale model is validated on several different modal tasks.

    参考文献
    相似文献
    引证文献
引用本文

吕建成,叶庆,田煜鑫,韩军伟,吴枫.类脑超大规模深度神经网络系统.软件学报,2022,33(4):1412-1429

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2021-05-20
  • 最后修改日期:2021-07-16
  • 录用日期:
  • 在线发布日期: 2021-10-26
  • 出版日期: 2022-04-06
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号