基于数据场聚类的共享单车需求预测模型
作者:
作者简介:

乔少杰(1981-),男,博士,教授,CCF杰出会员,主要研究领域为机器学习,城市计算,深度学习;
黄发良(1975-),男,博士,副教授,主要研究领域为数据挖掘;
韩楠(1984-),女,博士,副教授,主要研究领域为机器学习;
元昌安(1964-),男,博士,教授,博士生导师,CCF专业会员,主要研究领域为数据库;
岳昆(1979-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为人工智能;
丁鹏(1993-),男,硕士生,主要研究领域为机器学习;
易玉根(1986-),男,博士,副教授,CCF专业会员,主要研究领域为机器学习,深度学习;
Gutierrez LA (1980-),男,博士,Researcher,主要研究领域为机器学习.

通讯作者:

韩楠,E-mail:hannan@cuit.edu.cn

基金项目:

国家自然科学基金(61772091,61802035,61962006,62072311,U1802271,U2001212);四川省科技计划(2021JDJQ0021,2020YFG0153,20YYJC2785,2019YFS0067,2020YJ0481,2020YFS0466,2020YJ0430,2020YDR0164);CCF-华为数据库创新研究计划(CCF-HuaweiDBIR2020004A);广西自然科学基金(2018GXNSFDA138005)


Shared-bike Demand Prediction Model Based on Station Clustering
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [33]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    共享单车系统日益普及,积累了海量的出行轨迹数据.在共享单车系统中,用户的借车和还车行为是随机的,且受天气、时间等动态因素影响,使得共享单车调度不平衡,影响单车用户体验,并给运营商造成巨大经济损失.提出了新型基于站点聚类的共享单车需求预测算法,通过构建单车转移网络计算站点活跃度,充分考虑站点地理位置和单车转移模式因素,基于数据场聚类思想,将距离相近和用车模式相似的站点聚合到一个聚簇中,给出最佳簇中心个数求取方法.充分分析时间和天气因素对站点单车需求的影响,利用皮尔逊相关系数,从真实天气数据中选择相关性最大的天气特征,结合历史聚簇内单车需求量,将其转化为三维向量,利用多特征长短时记忆深度神经网络LSTM (long short-term memory)对向量内的特征信息进行学习和训练,以30分钟为长时间间隔,对每个聚簇内的单车需求量进行预测分析.与传统机器学习算法和当前主流方法进行对比,实验结果表明,所提单车需求模型预测性能得到显著提升.

    Abstract:

    Bike-sharing system is becoming more and more popular and there accumulates a large volume of trajectory data. In the bike-sharing system, the borrowing and returning behavior of users are arbitrary. In addition, bike-sharing system will be affected by weather, time period, and other dynamic factors, which makes shared bike scheduling unbalanced, affects user’s experience, and causes huge economic losses to operators. A novel shared-bike demand prediction model based on station clustering is proposed, the activeness of stations is calculated by constructing a bike transformation network. The geographical location of stations and the bike transmission patterns are taken into full consideration, and the stations with near distances and transformation patterns are aggregated into a cluster based on the idea of data field clustering. In addition, a method for computing the optimal number of cluster centers is presented. The influence of time and weather factors on bike demand is fully analyzed and the Pearson correlation coefficient is used to choose the most relevant weather features from the real weather data and transformed into a three-dimensional vector by taking into consideration the historical demand for bicycles in the cluster. In addition, long short-term memory (LSTM) neural network with multiple features is employed to learn and train the feature information in the vector, and the bike demand in each cluster is predicted and analyzed every thirty minutes. When compared with the traditional machine learning algorithms and the state-of-the-art methods, the results show that the prediction performance of the proposed model has been significantly improved.

    参考文献
    [1] http://www.hellobike.com
    [2] Yang ZD, Hu J, Shu YC, et al. Mobility modeling and prediction in bike-sharing systems. In: Proc. of the 14th Annual Int’l Conf. on Mobile Systems, Applications and Services. New York: ACM, 2016. 165-178. [doi: 10.1145/2906338.2906408]
    [3] Huang F, Qiao SJ, Peng J, et al. A bimodal gaussian inhomogeneous poisson algorithm for bike number prediction in bike-sharing system. IEEE Trans. on Intelligent Transportation Systems, 2019, 20(8): 2848-2857. [doi: 10.1109/TITS.2018.2868483]
    [4] Ashqar HI, Elhenawy M, Almannaa MH, et al. Modeling bike availability in a bike-sharing system using machine learning. In: Proc. of the 5th IEEE Int’l Conf. on MT-ITS. Washington: IEEE, 2017. 374-378. [doi: 10.1109/MTITS.2017.8005700]
    [5] Liu JJ, Sun LL, Chen WW, et al. Rebalancing bike sharing systems: A multi-source data smart optimization. In: Proc. of the 22nd ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM, 2016. 1005-1014. [doi: 10.1145/ 2939672.2939776]
    [6] Fricker C, Gast N. Incentives and redistribution in homogeneous bike-sharing systems with stations of finite capacity. Euro Journal on Transportation and Logistics, 2016, 5(3): 261-291. [doi: 10.1007/s13676-014-0053-5]
    [7] Julia P, Przemyslaw AG, Ryota K, et al. Predicting the success of online petitions leveraging multidimensional time-series. In: Proc. of the 26th Int’l Conf. on World Wide Web. New York: ACM, 2017. 755-764. [doi: 10.1145/3038912.3052705]
    [8] Qiao SJ, Han N, Wang JF, et al. Predicting long-term trajectories of connected vehicles via prefix-projection technique. IEEE Trans. on Intelligent Transportation Systems, 2018, 19(7): 2305-2315. [doi: 10.1109/TITS.2017.2750075]
    [9] Li YX, Zheng Y, Zhang HC, et al. Traffic prediction in a bike-sharing system. In: Proc. of the 23rd SIGSPATIAL Int’l Conf. on Advances in Geographic Information Systems. New York: ACM, 2015. 33:1-33:10. [doi: 10.1145/2820783.2820837]
    [10] Chen LB, Zhang DQ, Wang LY, et al. Dynamic cluster-based over-demand prediction in bike sharing systems. In: Proc. of the 2016 ACM Int’l Joint Conf. on Pervasive and Ubiquitous Computing. New York: ACM, 2016. 841-852. [doi: 10.1145/2971648. 2971652]
    [11] Feng SJ, Chen H, Du C, et al. A hierarchical demand prediction method with station clustering for bike sharing system. In: Proc. of the 3rd IEEE Int’l Conf. on Data Science in Cyberspace. Washington: IEEE, 2018. 829-836.[doi: 10.1109/DSC.2018.00133]
    [12] Zhang XK, Fei S, Song C, et al. Label propagation algorithm based on local cycles for community detection. Int’l Journal of Modern Physics B, 2015, 29(5): 1550029. [doi: 10.1142/S0217979215500290]
    [13] Schuijbroek J, Hampshire RC, Van Hoeve WJ. Inventory rebalancing and vehicle routing in bike sharing systems. European Journal of Operational Research, 2017, 257(3): 992-1004. [doi: 10.1016/j.ejor.2016.08.029]
    [14] Lin L, He ZB, Peeta S. Predicting station-level hourly demand in a large-scale bike-sharing network: A graph convolutional neural network approach. Transportation Research Part C: Emerging Technologies, 2018, 97: 258-276. [doi: 10.1016/j.trc.2018.10.011]
    [15] Chai D, Wang LY, Yang Q. Bike flow prediction with multi-graph convolutional networks. In: Proc. of the 26th ACM SIGSPATIAL Int’l Conf. on Advances in Geographic Information Systems. New York: ACM, 2018. 397-400. [doi: 10.1145/ 3274895.3274896]
    [16] Lv YS, Duan YJ, Kang WW, et al. Traffic flow prediction with big data: A deep learning approach. IEEE Trans. on Intelligent Transportation Systems, 2015, 16(2): 865-873. [doi: 10.1109/TITS.2014.2345663]
    [17] Xu YX, Wu WG, Wang SM, et al. Data center temperature prediction algorithm based on long short-term memory network. Computer Technology and Development, 2019, 29(12): 1-7 (in Chinese with English abstract). [doi: 10.3969/j.issn.1673-629X. 2019.12.001]
    [18] Su M, Wu C, Huang K, et al. Cell-coupled long short-term memory with l-skip fusion mechanism for mood disorder detection through elicited audiovisual features. IEEE Trans. on Neural Networks and Learning Systerms, 2020, 31(1): 124-135. [doi: 10. 1109/TNNLS.2019.2899884]
    [19] Cohen J, Cohen P, West SG, et al. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. New York: Psychology Press, 2013. 379-384. [doi: 10.4324/9781410606266]
    [20] Bai Z, Huang L, Chen JN, et al. Optimization of deep convolutional neural network for large scale image classification. Ruan Jian Xue Bao/Journal of Software, 2018, 29(4): 1029-1038 (in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5404.htm [doi: 10.13328/j.cnki.jos.005404]
    [21] Zhou XL, Chen YG. Understanding spatiotemporal patterns of biking behavior by analyzing massive bike sharing data in Chicago. Plos One, 2015, 10(10): e0137922. [doi: 10.1371/journal.pone.0137922]
    [22] Chen LB, Ma XJ, Nguyen TMT, et al. Understanding bike trip patterns leveraging bike sharing system open data. Frontiers of Computer Science, 2017, 11(1): 38-48. [doi: 10.1007/s11704-016-6006-4]
    [23] Chardon CMD, Caruso G, Thomas I. Bike-share rebalancing strategies, patterns, and purpose. Journal of Transport Geography, 2016, 55: 22-39. [doi: 10.1016/j.jtrangeo.2016.07.003]
    [24] Caulfield B, O’Mahony M, Brazil W, et al. Examining usage patterns of a bike-sharing scheme in a medium sized city. Transportation Research Part A: Policy and Practice, 2017, 100: 152-161. [doi: 10.1016/j.tra.2017.04.023]
    [25] Taqi AM, Awad A, Al-Azzo F, et al. The impact of multi-optimizers and data augmentation on TensorFlow convolutional neural network performance. In: Proc. of the 1st IEEE Conf. on Multimedia Information Processing and Retrieval. Washington: IEEE, 2018. 140-145. [doi: 10.1109/MIPR.2018.00032]
    [26] Mcmahan HB, Holt G, Sculley D, et al. Ad click prediction: A view from the trenches. In: Proc. of the 19th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining. New York: ACM, 2013. 1222-1230. [doi: 10.1145/2487575.2488200]
    [27] Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 2011, 12: 2121-2159. [doi: 10.1109/TNN.2011.2146788]
    [28] Kingma D, Ba J. Adam: A method for stochastic optimization. In: Proc. of the 3rd Int’l Conf. on Learning Representations (ICLR 2015). San Diego, 2015. 1-15.
    [29] Cai L, Gu J, Ma JH, et al. Probabilistic wind power forecasting approach via instance-based transfer learning embedded gradient boosting decision trees. Energies, 2019, 12(1): 1-19. [doi: 10.3390/en12010159]
    [30] Bhatia N, Vandana. Survey of nearest neighbor techniques. Int’l Journal of Computer Science and Information Security, 2010, 8(2): 302-305.
    附中文参考文献:
    [17] 徐一轩, 伍卫国, 王思敏, 等. 基于长短期记忆网络(LSTM)的数据中心温度预测算法. 计算机技术与发展, 2019, 29(12): 1-7. [doi: 10.3969/j.issn.1673-629X.2019.12.001]
    [20] 白琮, 黄玲, 陈佳楠, 等. 面向大规模图像分类的深度卷积神经网络优化. 软件学报, 2018, 29(4): 1029-1038. http://www.jos.org.cn/1000-9825/5404.htm [doi: 10.13328/j.cnki.jos.005404]
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

乔少杰,韩楠,岳昆,易玉根,黄发良,元昌安,丁鹏,Louis Alberto GUTIERREZ.基于数据场聚类的共享单车需求预测模型.软件学报,2022,33(4):1451-1476

复制
分享
文章指标
  • 点击次数:1882
  • 下载次数: 5460
  • HTML阅读次数: 4020
  • 引用次数: 0
历史
  • 收稿日期:2021-01-17
  • 最后修改日期:2021-07-16
  • 在线发布日期: 2021-10-26
  • 出版日期: 2022-04-06
文章二维码
您是第19754413位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号