数据流上的预测聚集查询处理算法
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

Supported by the National Natural Science Foundation of China under Grant No.60473075 (国家自然科学基金); the Key Project of the Natural Science Foundation of Heilongjiang Province under Grant No.ZJG03-05 (黑龙江省自然科学基金重点项目)


Processing Algorithms for Predictive Aggregate Queries over Data Streams
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    实时数据流未来趋势的预测具有重要的实际应用意义.例如,在环境监测传感器网络中,通过对感知数据流进行预测聚集查询,观察者可以预测网络覆盖的区域在未来一段时间内的平均温度和湿度,以确定是否会发生异常事件.目前的研究工作多数集中在数据流上当前数据的查询,数据流上预测查询的研究工作还很少.采用多元线性回归方法,给出了数据流上的聚集值预测模型,提出了一种数据流预测聚集查询处理方法.当预测失败的次数大于预先给定的阈值时,给出了一种预测模型自动调整策略,以降低预测误差.还提出了滑动窗口的更新周期、数据流的流速对预测精度影响的数学模型.理论分析与实验结果表明,提出的预测聚集查询处理算法具有较高的性能,并且能够返回满足用户精度要求的预测查询结果.在实验中,采用TPC-H国际标准测试数据和TAO(tropical atmosphere ocean)测量的海洋表面空气温度数据来构造数据流.

    Abstract:

    It is very important in a lot of applications to forecast future trend of data streams. For example, using predictive queries to a sensor network for monitoring environment, observers can forecast future average temperature and humidity in the area covered by the network to determine abnormal events. Recent works on query processing over data streams mainly focused on approximate queries over newly arriving data. To the best of the knowledge, there is nothing to date in the literature on predictive query processing over data streams. Adopting multivariable linear regression, a predictive mathematical model for forecasting the aggregate value over data streams is first proposed. Then, based on the model, a predictive aggregate query processing method over data streams is proposed in the paper. When the frequency of forecast failing is greater than a predefined threshold, an adaptive strategy for the predictive mathematical model is proposed. A mathematical model that characterizes the affects of the updating cycle of sliding window and data stream rate on predictive accuracy is also presented.Analytical and experimental results show that the proposed method is very effective, and the proposed algorithms have higher performance and provide better prediction of aggregate values over data streams to users. In experiments the TPC-H data and ocean air temperature data measured by TAO (tropical atmosphere ocean) are used to construct data streams.

    参考文献
    相似文献
    引证文献
引用本文

李建中,郭龙江,张冬冬,王伟平.数据流上的预测聚集查询处理算法.软件学报,2005,16(7):1252-1261

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2004-05-17
  • 最后修改日期:2005-02-03
  • 录用日期:
  • 在线发布日期:
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号