时序数据曲线排齐的相关性分析方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61273291, 71031006); 山西省回国留学人员科研基金(2012-008); 中国民航信息技术科研基地开放基金(CAAC-ITRB-201305)


Correlation Analysis in Curve Registration of Time Series
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    时序数据是数据挖掘的一类重要对象.在做时序数据分析时,若不考虑数据的时差,则会造成相关性的误判.所以,时序数据存在相关性和时差相互制约的问题.通过对时序数据的相关性和协同性进行研究,给出了双序列的相关性判定方法和曲线排齐方法.首先,从时间弯曲的角度分析了两类相关性错误产生的原因及其特点;然后,根据相关系数的渐近分布得到相关系数在一定显著性水平上的界,将两者综合得到基于时移序列相关系数特征的相关性判定方法;最后,提出一种基于相关系数最大化的曲线排齐模型,其适用范围比AISE准则更广.模型采用光滑广义期望最大化(S-GEM)算法求解时间弯曲函数.在构造数据和真实数据上的数值实验结果表明:该相关性判别方法在伪回归识别中,比常规的3种相关系数以及Granger因果检验更有效;提出的S-GEM算法在大多数情况下明显优于连续单调排齐法(CMRM)、自模型排齐法(SMR)和极大似然排齐法(MLR).该文考虑的是双序列的线性相关问题和函数型曲线排齐方法,这些结果可为回归分析的相关性判定和时间对齐提供理论基础,并为多序列相关性分析和曲线排齐提供参考方向.

    Abstract:

    Time series data is an important object of data mining. In analysis of time series, misjudgment of correlation will occur if time lags are not considered. Therefore, there exists mutual restraint between correlation and time lags in time series. Based on the exploration of correlation and simultaneousness of time series, the correlation identification and curve registration methods for double sequences are given in this paper. Concretely, the study investigates the reasons and characteristics of two types of errors in correlation analysis in the view of time warping, and then deduces the correlation coefficient’s bounds in a certain significance level by its asymptotic distribution. Further, a correlation identification method based on time-lag series is proposed. Finally, the curve registration model of maximizing the correlation coefficient is presented with a broader application than AISE. Smoothing-generalized expectation maximization (S-GEM) algorithm is used to solve the time warping function of the new model. The experimental results on simulated and real data demonstrate that the proposed correlation identification approach is more effective than 3 correlation coefficients and Granger causality test in recognition of spurious regression. The registration method provided is obviously performed better than the classical continuous monotone registration method (CMRM), Self-modeling registration (SMR) and maximum likelihood registration (MLR) in most situations. Linear correlation of double series and functional curve registration are considered here, and the results can provide the theoretical basis for correlation identification and time alignment in regression and reference direction for correlation analysis and curves registration of multiple series.

    参考文献
    相似文献
    引证文献
引用本文

姜高霞,王文剑.时序数据曲线排齐的相关性分析方法.软件学报,2014,25(9):2002-2017

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2014-01-20
  • 最后修改日期:2014-04-22
  • 录用日期:
  • 在线发布日期: 2014-09-09
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号