[关键词]
[摘要]
与研究固定特征空间的传统在线学习相比,特征演化学习通常假设特征不会以任意方式消失或出现,而是随着收集数据特征的硬件设备更换旧特征消失、新特征出现.然而,已有的特征演化学习方法仅利用数据流的一阶信息,而忽略可以挖掘特征之间相关性和显著提高分类性能的二阶信息.提出了一种特征演化的置信-加权学习算法来解决上述问题:首先,引入二阶置信-加权来更新数据流的预测模型;接着,为了充分利用已学习的模型,在重叠时期学习线性映射来恢复旧特征;随后,用恢复的旧特征更新已有模型;同时,用新特征学习新的预测模型;继而,运用两种集成方法来利用这两种模型;实验研究表明,所提算法优于已有的特征演化学习算法.
[Key word]
[Abstract]
Compared with traditional online learning for fixed features, feature evolvable learning usually assumes that features would not vanish or appear in an arbitrary way, while the old features and new features gathered by the hardware devices will disappear and emerge at the same time along with the devices exchanging simultaneously. However, the existing feature evolvable algorithms merely utilize the first-order information of data streams, regardless of the second-order information which explores the correlations between features and significantly improves the classification performance. A confidence-weighted learning for feature evolution (CWFE) algorithm is proposed to solve the aforementioned problem. First, second-order confidence-weighted learning for data streams is introduced to update the prediction model. Next, in order to benefit the learned model, linear mapping during the overlap period is learned to recover the old features. Then, the existing model is updated with the recovered old features, and at the same time, a new predictive model is learned with the new features. Furthermore, two ensemble methods are introduced to utilize these two models. Finally, empirical studies show superior performance over state-of-the-art feature evolvable algorithms.
[中图分类号]
[基金项目]
国家重点研发计划(2018AAA0100905);中央引导地方科技发展资金(2021Szvup056);江苏省重点研发计划(产业前瞻与关键核心技术)(BE2021028);国家电网公司科学技术项目(SGJS0000DKJS2000952);龙岩市科技计划(2019LYF13002)