[关键词]
[摘要]
提出了一种增量式极端随机森林分类器(incremental extremely random forest,简称IERF),用于处理数据流,特别是小样本数据流的在线学习问题.IERF 算法中新到达的样本将被存储到相应的叶节点,并通过Gini 系数来确定是否对当前叶节点进行分裂扩展,在给定有限数量,甚至是少量样本的情况下,IERF 算法能够快速高效地完成分类器的增量构造.UCI 数据集的实验证明,提出的IERF 算法具有与离线批量学习的极端随机森林(extremely random forest,简称ERF
[Key word]
[Abstract]
This paper proposes an incremental extremely random forest (IERF) algorithm, dealing with online learning classification with streaming data, especially with small streaming data. In this method, newly arrived examples are stored at the leaf nodes and used to determine when to split the leaf nodes combined with Gini index, so the trees can be expanded efficiently and fast with a few examples. The proposed online IERF algorithm gives more competitive or even better performance, than the offline extremely random forest (ERF) method, based on the UCI data experiment. On the moderate training datasets, the IERF algorithm beats the decision tree reconstruction algorithm and other incremental learning algorithms on the performance. Finally, the IERF algorithm is used to solve online video object tracking (multi-object tracking also included) problems, and the results on the challenging video sequences demonstrate its effectiveness and robustness.
[中图分类号]
[基金项目]
国家自然科学基金(90707003, 60970094)