[关键词]
[摘要]
在多标记学习(MLL)问题中,每个示例都与一组标记相关联.为了实现对未见示例的高效预测,挖掘和利用标记之间的关系是至关重要的.大多数已有的研究都将关系简化为标记之间的相关性,而相关性又通常基于标记的共现性.揭示了因果关系对于描述一个标记在学习过程中如何帮助另一个标记更为重要.基于这一观察,提出了两种策略来从标记因果有向无环图(DAG)中生成标记的因果顺序,同时使得生成的因果顺序都遵循因标记应该在果标记之前的准则.第1种策略的主要思想是对随机顺序进行排序,使其满足DAG中的因果关系.而第2种策略的主要思想是根据DAG的结构,将标记放入许多不相交的拓扑层次中,再通过它们的拓扑结构进行排序.进一步,通过将因果顺序纳入到分类器链(CC)模型中,提出了一种有效的MLL方法,从而从更加本质的角度来利用标记关系.在多个数据集上的实验结果验证了该方法确实能够挖掘出有效的标记因果顺序,并帮助提升学习性能.
[Key word]
[Abstract]
In multi-label learning (MLL) problems, each example is associated with a set of labels. In order to train a well-performed predictor for unseen examples, exploiting relations between labels is crucially important. Most exiting studies simplify the relation as correlations among labels, typically based on their co-occurrence. This study discloses that causal relations are more essential for describing how a label can help another one during the learning process. Based on this observation, two strategies are proposed to generate causal orders of labels from the label causal directed acyclic graph (DAG), following the constraint that the cause label should be prior to the effect label. The main idea of the first strategy is to sort a random order to make it satisfied the cause-effect relations in DAG. And the main idea of the second strategy is to put labels into many non-intersect topological levels based on the structure of the DAG, then sort these labels through their topological structure. Further, by incorporating the causal orders into the classifier chain (CC) model, an effective MLL approach is proposed to exploit the label relation from a more essential view. Experiments results on multiple datasets validate that the extracted causal order of labels indeed provides helpful information to boost the performance.
[中图分类号]
[基金项目]
国家自然科学基金(61673201,61921006)