Abstract:In the era of big data, the sizes of data sets in terms of the number of samples, features, and classes have dramatically increased, and the classes usually exists a hierarchical structure. It is of great significance to select features for hierarchical data. In recent years, relevant feature selection algorithms have been proposed. However, the existing algorithms do not take full advantage of the information of the hierarchical structure of classes, and ignore the common and specific features of different class nodes. This study proposes a label- correlation-based feature selection algorithm for hierarchical classification with common and specific features. The algorithm uses recursive regularization to select the corresponding specific features for each internal node of the hierarchical structure, and makes full use of the hierarchical structure to analyze the label correlation, and then utilizes regularized penalty to select the common features of each subtree. Finally, the proposed model not only can address hierarchical tree data, but also can address more complex hierarchical DAG data directly. Experimental results on six hierarchical tree data sets and four hierarchical DAG data sets demonstrate the effectiveness of the proposed algorithm.