Abstract:With the development of technologies such as big data, computing, and the Internet, artificial intelligence techniques represented by machine learning and deep learning have achieved tremendous success. Particularly, the emergence of various large-scale models has greatly accelerated the application of artificial intelligence in various fields. However, the success of these techniques heavily relies on massive training data and abundant computing resources, which significantly limits their application in data or resource-scarce domains. Therefore, how to learn from limited samples, known as few-shot learning, has become a crucial research problem in the new wave of industrial transformation led by artificial intelligence. The most commonly used approach in few-shot learning is based on meta- learning. Such methods learn meta-knowledge for solving similar tasks by training on a series of related training tasks, which enables fast learning on new testing tasks using the acquired meta-knowledge. Although these methods have achieved sound results in few-shot classification tasks, they assume that the training and testing tasks come from the same distribution. This implies that a sufficient number of training tasks are required for the model to generalize the learned meta-knowledge to continuously changing testing tasks. However, in some real-world scenarios with truly limited data, ensuring an adequate number of training tasks is challenging. To address this issue, this study proposes a robust few-shot classification method based on diverse and authentic task generation (DATG). The method generates additional training tasks by applying Mixup to a small number of existing tasks, aiding the model in learning. By constraining the diversity and authenticity of the generated tasks, this method effectively improves the generalization of few-shot classification methods. Specifically, the base classes in the training set are firstly clustered to obtain different clusters and then tasks are selected from different clusters for Mixup to increase task diversity. Furthermore, performing inter-cluster tasks Mixup helps alleviate the learning of pseudo-discriminative features highly correlated with the categories. To ensure that the generated tasks do not deviate too much from the real distribution and mislead the model’s learning, the maximum mean discrepancy (MMD) between the generated tasks and real tasks is minimized, thus ensuring the authenticity of the generated tasks. Finally, it is theoretically analyzed why the inter-cluster task Mixup strategy can improve the model’s generalization performance. Experimental results on multiple datasets further demonstrate the effectiveness of the proposed method.