Abstract:How to efficiently deploy machine learning models on mobile devices has drawn a lot of attention in both academia and industry, among which the model training is a critical part. However, with increasingly public attention on data privacy and the recently adopted laws and regulations, it becomes harder for developers to collect training data from users and thus cannot train high-quality models. Researchers have been exploring approaches of training neural networks on decentralized data. Those efforts will be summarized and their limitations be pointed out. To this end, this work presents a novel neural network training paradigm on mobile devices, which distributes all training computations associated with private data on local devices and requires no data to be uploaded in any form. Such training paradigm autonomous learning is named. To deal with two main challenges of autonomous learning, i.e., limited data volume and insufficient computing power available on mobile devices, the first autonomous learning system AutLearn is designed and implemented. It incorporates the cloud (public data, pre-training)—client (private data, transfer learning) cooperation methodology and data augmentation techniques to ensure the model convergence on mobile devices. Furthermore, by utilizing a series of optimization techniques such as model compression, neural network compiler, and runtime cache reuse, AutLearn can significantly reduce the on-client training cost. Two classical scenarios of autonomous learning are implemented based on AutLearn and carried out a set of experiments. The results showed that AutLearn can train the neural networks with comparable or even higher accuracy compared to traditional centralized/federated training mode with privacy preserved. AutLearn can also significantly reduce the computational and energy cost of neural network training on mobile devices.