Abstract:Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These have dramatically improved the state-of-the-art methods in speech recognition, visual object recognition, natural language processing, and many other domains. However, due to the large number of layers and large parameter scales, deep learning often results in gradient vanishing, falling into local optimal solution, overfitting, and so on. By using ensemble learning methods, this study proposes a novel deep sharing ensemble network. Through joint training many independent output layers in each hidden layer and injecting gradients, this network can reduce the gradient vanishing phenomenon, and through ensemble multi-output, it can get a better generalization performance.