Abstract:The great advantage of distant supervision relation extraction is to generate labeled data automatically through knowledge bases and natural language texts. This simple automatic alignment mechanism liberates people from heavy labeling work, but inevitably produces various incorrect labeled data meanwhile, which would have an influential effect on the construction of high-quality relation extraction models. To handle noise labels in the distant supervision relation extraction, here it is assumed that the final label of sentence is based on noisy observations generated by some unknown factors. Based on this assumption, a new relation extraction model is constructed, which consists of encoder layer, attention based on noise distribution layer, real label output layer, and noisy observation layer. In the training phase, transformation probabilities are learned from real label to noisy label by using automatically labeled data, and in the testing phase, the real label is obtained through the real label output layer. This study proposes to combine the noise observation model with deep neural network. The attention mechanism of noise distribution is focused based on deep neural network, and unbalanced samples are denoised of under the framework of deep neural network, aiming to further improve the performance of distant supervision relation extraction based on noisy observation. To examine its performance, the proposed method is applied to a public dataset. The performance of distant supervision relation extraction model is evaluated under different distribution families. The experimental results illustrate the proposed method is more effective with higher precision and recall, compared to the existing methods.