Abstract:Automatic semantic annotation, which automatically annotates images with semantic labels has received much research interest. Although it has been studied for years, image annotation is still far from practical. The effectiveness of traditional image annotation techniques heavily relies on the availability of a sufficiently large set of correct, complete and balanced labeled samples, which typically come from users in an interactive manual process. However, in real world environment, image labels are often incomplete, noisy and imbalanced. This paper investigates the usefulness of weakly labeled information and proposes an image annotation method for weakly labeled dataset. First, the missing labels are automatically filled by a transductive method which incorporates label correlation and semantic sparsity, along with the consistency of visual and semantic similarity. Then approximate semantic balanced neighborhood is constructed. A distance metric learning method for large margin nearest neighbor embedded in multiple labels is supplied, making the retrieved neighbors by this metric appear in the same semantic subspace. Local semantic consistent neighborhood is obtained by local nonnegative sparse coding. Meanwhile, an iterative denoising method for label inference is proposed to simultaneously handle the noise and annotate images under the guidance of semantic nearest neighbors and contextual information. Experimental results demonstrate the effectiveness and capability of the proposed method.