Abstract:A key issue of semantic-based image retrieval is how to bridge the semantic gap between the low-level feature of image and high-level semantics, which can be expressed by means of free text effectively. The cross-modal relationship between the text and image is studied by a modeling semantic correlation between text and image. Based on the model, an approach to image retrieval is proposed so that images are retrieved according to meaning of the query text rather than query keywords. First, an algorithm for solving sparse canonical correlation analysis (CCA) is designed in this paper. Then a semantic space is learned by way of latent semantic analysis from text corpus, and images are represented by bag of visual words. After that, a semantic correlation space, by which the map between visual words of image and the high-level semantics is made explicit, can be constructed. The proposed method solves CCA in a sparse framework in order to make the result more interpretable and stable. The experimental result demonstrates that Sparse CCA outperform CCA in the context, and also substantiates the feasibility of the proposed approach to image retrieval.