Abstract:Information retrieval-based software bug localization is an active research topic in the domain of software fault localization. It first analyzes the contents of the bug reports and program modules. Then it calculates the similarity between the bug reports and program modules. Finally, it recommends the most similar program modules to developers when given a bug report. This paper presents a systematic survey of existing research achievements of the domestic and international researchers in recent years. First, a research framework is proposed and three key factors (i.e., data sources, retrieval model, and application scenario), which may influence the performance of bug localization methods are identified. Next, existing research achievements in these three key factors are discussed in sequence. Then, the performance evaluation measures and datasets commonly used in information retrieval-based bug localization are summarized. Finally, conclusions of this study are drawn and a perspective of the future work in this research area is discussed.