Abstract:Bug localization is one of the most active domains in software engineering. Most of the bugs are submitted to bug tracker systems, e.g., Bugzilla and Jira. Because of the large number of the submitted bug reports, it is difficult for developers to resolve these defects in time. Therefore, an automatic tool to help developers to identify bug related files is needed. Many bug localization technologies have been proposed by researchers. Taking advantages of the text nature of bug report, information retrieval technologies are adopted to solve bug localization problems. Due to the low computing cost and the applicability to various programming languages, information retrieval-based bug localization (IRBL) technologies become hot spots in bug localization and acquire a series of achievements. However, challenges still exist in data preprocessing, similarity calculation, and engineering application. Therefore, current IRBL technologies are summarized. The contributions of this study are: (1) the data preprocess methods and general information retrieval algorithms are summarized; (2) the feature categories are concluded and classified; (3) the performance measures are concluded; (4) the current problems in IRBL technologies are highlighted; and (5) the trends of IRBL technologies are outlooked.