Abstract:Developers usually select different open source licenses to restrain the conditions of using open source software, in order to protect intellectual property rights effectively and maintain the long-term development of the software. However, since the open source community has a wide variety of licenses available, developers generally find it difficult to understand the differences between different open source licenses. And existing selection tools of open source license require developers to understand the terms of the open source license and identify their business needs, which makes it harder for developers to make the right choice. Although there has been extensive research on open source license, there is still no systematic analysis on the actual difficulties of the developers to choose the open source license, thus lacking a clear understanding. For this reason, this study attempts to understand the difficulties faced by open source developers in choosing open source licenses, analyzes the components of open source license and the factors influencing open source license selection, and provides references for developers to choose open source licenses. This study conducts a random survey of 200 developers that participated in the open source projects on GitHub through questionnaires. With a Thematic Synthesis on the 53 feedbacks, it is found that developers often face difficulties in the selection of open source licenses in terms of complexity of terms and unknown considerations. By analyzing the ten open source licenses most widely used in 3 346 168 repositories on GitHub, this study establishes a framework of open source licenses that contains 10 dimensions. Drawing on the Theory of Planned Behavior, nine factors that affect license selection from three aspects are put forward: behavior attitude, subjective norm, and perceived behavior control. The relevance of those factors is verified by developer survey. Furthermore, the relationship between project characteristics and license selection is verified by fitting the order regression model. The results of research can deepen developers’ understanding of the contents of open source licenses, provide decision support for developers to select appropriate licenses based on their own needs, and provide a reference for implementing open source license selection tools based on developers’ needs.