Abstract:In the process of software development, one code file is often developed and maintained by more than one developer and each developer contributes different amount of code to the file, which forms a unique contribution composition. Whether the contribution of the code file is reasonable or not directly affects the task allocation, which in turn affects the quality of software and development efficiency. For different types of code files, how to measure and determine their contribution composition becomes an urgent problem to be solved. Due to the maturity of supporting tools in collaborative development, the activities of developers can be recorded effectively. Therefore, the huge amount of data generated by developers lays the foundation for data-driven intelligent software development. Firstly in this paper, based on code ownership, a set of metrics is established to describe the contribution composition of code files from the three dimensions:concentration, complexity and stability. Secondly, taking Nova (one of the OpenStack' core projects) as a case study with its' version control data and metrics, a measure of contribution composition is established to summarize 12 common file types, resulting in 3 contribution composition patterns. Finally, the validity of the metrics and the rationality of contribution composition patterns are verified by combining mail-in and in-person interviews, and some instructive suggestions for software development process are presented.