Abstract:Crowd-Based software production model in global open source software ecosystem is rapidly becoming a new paradigm in promoting software productivity, and has great impacts on many stages of software development and applications. Crowd-Based software production generates large amounts of software data, continuously expands its collaboration scopes, and highly simplifies its project management. These globalization features present many challenges to crowd-based software production in software reuse, collaboration development and knowledge management, which urgently require new theories and supporting tools. This paper first classifies the distribution, basic process and data form of crowd-based software production activities. Then it analyzes the studies of software communities on data mining technology from the three core aspects-software reuse, collaborative development and knowledge management. Finally, the paper summarizes the problems and future trends of research works in this field.