Topic Analysis on Chinese Programming Question and Answer Websites
Author:
Affiliation:

Clc Number:

Fund Project:

National Key Research and Development Program of China (2018YFB1004202); National Natural Science Foundation of China (61672078)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Programming question and answer website is a network platform where software developers can exchange technical knowledge by posting and answering questions. With the development of Internet and growth in the number of software developers, programming question and answer websites accumulate extensive discussion contents of software engineering knowledge. Researchers have applied topic analysis on English question and answer websites in recent years, yet there are few similar studies on Chinese programming question and answer websites. Analyzing these contents can help developers know more about the trends of techniques. It also benefits website administrator to improve the forum for better user experience, etc. This study applies latent Dirichlet allocation (LDA) to automatically cluster the main topics in 92 383 questions on OSCHINA. Then, several analyses are applied to these topics, including trend analysis, difficulty analysis, and keyword analysis. Several findings are as follow:(1) Topics concluded from user discussion can be divided into 6 categories, including front-end development, back-end development, databases, operating systems, general techniques, and others. Within those categories, front-end development contains the most question posts. (2) Using trend analysis, it is found that in back-end development, developers are paying more attention to more up-to-date and advanced topics (distributed systems, system design & Web interfaces) rather than basic topics (project deployment, server configuration). (3) It is also found that data presentation is the most difficult topic, as it has the highest ratio of questions which are never answered while its popularity is above average. (4) The trend of different specific techniques is analyzed in one topic. For instance, the popularity of Java in the technique learning topic is obviously higher than the popularity of Python.

    Reference
    Related
    Cited by
Get Citation

蒋竞,吕江枫,张莉.中文软件问答社区主题分析研究.软件学报,2020,31(4):1143-1161

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 21,2019
  • Revised:October 09,2019
  • Adopted:
  • Online: April 16,2020
  • Published: April 06,2020
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-4
Address:4# South Fourth Street, Zhong Guan Cun, Beijing 100190,Postal Code:100190
Phone:010-62562563 Fax:010-62562533 Email:jos@iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063