一种语义感知的细粒度App评论缺陷挖掘方法

doi:10.13328/j.cnki.jos.006697

微信服务号

微信订阅号

2025年5月11日 10:58 星期日

首页 > 过刊浏览>2023年第34卷第4期 >1613-1629. DOI:10.13328/j.cnki.jos.006697

PDF HTML阅读 XML下载导出引用引用提醒

一种语义感知的细粒度App评论缺陷挖掘方法
DOI:
                        10.13328/j.cnki.jos.006697
                    
CSTR:
                        
                    
作者:
                        王亚文王亚文
中国科学院软件研究所 互联网软件技术实验室, 北京 100190;中国科学院大学, 北京 100049
在期刊界中查找
在百度中查找
在本站中查找
王俊杰王俊杰
中国科学院软件研究所 互联网软件技术实验室, 北京 100190;中国科学院大学, 北京 100049
在期刊界中查找
在百度中查找
在本站中查找
石琳石琳
中国科学院软件研究所 互联网软件技术实验室, 北京 100190;中国科学院大学, 北京 100049
在期刊界中查找
在百度中查找
在本站中查找
王青王青
中国科学院软件研究所 互联网软件技术实验室, 北京 100190;中国科学院大学, 北京 100049;计算机科学国家重点实验室(中国科学院 软件研究所), 北京 100190
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:王亚文(1993-),男,博士生,主要研究领域为智能需求工程,软件工程,自然语言处理;王俊杰(1987-),女,博士,副研究员,主要研究领域为智能软件工程,软件工程大数据,经验软件工程,软件质量,众包软件测试;石琳(1985-),女,博士,副研究员,CCF高级会员,主要研究领域为智能需求工程,软件工程,经验软件工程,软件演化,软件质量;王青(1964-),女,博士,研究员,博士生导师,CCF高级会员,主要研究领域为以过程为中心的软件质量管理技术,建模技术,知识管理技术,软件协同工作技术.
通讯作者:
中图分类号:TP311
基金项目:国家重点研发计划(2018YFB1403400); 国家自然科学基金(62072442)

Semantic-aware and Fine-grained App Review Bug Mining Approach

Author:

WANG Ya-Wen
WANG Ya-Wen
Laboratory for Internet Software Technologies, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100049, China
在期刊界中查找
在百度中查找
在本站中查找
WANG Jun-Jie
WANG Jun-Jie
Laboratory for Internet Software Technologies, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100049, China
在期刊界中查找
在百度中查找
在本站中查找
SHI Lin
SHI Lin
Laboratory for Internet Software Technologies, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100049, China
在期刊界中查找
在百度中查找
在本站中查找
WANG Qing
WANG Qing
Laboratory for Internet Software Technologies, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China;University of Chinese Academy of Sciences, Beijing 100049, China;State Key Laboratory of Computer Sciences (Institute of Software, Chinese Academy of Sciences), Beijing 100190, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献 [57]

相似文献 [20]

引证文献

资源附件

文章评论

摘要:

手机用户提交的App评论为开发者提供了一个了解用户满意度的沟通渠道. 许多用户通常使用“send a video”和“crash”等关键短语来描述有缺陷的功能(即用户操作)和App的异常行为(即异常行为), 而这些短语可能会与其他琐碎信息(如用户的抱怨)一起交杂在评论文本中. 掌握这些细粒度信息可以帮助开发者理解来自用户的功能需求或缺陷报告, 进而有利于提升App的质量. 现有的基于模式的目标短语提取方法只能对评论的高层主题/方面进行总结, 并且由于对评论的语义理解不足, 短语提取的性能较差. 提出了一种语义感知的细粒度App评论缺陷挖掘方法(Arab), 来提取用户操作和异常行为, 并挖掘两者之间的关联关系. 设计了一种新颖的用于提取细粒度目标短语的神经网络模型, 该模型将文本描述和评论属性相结合, 能更好地建模评论的语义. Arab还根据语义关系对提取的短语进行聚类, 并将用户操作和异常行为之间的关联关系进行了可视化. 使用6个App的3 426条评论进行评估实验, 实验结果证实了Arab在短语提取方面的有效性. 进一步使用Arab对15个热门App的301 415条评论进行了案例研究, 以探索其潜在的应用, 并验证其在大规模数据上的实用性.

关键词:App评论;信息提取;深度学习

Abstract:

App reviews are considered as a communication channel between users and developers to perceive user’s satisfaction. Users usually describe buggy features (i.e., user actions) and App abnormal behaviors (i.e., abnormal behaviors) in forms of key phrases (e.g., “send a video” and “crash”), which could be buried with other trivial information (e.g., complaints) in the review texts. A fine-grained view about this information could facilitate the developers’ understanding of feature requests or bug reports from users, and improve the quality of Apps. Existing pattern-based approaches to extract target phrases can only summarize the high-level topics/aspects of reviews, and suffer from low performance due to insufficient semantic understanding of reviews. This study proposes a semantic-aware and fine-grained App review bug mining approach (Arab) to extract user actions and abnormal behaviors, and mine the correlations between them. A novel neural network model is designed for extracting fine-grained target phrases, which combines textual descriptions and review attributes to better represent the semantics of reviews. Arab also clusters the extracted phrases based on their semantic relations and provides a visualization of correlations between User Actions and Abnormal Behaviors. 3,426 reviews from six Apps are used to carry out evaluation test, and the results confirm the effectiveness of Arab in phrase extraction. A case study is further conducted with Arab on 301,415 reviews of 15 popular Apps to explore its potential application and examine its usefulness on large-scale data.

Key words:App review;information extraction;deep learning

参考文献

[1] Guo H, Singh MP. Caspar:Extracting and synthesizing user stories of problems from app reviews. In:Proc. of the 42nd Int'l Conf. on Software Engineering (ICSE 2020). 2020. 628-640.

[2] Johann T, Stanik C, Alizadeh BAM, et al. SAFE:A simple approach for feature extraction from App descriptions and App reviews. In:Proc. of the 25th IEEE Int'l Requirements Engineering Conf., RE 2017. 2017. 21-30.

[3] Di Sorbo A, Panichella S, Alexandru CV, et al. What would users change in my App? Summarizing App reviews for recommending software changes. In:Proc. of the 24th ACM SIGSOFT Int'l Symp. on Foundations of Software Engineering, FSE 2016. 2016. 499-510.

[4] Gu XD, Kim S. What parts of your Apps are loved by users? In:Proc. of the 30th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE 2015). 2015. 760-770.

[5] Khalid H, Shihab E, Nagappan M, et al. What do mobile App users complain about? IEEE Software, 2015, 32(3):70-77.

[6] Panichella S, Di Sorbo A, Guzman E, et al. How can I improve my App? Classifying user reviews for software maintenance and evolution. In:Proc. of the 2015 IEEE Int'l Conf. on Software Maintenance and Evolution (ICSME 2015). 2015. 281-290.

[7] Lu Z, Yang D, Li J. A software evaluation system based on reviews mining. Computer Applications and Software, 2014, 31(7):1-4(in Chinese with English abstract).

[8] Harman M, Jia Y, Zhang YY. App store mining and analysis:MSR for app stores. In:Proc. of the 9th IEEE Working Conf. of Mining Software Repositories (MSR 2012). 2012. 108-111.

[9] Noei E, da Cost DA, Zou Y. Winning the App production rally. In:Proc. of the 2018 ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering (ESEC/SIGSOFT FSE 2018). 2018. 283-294.

[10] Palomba F, Vásquez ML, Bavota G, et al. User reviews matter! Tracking crowdsourced reviews to support evolution of successful Apps. In:Proc. of the 2015 IEEE Int'l Conf. on Software Maintenance and Evolution (ICSME 2015). 2015. 291-300.

[11] Chen N, Lin JL, Hoi SCH, et al. AR-miner:mining informative reviews for developers from mobile App marketplace. In:Proc. of the 36th Int'l Conf. on Software Engineering (ICSE 2014). 2014. 767-778.

[12] MaalejW, Nabil H. Bug report, feature request, or simply praise? On automatically classifying App reviews. In:Proc. of the 23rd IEEE Int'l Requirements Engineering Conf. (RE 2015). 2015. 116-125.

[13] Gao CY, Zeng JC, Lo D, et al. INFAR:insight extraction from App reviews. In:Proc. of the 2018 ACM Joint Meeting on European Software Engineering Conf. and Symp. on the Foundations of Software Engineering (ESEC/SIGSOFT FSE 2018). 2018. 904-907.

[14] Vu PM, Nguyen TT, Pham HV, et al. Mining user opinions in mobile App reviews:A keyword-based approach. In:Proc. of the 30th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE 2015). 2015. 749-759.

[15] Jiang W, Zhang L, Dai Y, et al. Analyzing helpfulness of online reviews for user requirements elicitation. Chinese Journal of Computers, 2013, 36(1):119-131(in Chinese with English abstract).

[16] Hu TY, Jiang Y. Mining of user's comments reflecting usage feedback for APP software. Ruan Jian Xue Bao/Journal of Software, 2019, 30(10):3168-3185(in Chinese with English abstract). http://www.jos.org.cn/1000-9825/5794.htm[doi:10.13328/j.cnki.jos. 005794]

[17] Martin WJ, Sarro F, Jia Y, et al. A survey of App store analysis for software engineering. IEEE Trans. on Software Engineering, 2017, 43(9):817-847.

[18] Cui J, Yang D, Li J. RERM:A requirement elicitation method based on review mining. Computer Applications and Software, 2015, 32(8):28-33(in Chinese with English abstract).

[19] Villarroel L, Bavota G, Russo B, et al. Release planning of mobile apps based on user reviews. In:Proc. of the 38th Int'l Conf. on Software Engineering (ICSE 2016). 2016. 14-24.

[20] Vu PM, Pham HV, Nguyen TT, et al. Phrase-based extraction of user opinions in mobile App reviews. In:Proc. of the 31st IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE 2016). Singapore:ACM, 2016. 726-731.

[21] Cooper N, Bernal-Cárdenas C, Chaparro O, et al. It takes Two to Tango:Combining visual and textual information for detecting duplicate video-based bug reports. In:Proc. of the Int'l Conf. on Software Engineering (ICSE). 2021.

[22] Wang JJ, Li MY, Wang S, et al. Images don't lie:Duplicate crowdtesting reports detection with screenshot information. Information and Software Technology, 2019, 110:139-155.

[23] Hao R, Feng Y, Jones J, et al. CTRAS:Crowdsourced test report aggregation and summarization. In:Proc. of the ICSE'2019. 2019. 921-932.

[24] Wang JJ, Cui Q, Wang S, et al. Domain adaptation for test report classification in crowdsourced testing. In:Proc. of the ICSE 2017. 2017. 83-92.

[25] Liu H, Shen MZ, Jin JH, et al. Automated classification of actions in bug reports of mobile Apps. In:Proc. of the ISSTA 2020:the 29th ACM SIGSOFT Int'l Symp. on Software Testing and Analysis. Virtual Event, 2020. 128-140.

[26] Zhang J, Wang X, Zhang HY, et al. Retrieval-based neural source code summarization. In:Proc. of the ICSE 2020:the 42nd Int'l Conf. on Software Engineering. 2020. 1385-1397.

[27] Guo J, Cheng JH, Cleland-Huang J. Semantically enhanced software traceability using deep learning techniques. In:Proc. of the 39th Int'l Conf. on Software Engineering, ICSE 2017. 2017. 3-14.

[28] Wang H, Chen CY, Xing ZC, et al. DiffTech:A tool for differencing similar technologies from question-and-answer discussions. In:Proc. of the ESEC/FSE 2020:the 28th ACM Joint European Software Engineering Conf. and Symp. on the Foundations of Software Engineering. Virtual Event, 2020. 1576-1580.

[29] Huang ZH, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging. CoRR abs/1508.01991, 2015.

[30] Zhang Y, Chen HS, Zhao YH, et al. Learning tag dependencies for sequence tagging. In:Proc. of the 27th Int'l Joint Conf. on Artificial Intelligence (IJCAI 2018). 2018. 4581-4587.

[31] McCallum A, Freitag D, Pereira FCN. Maximum entropy Markov models for information extraction and segmentation. In:Proc. of the 17th Int'l Conf. on Machine Learning (ICML 2000). Stanford:Morgan Kaufmann Publishers, 2000. 591-598.

[32] Tang A, Jackson D, Hobbs J, et al. A maximum entropy model applied to spatial and temporal correlations from cortical networks in Vitro. Journal of Neuroscience, 2008, 28(2):505-518.

[33] Felsenstein J, Churchill GA. A hidden Markov model approach to variation among sites in rate of evolution. Molecular Biology and Evolution, 1996, 13(1):93-104.

[34] Lafferty JD, McCallum A, Pereira FCN. Conditional random fields:Probabilistic models for segmenting and labeling sequence data. In:Proc. of the 18th Int'l Conf. on Machine Learning (ICML 2001). Williams College:Morgan Kaufmann Publishers, 2001. 282-289.

[35] Collobert R, Weston J, Bottou L, et al. Natural language processing (Almost) from Scratch. Journal of Machine Learning Research, 2011, 12:2493-2537.

[36] Devlin J, Chang MW, Lee K, et al. BERT:Pre-training of deep bidirectional transformers for language understanding. In:Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies, NAACL-HLT 2019, Volume 1(Long and Short Papers). Minneapolis:Association for Computational Linguistics, 2019. 4171-4186.

[37] Howard J, Ruder S. Universal language model finetuning for text classification. In:Proc. of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Volume 1:Long Papers. Melbourne:Association for Computational Linguistics, 2018. 328-339.

[38] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. In:Proc. of the Advances in Neural Information Processing Systems 30:Annual Conf. on Neural Information Processing Systems 2017. 2017. 5998-6008.

[39] Xu L, Dong QQ, Yu C, et al. CLUENER2020:Fine-grained name entity recognition for Chinese. arXiv:2001.04351, 2020.

[40] Gao CY, Zeng JC, Lyu MR, et al. Online App review analysis for identifying emerging issues. In:Proc. of the 40th Int'l Conf. on Software Engineering (ICSE 2018). Gothenburg:ACM, 2018. 48-58.

[41] Gao CY, Zeng JC, Xia X, et al. Automating App review response generation. In:Proc. of the 34th IEEE/ACM Int'l Conf. on Automated Software Engineering (ASE 2019). San Diego:IEEE, 2019. 163-175.

[42] Islam MR, Zibran MF. SentiStrength-SE:Exploiting domain specificity for improved sentiment analysis in software engineering text. The Journal of Systems & Software, 2018, 145:125-146.

[43] Dai HJ, Lai PT, Chang YC, et al. Enhancing of chemical compound and drug name recognition using representative tag scheme and fine-grained tokenization. Journal of Cheminformatics, 2015, 7:S1-S14.

[44] Ratinov LA, Roth D. Design challenges and misconceptions in named entity recognition. In:Proc. of the 13th Conf. on Computational Natural Language Learning, CoNLL 2009. Boulder:ACL, 2009. 147-155.

[45] Kingma DP, Ba J. ADAM:A Method for stochastic optimization. In:Proc. of the 3rd Int'l Conf. on Learning Representations, ICLR 2015. San Diego, 2015.

[46] Porteous I, Newman D, Ihler AT, et al. Fast collapsed gibbs sampling for latent dirichlet allocation. In:Proc. of the 14th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. Las Vegas:ACM, 2008. 569-577.

[47] Cer D, Yang YF, Kong SY, et al. Universal sentence encoder for English. In:Proc. of the 2018 Conf. on Empirical Methods in Natural Language Processing, EMNLP 2018:System Demonstrations. Brussels:Association for Computational Linguistics, 2018. 169-174.

[48] Biemann C. Chinese whispers-an efficient graph clustering algorithm and its application to natural language processing problems. In:Workshop on TextGraphs, at HLT-NAACL, Association for Computational Linguistics, 2006. 73-80.

[49] Huang Y, Chen CY, Xing ZC, et al. Tell them apart:Distilling technology differences from crowd-scale comparison discussions. In:Proc. of the 33rd ACM/IEEE Int'l Conf. on Automated Software Engineering, ASE 2018. Montpellier:ACM, 2018. 214-224.

[50] Wu HY, Deng WJ, Niu XT, et al. Identifying key features from App user reviews. In:Proc. of the 43rd IEEE/ACM Int'l Conf. on Software Engineering, ICSE 2021. Madrid:IEEE, 2021. 922-932.

[51] Ester M, Kriegel H, Sander J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In:Proc. of the 2nd Int'l Conf. on Knowledge Discovery and Data Mining (KDD-96). 1996. 226-231.

[52] Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In:Proc. of the 14th Int'l Joint Conf. on Artificial Intelligence (IJCAI 1995). 1995. 1137-1145.

附中文参考文献

[7] 卢忠浩, 杨达, 李娟. 基于评论挖掘的软件评价系统. 计算机应用与软件, 2014, 31(7):1-4.

[15] 姜巍, 张莉, 戴翼, 蒋竞, 王刚. 面向用户需求获取的在线评论有用性分析. 计算机学报, 2013, 36(1):119-131.

[16] 胡甜媛, 姜瑛. 体现使用反馈的APP软件用户评论挖掘. 软件学报, 2019, 30(10):3168-3185. http://www.jos.org.cn/1000-9825/5794.htm[doi:10.13328/j.cnki.jos.005794]

[18] 崔建苓, 杨达, 李娟. RERM:一种基于评论挖掘的需求获取方法. 计算机应用与软件, 2015, 32(8):28-33.

引用本文

王亚文,王俊杰,石琳,王青.一种语义感知的细粒度App评论缺陷挖掘方法.软件学报,2023,34(4):1613-1629

复制

文章指标

点击次数:1154
下载次数: 2885
HTML阅读次数: 1789
引用次数: 0

历史

收稿日期:2022-01-19
最后修改日期:2022-03-04
录用日期:
在线发布日期: 2022-07-22
出版日期: 2023-04-06

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码