基于TWE-NMF主题模型的Mashup服务聚类方法

doi:10.13328/j.cnki.jos.006508

微信服务号

微信订阅号

2025年6月15日 18:57 星期日

首页 > 过刊浏览>2023年第34卷第6期 >2727-2748. DOI:10.13328/j.cnki.jos.006508

PDF HTML阅读 XML下载导出引用引用提醒

基于TWE-NMF主题模型的Mashup服务聚类方法
DOI:
                        10.13328/j.cnki.jos.006508
                    
CSTR:
                        
                    
作者:
                        陆佳炜陆佳炜
中国计量大学 机械电子工程学院, 浙江 杭州 310018
在期刊界中查找
在百度中查找
在本站中查找
赵伟赵伟
浙江工业大学 计算机科学与技术学院, 浙江 杭州 310023
在期刊界中查找
在百度中查找
在本站中查找
张元鸣张元鸣
浙江工业大学 计算机科学与技术学院, 浙江 杭州 310023
在期刊界中查找
在百度中查找
在本站中查找
梁倩卉梁倩卉
School of Computer Science and Engineering, Nanyang Technological University, Singapore 637457, Singapore
在期刊界中查找
在百度中查找
在本站中查找
肖刚肖刚
中国计量大学 机械电子工程学院, 浙江 杭州 310018
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:陆佳炜(1981-),男,副教授,CCF专业会员,主要研究领域为服务计算,软件架构,大数据可视化;赵伟(1996-),男,硕士,主要研究领域为服务计算,数据挖掘;张元鸣(1977-),男,博士,副教授,CCF专业会员,主要研究领域为服务计算,大数据分析,并行计算;梁倩卉(1977-),女,博士,讲师,主要研究领域为数据科学,人工智能;肖刚(1965-),男,博士,教授,博士生导师,CCF高级会员,主要研究领域为智能制造,云制造
通讯作者:肖刚，xg@zjut.edu.cn
中图分类号:TP311
基金项目:国家自然科学基金（61976193）；浙江省自然科学基金（LY19F020034）；浙江省重点研发计划（2021C03136）

TWE-NMF Topic Model-based Approach for Mashup Service Clustering

Author:

LU Jia-Wei
LU Jia-Wei
School of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China
在期刊界中查找
在百度中查找
在本站中查找
ZHAO Wei
ZHAO Wei
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
在期刊界中查找
在百度中查找
在本站中查找
ZHANG Yuan-Ming
ZHANG Yuan-Ming
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China
在期刊界中查找
在百度中查找
在本站中查找
LIANG Qian-Hui
LIANG Qian-Hui
School of Computer Science and Engineering, Nanyang Technological University, Singapore 637457, Singapore
在期刊界中查找
在百度中查找
在本站中查找
XIAO Gang
XIAO Gang
School of Mechanical and Electrical Engineering, China Jiliang University, Hangzhou 310018, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

随着互联网和面向服务技术的发展，一种新型的Web应用——Mashup服务，开始在互联网上流行并快速增长.如何在众多Mashup服务中找到高质量的服务，已经成为一个大家关注的热点问题.寻找功能相似的服务并进行聚类，能有效提升服务发现的精度与效率.目前国内外主流方法为挖掘Mashup服务中隐含的功能信息，进一步采用特定聚类算法如K-means等进行聚类.然而Mashup服务文档通常为短文本，基于传统的挖掘算法如LDA无法有效处理短文本，导致聚类效果并不理想.针对这一问题，提出一种基于非负矩阵分解的TWE-NMF （non-negative matrix factorization combining tags and word embedding）模型对Mashup服务进行主题建模.所提方法首先对Mashup服务规范化处理，其次采用一种基于改进的Gibbs采样的狄利克雷过程混合模型，自动估算主题的数量，随后将词嵌入和服务标签等信息与非负矩阵分解相结合，求解Mashup服务主题特征，并通过谱聚类算法将服务聚类.最后，对所提方法的性能进行了综合评价，实验结果表明，与现有的服务聚类方法相比，所提方法在准确率、召回率、F-measure、纯度和熵等评价指标方面都有显著提高.

关键词:Mashup服务;非负矩阵分解;主题模型;词嵌入;服务聚类

Abstract:

With the development of the Internet and service-oriented technology, a new type of Web application—Mashup service, began to become popular on the Internet and grow rapidly. How to find high-quality services among large number of Mashup services has become a focus of attention. It has been shown that finding and clustering services with similar functions can effectively improve the accuracy and efficiency of service discovery. At present, current methods mainly focus on mining the hidden functional information in the Mashup service, and use specific clustering algorithms such as K-means for clustering. However, Mashup service documents are usually short texts. Traditional mining algorithms such as LDA are difficult to represent short texts and find satisfied clustering effects from them. In order to solve this problem, this study proposes a non-negative matrix factorization combining tags and word embedding (TWE-NMF) model to discover topics for the Mashup services. This method firstly normalizes the Mashup service, then uses a Dirichlet process multinomial mixture model based on improved Gibbs sampling to automatically estimate the number of topics. Next, it combines the word embedding and service tag information with non-negative matrix factorization to calculate Mashup topic features. Moreover, a spectral clustering algorithm is used to perform Mashup service clustering. Finally, the performance of the method is comprehensively evaluated. Compared with the existing service clustering method, the experimental results show that the proposed method has a significant improvement in the evaluation indicators such as precision, recall, F-measure, purity, and entropy.

Key words:Mashup service;non-negative matrix factorization (NMF);topic model;word embedding;service clustering

引用本文

陆佳炜,赵伟,张元鸣,梁倩卉,肖刚.基于TWE-NMF主题模型的Mashup服务聚类方法.软件学报,2023,34(6):2727-2748

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2020-11-02
最后修改日期:2021-01-29
录用日期:
在线发布日期: 2022-12-08
出版日期: 2023-06-06

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码

微信服务号

微信订阅号

引用本文

相关视频

分享

微信扫一扫：分享

文章指标

历史

文章二维码