基于自适应剪枝的满足本地差分隐私的真值发现算法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

TP309

基金项目:

安徽理工大学高层次引进人才科研启动基金(2023yjrc92); 国家自然科学基金面上项目(62372051); 国家自然科学基金青年项目(62202164); 安徽省科技重大专项(18030901025); 高校学科(专业)拔尖人才学术资助项目(gxbjZD2021050)


Locally Differentially Private Truth Discovery Algorithm via Adaptive Pruning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    为了对移动群智感知中工人上传的不同质量的感知数据做必要的聚合处理, 真值发现技术应运而生, 其是为后续应用提供精确数据支持的基础. 为了应对可能的隐私泄露问题, 现有研究往往结合本地差分隐私技术来进行保护, 然而这些研究往往忽略了感知数据中的异常值对本地差分隐私下真值发现精度的影响. 这些异常值往往具有极大的取值范围, 导致注入数据中的噪音量较大. 而且在现实世界中, 工人出于对隐私泄露的担心, 移动群智感知服务器无法在无隐私保护的情况下预先处理数据. 为解决以上问题, 提出基于自适应剪枝的满足本地差分隐私的真值发现算法NATURE. 该算法的核心思想是考虑数据中蕴含的噪音类型来自适应剪枝掉不需要的工人的所有值或者某些任务值. 在NATURE中, 为便于剪枝, 在形式化约束优化问题的基础上, 设计基于优化问题的噪音感知的权重和重要性估计方法; 为进行剪枝, 在证明最优剪枝问题是NP-hard的基础上, 设计具有多项式时间复杂度的效用感知的自适应剪枝方法. 进一步从理论上分析NATURE的隐私、效用和复杂度. 在两个真实数据集和一个合成数据集上的实验结果表明, 相较于对比算法, NATURE在求得噪音“真值”的精度上至少提高20%.

    Abstract:

    To conduct necessary aggregation on varying-quality sensed data uploaded by workers in mobile crowdsensing, truth discovery technology has emerged as the cornerstone for providing precise data support for subsequent applications. Existing studies tend to adopt local differential privacy for protection against potential privacy breaches, but often ignore the influence of outliers in the sensed data on the truth discovery accuracy under local differential privacy. These outliers often have a large range of values, resulting in a large amount of noise in the injected data. Additionally, due to workers’ concerns about privacy breaches, mobile crowdsensing servers cannot preprocess data without privacy protection. To this end, this study proposes NATURE, which meets local differential privacy based on adaptive pruning. The core idea of the algorithm is to consider the noise types in the data to adaptively prune all unnecessary workers’ values or certain task values. In NATURE, the noise-aware weight and importance estimation (NWIE) method based on a formalized constraint optimization problem is designed to facilitate data pruning. Based on proving the optimal pruning problem is NP-hard, this study designs the utility-aware adaptive pruning (UAP) method with polynomial time complexity to conduct pruning. Furthermore, a theoretical analysis of NATURE’s privacy, utility, and complexity is carried out. Experimental results on two real-world datasets and one synthetic dataset demonstrate that NATURE achieves an accuracy improvement of at least 20% in obtaining “truth” compared to its comparative algorithms.

    参考文献
    相似文献
    引证文献
引用本文

张朋飞,朱伊波,程祥,张治坤,刘西蒙,孙笠,方贤进,张吉.基于自适应剪枝的满足本地差分隐私的真值发现算法.软件学报,,():1-24

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-11-29
  • 最后修改日期:2024-04-11
  • 录用日期:
  • 在线发布日期: 2025-01-16
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62562563 传真:010-62562533 Email:jos@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号