Abstract:Privacy-Preserving data publishing has attracted considerable research interest over the past few years. The principle difference of clustering and obfuscating burdens the trade-off between clustering utility maintaining and privacy protection. Most of existing methods such as adopting strategies of distance-preservation, or distribution-preservation, cannot accommodate both clustering utility and privacy security of the data. As a trade-off, a neighborhood-preservation based perturbing algorithm VecREP (vector equivalent replacing based perturbing method) is proposed, which realizes good clustering utility by maintaining the nearest neighborhood for each data point. The definition of a safe neighborhood is introduced to stabilize the composition of the nearest neighborhood. The equivalent replacing arc is generated to realize distribution stability of nearest neighborhood leveraging vector offset and composition. For each data point, VecREP randomly chooses a point on its equivalent replacing arc inside corresponding safe neighborhood to make substitution. The algorithm is compared with existing methods such as RBT, TDR, Camp-crest and NeNDS. Experimental results demonstrate that VecREP competes in performance with RBT on maintaining clustering quality and, outperforms the other. It can avoid a reversible attack effectively and compared to the existing solution, ARMM has a shorter handover delay and a smaller location update and delivery cost.