Abstract:In order to prevent the disclosure of sensitive information and protect users’ privacy, the generalization and suppression of technology is often used to anonymize the quasi-identifiers of the data before its sharing. Data streams are inherently infinite and highly dynamic which are very different from static datasets, so that the anonymization of data streams needs to be capable of solving more complicated problems. The methods for anonymizing static datasets cannot be applied to data streams directly. In this paper, an anonymization approach for data streams is proposed with the analysis of the published anonymization methods for data streams. This approach scans the data only once to recognize and reuse the clusters that satisfy the anonymization requirements for speeding up the anonymization process. Experimental results on the real dataset show that the proposed method can reduce the information loss that is caused by generalization and suppression and also satisfies the anonymization requirements and has low time and space complexity.