TP311
国家自然科学基金(62202070, 62322601, 62172066, 62076191); 中国博士后科学基金面上项目(2022M720567); 中央高校基本科研业务费专项资金(2024IAIS-QN017); 山东省重大基础研究项目(ZR2024ZD03); 高端装备机械传动全国重点实验室自主研究课题(SKLMT-ZZKT-2024R07)
物联网技术的发展产生了海量的浮点时序数据, 这给数据存储和传输带来了巨大挑战. 为此, 浮点时序数据压缩变得至关重要, 其按数据可逆性分为有损压缩和无损压缩. 有损压缩方法通过舍弃部分数据信息以实现较好的压缩率, 适用于对精确性要求较低的应用. 无损压缩方法在减小数据大小的同时保留了所有数据信息, 这对于需要保持数据完整性和准确性的应用至关重要. 此外为满足边缘设备的实时监控需求, 流式压缩算法应运而生. 当前时序压缩综述论文存在梳理不全面、脉络不清晰、分类标准单一、未归纳较新的具有代表性算法等问题. 对历年来的时序数据压缩算法按有损压缩和无损压缩进行划分, 并进一步区分不同的算法框架, 包括基于数据表示、基于预测、基于机器学习、基于变换等, 同时对流式与批式的压缩特征进行归纳. 然后对各种压缩算法的设计思路进行深入分析, 并给出各算法的发展脉络图. 接着结合实验比较各类算法的优势与不足. 最后总结算法常见的应用场景, 并对未来研究进行展望.
Advances of IoT (Internet of Thing) generate a sheer volume of floating-point time series data, which poses great challenges in storing and transmitting these data. To this end, floating-point time series data compression is extremely crucial. It can be classified into lossy and lossless compression based on data reversibility. Lossy compression methods achieve a better compression ratio by discarding some data information and are suitable for applications with lower precision requirements. Lossless compression methods, while reducing data size, retain all data information, which is essential for applications that require maintaining data integrity and accuracy. In addition, to meet the requirements of real-time monitoring on edge devices, streaming compression algorithms emerge. Current review studies on time series compression encounter issues such as incomplete sorting, unclear line of thought, single classification standards, and lack of inclusion of relatively new and representative algorithms. Time series compression algorithms over the years are divided into lossy compression and lossless compression. Then, different algorithm frameworks are further distinguished, including those based on data representation, prediction, machine learning, and transformation. Meanwhile, the compression characteristics of streaming and batch processing are summarized. Then, the design ideas of various compression algorithms are deeply analyzed, and the development context diagrams of these algorithms are presented. Next, the advantages and disadvantages of various algorithms are compared with experiments. Finally, common application scenarios are summarized. Future research is envisioned.
朱明辉,李政,李瑞远,陈超,郑宇.浮点时序数据压缩综述.软件学报,,():1-31
复制