SPBSpark: 高查询效能的分布式轨迹索引方法

doi:10.13328/j.cnki.jos.007554

微信小程序

微信服务号

微信订阅号

首页 > 过刊浏览>年第卷第期 >1-22. DOI:10.13328/j.cnki.jos.007554

PDF HTML阅读 XML下载导出引用引用提醒

SPBSpark: 高查询效能的分布式轨迹索引方法
DOI:
                        10.13328/j.cnki.jos.007554
                    
CSTR:
                        
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP311
基金项目:国家重点研发计划(2023YFC3341204); 国家自然科学基金(62377014, 62407016)

SPBSpark: Distributed Trajectory Indexing Method with High Query Efficiency

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

GPS (global positioning system, 全球定位系统)移动设备与5G (5th generation mobile communication technology, 第5代移动通信技术)互联网技术的普及催生了轨迹数据的飞速增长. 如何对海量轨迹数据进行高效地存储、管理和分析成为当前环境下的研究热点问题. 传统的单节点式轨迹索引受限于内存容量、磁盘I/O速度等问题已经无法胜任海量轨迹数据的管理. Spark作为一种基于内存计算的分布式框架, 在处理海量数据时具备天然的优势. 因此, 提出了基于Spark平台的分布式轨迹数据索引以及相关的查询技术方案. 为了提升分布式集群中单个节点的数据存储能力和轨迹查询效率, 首先提出了一种轨迹编码技术(Z-order trajectory encoding, ZTE), 该技术对轨迹MBR (minimum bounding rectangle, 最小外接矩形)所覆盖的最小相邻子空间进行编码, 可以表达不同粒度的轨迹以及轨迹的运动方向, 用于判断轨迹与查询空间的关系. 基于这一技术, 将轨迹的ZTE编码进一步组织成偏序结构, 设计了基于子空间偏序分支的SPB分支(subspace partial-order branch, SPB)并结合哈希映射表IDMap构建局部索引. 索引能够避免类R树索引中最小限定矩形堆叠形成死空间导致的效率低下问题, 实现快速剪枝. 为了支持海量轨迹数据的高效检索, 基于SPB分支的局部索引设计了分布式的轨迹索引SPBSpark. SPBSpark主要包括数据分区、局部索引和全局索引这3个部分. 该索引能有效支持时空范围查询、k近邻查询、移动对象轨迹查询这3种查询. 最后, 选取了同样基于Spark框架的分布式轨迹索引TrajSpark和LocationSpark作为实验对照对象. 通过仿真实验对比分析, SPBSpark索引的空间利用率在LocationSpark上改善了约15%. 在查询性能上, 相较于TrajSpark和LocationSpark, SPBSpark拥有2–3倍的性能提升.

Abstract:

The popularization of GPS mobile devices and 5G Internet technology has led to the rapid growth of trajectory data. How to efficiently store, manage, and analyze massive trajectory data has become a hot research issue in the current environment. The traditional single-node trajectory index is limited by memory capacity, disk I/O speed, and other factors, and is no longer capable of managing large-scale trajectory data. Spark, as a distributed framework based on in-memory computing, has natural advantages in processing massive data. Therefore, this study proposes a distributed trajectory data indexing and query scheme based on the Spark platform. To improve the data storage capacity of a single node in a distributed cluster and the efficiency of trajectory queries, a trajectory encoding technique, Z-order trajectory encoding (ZTE), is proposed. This technique encodes the minimum adjacent subspaces covered by the trajectory minimum bounding rectangle (MBR), which can represent trajectories of different granularities and their movement directions, and is used to determine the relationship between a trajectory and the query space. Based on this technique, this study further organizes the ZTE codes of trajectories into a partial-order structure and designs a subspace partial-order branch (SPB). Combined with the hash mapping table IDMap, a local index is constructed. This index avoids the inefficiency caused by the dead space formed by the overlapping of minimum bounding rectangles in R-tree-like indexes and enables fast pruning. To support efficient retrieval of massive trajectory data, the study designs a distributed trajectory index named SPBSpark based on the SPB-branch local index. SPBSpark mainly consists of three components: data partition, local index, and global index. The proposed index effectively supports three types of queries: spatiotemporal range query, k-nearest neighbor query, and moving object trajectory query. Finally, the study selects the distributed trajectory indexes TrajSpark and LocationSpark, which are also based on the Spark framework, as comparison systems. Through comparative simulation experiments, the spatial utilization of the SPBSpark index is improved by about 15% compared with LocationSpark. In terms of query performance, SPBSpark achieves a 2–3 times performance improvement compared with TrajSpark and LocationSpark.

参考文献

相似文献

引证文献

引用本文

汤娜,蔡锶淇,李晶晶,罗凯原,汤庸. SPBSpark: 高查询效能的分布式轨迹索引方法.软件学报,,():1-22

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-05-28
最后修改日期:2025-06-24
录用日期:
在线发布日期: 2026-01-07
出版日期:

微信小程序

微信服务号

微信订阅号

引用本文

相关视频

分享

文章指标

历史

文章二维码