庄 毅,庄越挺,吴 飞.基于数据网格的书法字k近邻查询.软件学报,2006,17(11):2289-2301 |
基于数据网格的书法字k近邻查询 |
Answring k-nn Query of Chinese Calligrphic Based on Data Grid |
投稿时间:2006-06-10 修订日期:2006-08-25 |
DOI: |
中文关键词: 中文书法字 k近邻查询 类超球 数据网格 |
英文关键词:Chinese calligrphic character k-nearest neighbor query: ciuster hypersphere data grid |
基金项目:Supported by the National Natural Science Foundation of China under Grant No.60533090 (国家自然科学基金); the National Science Fund of China for Distinguished Young Scholar under Grant No.60525108 (国家杰出青年基金); the China US Million Book Digital Library Project (高等学校中英文图书数字化国际合作计划) |
|
摘要点击次数: 3915 |
全文下载次数: 3437 |
中文摘要: |
提出一种在数据网格环境下的书法字k近邻查询方法.当用户在查询结点提交一个查询书法字和k时,首先以一个较小的查询半径,在数据结点进行基于混合距离尺度的书法字过滤,然后将过滤后的候选书法字以"打包"传输的方式发送到执行结点,在执行结点并行地对这些候选书法字进行距离(求精)运算,最终将结果书法字返回到查询结点.当返回的书法字个数小于k时,扩大半径值,继续循环,直到得到k个最近邻书法字为止.理论分析和实验表明,该方法在减少网络通信开销、增加I/O和CPU并行、降低响应时间方面具有较好的性能. |
英文摘要: |
In this paper, a novel k-Nearest Neighbor (k-NN) query over the Chinese calligraphic character databases based on Data Grid is proposed. First when user in the query node submits a query character and k, the character filtering algorithm is performed using the hybrid distance metric (HDM) index. Then the candidate characters are transferred to the executing nodes in a package mode. Furthermore, the refinement process of the candidate characters is conducted in parallelism to get the answer set. Finally, the answer set is transferred to the query node. If the number of answer set is less than k, then the query procedure is re-performed by increasing the query radius until the k nearest neighbor characters are obtained. The analysis and experimental results show that the performance of the algorithm is good in minimizing the response time by decreasing network transfer cost and increasing parallelism of I/O and CPU. |
HTML 下载PDF全文 查看/发表评论 下载PDF阅读器 |