广东工业大学学报 ›› 2014, Vol. 31 ›› Issue (3): 119-123.doi: 10.3969/j.issn.1007-7162.2014.03.021

• 综合研究 • 上一篇    下一篇

基于改进Kd-Tree构建算法的k近邻查询

陈晓康,刘竹松   

  1. 广东工业大学 计算机学院,广东 广州 510006
  • 收稿日期:2014-04-16 出版日期:2014-09-30 发布日期:2014-09-30
  • 作者简介:陈晓康(1990-),女,硕士研究生,主要研究方向为云计算、大数据、物联网和移动互联网.
  • 基金资助:

    国家自然科学基金资助项目(61272013);广东省现代信息服务业发展专项资金资助项目(GDEID2011IS038);广东省教育部产学研结合项目(2011B090400615,2012B091000073,2011B090400480)

K Nearest Neighbor Query Based on Improved KdTree Construction Algorithm

Chen Xiao-kang,Liu Zhu-song   

  1. School of Computer,Guangdong University of Technology, Guangzhou 510006,China
  • Received:2014-04-16 Online:2014-09-30 Published:2014-09-30

摘要: k近邻查询算法是查询大规模空间数据的常用算法之一,使用Kd-Tree先构建大规模空间数据的索引,然后对搜索空间进行层次划分,再进行k近邻查询,能保证搜索的效率.但是,传统的Kd-Tree构建有两个缺点:使用测试数据点进行k近邻查询每次都需要回溯到根节点,影响了查询的效率;Kd-Tree使用split域对空间进行层次划分,空间划分为立方体(二维数据表现为矩形),多边形空间在相交判断时会出现没必要进行数据距离比较的多余空间,这样会影响查询的效率.针对这两个缺点,本文提出了相应的改进算法——RB算法.实验结果证明,该算法比传统的KD算法拥有更高的查询效率.本文的主要贡献有两点:(1)构建一种快速创建Kd-Tree索引来支持KNN算法进行大规模数据的分类查询操作.(2)改进传统的Kd-Tree索引构建方法,提出新的改进算法RB算法,提高KNN算法查询的效率.

关键词: k近邻查询, Kd树, 空间数据, 多边形空间, 层次划分

Abstract: K nearest neighbor query algorithm is one of the commonly used algorithms in massive spatial data query. First, it construct an index of largescale spatial data by Kd-Tree, and hierarchical division of the search space. Then, it used the k nearest neighbor query to ensure the efficiency of the search. However, the traditional Kd-Tree construction has two drawbacks: the use of test data points are required for each k nearest neighbor query back to the root, thus affecting the efficiency of the query; Kd-Tree uses the splitlevel domain for the space division of space into cubes (twodimensional data are rectangular), extra space appears in polygonal space at the intersection of judgment, making the comparison of data unnecessary,   thus affecting the efficiency of the query. Regarding these two shortcomings, it proposed the corresponding improved algorithmRB algorithm. Experimental results show that the algorithm has a higher query efficiency than the traditional KD algorithm.There are two main contribution from this paper: (1) This paper construct a quickly create Kd-Tree indexes to support queries KNN akgorithm to classify largescale data. (2) RB algorithm is proposed to improve the traditional Kd-Tree index construction method,and improving query efficiency for KNN algorithm.

Key words: k nearest neighbor query, Kd-Tree, spatial data, polygon space, hierarchical division

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!