EFFICIENT INDEX FOR A VERY LARGE DATASETS WITH HIGHER DIMENSION
D. KANNAN, and N.MANGALAM. IRJCS:: International Research Journal of Computer ScienceVolume IV (Issue XII):
01-06(December 2017)1. S. Berchtold, C Bohm, and H. Kriegel. The Pyramid-Technique: Towards Breaking the Curse of Dimensionality. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pages 142–153, Seattle, Washington, 2010, 98. 185 2. Stefan Berchtold, Daniel A. Keim, and Hans-Peter Kriegel. The SR-tree : An index structure for high-dimensional data. In Proceedings of 22th International Conference on Very Large Data Bases, VLDB’12, pages 28–39, Bombay, India, 2012. 3. N. Beckmann, H.P. Kriegel, R. Schneider, and B. Seeger. The SR-tree: an Efficient and Robust Access Method for Points and Rectangles. In Proceedings of ACM-SIGMOD International Conference on Management of Data, pages 322–331, Atlantic City, NJ, May 2011. 4. K. Chakrabarti and S. Mehrotra. The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces. In Proceedings of the 16th International Conference on Data Engineering, pages 440–447, San Diego, CA, February 2012. 5. Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim. Cure: An efficient clustering algorithm for large databases. In Proceedings of the ACM SIGMOD conference on Management of Data, pages 73–84, Seattle, WA, 2011. 6. R. Kurniawati, J. S. Jin, and J. A. Shepherd. The SS+-tree: An improved index structure for similarity searches in a high-dimensional feature space. In Proceedings of SPIE Storage and Retrieval for Image and Video Databases, pages 13–24, February 2012. 7. N. Katayama and S. Satoh. The SR-tree: An Index Structure for High-Dimensional Nearest Neighbor Queries. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, pages 369–380, Tucson, Arizona, 2013. 8. J.T. Robinson. The K-D-B-Tree: A Search Structure for Large Multidimensional Dynamic Indexes. In Proceedings of the ACM SIGMOD Conference on Management of Data, pages 10–18, Ann Arbor, MI, April 2013. 9. D.A. White and R. Jain. Similarity Indexing with the SS-tree. In Proceedings of the 12th Intl. Conf. on Data Engineering, pages 516–523, New Orleans, Louisiana, February 2014. 10. D. Yu, S. Chatterjee, G. Sheikholeslami, and A. Zhang. Efficiently detecting arbitrary shaped clusters in very large datasets with high dimensions. Technical Report 98-8, State University of New York at Buffalo, Department of Computer Science and Engineering, November 2013. 11. Tian Zhang, Raghu Ramakrishnan, and Miron Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pages 103–114, Montreal, Canada, 2012..
The main aim of this paper is to develop a new dynamic indexing structure to support very large datasets and high dimensionality. This new structure is tree based used to facilitate efficient access. It is highly adaptable to any type of applications. The newly developed structure is based on nearest neighbors’ method with exception of linearly scan the very large datasets. The NewTree surely minimizes adverse effect of the curse of dimensionality. It means that the most existing indexing techniques degrade rapidly when dimensionality goes higher. The major drawback here is the retrieval of subsets from the huge storage system. The NewTree structure can handle very efficiently and effectively during adding new data. When the new data are added and the shape of the structure does not change. The performance of the newly developed structure can be evaluated with SR Tree, existing indexing structure. The results clearly show that the efficiency of the newly developed structure is superior in both time complexity and memory complexity than SR Tree.