Popularity-Based Ranking for Fast Approximate kNN Search
Volume 28, Issue 1 (2017), pp. 1–21
Pub. online: 1 January 2017
Type: Research Article
Open Access
Received
1 October 2016
1 October 2016
Accepted
1 February 2017
1 February 2017
Published
1 January 2017
1 January 2017
Abstract
Similarity searching has become widely available in many on-line archives of multimedia data. Users accessing such systems look for data items similar to their specific query object and typically refine results by re-running the search with a query from the results. We study this issue and propose a mechanism of approximate kNN query evaluation that incorporates statistics of accessing index data partitions. Apart from the distance between database objects, it also considers the prior query answers to prioritize index partitions containing frequently retrieved data, so evaluating repetitive similar queries more efficiently. We verify this concept in a number of experiments.
References
Antol, M., Dohnal, V. (2016). Optimizing query performance with inverted cache in metric spaces. In: Proceedings of ADBIS 2016: 20th East-European Conference on Advances in Databases and Information Systems, pp. 1–14. Accepted for publication. For the reviewers, it is available at https://www.fi.muni.cz/ xdohnal/adbis2016.pdf.
Brisaboa, N.R., Cerdeira-Pena, A., Gil-Costa, V., Marin, M., Pedreira, O. (2015). Efficient similarity search by combining indexing and caching strategies. In: Proceedings of SOFSEM 2015: Theory and Practice of Computer Science: 41st International Conference on Current Trends in Theory and Practice of Computer Science. Springer, Berlin, pp. 486–497.
Budikova, P., Batko, M., Zezula, P. (2011). Evaluation platform for content-based image retrieval systems. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (Eds.), Proceedings of International Conference on Theory and Practice of Digital Libraries, Research and Advanced Technology for Digital Libraries (TPDL). Springer, Berlin, pp. 130–142.
Ciaccia, P., Patella, M., Zezula, P. (1997). M-tree: an efficient access method for similarity search in metric spaces. In: Jarke, M., Carey, M.J., Dittrich, K.R., Lochovsky, F.H., Loucopoulos, P., Jeusfeld, M.A. (Eds.), Proceedings of the 23rd International Conference on Very Large Data Bases (VLDB 1997), Athens, Greece, August 25–29, 1997. Morgan Kaufmann, pp. 426–435.
Falchi, F., Lucchese, C., Orlando, S., Perego, R., Rabitti, F. (2009). Caching content-based queries for robust and efficient image retrieval. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, EDBT ’09. ACM, New York, pp. 780–790.
Novak, D., Zezula, P. (2014). Rank aggregation of candidate sets for efficient similarity search. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (Eds.), Proceedings of 25th International Conference on Database and Expert Systems Applications (DEXA). Springer International Publishing, Cham, pp. 42–58.
Oliveira, P.H., Traina, C., Kaster, D.S. (2015). Improving the pruning ability of dynamic metric access methods with local additional pivots and anticipation of information. In: Proceedings of 19th East European Conference on Advances in Databases and Information Systems (ADBIS), Poitiers, France, September 8–11, 2015. Springer International Publishing, Cham, pp. 18–31.
Pramanik, S., Li, J., Ruan, J., Bhattacharjee, S.K. (1999). Efficient search scheme for very large image databases. In: Beretta, G.B., Schettini, R. (Eds.), Proceedings of the International Society for Optical Engineering (SPIE) on Internet Imaging, San Jose, California, USA, January 26, 2000, Vol. 3964. The International Society for Optical Engineering, pp. 79–90.
Skopal, T., Hoksza, D. (2007). Improving the performance of m-tree family by nearest-neighbor graphs. In: Ioannidis, Y., Novikov, B., Rachev, B. (Eds.), Proceedings of 11th East European Conference on Advances in Databases and Information Systems (ADBIS), Varna, Bulgaria, September 29–October 3, 2007. Springer, Berlin, pp. 172–188.
Skopal, T., Pokorný, J., Snášel, V. (2005). Nearest neighbours search using the PM-Tree. In Lecture Notes in Computer Science: Vol. 3453. Proceedings of the 10th International Conference on Database Systems for Advanced Applications (DASFAA), Beijing, China, April 17–20, 2005. Springer, pp. 803–815.
Biographies
Antol Matej
M. Antol is a PhD student at the Faculty of Informatics, Masaryk University in Brno, Czech Republic. His area of interest is data analytics, namely optimization of hierarchical structures and their performance in metric spaces.
Dohnal Vlastislav
V. Dohnal is an associate professor at the Faculty of Informatics, Masaryk University. He has been interested in database systems and information retrieval. In particular, indexing principles and mechanisms for unstructured data modelled as a metric space.