Buch, Englisch, 210 Seiten
Buch, Englisch, 210 Seiten
Reihe: Dissertations in Database and Information Systems
ISBN: 978-1-58603-177-0
Verlag: IOS Press
With the example of an image database and the notion of relevance feedback, the dissertation addresses the important and challenging problem of identifying the most similar objects in a database given a set of reference objects and and a set of features. Similarity search is typically implemented as "Nearest Neighbor Search" (NN- Search) in high-dimensional vector spaces. Although a large number of existing work have provided solutions for the NN-Search problem, most of them have not taken the specific problems of high dimensionality explicitly into account. This work carefully investigates the so-called "Curse of Dimensionality", and presents a novel, flat organization for NN-Search optimized for high-dimensional spaces: the so-called "Vector Approximation File" (VA-File). The superiority of the VA-File over conventional indexing structures is shown theoretically and with extensive experiments. Further refinements of the VA-File discussed in the current work include approximate search and parallel search in a cluster of workstations. From a practical perspective, the dissertation provides an indexing technique that allows for interactive-time
similarity search even in huge databases with gigabytes of vector data. As such, it is the most favorable indexing technique for future multimedia retrieval systems.