E-Book, Englisch, Band 14, 198 Seiten, eBook
E-Book, Englisch, Band 14, 198 Seiten, eBook
Reihe: The Information Retrieval Series
ISBN: 978-1-4020-8035-7
Verlag: Springer US
Format: PDF
Kopierschutz: 1 - PDF Watermark
Machine Learning and Statistical Modeling Approaches to Image Retrieval
describes several approaches of integrating machine learning and statistical modeling into an image retrieval and indexing system that demonstrates promising results. The topics of this book reflect authors' experiences of machine learning and statistical modeling based image indexing and retrieval. This book contains detailed references for further reading and research in this field as well.
Zielgruppe
Research
Autoren/Hrsg.
Weitere Infos & Material
Image Retrieval and Linguistic Indexing.- Machine Learning and Statistical Modeling.- A Robust Region-Based Similarity Measure.- Cluster-Based Retrieval by Unsupervised Learning.- Categorization by Learning and Reasoning with Regions.- Automatic Linguistic Indexing of Pictures.- Modeling Ancient Paintings.- Conclusions and Future Work.
2.1 Similarity Comparison (p.16-17)
Similarity comparison is a key issue in CBIR [Santini and Jain, 1999]. In general, the comparison is performed over imagery features. According to the scope of representation, features fall roughly into two categories: global features and local features. The former category includes texture histogram, color histogram, color layout of the whole image, and features selected from multidimensional discriminant analysis of a collection of images [Faloutsos et al., 1994; Gupta and Jain, 1997; Pentland et al., 1996; Smith and Chang, 1996; Swets and Weng, 1996]. In the latter category are color, texture, and shape features for subimages [Picard and Minka, 1995], segmented regions [Carson et al., 2002; Chen and Wang, 2002; Ma and Manjunath, 1997; Wang et al., 2001b], and interest points [Schmid and Mohr, 1997].
As a relatively mature method, histogram matching has been applied to many general-purpose image retrieval systems such as IBM QBIC [Faloutsos et al., 1994], MIT Photobook [Pentland et al., 1996], Virage System [Gupta and Jain, 1997], and Columbia VisualSEEK and WebSEEK [Smith and Chang, 1996], etc. The Mahalanobis distance [Hafner et al., 1995] and intersection distance [Swain and Ballard, 1991] are commonly used to compute the difference between two histograms with the same number of bins. When the number of bins are different, e.g., when a sparse representation is used, the Earth Mover’s Distance (EMD) [Rubner et al., 1997] applies. The EMD is computed by solving a linear programming problem. A major drawback of the global histogram search lies in its sensitivity to intensity variations, color distortions, and cropping.
Many approaches have been proposed to tackle this problem:
* The PicToSeek [Gevers and Smeulders, 2000] system uses color models invariant to object geometry, object pose, and illumination.
* VisualSEEK and Virage systems attempt to reduce the influence of intensity variations and color distortions by employing spatial rela tionships and color layout in addition to those elementary color, texture, and shape features.
* The same idea of color layout indexing is extended in a later system, Stanford WBIIS [Wang et al., 1998], which, instead of averaging, characterizes the color variations over the spatial extent of an image by Daubechies’ wavelet coefficients and their variances.
* Schmid and Mohr [Schmid and Mohr, 1997] proposed a method of indexing images based on local features of automatically detected interest points of images.
* Minka and Picard [Minka and Picard, 1997] described a learning algorithm for selecting and grouping features. The user guides the learning process by providing positive and negative examples.
* The approach presented in [Swets and Weng, 1996] uses what is called the Most Discriminating Features for image retrieval. These features are extracted from a set of training images by optimal linear projection.
* The Virage system allows users to adjust weights of implemented features according to their own perceptions. The PicHunter system [Cox et al., 2000] and the UIUC MARS [Mehrotra et al., 1997] system are self-adaptable to different applications and different users based upon user feedbacks.
* To approximate the human perception of the shapes of the objects in the images, Del Bimbo and Pala [Bimbo and Pala, 1997] introduced a measure of shape similarity using elastic matching.
* In [Mojsilovic et al., 2000], matching and retrieval are performed along what is referred to as perceptual dimensions which are obtained from subjective experiments and multidimensional scaling based on the model of human perception of color patterns.
* In [Berretti et al., 2000], two distinct similarity measures, concerning respectively with fitting human perception and with the efficiency of data organization and indexing, are proposed for content-based image retrieval by shape similarity.