TY - GEN
T1 - Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems
AU - Müller, Henning
AU - Clough, Paul
AU - Hersh, William
AU - Deselaers, Thomas
AU - Lehmann, Thomas M.
AU - Janvier, Bruno
AU - Geissbuhler, Antoine
N1 - Copyright:
Copyright 2008 Elsevier B.V., All rights reserved.
PY - 2006
Y1 - 2006
N2 - Many image retrieval systems, and the evaluation methodologies of these systems, make use of either visual or textual information only. Only few combine textual and visual features for retrieval and evaluation. If text is used, it is often relies upon having a standardised and complete annotation schema for the entire collection. This, in combination with high-level semantic queries, makes visual/textual combinations almost useless as the information need can often be solved using just textual features. In reality, many collections do have some form of annotation but this is often heterogeneous and incomplete. Web-based image repositories such as FlickR even allow collective, as well as multilingual annotation of multimedia objects. This article describes an image retrieval evaluation campaign called ImageCLEF. Unlike previous evaluations, we offer a range of realistic tasks and image collections in which combining text and visual features is likely to obtain the best results. In particular, we offer a medical retrieval task which models exactly the situation of heterogenous annotation by combining four collections with annotations of varying quality, structure, extent and language. Two collections have an annotation per case and not per image, which is normal in the medical domain, making it difficult to relate parts of the accompanying text to corresponding images. This is also typical of image retrieval from the web in which adjacent text does not always describe an image. The ImageCLEF benchmark shows the need for realistic and standardised datasets, search tasks and ground truths for visual information retrieval evaluation.
AB - Many image retrieval systems, and the evaluation methodologies of these systems, make use of either visual or textual information only. Only few combine textual and visual features for retrieval and evaluation. If text is used, it is often relies upon having a standardised and complete annotation schema for the entire collection. This, in combination with high-level semantic queries, makes visual/textual combinations almost useless as the information need can often be solved using just textual features. In reality, many collections do have some form of annotation but this is often heterogeneous and incomplete. Web-based image repositories such as FlickR even allow collective, as well as multilingual annotation of multimedia objects. This article describes an image retrieval evaluation campaign called ImageCLEF. Unlike previous evaluations, we offer a range of realistic tasks and image collections in which combining text and visual features is likely to obtain the best results. In particular, we offer a medical retrieval task which models exactly the situation of heterogenous annotation by combining four collections with annotations of varying quality, structure, extent and language. Two collections have an annotation per case and not per image, which is normal in the medical domain, making it difficult to relate parts of the accompanying text to corresponding images. This is also typical of image retrieval from the web in which adjacent text does not always describe an image. The ImageCLEF benchmark shows the need for realistic and standardised datasets, search tasks and ground truths for visual information retrieval evaluation.
UR - http://www.scopus.com/inward/record.url?scp=33645685975&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33645685975&partnerID=8YFLogxK
U2 - 10.1117/12.660259
DO - 10.1117/12.660259
M3 - Conference contribution
AN - SCOPUS:33645685975
SN - 0819461016
SN - 9780819461018
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Internet Imaging VII - Proceedings of SPIE-IS and T Electronic Imaging
T2 - Internet Imaging VII
Y2 - 18 January 2006 through 19 January 2006
ER -