Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems

Henning Müller, Paul Clough, William (Bill) Hersh, Thomas Deselaers, Thomas M. Lehmann, Bruno Janvier, Antoine Geissbuhler

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Many image retrieval systems, and the evaluation methodologies of these systems, make use of either visual or textual information only. Only few combine textual and visual features for retrieval and evaluation. If text is used, it is often relies upon having a standardised and complete annotation schema for the entire collection. This, in combination with high-level semantic queries, makes visual/textual combinations almost useless as the information need can often be solved using just textual features. In reality, many collections do have some form of annotation but this is often heterogeneous and incomplete. Web-based image repositories such as FlickR even allow collective, as well as multilingual annotation of multimedia objects. This article describes an image retrieval evaluation campaign called ImageCLEF. Unlike previous evaluations, we offer a range of realistic tasks and image collections in which combining text and visual features is likely to obtain the best results. In particular, we offer a medical retrieval task which models exactly the situation of heterogenous annotation by combining four collections with annotations of varying quality, structure, extent and language. Two collections have an annotation per case and not per image, which is normal in the medical domain, making it difficult to relate parts of the accompanying text to corresponding images. This is also typical of image retrieval from the web in which adjacent text does not always describe an image. The ImageCLEF benchmark shows the need for realistic and standardised datasets, search tasks and ground truths for visual information retrieval evaluation.

Original languageEnglish (US)
Title of host publicationProceedings of SPIE - The International Society for Optical Engineering
Volume6061
DOIs
StatePublished - 2006
EventInternet Imaging VII - San Jose, CA, United States
Duration: Jan 18 2006Jan 19 2006

Other

OtherInternet Imaging VII
CountryUnited States
CitySan Jose, CA
Period1/18/061/19/06

Fingerprint

annotations
Image retrieval
Benchmarking
retrieval
evaluation
Information retrieval
Semantics
information retrieval
ground truth
semantics
multimedia
methodology

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Condensed Matter Physics

Cite this

Müller, H., Clough, P., Hersh, W. B., Deselaers, T., Lehmann, T. M., Janvier, B., & Geissbuhler, A. (2006). Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems. In Proceedings of SPIE - The International Society for Optical Engineering (Vol. 6061). [606105] https://doi.org/10.1117/12.660259

Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems. / Müller, Henning; Clough, Paul; Hersh, William (Bill); Deselaers, Thomas; Lehmann, Thomas M.; Janvier, Bruno; Geissbuhler, Antoine.

Proceedings of SPIE - The International Society for Optical Engineering. Vol. 6061 2006. 606105.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Müller, H, Clough, P, Hersh, WB, Deselaers, T, Lehmann, TM, Janvier, B & Geissbuhler, A 2006, Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems. in Proceedings of SPIE - The International Society for Optical Engineering. vol. 6061, 606105, Internet Imaging VII, San Jose, CA, United States, 1/18/06. https://doi.org/10.1117/12.660259
Müller H, Clough P, Hersh WB, Deselaers T, Lehmann TM, Janvier B et al. Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems. In Proceedings of SPIE - The International Society for Optical Engineering. Vol. 6061. 2006. 606105 https://doi.org/10.1117/12.660259
Müller, Henning ; Clough, Paul ; Hersh, William (Bill) ; Deselaers, Thomas ; Lehmann, Thomas M. ; Janvier, Bruno ; Geissbuhler, Antoine. / Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems. Proceedings of SPIE - The International Society for Optical Engineering. Vol. 6061 2006.
@inproceedings{3c7927e9d0aa46cfae5b4990f52549c7,
title = "Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems",
abstract = "Many image retrieval systems, and the evaluation methodologies of these systems, make use of either visual or textual information only. Only few combine textual and visual features for retrieval and evaluation. If text is used, it is often relies upon having a standardised and complete annotation schema for the entire collection. This, in combination with high-level semantic queries, makes visual/textual combinations almost useless as the information need can often be solved using just textual features. In reality, many collections do have some form of annotation but this is often heterogeneous and incomplete. Web-based image repositories such as FlickR even allow collective, as well as multilingual annotation of multimedia objects. This article describes an image retrieval evaluation campaign called ImageCLEF. Unlike previous evaluations, we offer a range of realistic tasks and image collections in which combining text and visual features is likely to obtain the best results. In particular, we offer a medical retrieval task which models exactly the situation of heterogenous annotation by combining four collections with annotations of varying quality, structure, extent and language. Two collections have an annotation per case and not per image, which is normal in the medical domain, making it difficult to relate parts of the accompanying text to corresponding images. This is also typical of image retrieval from the web in which adjacent text does not always describe an image. The ImageCLEF benchmark shows the need for realistic and standardised datasets, search tasks and ground truths for visual information retrieval evaluation.",
author = "Henning M{\"u}ller and Paul Clough and Hersh, {William (Bill)} and Thomas Deselaers and Lehmann, {Thomas M.} and Bruno Janvier and Antoine Geissbuhler",
year = "2006",
doi = "10.1117/12.660259",
language = "English (US)",
isbn = "0819461016",
volume = "6061",
booktitle = "Proceedings of SPIE - The International Society for Optical Engineering",

}

TY - GEN

T1 - Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems

AU - Müller, Henning

AU - Clough, Paul

AU - Hersh, William (Bill)

AU - Deselaers, Thomas

AU - Lehmann, Thomas M.

AU - Janvier, Bruno

AU - Geissbuhler, Antoine

PY - 2006

Y1 - 2006

N2 - Many image retrieval systems, and the evaluation methodologies of these systems, make use of either visual or textual information only. Only few combine textual and visual features for retrieval and evaluation. If text is used, it is often relies upon having a standardised and complete annotation schema for the entire collection. This, in combination with high-level semantic queries, makes visual/textual combinations almost useless as the information need can often be solved using just textual features. In reality, many collections do have some form of annotation but this is often heterogeneous and incomplete. Web-based image repositories such as FlickR even allow collective, as well as multilingual annotation of multimedia objects. This article describes an image retrieval evaluation campaign called ImageCLEF. Unlike previous evaluations, we offer a range of realistic tasks and image collections in which combining text and visual features is likely to obtain the best results. In particular, we offer a medical retrieval task which models exactly the situation of heterogenous annotation by combining four collections with annotations of varying quality, structure, extent and language. Two collections have an annotation per case and not per image, which is normal in the medical domain, making it difficult to relate parts of the accompanying text to corresponding images. This is also typical of image retrieval from the web in which adjacent text does not always describe an image. The ImageCLEF benchmark shows the need for realistic and standardised datasets, search tasks and ground truths for visual information retrieval evaluation.

AB - Many image retrieval systems, and the evaluation methodologies of these systems, make use of either visual or textual information only. Only few combine textual and visual features for retrieval and evaluation. If text is used, it is often relies upon having a standardised and complete annotation schema for the entire collection. This, in combination with high-level semantic queries, makes visual/textual combinations almost useless as the information need can often be solved using just textual features. In reality, many collections do have some form of annotation but this is often heterogeneous and incomplete. Web-based image repositories such as FlickR even allow collective, as well as multilingual annotation of multimedia objects. This article describes an image retrieval evaluation campaign called ImageCLEF. Unlike previous evaluations, we offer a range of realistic tasks and image collections in which combining text and visual features is likely to obtain the best results. In particular, we offer a medical retrieval task which models exactly the situation of heterogenous annotation by combining four collections with annotations of varying quality, structure, extent and language. Two collections have an annotation per case and not per image, which is normal in the medical domain, making it difficult to relate parts of the accompanying text to corresponding images. This is also typical of image retrieval from the web in which adjacent text does not always describe an image. The ImageCLEF benchmark shows the need for realistic and standardised datasets, search tasks and ground truths for visual information retrieval evaluation.

UR - http://www.scopus.com/inward/record.url?scp=33645685975&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33645685975&partnerID=8YFLogxK

U2 - 10.1117/12.660259

DO - 10.1117/12.660259

M3 - Conference contribution

AN - SCOPUS:33645685975

SN - 0819461016

SN - 9780819461018

VL - 6061

BT - Proceedings of SPIE - The International Society for Optical Engineering

ER -