Enhancing access to the bibliome

The TREC 2004 Genomics Track

William (Bill) Hersh, Ravi Teja Bhupatiraju, Laura Ross, Phoebe Roberts, Aaron Cohen, Dale Kraemer

Research output: Contribution to journalArticle

23 Citations (Scopus)

Abstract

Background: The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed. Results: A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential. Conclusion: Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems.

Original languageEnglish (US)
Article number3
JournalJournal of Biomedical Discovery and Collaboration
Volume1
Issue number1
DOIs
StatePublished - Mar 13 2006

Fingerprint

Genomics
Search Engine
Information retrieval systems
Information Storage and Retrieval
Search engines
Information retrieval
Information Systems
Names
Genes
Research Personnel
Research

ASJC Scopus subject areas

  • Health Informatics
  • Computer Science Applications
  • History and Philosophy of Science

Cite this

Enhancing access to the bibliome : The TREC 2004 Genomics Track. / Hersh, William (Bill); Bhupatiraju, Ravi Teja; Ross, Laura; Roberts, Phoebe; Cohen, Aaron; Kraemer, Dale.

In: Journal of Biomedical Discovery and Collaboration, Vol. 1, No. 1, 3, 13.03.2006.

Research output: Contribution to journalArticle

@article{917af693bdfb4ef098c1adbd1419daad,
title = "Enhancing access to the bibliome: The TREC 2004 Genomics Track",
abstract = "Background: The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed. Results: A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential. Conclusion: Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems.",
author = "Hersh, {William (Bill)} and Bhupatiraju, {Ravi Teja} and Laura Ross and Phoebe Roberts and Aaron Cohen and Dale Kraemer",
year = "2006",
month = "3",
day = "13",
doi = "10.1186/1747-5333-1-3",
language = "English (US)",
volume = "1",
journal = "Journal of Biomedical Discovery and Collaboration",
issn = "1747-5333",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Enhancing access to the bibliome

T2 - The TREC 2004 Genomics Track

AU - Hersh, William (Bill)

AU - Bhupatiraju, Ravi Teja

AU - Ross, Laura

AU - Roberts, Phoebe

AU - Cohen, Aaron

AU - Kraemer, Dale

PY - 2006/3/13

Y1 - 2006/3/13

N2 - Background: The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed. Results: A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential. Conclusion: Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems.

AB - Background: The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed. Results: A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential. Conclusion: Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems.

UR - http://www.scopus.com/inward/record.url?scp=33747884460&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33747884460&partnerID=8YFLogxK

U2 - 10.1186/1747-5333-1-3

DO - 10.1186/1747-5333-1-3

M3 - Article

VL - 1

JO - Journal of Biomedical Discovery and Collaboration

JF - Journal of Biomedical Discovery and Collaboration

SN - 1747-5333

IS - 1

M1 - 3

ER -