Lucene, MetaMap, and language modeling: OHSU at CLEF eHealth 2013

Steven Bedrick, Golnar Sheikshabbafghi

Research output: Contribution to journalConference article

Abstract

The Oregon Health & Science University team's participation in task #3 ("addressing patients' medical questions") of this year's eHealth CLEF campaign included submissions from two different retrieval systems. The first was a traditional, Lucene-based system modi fied from one used in previous years' TREC-med campaigns; the second was a novel system that used statistical language modeling techniques to perform text retrieval. Since 2013 was the first year of our participation in this campaign, our focus was on familiarizing ourselves with working on a corpus of web text, as well as putting together a proof-of-concept implementation of a language-model retrieval system. We submitted three runs in total; one from the novel system, and two from our Lucene-based system, one of which made use of the National Library of Medicine's MetaMap tool to perform query expansion. In general, our runs did not perform particularly well, although there were several topics for which our language model-based retrieval system produced the best P@10. Future work will focus on pre-indexing text normalization as well as a more sophisticated approach to query parsing.

Original languageEnglish (US)
JournalCEUR Workshop Proceedings
Volume1179
StatePublished - Jan 1 2013
Event2013 Cross Language Evaluation Forum Conference, CLEF 2013 - Valencia, Spain
Duration: Sep 23 2013Sep 26 2013

Keywords

  • Language model
  • Lucene
  • MetaMap
  • Skip-grams

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint Dive into the research topics of 'Lucene, MetaMap, and language modeling: OHSU at CLEF eHealth 2013'. Together they form a unique fingerprint.

  • Cite this