Algorithms for modeling distributions over large alphabets

Alon Orlitsky, Sajama, Narayana Santhanam, Krishnamurthy Viswanathan, Junan Zhang

Research output: Contribution to journalConference article

Abstract

We consider the problem of modeling a distribution whose alphabet size is large relative to the amount of observed data. It is well known that conventional maximum-likelihood estimates do not perform well in that regime. Instead, we find the distribution maximizing the probability of the data's pattern. We derive an efficient algorithm for approximating this distribution. Simulations show that the computed distribution models the data well and yields general estimators that evaluate various data attributes as well as specific estimators designed especially for these tasks.

Original languageEnglish (US)
Number of pages1
JournalIEEE International Symposium on Information Theory - Proceedings
StatePublished - Oct 20 2004
EventProceedings - 2004 IEEE International Symposium on Information Theory - Chicago, IL, United States
Duration: Jun 27 2004Jul 2 2004

    Fingerprint

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Cite this