Algorithms for modeling distributions over large alphabets

Alon Orlitsky, Sajama, Narayana Santhanam, Krishnamurthy Viswanathan, Junan Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

We consider the problem of modeling a distribution whose alphabet size is large relative to the amount of observed data. It is well known that conventional maximum-likelihood estimates do not perform well in that regime. Instead, we find the distribution maximizing the probability of the data's pattern. We derive an efficient algorithm for approximating this distribution. Simulations show that the computed distribution models the data well and yields general estimators that evaluate various data attributes as well as specific estimators designed especially for these tasks.

Original languageEnglish (US)
Title of host publicationIEEE International Symposium on Information Theory - Proceedings
Pages306
Number of pages1
StatePublished - 2004
Externally publishedYes
EventProceedings - 2004 IEEE International Symposium on Information Theory - Chicago, IL, United States
Duration: Jun 27 2004Jul 2 2004

Other

OtherProceedings - 2004 IEEE International Symposium on Information Theory
CountryUnited States
CityChicago, IL
Period6/27/047/2/04

Fingerprint

Maximum likelihood

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Orlitsky, A., Sajama, Santhanam, N., Viswanathan, K., & Zhang, J. (2004). Algorithms for modeling distributions over large alphabets. In IEEE International Symposium on Information Theory - Proceedings (pp. 306)

Algorithms for modeling distributions over large alphabets. / Orlitsky, Alon; Sajama; Santhanam, Narayana; Viswanathan, Krishnamurthy; Zhang, Junan.

IEEE International Symposium on Information Theory - Proceedings. 2004. p. 306.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Orlitsky, A, Sajama, Santhanam, N, Viswanathan, K & Zhang, J 2004, Algorithms for modeling distributions over large alphabets. in IEEE International Symposium on Information Theory - Proceedings. pp. 306, Proceedings - 2004 IEEE International Symposium on Information Theory, Chicago, IL, United States, 6/27/04.
Orlitsky A, Sajama, Santhanam N, Viswanathan K, Zhang J. Algorithms for modeling distributions over large alphabets. In IEEE International Symposium on Information Theory - Proceedings. 2004. p. 306
Orlitsky, Alon ; Sajama ; Santhanam, Narayana ; Viswanathan, Krishnamurthy ; Zhang, Junan. / Algorithms for modeling distributions over large alphabets. IEEE International Symposium on Information Theory - Proceedings. 2004. pp. 306
@inproceedings{d16c23e339254db188e2f5d956f6f6e1,
title = "Algorithms for modeling distributions over large alphabets",
abstract = "We consider the problem of modeling a distribution whose alphabet size is large relative to the amount of observed data. It is well known that conventional maximum-likelihood estimates do not perform well in that regime. Instead, we find the distribution maximizing the probability of the data's pattern. We derive an efficient algorithm for approximating this distribution. Simulations show that the computed distribution models the data well and yields general estimators that evaluate various data attributes as well as specific estimators designed especially for these tasks.",
author = "Alon Orlitsky and Sajama and Narayana Santhanam and Krishnamurthy Viswanathan and Junan Zhang",
year = "2004",
language = "English (US)",
pages = "306",
booktitle = "IEEE International Symposium on Information Theory - Proceedings",

}

TY - GEN

T1 - Algorithms for modeling distributions over large alphabets

AU - Orlitsky, Alon

AU - Sajama,

AU - Santhanam, Narayana

AU - Viswanathan, Krishnamurthy

AU - Zhang, Junan

PY - 2004

Y1 - 2004

N2 - We consider the problem of modeling a distribution whose alphabet size is large relative to the amount of observed data. It is well known that conventional maximum-likelihood estimates do not perform well in that regime. Instead, we find the distribution maximizing the probability of the data's pattern. We derive an efficient algorithm for approximating this distribution. Simulations show that the computed distribution models the data well and yields general estimators that evaluate various data attributes as well as specific estimators designed especially for these tasks.

AB - We consider the problem of modeling a distribution whose alphabet size is large relative to the amount of observed data. It is well known that conventional maximum-likelihood estimates do not perform well in that regime. Instead, we find the distribution maximizing the probability of the data's pattern. We derive an efficient algorithm for approximating this distribution. Simulations show that the computed distribution models the data well and yields general estimators that evaluate various data attributes as well as specific estimators designed especially for these tasks.

UR - http://www.scopus.com/inward/record.url?scp=5044241234&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=5044241234&partnerID=8YFLogxK

M3 - Conference contribution

SP - 306

BT - IEEE International Symposium on Information Theory - Proceedings

ER -