TY - GEN
T1 - Using reinforcement learning for dialogue management policies
T2 - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
AU - Heeman, Peter A.
AU - Fryer, Jordan
AU - Lunsford, Rebecca
AU - Rueckert, Andrew
AU - Selfridge, Ethan O.
N1 - Copyright:
Copyright 2013 Elsevier B.V., All rights reserved.
PY - 2012
Y1 - 2012
N2 - Reinforcement learning is becoming a popular tool for building dialogue managers. This paper addresses two issues in using RL. First, we propose two methods for finding MDP violations. Both methods make use of computing Q scores when testing the policy. Second, we investigate how convergence happens. To do this, we use a dialogue task in which the only source of variability is the dialogue policy itself. This allows us to study how and when convergence occurs as training progresses. The work in this paper will help dialogue designers build effective policies and understand how much training is necessary.
AB - Reinforcement learning is becoming a popular tool for building dialogue managers. This paper addresses two issues in using RL. First, we propose two methods for finding MDP violations. Both methods make use of computing Q scores when testing the policy. Second, we investigate how convergence happens. To do this, we use a dialogue task in which the only source of variability is the dialogue policy itself. This allows us to study how and when convergence occurs as training progresses. The work in this paper will help dialogue designers build effective policies and understand how much training is necessary.
UR - http://www.scopus.com/inward/record.url?scp=84878400692&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84878400692&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84878400692
SN - 9781622767595
T3 - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
SP - 746
EP - 749
BT - 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Y2 - 9 September 2012 through 13 September 2012
ER -