Perceptual experiments for diagnostic testing of text-to-speech systems

Research output: Contribution to journalArticle

32 Citations (Scopus)

Abstract

This paper describes perceptual methods for diagnosing problems in text-to-speech systems. Special attention is paid to two issues. First, coverage of the domain of a text-to-speech system. Since this domain involves an enormous range of contexts, it is criticial for diagnostics, and also for overall evaluation, that test materials cover this range to the fullest extent possible. Automatic text generation algorithms that make extensive use of "greedy" algorithms are described that serve this purpose. Second, speech generated by text-to-speech systems tends to have a great variety of problems . A battery of experimental paradigms is discussed that address different facets of speech quality and intelligibility. Included are: (a) "word pointing" method for detection of problematic concatenative units, (b) "minimal pairs intelligibility test"-an expanded diagnostic rhyme test; (c) automatically scored orthographic name transcription task; (d) mean opinion score paradigm with problem categorization; and (e) paired comparison paradigm with strength-of-choice rating. The methods are applied in a series of experiments on high-end text-to-speech systems.

Original languageEnglish (US)
Pages (from-to)49-100
Number of pages52
JournalComputer Speech and Language
Volume7
Issue number1
DOIs
StatePublished - Jan 1993
Externally publishedYes

Fingerprint

Text-to-speech
Diagnostics
diagnostic
Testing
Paradigm
experiment
Experiment
Experiments
paradigm
Paired Comparisons
Diagnostic Tests
Speech Intelligibility
Categorization
Matched-Pair Analysis
Greedy Algorithm
Facet
Battery
Range of data
Transcription
test evaluation

ASJC Scopus subject areas

  • Linguistics and Language
  • Experimental and Cognitive Psychology
  • Electrical and Electronic Engineering
  • Signal Processing

Cite this

Perceptual experiments for diagnostic testing of text-to-speech systems. / Van Santen, Jan.

In: Computer Speech and Language, Vol. 7, No. 1, 01.1993, p. 49-100.

Research output: Contribution to journalArticle

@article{7043e631793047cba208de694dfbaed0,
title = "Perceptual experiments for diagnostic testing of text-to-speech systems",
abstract = "This paper describes perceptual methods for diagnosing problems in text-to-speech systems. Special attention is paid to two issues. First, coverage of the domain of a text-to-speech system. Since this domain involves an enormous range of contexts, it is criticial for diagnostics, and also for overall evaluation, that test materials cover this range to the fullest extent possible. Automatic text generation algorithms that make extensive use of {"}greedy{"} algorithms are described that serve this purpose. Second, speech generated by text-to-speech systems tends to have a great variety of problems . A battery of experimental paradigms is discussed that address different facets of speech quality and intelligibility. Included are: (a) {"}word pointing{"} method for detection of problematic concatenative units, (b) {"}minimal pairs intelligibility test{"}-an expanded diagnostic rhyme test; (c) automatically scored orthographic name transcription task; (d) mean opinion score paradigm with problem categorization; and (e) paired comparison paradigm with strength-of-choice rating. The methods are applied in a series of experiments on high-end text-to-speech systems.",
author = "{Van Santen}, Jan",
year = "1993",
month = "1",
doi = "10.1006/csla.1993.1004",
language = "English (US)",
volume = "7",
pages = "49--100",
journal = "Computer Speech and Language",
issn = "0885-2308",
publisher = "Academic Press Inc.",
number = "1",

}

TY - JOUR

T1 - Perceptual experiments for diagnostic testing of text-to-speech systems

AU - Van Santen, Jan

PY - 1993/1

Y1 - 1993/1

N2 - This paper describes perceptual methods for diagnosing problems in text-to-speech systems. Special attention is paid to two issues. First, coverage of the domain of a text-to-speech system. Since this domain involves an enormous range of contexts, it is criticial for diagnostics, and also for overall evaluation, that test materials cover this range to the fullest extent possible. Automatic text generation algorithms that make extensive use of "greedy" algorithms are described that serve this purpose. Second, speech generated by text-to-speech systems tends to have a great variety of problems . A battery of experimental paradigms is discussed that address different facets of speech quality and intelligibility. Included are: (a) "word pointing" method for detection of problematic concatenative units, (b) "minimal pairs intelligibility test"-an expanded diagnostic rhyme test; (c) automatically scored orthographic name transcription task; (d) mean opinion score paradigm with problem categorization; and (e) paired comparison paradigm with strength-of-choice rating. The methods are applied in a series of experiments on high-end text-to-speech systems.

AB - This paper describes perceptual methods for diagnosing problems in text-to-speech systems. Special attention is paid to two issues. First, coverage of the domain of a text-to-speech system. Since this domain involves an enormous range of contexts, it is criticial for diagnostics, and also for overall evaluation, that test materials cover this range to the fullest extent possible. Automatic text generation algorithms that make extensive use of "greedy" algorithms are described that serve this purpose. Second, speech generated by text-to-speech systems tends to have a great variety of problems . A battery of experimental paradigms is discussed that address different facets of speech quality and intelligibility. Included are: (a) "word pointing" method for detection of problematic concatenative units, (b) "minimal pairs intelligibility test"-an expanded diagnostic rhyme test; (c) automatically scored orthographic name transcription task; (d) mean opinion score paradigm with problem categorization; and (e) paired comparison paradigm with strength-of-choice rating. The methods are applied in a series of experiments on high-end text-to-speech systems.

UR - http://www.scopus.com/inward/record.url?scp=0027147339&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0027147339&partnerID=8YFLogxK

U2 - 10.1006/csla.1993.1004

DO - 10.1006/csla.1993.1004

M3 - Article

AN - SCOPUS:0027147339

VL - 7

SP - 49

EP - 100

JO - Computer Speech and Language

JF - Computer Speech and Language

SN - 0885-2308

IS - 1

ER -