Algorithmic classification of five characteristic types of paraphasias

Gerasimos Fergadiotis, Kyle Gorman, Steven Bedrick

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Purpose: This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). Method: We analyzed 7, 111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Results: Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Conclusion: Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.

Original languageEnglish (US)
Pages (from-to)S776-S787
JournalAmerican Journal of Speech-Language Pathology
Volume25
Issue number4
DOIs
StatePublished - 2016

Fingerprint

Semantics
semantics
Natural Language Processing
Databases
Psycholinguistics
Bryophyta
Language Development
Aphasia
psycholinguistics
speech disorder
diagnostic
Costs and Cost Analysis
costs
language
Research
performance

ASJC Scopus subject areas

  • Otorhinolaryngology
  • Developmental and Educational Psychology
  • Linguistics and Language
  • Speech and Hearing

Cite this

Algorithmic classification of five characteristic types of paraphasias. / Fergadiotis, Gerasimos; Gorman, Kyle; Bedrick, Steven.

In: American Journal of Speech-Language Pathology, Vol. 25, No. 4, 2016, p. S776-S787.

Research output: Contribution to journalArticle

@article{cebe9fd14f9948d8b20985f3b8e89e68,
title = "Algorithmic classification of five characteristic types of paraphasias",
abstract = "Purpose: This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). Method: We analyzed 7, 111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Results: Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97{\%} accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91{\%} accurate, and the overall classification accuracy of the semantic classifier ranged from 86{\%} to 90{\%}. Conclusion: Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.",
author = "Gerasimos Fergadiotis and Kyle Gorman and Steven Bedrick",
year = "2016",
doi = "10.1044/2016_AJSLP-15-0147",
language = "English (US)",
volume = "25",
pages = "S776--S787",
journal = "American Journal of Speech-Language Pathology",
issn = "1058-0360",
publisher = "American Speech-Language-Hearing Association (ASHA)",
number = "4",

}

TY - JOUR

T1 - Algorithmic classification of five characteristic types of paraphasias

AU - Fergadiotis, Gerasimos

AU - Gorman, Kyle

AU - Bedrick, Steven

PY - 2016

Y1 - 2016

N2 - Purpose: This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). Method: We analyzed 7, 111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Results: Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Conclusion: Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.

AB - Purpose: This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). Method: We analyzed 7, 111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Results: Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Conclusion: Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.

UR - http://www.scopus.com/inward/record.url?scp=85007226796&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85007226796&partnerID=8YFLogxK

U2 - 10.1044/2016_AJSLP-15-0147

DO - 10.1044/2016_AJSLP-15-0147

M3 - Article

C2 - 27997952

AN - SCOPUS:85007226796

VL - 25

SP - S776-S787

JO - American Journal of Speech-Language Pathology

JF - American Journal of Speech-Language Pathology

SN - 1058-0360

IS - 4

ER -