TY - JOUR
T1 - Algorithmic classification of five characteristic types of paraphasias
AU - Fergadiotis, Gerasimos
AU - Gorman, Kyle
AU - Bedrick, Steven
N1 - Publisher Copyright:
© 2016 American Speech-Language-Hearing Association.
PY - 2016
Y1 - 2016
N2 - Purpose: This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). Method: We analyzed 7, 111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Results: Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Conclusion: Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.
AB - Purpose: This study was intended to evaluate a series of algorithms developed to perform automatic classification of paraphasic errors (formal, semantic, mixed, neologistic, and unrelated errors). Method: We analyzed 7, 111 paraphasias from the Moss Aphasia Psycholinguistics Project Database (Mirman et al., 2010) and evaluated the classification accuracy of 3 automated tools. First, we used frequency norms from the SUBTLEXus database (Brysbaert & New, 2009) to differentiate nonword errors and real-word productions. Then we implemented a phonological-similarity algorithm to identify phonologically related real-word errors. Last, we assessed the performance of a semantic-similarity criterion that was based on word2vec (Mikolov, Yih, & Zweig, 2013). Results: Overall, the algorithmic classification replicated human scoring for the major categories of paraphasias studied with high accuracy. The tool that was based on the SUBTLEXus frequency norms was more than 97% accurate in making lexicality judgments. The phonological-similarity criterion was approximately 91% accurate, and the overall classification accuracy of the semantic classifier ranged from 86% to 90%. Conclusion: Overall, the results highlight the potential of tools from the field of natural language processing for the development of highly reliable, cost-effective diagnostic tools suitable for collecting high-quality measurement data for research and clinical purposes.
UR - http://www.scopus.com/inward/record.url?scp=85007226796&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85007226796&partnerID=8YFLogxK
U2 - 10.1044/2016_AJSLP-15-0147
DO - 10.1044/2016_AJSLP-15-0147
M3 - Article
C2 - 27997952
AN - SCOPUS:85007226796
SN - 1058-0360
VL - 25
SP - S776-S787
JO - American Journal of Speech-Language Pathology
JF - American Journal of Speech-Language Pathology
IS - 4
ER -