Observer ratings of neighborhoods: Comparison of two methods

Elena Andresen, Theodore K. Malmstrom, Mario Schootman, Fredric D. Wolinsky, J. Philip Miller, Douglas K. Miller

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Background: Although neighborhood characteristics have important relationships with health outcomes, direct observation involves imperfect measurement. The African American Health (AAH) study included two observer neighborhood rating systems (5-item Krause and 18-item AAH Neighborhood Assessment Scale [NAS]), initially fielded at two different waves. Good measurement characteristics were previously shown for both, but there was more rater variability than desired. In 2010 both measures were re-fielded together, with enhanced training and field methods implemented to decrease rater variability while maintaining psychometric properties. Methods. AAH included a poor inner city and more heterogeneous suburban areas. Four interviewers rated 483 blocks, with 120 randomly-selected blocks rated by two interviewers. We conducted confirmatory factor analysis of scales and tested the Krause (5-20 points), AAH 18-item NAS (0-28 points), and a previous 7-item and new 5-item versions of the NAS (0-17 points, 0-11 points). Retest reliability for items (kappa) and scales (Intraclass Correlation Coefficient [ICC]) were calculated overall and among pre-specified subgroups. Linear regression assessed interviewer effects on total scale scores and assessed concurrent validity on lung and lower body functions. Mismeasurement effects on self-rated health were also assessed. Results: Scale scores were better in the suburbs than in the inner city. ICC was poor for the Krause scale (ICC=0.19), but improved if the retests occurred within 10 days (ICC=0.49). The 7-and 5-item NAS scales had better ICCs (0.56 and 0.62, respectively), and were higher (0.71 and 0.73) within 10 days. Rater variability for the Kraus and 5-and 7-item NAS scales was 1-3 points (compared to the supervising rater). Concurrent validity was modest, with residents living in worse neighborhood conditions having worse function. Unadjusted estimates were biased towards the null compared with measurement-error corrected estimates. Conclusions: Enhanced field protocols and rater training did not improve measurement quality. Specifically, retest reliability and interviewer variability remained problematic. Measurement error partially reduced, but did not eliminate concurrent validity, suggesting there are robust associations between neighborhood characteristics and health outcomes. We conclude that the 5-item AAH NAS has sufficient reliability and validity for further use. Additional research on the measurement properties of environmental rating methods is encouraged.

Original languageEnglish (US)
Article number1024
JournalBMC Public Health
Volume13
Issue number1
DOIs
StatePublished - 2013

Fingerprint

African Americans
Health
Interviews
Epidemiologic Effect Modifiers
Psychometrics
Reproducibility of Results
Statistical Factor Analysis
Linear Models
Observation
Lung
Research

ASJC Scopus subject areas

  • Public Health, Environmental and Occupational Health

Cite this

Andresen, E., Malmstrom, T. K., Schootman, M., Wolinsky, F. D., Philip Miller, J., & Miller, D. K. (2013). Observer ratings of neighborhoods: Comparison of two methods. BMC Public Health, 13(1), [1024]. https://doi.org/10.1186/1471-2458-13-1024

Observer ratings of neighborhoods : Comparison of two methods. / Andresen, Elena; Malmstrom, Theodore K.; Schootman, Mario; Wolinsky, Fredric D.; Philip Miller, J.; Miller, Douglas K.

In: BMC Public Health, Vol. 13, No. 1, 1024, 2013.

Research output: Contribution to journalArticle

Andresen, E, Malmstrom, TK, Schootman, M, Wolinsky, FD, Philip Miller, J & Miller, DK 2013, 'Observer ratings of neighborhoods: Comparison of two methods', BMC Public Health, vol. 13, no. 1, 1024. https://doi.org/10.1186/1471-2458-13-1024
Andresen E, Malmstrom TK, Schootman M, Wolinsky FD, Philip Miller J, Miller DK. Observer ratings of neighborhoods: Comparison of two methods. BMC Public Health. 2013;13(1). 1024. https://doi.org/10.1186/1471-2458-13-1024
Andresen, Elena ; Malmstrom, Theodore K. ; Schootman, Mario ; Wolinsky, Fredric D. ; Philip Miller, J. ; Miller, Douglas K. / Observer ratings of neighborhoods : Comparison of two methods. In: BMC Public Health. 2013 ; Vol. 13, No. 1.
@article{15059c80a6384283a700b6a26f381628,
title = "Observer ratings of neighborhoods: Comparison of two methods",
abstract = "Background: Although neighborhood characteristics have important relationships with health outcomes, direct observation involves imperfect measurement. The African American Health (AAH) study included two observer neighborhood rating systems (5-item Krause and 18-item AAH Neighborhood Assessment Scale [NAS]), initially fielded at two different waves. Good measurement characteristics were previously shown for both, but there was more rater variability than desired. In 2010 both measures were re-fielded together, with enhanced training and field methods implemented to decrease rater variability while maintaining psychometric properties. Methods. AAH included a poor inner city and more heterogeneous suburban areas. Four interviewers rated 483 blocks, with 120 randomly-selected blocks rated by two interviewers. We conducted confirmatory factor analysis of scales and tested the Krause (5-20 points), AAH 18-item NAS (0-28 points), and a previous 7-item and new 5-item versions of the NAS (0-17 points, 0-11 points). Retest reliability for items (kappa) and scales (Intraclass Correlation Coefficient [ICC]) were calculated overall and among pre-specified subgroups. Linear regression assessed interviewer effects on total scale scores and assessed concurrent validity on lung and lower body functions. Mismeasurement effects on self-rated health were also assessed. Results: Scale scores were better in the suburbs than in the inner city. ICC was poor for the Krause scale (ICC=0.19), but improved if the retests occurred within 10 days (ICC=0.49). The 7-and 5-item NAS scales had better ICCs (0.56 and 0.62, respectively), and were higher (0.71 and 0.73) within 10 days. Rater variability for the Kraus and 5-and 7-item NAS scales was 1-3 points (compared to the supervising rater). Concurrent validity was modest, with residents living in worse neighborhood conditions having worse function. Unadjusted estimates were biased towards the null compared with measurement-error corrected estimates. Conclusions: Enhanced field protocols and rater training did not improve measurement quality. Specifically, retest reliability and interviewer variability remained problematic. Measurement error partially reduced, but did not eliminate concurrent validity, suggesting there are robust associations between neighborhood characteristics and health outcomes. We conclude that the 5-item AAH NAS has sufficient reliability and validity for further use. Additional research on the measurement properties of environmental rating methods is encouraged.",
author = "Elena Andresen and Malmstrom, {Theodore K.} and Mario Schootman and Wolinsky, {Fredric D.} and {Philip Miller}, J. and Miller, {Douglas K.}",
year = "2013",
doi = "10.1186/1471-2458-13-1024",
language = "English (US)",
volume = "13",
journal = "BMC Public Health",
issn = "1471-2458",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Observer ratings of neighborhoods

T2 - Comparison of two methods

AU - Andresen, Elena

AU - Malmstrom, Theodore K.

AU - Schootman, Mario

AU - Wolinsky, Fredric D.

AU - Philip Miller, J.

AU - Miller, Douglas K.

PY - 2013

Y1 - 2013

N2 - Background: Although neighborhood characteristics have important relationships with health outcomes, direct observation involves imperfect measurement. The African American Health (AAH) study included two observer neighborhood rating systems (5-item Krause and 18-item AAH Neighborhood Assessment Scale [NAS]), initially fielded at two different waves. Good measurement characteristics were previously shown for both, but there was more rater variability than desired. In 2010 both measures were re-fielded together, with enhanced training and field methods implemented to decrease rater variability while maintaining psychometric properties. Methods. AAH included a poor inner city and more heterogeneous suburban areas. Four interviewers rated 483 blocks, with 120 randomly-selected blocks rated by two interviewers. We conducted confirmatory factor analysis of scales and tested the Krause (5-20 points), AAH 18-item NAS (0-28 points), and a previous 7-item and new 5-item versions of the NAS (0-17 points, 0-11 points). Retest reliability for items (kappa) and scales (Intraclass Correlation Coefficient [ICC]) were calculated overall and among pre-specified subgroups. Linear regression assessed interviewer effects on total scale scores and assessed concurrent validity on lung and lower body functions. Mismeasurement effects on self-rated health were also assessed. Results: Scale scores were better in the suburbs than in the inner city. ICC was poor for the Krause scale (ICC=0.19), but improved if the retests occurred within 10 days (ICC=0.49). The 7-and 5-item NAS scales had better ICCs (0.56 and 0.62, respectively), and were higher (0.71 and 0.73) within 10 days. Rater variability for the Kraus and 5-and 7-item NAS scales was 1-3 points (compared to the supervising rater). Concurrent validity was modest, with residents living in worse neighborhood conditions having worse function. Unadjusted estimates were biased towards the null compared with measurement-error corrected estimates. Conclusions: Enhanced field protocols and rater training did not improve measurement quality. Specifically, retest reliability and interviewer variability remained problematic. Measurement error partially reduced, but did not eliminate concurrent validity, suggesting there are robust associations between neighborhood characteristics and health outcomes. We conclude that the 5-item AAH NAS has sufficient reliability and validity for further use. Additional research on the measurement properties of environmental rating methods is encouraged.

AB - Background: Although neighborhood characteristics have important relationships with health outcomes, direct observation involves imperfect measurement. The African American Health (AAH) study included two observer neighborhood rating systems (5-item Krause and 18-item AAH Neighborhood Assessment Scale [NAS]), initially fielded at two different waves. Good measurement characteristics were previously shown for both, but there was more rater variability than desired. In 2010 both measures were re-fielded together, with enhanced training and field methods implemented to decrease rater variability while maintaining psychometric properties. Methods. AAH included a poor inner city and more heterogeneous suburban areas. Four interviewers rated 483 blocks, with 120 randomly-selected blocks rated by two interviewers. We conducted confirmatory factor analysis of scales and tested the Krause (5-20 points), AAH 18-item NAS (0-28 points), and a previous 7-item and new 5-item versions of the NAS (0-17 points, 0-11 points). Retest reliability for items (kappa) and scales (Intraclass Correlation Coefficient [ICC]) were calculated overall and among pre-specified subgroups. Linear regression assessed interviewer effects on total scale scores and assessed concurrent validity on lung and lower body functions. Mismeasurement effects on self-rated health were also assessed. Results: Scale scores were better in the suburbs than in the inner city. ICC was poor for the Krause scale (ICC=0.19), but improved if the retests occurred within 10 days (ICC=0.49). The 7-and 5-item NAS scales had better ICCs (0.56 and 0.62, respectively), and were higher (0.71 and 0.73) within 10 days. Rater variability for the Kraus and 5-and 7-item NAS scales was 1-3 points (compared to the supervising rater). Concurrent validity was modest, with residents living in worse neighborhood conditions having worse function. Unadjusted estimates were biased towards the null compared with measurement-error corrected estimates. Conclusions: Enhanced field protocols and rater training did not improve measurement quality. Specifically, retest reliability and interviewer variability remained problematic. Measurement error partially reduced, but did not eliminate concurrent validity, suggesting there are robust associations between neighborhood characteristics and health outcomes. We conclude that the 5-item AAH NAS has sufficient reliability and validity for further use. Additional research on the measurement properties of environmental rating methods is encouraged.

UR - http://www.scopus.com/inward/record.url?scp=84886379079&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84886379079&partnerID=8YFLogxK

U2 - 10.1186/1471-2458-13-1024

DO - 10.1186/1471-2458-13-1024

M3 - Article

C2 - 24168373

AN - SCOPUS:84886379079

VL - 13

JO - BMC Public Health

JF - BMC Public Health

SN - 1471-2458

IS - 1

M1 - 1024

ER -