An application of data mining to identify potential risk factors for anophthalmia and microphthalmia

National Birth Defects Prevention Study

Research output: Contribution to journalArticle

Abstract

Background: We examined a large number of variables to generate new hypotheses regarding a wider range of risk factors for anophthalmia/microphthalmia using data mining. Methods: Data were from the National Birth Defects Prevention Study, a multicentre, case-control study from 10 centres in the United States. There were 134 cases of “isolated” and 87 “nonisolated” (with other major birth defects) of anophthalmia/microphthalmia and 11 052 nonmalformed controls with delivery dates October 1997-December 2011. Using random forest, a data mining procedure, we compared the two case types with controls for 201 variables. Variables considered important ranked by random forest were included in a multivariable logistic regression model to estimate odds ratios and 95% confidence intervals. Results: Predictors for isolated cases included paternal race/ethnicity, maternal intake of certain nutrients and foods, and childhood health problems in relatives. Using regression, inverse associations were observed with greater maternal education and with increasing intake of folate and potatoes. Odds were slightly higher with greater paternal education, for increased intake of carbohydrates and beans, and if relatives had a childhood health problem. For nonisolated cases, predictors included paternal race/ethnicity, maternal intake of certain nutrients, and smoking in the home the month before conception. Odds were higher for Hispanic fathers and smoking in the home and NSAID use the month before conception. Conclusions: Results appear to support previously hypothesised risk factors, socio-economic status, NSAID use, and inadequate folate intake, and potentially provide new areas such as passive smoking pre-pregnancy, and paternal education and ethnicity, to explore for further understanding of anophthalmia/microphthalmia.

Original languageEnglish (US)
JournalPaediatric and Perinatal Epidemiology
DOIs
StateAccepted/In press - Jan 1 2018

Fingerprint

Anophthalmos
Microphthalmos
Data Mining
Mothers
Non-Steroidal Anti-Inflammatory Agents
Education
Folic Acid
Food
Logistic Models
Smoking
Tobacco Smoke Pollution
Health
Solanum tuberosum
Hispanic Americans
Fathers
Multicenter Studies
Case-Control Studies
Odds Ratio
Economics
Carbohydrates

Keywords

  • anophthalmia
  • birth defects
  • data mining
  • microphthalmia
  • random forest

ASJC Scopus subject areas

  • Epidemiology
  • Pediatrics, Perinatology, and Child Health

Cite this

An application of data mining to identify potential risk factors for anophthalmia and microphthalmia. / National Birth Defects Prevention Study.

In: Paediatric and Perinatal Epidemiology, 01.01.2018.

Research output: Contribution to journalArticle

@article{87b297f9a8524129911ae12b53917651,
title = "An application of data mining to identify potential risk factors for anophthalmia and microphthalmia",
abstract = "Background: We examined a large number of variables to generate new hypotheses regarding a wider range of risk factors for anophthalmia/microphthalmia using data mining. Methods: Data were from the National Birth Defects Prevention Study, a multicentre, case-control study from 10 centres in the United States. There were 134 cases of “isolated” and 87 “nonisolated” (with other major birth defects) of anophthalmia/microphthalmia and 11 052 nonmalformed controls with delivery dates October 1997-December 2011. Using random forest, a data mining procedure, we compared the two case types with controls for 201 variables. Variables considered important ranked by random forest were included in a multivariable logistic regression model to estimate odds ratios and 95{\%} confidence intervals. Results: Predictors for isolated cases included paternal race/ethnicity, maternal intake of certain nutrients and foods, and childhood health problems in relatives. Using regression, inverse associations were observed with greater maternal education and with increasing intake of folate and potatoes. Odds were slightly higher with greater paternal education, for increased intake of carbohydrates and beans, and if relatives had a childhood health problem. For nonisolated cases, predictors included paternal race/ethnicity, maternal intake of certain nutrients, and smoking in the home the month before conception. Odds were higher for Hispanic fathers and smoking in the home and NSAID use the month before conception. Conclusions: Results appear to support previously hypothesised risk factors, socio-economic status, NSAID use, and inadequate folate intake, and potentially provide new areas such as passive smoking pre-pregnancy, and paternal education and ethnicity, to explore for further understanding of anophthalmia/microphthalmia.",
keywords = "anophthalmia, birth defects, data mining, microphthalmia, random forest",
author = "{National Birth Defects Prevention Study} and Weber, {Kari A.} and Wei Yang and Carmichael, {Suzan L.} and Lupo, {Philip J.} and Stephanie Dukhovny and Yazdy, {Mahsa M.} and Lin, {Angela E.} and {Van Bennekom}, {Carla M.} and Mitchell, {Allen A.} and Shaw, {Gary M.}",
year = "2018",
month = "1",
day = "1",
doi = "10.1111/ppe.12509",
language = "English (US)",
journal = "Paediatric and Perinatal Epidemiology",
issn = "0269-5022",
publisher = "Wiley-Blackwell",

}

TY - JOUR

T1 - An application of data mining to identify potential risk factors for anophthalmia and microphthalmia

AU - National Birth Defects Prevention Study

AU - Weber, Kari A.

AU - Yang, Wei

AU - Carmichael, Suzan L.

AU - Lupo, Philip J.

AU - Dukhovny, Stephanie

AU - Yazdy, Mahsa M.

AU - Lin, Angela E.

AU - Van Bennekom, Carla M.

AU - Mitchell, Allen A.

AU - Shaw, Gary M.

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Background: We examined a large number of variables to generate new hypotheses regarding a wider range of risk factors for anophthalmia/microphthalmia using data mining. Methods: Data were from the National Birth Defects Prevention Study, a multicentre, case-control study from 10 centres in the United States. There were 134 cases of “isolated” and 87 “nonisolated” (with other major birth defects) of anophthalmia/microphthalmia and 11 052 nonmalformed controls with delivery dates October 1997-December 2011. Using random forest, a data mining procedure, we compared the two case types with controls for 201 variables. Variables considered important ranked by random forest were included in a multivariable logistic regression model to estimate odds ratios and 95% confidence intervals. Results: Predictors for isolated cases included paternal race/ethnicity, maternal intake of certain nutrients and foods, and childhood health problems in relatives. Using regression, inverse associations were observed with greater maternal education and with increasing intake of folate and potatoes. Odds were slightly higher with greater paternal education, for increased intake of carbohydrates and beans, and if relatives had a childhood health problem. For nonisolated cases, predictors included paternal race/ethnicity, maternal intake of certain nutrients, and smoking in the home the month before conception. Odds were higher for Hispanic fathers and smoking in the home and NSAID use the month before conception. Conclusions: Results appear to support previously hypothesised risk factors, socio-economic status, NSAID use, and inadequate folate intake, and potentially provide new areas such as passive smoking pre-pregnancy, and paternal education and ethnicity, to explore for further understanding of anophthalmia/microphthalmia.

AB - Background: We examined a large number of variables to generate new hypotheses regarding a wider range of risk factors for anophthalmia/microphthalmia using data mining. Methods: Data were from the National Birth Defects Prevention Study, a multicentre, case-control study from 10 centres in the United States. There were 134 cases of “isolated” and 87 “nonisolated” (with other major birth defects) of anophthalmia/microphthalmia and 11 052 nonmalformed controls with delivery dates October 1997-December 2011. Using random forest, a data mining procedure, we compared the two case types with controls for 201 variables. Variables considered important ranked by random forest were included in a multivariable logistic regression model to estimate odds ratios and 95% confidence intervals. Results: Predictors for isolated cases included paternal race/ethnicity, maternal intake of certain nutrients and foods, and childhood health problems in relatives. Using regression, inverse associations were observed with greater maternal education and with increasing intake of folate and potatoes. Odds were slightly higher with greater paternal education, for increased intake of carbohydrates and beans, and if relatives had a childhood health problem. For nonisolated cases, predictors included paternal race/ethnicity, maternal intake of certain nutrients, and smoking in the home the month before conception. Odds were higher for Hispanic fathers and smoking in the home and NSAID use the month before conception. Conclusions: Results appear to support previously hypothesised risk factors, socio-economic status, NSAID use, and inadequate folate intake, and potentially provide new areas such as passive smoking pre-pregnancy, and paternal education and ethnicity, to explore for further understanding of anophthalmia/microphthalmia.

KW - anophthalmia

KW - birth defects

KW - data mining

KW - microphthalmia

KW - random forest

UR - http://www.scopus.com/inward/record.url?scp=85054605862&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054605862&partnerID=8YFLogxK

U2 - 10.1111/ppe.12509

DO - 10.1111/ppe.12509

M3 - Article

JO - Paediatric and Perinatal Epidemiology

JF - Paediatric and Perinatal Epidemiology

SN - 0269-5022

ER -