Improving Data-Driven Methods to Identify and Categorize Transgender Individuals by Gender in Insurance Claims Data

Jaclyn M.W. Hughto; Landon Hughes; Kim Yee; Jae Downing; Jacqueline Ellison; Ash Alpert; Guneet Jasuja; Theresa I. Shireman

doi:10.1089/lgbt.2021.0433

Improving Data-Driven Methods to Identify and Categorize Transgender Individuals by Gender in Insurance Claims Data

Jaclyn M.W. Hughto, Landon Hughes, Kim Yee, Jae Downing, Jacqueline Ellison, Ash Alpert, Guneet Jasuja, Theresa I. Shireman

School Of Public Health

Research output: Contribution to journal › Article › peer-review

5 Scopus citations

Abstract

Purpose: Prior algorithms enabled the identification and gender categorization of transgender people in insurance claims databases in which sex and gender are not simultaneously captured. However, these methods have been unable to categorize the gender of a large proportion of their samples. We improve upon these methods to identify the gender of a larger proportion of transgender people in insurance claims data. Methods: Using 2001-2019 Optum's Clinformatics® Data Mart insurance claims data, we adapted prior algorithms by combining diagnosis, procedure, and pharmacy claims to (1) identify a transgender sample; and (2) stratify the sample by gender category (trans feminine and nonbinary [TFN], trans masculine and nonbinary [TMN], unclassified). We used logistic regression to estimate the burden of 13 chronic health conditions, controlling for gender category, age, race/ethnicity, enrollment length, and census region. Results: We identified 38,598 unique transgender people, comprising 50% [n = 19,252] TMN, 26% (n = 10,040) TFN, and 24% (n = 9306) unclassified individuals. In adjusted models, relative to TMN people, TFN people had significantly higher odds of most chronic health conditions, including HIV, atherosclerotic cardiovascular disorder, myocardial infarction, alcohol use disorder, and drug use disorder. Notably, TMN individuals had significantly higher odds of post-traumatic stress disorder and depression than TFN individuals. Conclusion: By combining complex administrative claims-based algorithms, we identified the largest U.S.-based sample of transgender individuals and inferred the gender of >75% of the sample. Adjusted models extend prior research documenting key health disparities by gender category. These methods may enable researchers to explore rare and sex-specific conditions in hard-to-reach transgender populations.

Original language	English (US)
Pages (from-to)	254-263
Number of pages	10
Journal	LGBT Health
Volume	9
Issue number	4
DOIs	https://doi.org/10.1089/lgbt.2021.0433
State	Published - Jun 1 2022

Keywords

health comorbidities
insurance
methods
transgender

ASJC Scopus subject areas

Dermatology
Obstetrics and Gynecology
Public Health, Environmental and Occupational Health
Psychiatry and Mental health
Urology

Access to Document

10.1089/lgbt.2021.0433

Cite this

@article{3bd6b3a7911a4836862bd773f7227488,

title = "Improving Data-Driven Methods to Identify and Categorize Transgender Individuals by Gender in Insurance Claims Data",

abstract = "Purpose: Prior algorithms enabled the identification and gender categorization of transgender people in insurance claims databases in which sex and gender are not simultaneously captured. However, these methods have been unable to categorize the gender of a large proportion of their samples. We improve upon these methods to identify the gender of a larger proportion of transgender people in insurance claims data. Methods: Using 2001-2019 Optum's Clinformatics{\textregistered} Data Mart insurance claims data, we adapted prior algorithms by combining diagnosis, procedure, and pharmacy claims to (1) identify a transgender sample; and (2) stratify the sample by gender category (trans feminine and nonbinary [TFN], trans masculine and nonbinary [TMN], unclassified). We used logistic regression to estimate the burden of 13 chronic health conditions, controlling for gender category, age, race/ethnicity, enrollment length, and census region. Results: We identified 38,598 unique transgender people, comprising 50% [n = 19,252] TMN, 26% (n = 10,040) TFN, and 24% (n = 9306) unclassified individuals. In adjusted models, relative to TMN people, TFN people had significantly higher odds of most chronic health conditions, including HIV, atherosclerotic cardiovascular disorder, myocardial infarction, alcohol use disorder, and drug use disorder. Notably, TMN individuals had significantly higher odds of post-traumatic stress disorder and depression than TFN individuals. Conclusion: By combining complex administrative claims-based algorithms, we identified the largest U.S.-based sample of transgender individuals and inferred the gender of >75% of the sample. Adjusted models extend prior research documenting key health disparities by gender category. These methods may enable researchers to explore rare and sex-specific conditions in hard-to-reach transgender populations.",

keywords = "health comorbidities, insurance, methods, transgender",

author = "Hughto, {Jaclyn M.W.} and Landon Hughes and Kim Yee and Jae Downing and Jacqueline Ellison and Ash Alpert and Guneet Jasuja and Shireman, {Theresa I.}",

year = "2022",

month = jun,

day = "1",

doi = "10.1089/lgbt.2021.0433",

language = "English (US)",

volume = "9",

pages = "254--263",

journal = "LGBT Health",

issn = "2325-8292",

publisher = "Mary Ann Liebert Inc.",

number = "4",

}

TY - JOUR

T1 - Improving Data-Driven Methods to Identify and Categorize Transgender Individuals by Gender in Insurance Claims Data

AU - Hughto, Jaclyn M.W.

AU - Hughes, Landon

AU - Yee, Kim

AU - Downing, Jae

AU - Ellison, Jacqueline

AU - Alpert, Ash

AU - Jasuja, Guneet

AU - Shireman, Theresa I.

PY - 2022/6/1

Y1 - 2022/6/1

N2 - Purpose: Prior algorithms enabled the identification and gender categorization of transgender people in insurance claims databases in which sex and gender are not simultaneously captured. However, these methods have been unable to categorize the gender of a large proportion of their samples. We improve upon these methods to identify the gender of a larger proportion of transgender people in insurance claims data. Methods: Using 2001-2019 Optum's Clinformatics® Data Mart insurance claims data, we adapted prior algorithms by combining diagnosis, procedure, and pharmacy claims to (1) identify a transgender sample; and (2) stratify the sample by gender category (trans feminine and nonbinary [TFN], trans masculine and nonbinary [TMN], unclassified). We used logistic regression to estimate the burden of 13 chronic health conditions, controlling for gender category, age, race/ethnicity, enrollment length, and census region. Results: We identified 38,598 unique transgender people, comprising 50% [n = 19,252] TMN, 26% (n = 10,040) TFN, and 24% (n = 9306) unclassified individuals. In adjusted models, relative to TMN people, TFN people had significantly higher odds of most chronic health conditions, including HIV, atherosclerotic cardiovascular disorder, myocardial infarction, alcohol use disorder, and drug use disorder. Notably, TMN individuals had significantly higher odds of post-traumatic stress disorder and depression than TFN individuals. Conclusion: By combining complex administrative claims-based algorithms, we identified the largest U.S.-based sample of transgender individuals and inferred the gender of >75% of the sample. Adjusted models extend prior research documenting key health disparities by gender category. These methods may enable researchers to explore rare and sex-specific conditions in hard-to-reach transgender populations.

AB - Purpose: Prior algorithms enabled the identification and gender categorization of transgender people in insurance claims databases in which sex and gender are not simultaneously captured. However, these methods have been unable to categorize the gender of a large proportion of their samples. We improve upon these methods to identify the gender of a larger proportion of transgender people in insurance claims data. Methods: Using 2001-2019 Optum's Clinformatics® Data Mart insurance claims data, we adapted prior algorithms by combining diagnosis, procedure, and pharmacy claims to (1) identify a transgender sample; and (2) stratify the sample by gender category (trans feminine and nonbinary [TFN], trans masculine and nonbinary [TMN], unclassified). We used logistic regression to estimate the burden of 13 chronic health conditions, controlling for gender category, age, race/ethnicity, enrollment length, and census region. Results: We identified 38,598 unique transgender people, comprising 50% [n = 19,252] TMN, 26% (n = 10,040) TFN, and 24% (n = 9306) unclassified individuals. In adjusted models, relative to TMN people, TFN people had significantly higher odds of most chronic health conditions, including HIV, atherosclerotic cardiovascular disorder, myocardial infarction, alcohol use disorder, and drug use disorder. Notably, TMN individuals had significantly higher odds of post-traumatic stress disorder and depression than TFN individuals. Conclusion: By combining complex administrative claims-based algorithms, we identified the largest U.S.-based sample of transgender individuals and inferred the gender of >75% of the sample. Adjusted models extend prior research documenting key health disparities by gender category. These methods may enable researchers to explore rare and sex-specific conditions in hard-to-reach transgender populations.

KW - health comorbidities

KW - insurance

KW - methods

KW - transgender

UR - http://www.scopus.com/inward/record.url?scp=85131226694&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85131226694&partnerID=8YFLogxK

U2 - 10.1089/lgbt.2021.0433

DO - 10.1089/lgbt.2021.0433

M3 - Article

C2 - 35290746

AN - SCOPUS:85131226694

SN - 2325-8292

VL - 9

SP - 254

EP - 263

JO - LGBT Health

JF - LGBT Health

IS - 4

ER -

Improving Data-Driven Methods to Identify and Categorize Transgender Individuals by Gender in Insurance Claims Data

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this