TY - JOUR
T1 - Transcriptomic signatures across human tissues identify functional rare genetic variation
AU - The GTEx Consortium
AU - Aguet, François
AU - Barbeira, Alvaro N.
AU - Bonazzola, Rodrigo
AU - Brown, Andrew
AU - Castel, Stephane E.
AU - Jo, Brian
AU - Kasela, Silva
AU - Kim-Hellmuth, Sarah
AU - Liang, Yanyu
AU - Oliva, Meritxell
AU - Flynn, Elise D.
AU - Parsana, Princy
AU - Fresard, Laure
AU - Gamazon, Eric R.
AU - Hamel, Andrew R.
AU - He, Yuan
AU - Hormozdiari, Farhad
AU - Mohammadi, Pejman
AU - Muñoz-Aguirre, Manuel
AU - Park, Yo Son
AU - Saha, Ashis
AU - Segrè, Ayellet V.
AU - Strober, Benjamin J.
AU - Wen, Xiaoquan
AU - Wucher, Valentin
AU - Ardlie, Kristin G.
AU - Battle, Alexis
AU - Brown, Christopher D.
AU - Cox, Nancy
AU - Das, Sayantan
AU - Dermitzakis, Emmanouil T.
AU - Engelhardt, Barbara E.
AU - Garrido-Martín, Diego
AU - Gay, Nicole R.
AU - Getz, Gad A.
AU - Guigó, Roderic
AU - Handsaker, Robert E.
AU - Hoffman, Paul J.
AU - Im, Hae Kyung
AU - Kashin, Seva
AU - Kwong, Alan
AU - Lappalainen, Tuuli
AU - Li, Xiao
AU - MacArthur, Daniel G.
AU - Montgomery, Stephen B.
AU - Rouhana, John M.
AU - Stephens, Matthew
AU - Stranger, Barbara E.
AU - Todres, Ellen
AU - Conrad, Donald F.
N1 - Funding Information:
This work was supported by the Common Fund of the Office of the Director, U.S. National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, NIA, NIAID, and NINDS through NIH contracts HHSN261200800001E (Leidos Prime contract with NCI: A.M.S., D.E.T., N.V.R., J.A.M., L.S., M.E.B., L.Q., T.K., D.B., K.R., and A.U.), 10XS170 (NDRI: W.F.L., J.A.T., G.K., A.M., S.S., R.H., G.Wa., M.J., M.Wa., L.E.B., C.J., J.W., B.R., M.Hu., K.M., L.A.S., H.M.G., M.Mo., and L.K.B.), 10XS171 (Roswell Park Cancer Institute: B.A.F., M.T.M., E.K., B.M.G., K.D.R., and J.B.), 10X172 (Science Care Inc.), 12ST1039 (IDOX), 10ST1035 (Van Andel Institute: S.D.J., D.C.R., and D.R.V.), HHSN268201000029C (Broad Institute: F.A., G.G., K.G.A., A.V.S., X.Li., E.T., S.G., A.G., S.A., K.H.H., D.T.N., K.H., S.R.M., and J.L.N.), 5U41HG009494 (F.A., G.G., and K.G.A.), and through NIH grants R01 DA006227-17 (University of Miami Brain Bank: D.C.M. and D.A.D.), Supplement to University of Miami grant DA006227 (D.C.M. and D.A.D.), R01 MH090941 (University of Geneva), R01 MH090951 and R01 MH090937 (University of Chicago), R01 MH090936 (University of North Carolina?Chapel Hill), R01MH101814 (M.M.-A., V.W., S.B.M., R.G., E.T.D., D.G.-M., and A.V.), U01HG007593 (S.B.M.), R01MH101822 (C.D.B.), U01HG007598 (M.O. and B.E.S.), U01MH104393 (A.P.F.), extension H002371 to 5U41HG002371 (W.J.K.), as well as other funding sources: R01MH106842 (T.L., P.M., E.F., and P.J.H.), R01HL142028 (T.L., Si.Ka., and P.J.H.), R01GM122924 (T.L. and S.E.C.), R01MH107666 (H.K.I.), P30DK020595 (H.K.I.), UM1HG008901 (T.L.), R01GM124486 (T.L.), R01HG010067 (Y.Pa.), R01HG002585 (G.Wa. and M.St.), Gordon and Betty Moore Foundation GBMF 4559 (G.Wa. and M.St.), 1K99HG009916-01 (S.E.C.), R01HG006855 (Se.Ka. and R.E.H.), BIO2015-70777-P, Ministerio de Economia y Competitividad and FEDER funds (M.M.-A., V.W., R.G., and D.G.-M.), la Caixa Foundation ID 100010434 under agreement LCF/BQ/SO15/52260001 (D.G.-M.), NIH CTSA grant UL1TR002550-01 (P.M.), Marie-Sk?odowska Curie fellowship H2020 Grant 706636 (S.K.-H.), R35HG010718 (E.R.G.), FPU15/ 03635, Ministerio de Educaci?n, Cultura y Deporte (M.M.-A.), R01MH109905, 1R01HG010480 (A.Ba.), Searle Scholar Program (A.Ba.), R01HG008150 (S.B.M.), 5T32HG000044-22, NHGRI Institutional Training Grant in Genome Science (N.R.G.), EU IMI program (UE7-DIRECT-115317-1) (E.T.D. and A.V.), FNS funded project RNA1 (31003A_149984) (E.T.D. and A.V.), DK110919 (F.H.), F32HG009987 (F.H.), Massachusetts Lions Eye Research Fund Grant (A.R.H.), Wellcome grant WT108749/Z/15/Z (P.F.), and European Molecular Biology Laboratory (P.F. and D.Z.).
Publisher Copyright:
© 2020 American Association for the Advancement of Science. All rights reserved.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
PY - 2020/9
Y1 - 2020/9
N2 - INTRODUCTION: The human genome contains tens of thousands of rare (minor allele frequency <1%) variants, some of which contribute to disease risk. Using 838 samples with whole-genome and multitissue transcriptome sequencing data in the Genotype-Tissue Expression (GTEx) project version 8, we assessed how rare genetic variants contribute to extreme patterns in gene expression (eOutliers), allelic expression (aseOutliers), and alternative splicing (sOutliers). We integrated these three signals across 49 tissues with genomic annotations to prioritize high-impact rare variants (RVs) that associate with human traits. RATIONALE: Outlier gene expression aids in identifying functional RVs. Transcriptome sequencing provides diverse measurements beyond gene expression, including allele-specific expression and alternative splicing, which can provide additional insight into RV functional effects. RESULTS: After identifying multitissue eOutliers, aseOutliers, and sOutliers, we found that outlier individuals of each type were significantly more likely to carry an RV near the corresponding gene. Among eOutliers, we observed strong enrichment of rare structural variants. sOutliers were particularly enriched for RVs that disrupted or created a splicing consensus sequence. aseOutliers provided the strongest enrichment signal when evaluated from just a single tissue. We developed Watershed, a probabilistic model for personal genome interpretation that improves over standard genomic annotation–based methods for scoring RVs by integrating these three transcriptomic signals from the same individual and replicates in an independent cohort. To assess whether outlier RVs identified in GTEx associate with traits, we evaluated these variants for association with diverse traits in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. We found that transcriptome-assisted prioritization identified RVs with larger trait effect sizes and were better predictors of effect size than genomic annotation alone. CONCLUSION: With >800 genomes matched with transcriptomes across 49 tissues, we were able to study RVs that underlie extreme changes in the transcriptome. To capture the diversity of these extreme changes, we developed and integrated approaches to identify expression, allele-specific expression, and alternative splicing outliers, and characterized the RV landscape underlying each outlier signal. We demonstrate that personal genome interpretation and RV discovery is enhanced by using these signals. This approach provides a new means to integrate a richer set of functional RVs into models of genetic burden, improve disease gene identification, and enable the delivery of precision genomics.
AB - INTRODUCTION: The human genome contains tens of thousands of rare (minor allele frequency <1%) variants, some of which contribute to disease risk. Using 838 samples with whole-genome and multitissue transcriptome sequencing data in the Genotype-Tissue Expression (GTEx) project version 8, we assessed how rare genetic variants contribute to extreme patterns in gene expression (eOutliers), allelic expression (aseOutliers), and alternative splicing (sOutliers). We integrated these three signals across 49 tissues with genomic annotations to prioritize high-impact rare variants (RVs) that associate with human traits. RATIONALE: Outlier gene expression aids in identifying functional RVs. Transcriptome sequencing provides diverse measurements beyond gene expression, including allele-specific expression and alternative splicing, which can provide additional insight into RV functional effects. RESULTS: After identifying multitissue eOutliers, aseOutliers, and sOutliers, we found that outlier individuals of each type were significantly more likely to carry an RV near the corresponding gene. Among eOutliers, we observed strong enrichment of rare structural variants. sOutliers were particularly enriched for RVs that disrupted or created a splicing consensus sequence. aseOutliers provided the strongest enrichment signal when evaluated from just a single tissue. We developed Watershed, a probabilistic model for personal genome interpretation that improves over standard genomic annotation–based methods for scoring RVs by integrating these three transcriptomic signals from the same individual and replicates in an independent cohort. To assess whether outlier RVs identified in GTEx associate with traits, we evaluated these variants for association with diverse traits in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. We found that transcriptome-assisted prioritization identified RVs with larger trait effect sizes and were better predictors of effect size than genomic annotation alone. CONCLUSION: With >800 genomes matched with transcriptomes across 49 tissues, we were able to study RVs that underlie extreme changes in the transcriptome. To capture the diversity of these extreme changes, we developed and integrated approaches to identify expression, allele-specific expression, and alternative splicing outliers, and characterized the RV landscape underlying each outlier signal. We demonstrate that personal genome interpretation and RV discovery is enhanced by using these signals. This approach provides a new means to integrate a richer set of functional RVs into models of genetic burden, improve disease gene identification, and enable the delivery of precision genomics.
UR - http://www.scopus.com/inward/record.url?scp=85096430793&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096430793&partnerID=8YFLogxK
U2 - 10.1126/SCIENCE.AAZ5900
DO - 10.1126/SCIENCE.AAZ5900
M3 - Article
C2 - 32913073
AN - SCOPUS:85090818075
VL - 369
JO - Science
JF - Science
SN - 0036-8075
IS - 6509
M1 - 1334
ER -