TY - JOUR
T1 - Capturing cell type-specific chromatin compartment patterns by applying topic modeling to single-cell Hi-C data
AU - Kim, Hyeon Jin
AU - Yardımcı, Galip Gürkan
AU - Bonora, Giancarlo
AU - Ramani, Vijay
AU - Liu, Jie
AU - Qiu, Ruolan
AU - Lee, Choli
AU - Hesson, Jennifer
AU - Ware, Carol B.
AU - Shendure, Jay
AU - Duan, Zhijun
AU - Noble, William Stafford
N1 - Funding Information:
This work was supported by National Institutes of Health award U54 DK107979. The funders had no role instudy design, data collection and analysis, decision to publish, or preparation of the manuscript.
Publisher Copyright:
© 2020 Kim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2020/9
Y1 - 2020/9
N2 - Single-cell Hi-C (scHi-C) interrogates genome-wide chromatin interaction in individual cells, allowing us to gain insights into 3D genome organization. However, the extremely sparse nature of scHi-C data poses a significant barrier to analysis, limiting our ability to tease out hidden biological information. In this work, we approach this problem by applying topic modeling to scHi-C data. Topic modeling is well-suited for discovering latent topics in a collection of discrete data. For our analysis, we generate nine different single-cell combinatorial indexed Hi-C (sci-Hi-C) libraries from five human cell lines (GM12878, H1Esc, HFF, IMR90, and HAP1), consisting over 19,000 cells. We demonstrate that topic modeling is able to successfully capture cell type differences from sci-Hi-C data in the form of “chromatin topics.” We further show enrichment of particular compartment structures associated with locus pairs in these topics.
AB - Single-cell Hi-C (scHi-C) interrogates genome-wide chromatin interaction in individual cells, allowing us to gain insights into 3D genome organization. However, the extremely sparse nature of scHi-C data poses a significant barrier to analysis, limiting our ability to tease out hidden biological information. In this work, we approach this problem by applying topic modeling to scHi-C data. Topic modeling is well-suited for discovering latent topics in a collection of discrete data. For our analysis, we generate nine different single-cell combinatorial indexed Hi-C (sci-Hi-C) libraries from five human cell lines (GM12878, H1Esc, HFF, IMR90, and HAP1), consisting over 19,000 cells. We demonstrate that topic modeling is able to successfully capture cell type differences from sci-Hi-C data in the form of “chromatin topics.” We further show enrichment of particular compartment structures associated with locus pairs in these topics.
UR - http://www.scopus.com/inward/record.url?scp=85092232232&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85092232232&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1008173
DO - 10.1371/journal.pcbi.1008173
M3 - Article
C2 - 32946435
AN - SCOPUS:85092232232
SN - 1553-734X
VL - 16
JO - PLoS Computational Biology
JF - PLoS Computational Biology
IS - 9
M1 - 1008173
ER -