A Monte Carlo evaluation of weighted community detection algorithms

Kathleen M. Gates, Teague Henry, Doug Steinley, Damien Fair

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

The past decade has been marked with a proliferation of community detection algorithms that aim to organize nodes (e.g., individuals, brain regions, variables) into modular structures that indicate subgroups, clusters, or communities. Motivated by the emergence of big data across many fields of inquiry, these methodological developments have primarily focused on the detection of communities of nodes from matrices that are very large. However, it remains unknown if the algorithms can reliably detect communities in smaller graph sizes (i.e., 1000 nodes and fewer) which are commonly used in brain research. More importantly, these algorithms have predominantly been tested only on binary or sparse count matrices and it remains unclear the degree to which the algorithms can recover community structure for different types of matrices, such as the often used cross-correlation matrices representing functional connectivity across predefined brain regions. Of the publicly available approaches for weighted graphs that can detect communities in graph sizes of at least 1000, prior research has demonstrated that Newman’s spectral approach (i.e., Leading Eigenvalue), Walktrap, Fast Modularity, the Louvain method (i.e., multilevel community method), Label Propagation, and Infomap all recover communities exceptionally well in certain circumstances. The purpose of the present Monte Carlo simulation study is to test these methods across a large number of conditions, including varied graph sizes and types of matrix (sparse count, correlation, and reflected Euclidean distance), to identify which algorithm is optimal for specific types of data matrices. The results indicate that when the data are in the form of sparse count networks (such as those seen in diffusion tensor imaging), Label Propagation and Walktrap surfaced as the most reliable methods for community detection. For dense, weighted networks such as correlation matrices capturing functional connectivity, Walktrap consistently outperformed the other approaches for recovering communities.

Original languageEnglish (US)
Article number45
JournalFrontiers in Neuroinformatics
Volume10
Issue numberNOV
DOIs
StatePublished - Nov 10 2016

Fingerprint

Brain
Labels
Diffusion Tensor Imaging
Diffusion tensor imaging
Research
Big data
Monte Carlo simulation

Keywords

  • Community detection
  • Functional connectivity
  • Functional mri
  • Modules
  • Monte carlo simulation

ASJC Scopus subject areas

  • Neuroscience (miscellaneous)
  • Biomedical Engineering
  • Computer Science Applications

Cite this

A Monte Carlo evaluation of weighted community detection algorithms. / Gates, Kathleen M.; Henry, Teague; Steinley, Doug; Fair, Damien.

In: Frontiers in Neuroinformatics, Vol. 10, No. NOV, 45, 10.11.2016.

Research output: Contribution to journalArticle

Gates, Kathleen M. ; Henry, Teague ; Steinley, Doug ; Fair, Damien. / A Monte Carlo evaluation of weighted community detection algorithms. In: Frontiers in Neuroinformatics. 2016 ; Vol. 10, No. NOV.
@article{01148640e16c4b39821a016267b10b4b,
title = "A Monte Carlo evaluation of weighted community detection algorithms",
abstract = "The past decade has been marked with a proliferation of community detection algorithms that aim to organize nodes (e.g., individuals, brain regions, variables) into modular structures that indicate subgroups, clusters, or communities. Motivated by the emergence of big data across many fields of inquiry, these methodological developments have primarily focused on the detection of communities of nodes from matrices that are very large. However, it remains unknown if the algorithms can reliably detect communities in smaller graph sizes (i.e., 1000 nodes and fewer) which are commonly used in brain research. More importantly, these algorithms have predominantly been tested only on binary or sparse count matrices and it remains unclear the degree to which the algorithms can recover community structure for different types of matrices, such as the often used cross-correlation matrices representing functional connectivity across predefined brain regions. Of the publicly available approaches for weighted graphs that can detect communities in graph sizes of at least 1000, prior research has demonstrated that Newman’s spectral approach (i.e., Leading Eigenvalue), Walktrap, Fast Modularity, the Louvain method (i.e., multilevel community method), Label Propagation, and Infomap all recover communities exceptionally well in certain circumstances. The purpose of the present Monte Carlo simulation study is to test these methods across a large number of conditions, including varied graph sizes and types of matrix (sparse count, correlation, and reflected Euclidean distance), to identify which algorithm is optimal for specific types of data matrices. The results indicate that when the data are in the form of sparse count networks (such as those seen in diffusion tensor imaging), Label Propagation and Walktrap surfaced as the most reliable methods for community detection. For dense, weighted networks such as correlation matrices capturing functional connectivity, Walktrap consistently outperformed the other approaches for recovering communities.",
keywords = "Community detection, Functional connectivity, Functional mri, Modules, Monte carlo simulation",
author = "Gates, {Kathleen M.} and Teague Henry and Doug Steinley and Damien Fair",
year = "2016",
month = "11",
day = "10",
doi = "10.3389/fninf.2016.00045",
language = "English (US)",
volume = "10",
journal = "Frontiers in Neuroinformatics",
issn = "1662-5196",
publisher = "Frontiers Research Foundation",
number = "NOV",

}

TY - JOUR

T1 - A Monte Carlo evaluation of weighted community detection algorithms

AU - Gates, Kathleen M.

AU - Henry, Teague

AU - Steinley, Doug

AU - Fair, Damien

PY - 2016/11/10

Y1 - 2016/11/10

N2 - The past decade has been marked with a proliferation of community detection algorithms that aim to organize nodes (e.g., individuals, brain regions, variables) into modular structures that indicate subgroups, clusters, or communities. Motivated by the emergence of big data across many fields of inquiry, these methodological developments have primarily focused on the detection of communities of nodes from matrices that are very large. However, it remains unknown if the algorithms can reliably detect communities in smaller graph sizes (i.e., 1000 nodes and fewer) which are commonly used in brain research. More importantly, these algorithms have predominantly been tested only on binary or sparse count matrices and it remains unclear the degree to which the algorithms can recover community structure for different types of matrices, such as the often used cross-correlation matrices representing functional connectivity across predefined brain regions. Of the publicly available approaches for weighted graphs that can detect communities in graph sizes of at least 1000, prior research has demonstrated that Newman’s spectral approach (i.e., Leading Eigenvalue), Walktrap, Fast Modularity, the Louvain method (i.e., multilevel community method), Label Propagation, and Infomap all recover communities exceptionally well in certain circumstances. The purpose of the present Monte Carlo simulation study is to test these methods across a large number of conditions, including varied graph sizes and types of matrix (sparse count, correlation, and reflected Euclidean distance), to identify which algorithm is optimal for specific types of data matrices. The results indicate that when the data are in the form of sparse count networks (such as those seen in diffusion tensor imaging), Label Propagation and Walktrap surfaced as the most reliable methods for community detection. For dense, weighted networks such as correlation matrices capturing functional connectivity, Walktrap consistently outperformed the other approaches for recovering communities.

AB - The past decade has been marked with a proliferation of community detection algorithms that aim to organize nodes (e.g., individuals, brain regions, variables) into modular structures that indicate subgroups, clusters, or communities. Motivated by the emergence of big data across many fields of inquiry, these methodological developments have primarily focused on the detection of communities of nodes from matrices that are very large. However, it remains unknown if the algorithms can reliably detect communities in smaller graph sizes (i.e., 1000 nodes and fewer) which are commonly used in brain research. More importantly, these algorithms have predominantly been tested only on binary or sparse count matrices and it remains unclear the degree to which the algorithms can recover community structure for different types of matrices, such as the often used cross-correlation matrices representing functional connectivity across predefined brain regions. Of the publicly available approaches for weighted graphs that can detect communities in graph sizes of at least 1000, prior research has demonstrated that Newman’s spectral approach (i.e., Leading Eigenvalue), Walktrap, Fast Modularity, the Louvain method (i.e., multilevel community method), Label Propagation, and Infomap all recover communities exceptionally well in certain circumstances. The purpose of the present Monte Carlo simulation study is to test these methods across a large number of conditions, including varied graph sizes and types of matrix (sparse count, correlation, and reflected Euclidean distance), to identify which algorithm is optimal for specific types of data matrices. The results indicate that when the data are in the form of sparse count networks (such as those seen in diffusion tensor imaging), Label Propagation and Walktrap surfaced as the most reliable methods for community detection. For dense, weighted networks such as correlation matrices capturing functional connectivity, Walktrap consistently outperformed the other approaches for recovering communities.

KW - Community detection

KW - Functional connectivity

KW - Functional mri

KW - Modules

KW - Monte carlo simulation

UR - http://www.scopus.com/inward/record.url?scp=84995685606&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84995685606&partnerID=8YFLogxK

U2 - 10.3389/fninf.2016.00045

DO - 10.3389/fninf.2016.00045

M3 - Article

AN - SCOPUS:84995685606

VL - 10

JO - Frontiers in Neuroinformatics

JF - Frontiers in Neuroinformatics

SN - 1662-5196

IS - NOV

M1 - 45

ER -