Optimized preprocessing of ultra-performance liquid chromatography/mass spectrometry urinary metabolic profiles for improved information recovery

Kirill A. Veselkov, Lisa K. Vingara, Perrine Masson, Steven L. Robinette, Elizabeth Want, Jia V. Li, Richard H. Barton, Claire Boursier-Neyret, Bernard Walther, Timothy M. Ebbels, István Pelczer, Elaine Holmes, John C. Lindon, Jeremy K. Nicholson

Research output: Contribution to journalArticlepeer-review

211 Scopus citations


Ultra-performance liquid chromatography coupled to mass spectrometry (UPLC/MS) has been used increasingly for measuring changes of low molecular weight metabolites in biofluids/tissues in response to biological challenges such as drug toxicity and disease processes. Typically samples show high variability in concentration, and the derived metabolic profiles have a heteroscedastic noise structure characterized by increasing variance as a function of increased signal intensity. These sources of experimental and instrumental noise substantially complicate information recovery when statistical tools are used. We apply and compare several preprocessing procedures and introduce a statistical error model to account for these bioanalytical complexities. In particular, the use of total intensity, median fold change, locally weighted scatter plot smoothing, and quantile normalizations to reduce extraneous variance induced by sample dilution were compared. We demonstrate that the UPLC/MS peak intensities of urine samples should respond linearly to variable sample dilution across the intensity range. While all four studied normalization methods performed reasonably well in reducing dilution-induced variation of urine samples in the absence of biological variation, the median fold change normalization is least compromised by the biologically relevant changes in mixture components and is thus preferable. Additionally, the application of a subsequent log-based transformation was successful in stabilizing the variance with respect to peak intensity, confirming the predominant influence of multiplicative noise in peak intensities from UPLC/MS-derived metabolic profile data sets. We demonstrate that variance-stabilizing transformation and normalization are critical preprocessing steps that can benefit greatly metabolic information recovery from such data sets when widely applied chemometric methods are used.

Original languageEnglish (US)
Pages (from-to)5864-5872
Number of pages9
JournalAnalytical Chemistry
Issue number15
StatePublished - Aug 1 2011

ASJC Scopus subject areas

  • Analytical Chemistry


Dive into the research topics of 'Optimized preprocessing of ultra-performance liquid chromatography/mass spectrometry urinary metabolic profiles for improved information recovery'. Together they form a unique fingerprint.

Cite this