Elsevier

NeuroImage

Volume 161, 1 November 2017, Pages 149-170
NeuroImage

Harmonization of multi-site diffusion tensor imaging data

https://doi.org/10.1016/j.neuroimage.2017.08.047Get rights and content

Highlights

  • Significant site and scanner effects exist in DTI scalar maps.

  • Several multi-site harmonization methods are proposed.

  • ComBat performs the best at removing site effects in FA and MD.

  • Voxels associated with age in FA and MD are more replicable after ComBat.

  • ComBat is generalizable to other imaging modalities.

Abstract

Diffusion tensor imaging (DTI) is a well-established magnetic resonance imaging (MRI) technique used for studying microstructural changes in the white matter. As with many other imaging modalities, DTI images suffer from technical between-scanner variation that hinders comparisons of images across imaging sites, scanners and over time. Using fractional anisotropy (FA) and mean diffusivity (MD) maps of 205 healthy participants acquired on two different scanners, we show that the DTI measurements are highly site-specific, highlighting the need of correcting for site effects before performing downstream statistical analyses. We first show evidence that combining DTI data from multiple sites, without harmonization, may be counter-productive and negatively impacts the inference. Then, we propose and compare several harmonization approaches for DTI data, and show that ComBat, a popular batch-effect correction tool used in genomics, performs best at modeling and removing the unwanted inter-site variability in FA and MD maps. Using age as a biological phenotype of interest, we show that ComBat both preserves biological variability and removes the unwanted variation introduced by site. Finally, we assess the different harmonization methods in the presence of different levels of confounding between site and age, in addition to test robustness to small sample size studies.

Introduction

Diffusion tensor imaging (DTI) is a well-established magnetic resonance imaging (MRI) technique for studying the white matter (WM) organization and tissue characteristics of the brain. Diffusion tensor imaging has been used extensively to study both brain development and pathology; see Alexander et al. (2007) for a review of DTI and several of its applications. In studies assessing white matter tissue characteristics, two commonly reported complementary scalar maps are the mean diffusivity (MD), which assesses the degree to which water diffuses at each location, and fractional anisotropy (FA), which measures the coherence of this diffusion in one particular direction. Together, MD and FA provide complementary description of white matter microstructure.

With the increasing number of publicly availably neuroimaging databases, a crucial goal is to combine large-scale imaging studies to increase the power of statistical analyses to test common biological hypothesis. For instance, for life-span studies, combining data across sites and age ranges is essential for obtaining the necessary number of participants of each age. The success of combining multi-site imaging data depends critically on the comparability of the images across sites. As with other imaging modalities, DTI images are subject to technical variability across scans, including heterogeneity in the imaging protocol, variations in the scanning parameters and differences in the scanner manufacturers (Zhu et al., 2009, Zhu et al., 2011). Among others, the reliability of FA and MD maps have been shown to be affected by angular and spatial resolution (Zhan et al., 2010, Alexander et al., 2001, Kim et al., 2006), the number of diffusion weighting directions (Giannelli et al., 2009), the number of gradient sampling orientations (Jones, 2004), the number of b-values (Correia et al., 2009), and the b-values themselves.

In the design of multi-site studies, defining a standardized DTI protocol is a first step towards reducing inter-scanner variability. However, even in the presence of a standardized protocol, systematic differences between scanner manufacturers, field strength and other scanner characteristics will systematically affect the DTI images and induce inter-scanner variation. Image-based meta analysis (IBMA) techniques, reviewed in Salimi-Khorshidi et al. (2009), are common methods for combining results from multi-site studies with the goal of testing a statistical hypothesis. IBMA methods circumvent the need of harmonizing images across sites by performing site-specific statistical analyses and combining results afterwards. Fisher's p-value combining method and Stouffer's z-transformation test, applied to z or t-maps, are two common IBMA techniques. Fixed-effect models based on (possibly) normalized images, and mixed-effect models to model the inter- and intra-site variability, are other common techniques for the analysis of multi-site data. Indeed, meta-analysis methods have shown great promise for studies with a large number of participants at each site. For instance, the ENIGMA-DTI working group has been successfully using and validating meta-analysis techniques on such multi-site DTI data (Jahanshad et al., 2013, Kochunov et al., 2014).

Meta-analysis techniques have several limitations, however. First, study-specific samples might not be sufficient to estimate the true biological variability in the population (Mirzaalian et al., 2016). As described by De Wit et al. (2014), adjusting for variability at the participant level is problematic in meta-analyses, since only group-level demographic and clinical information is available. Another limitation is that for a multi-site study, computing site-specific summary statistics will be affected by unbalanced data. For instance, the calculation of a variance using unbalanced datasets is highly affected by the ratio cases/controls in the sample (Linn et al., 2016b). Another limitation, for imaging studies with small sample sizes, the parameters of the z-score transformations cannot be robustly estimated, yielding suboptimal statistical inferences.

Mega-analyses, in which the imaging data are combined before performing statistical inferences, have the potential to increase power compared to meta-analyses (De Wit et al., 2014). In addition, pooling imaging data across studies has the benefit of enriching the clinical picture of the sample by increasing the variability in symptom profiles (Turner, 2014) and demographic variables. This is particularly important for age-span studies. However, pooling data across studies may increase the heterogeneity of the imaging measurements by introducing undesirable variability caused by differences in scanner protocols. Harmonization of the pooled data is therefore necessary to ensure the success of mega-analyses. The DTI harmonization technique proposed in Mirzaalian et al. (2016) is a first step towards that direction. The method is based on rotation invariant spherical harmonics (RISH) and combines the unprocessed DTI images across scanners. Unfortunately, a major drawback of the method is that it requires DTI data to have similar acquisition parameters across sites, an assumption often infeasible in multi-site observational analyses.

In this work, we adapted and compared several statistical approaches for the harmonization of DTI studies that were previously developed for other data types: Functional normalization (Fortin et al., 2014), RAVEL (Fortin et al., 2016a), Surrogate variable analysis (SVA) (Leek and Storey, 2007) and ComBat (Johnson et al., 2007), a popular batch adjustment method developed for genomics data. We also include a simple method that globally rescales the data for each site using a z-score transformation map common to all features, which we refer to as “global scaling”. For the evaluation of the different harmonization techniques, we use DTI data acquired as a part of two large imaging studies ((Satterthwaite et al., 2014) and (Ghanbari et al., 2014)) with images acquired on different scanners, using different imaging protocols. The participants are teenagers, and were matched across studies for age, gender, ethnicity, and handedness.

We first analyze site-related differences in the FA, MD, radial diffusivity (RD) and axial diffusivity (AD) measurements, and show evidence of significant site effects that differ across the brain. This motivates the need for a harmonization technique that is sensitive to region-specific scanner effects. Then, we harmonize the data with several proposed harmonizations, and evaluate their performance using a comprehensive evaluation framework. We show that the ComBat is the most effective harmonization techniques as it removes unwanted variation induced by site, while preserving between-subject biological variability. ComBat is a promising harmonization technique for other imaging modalities since it does not make assumptions about the origin of the site effects.

Section snippets

Data

We consider two DTI studies from two different scanners. To investigate the effect of scanner variations on the DTI measurements, we matched the participants for age, gender, ethnicity and handedness, resulting in 105 participants retained in each study for further analysis. The characteristics of each dataset are described below.

Dataset 1 (Site 1): PNC dataset. We selected a subset of the Philadelphia Neurodevelopmental Cohort (PNC) (Satterthwaite et al., 2014), and included 105 healthy

Results

The results are organized as follows. We first show evidence of substantial site effects in the FA and MD maps in Section 3.1, and then show how the different harmonization methods perform at removing those site effects in Section 3.2. In Section 3.3, we discuss the biological variability at each site separately, before and after harmonization and show how site effects affect the number of voxels associations with age. In Section 3.4, we present our experiments for simulating different levels

Discussion

In this work, we investigated the effects of combining DTI studies across sites and scanners on the statistical analyses. We used FA and MD maps from data acquired at two sites with different scanners. We first showed that combining the two studies without proper harmonization led to a decrease in power of detecting voxels associated with age. This confirmed that DTI measurements are highly affected by small changes in the scanner parameters, as those affect the underlying water diffusivity.

Software

All of the postprocessing analysis was performed in the R statistical software (v3.2.0). For SVA and ComBat, reference implementations from the sva package were used (v3.22.0). All figures were generated in R with customized and reproducible scripts, using several functions from the package fslr (Muschelli et al., 2015) (v2.12). We have adapted and implemented the ComBat methodology to imaging data, and the software is available in both R and Matlab on GitHub (//github.com/Jfortin1/ComBatHarmonization

Competing interests

The authors declare that they have no competing interests.

Authors contributions

JPF developed the methodology and analyzed the data. DP, BT and TW processed the data. ME, KR, DR, TS, RCG, REG and RTSc recruited the participants and acquired the data. JPF and RTSh wrote the manuscript. RTSh and RV supervised the work. All authors read and approved the final manuscript.

Funding

The research was supported in part by R01NS085211 and R21NS093349 from the National Institute of Neurological Disorders and Stroke, R01MH092862 and R01MH107703 from the National Institute of Mental Health and R01HD089390 from the National Institute of Child Health and Human Development. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.

References (58)

  • Andrew L. Alexander et al.

    Analysis of partial volume effects in diffusion-tensor mri

    Magn. Reson. Med.

    (2001)
  • Andrew L. Alexander et al.

    Diffusion tensor imaging of the brain

    Neurotherapeutics

    (2007)
  • Manzar Ashtari et al.

    White matter development during late adolescence in healthy males: a cross-sectional diffusion tensor imaging study

    Neuroimage

    (2007)
  • Naama Barnea-Goraly et al.

    White matter development during childhood and adolescence: a cross-sectional diffusion tensor imaging study

    Cereb. Cortex

    (2005)
  • Sunita Bava et al.

    Longitudinal characterization of white matter maturation during adolescence

    Brain Res.

    (2010)
  • J Martin Bland et al.

    Statistical methods for assessing agreement between two methods of clinical measurement

    Lancet

    (1986)
  • B.M. Bolstad et al.

    A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

    Bioinformatics

    (2003)
  • W. Cleveland

    Visualizing Data

    (1993)
  • William S. Cleveland

    Robust locally weighted regression and smoothing scatterplots

    J. Am. Stat. Assoc.

    (1979)
  • William S. Cleveland

    Lowess: a program for smoothing scatterplots by robust locally weighted regression

    Am. Statistician

    (1981)
  • Marta Morgado Correia et al.

    Looking for the optimal dti acquisition scheme given a maximum scan time: are more b-values a waste of time?

    Magn. Reson. Imaging

    (2009)
  • Stella J. De Wit et al.

    Multicenter voxel-based morphometry mega-analysis of structural brain scans in obsessive-compulsive disorder

    Am. J. Psychiatry

    (2014)
  • Sandrine Dudoit et al.

    Statistical methods for identifying differentially expressed genes in replicated cdna microarray experiments

    Stat. Sin.

    (2002)
  • Jean-Philippe Fortin et al.

    Functional normalization of 450k methylation array data improves replication in large cancer studies

    Genome Biol.

    (2014)
  • Jean-Philippe Fortin et al.

    Removing inter-subject technical variability in magnetic resonance imaging studies

    NeuroImage

    (2016)
  • Jean-Philippe Fortin et al.

    Preprocessing, normalization and integration of the illumina humanmethylationepic array with minfi

    Bioinformatics

    (2016)
  • J.A. Gagnon-Bartsch et al.

    Using control genes to correct for unwanted variation in microarray data

    Biostatistics

    (2012)
  • Eleftherios Garyfallidis et al.

    Dipy, a library for the analysis of diffusion mri data

    Front. Neuroinformat.

    (2014)
  • Yasser Ghanbari et al.

    Identifying group discriminative and age regressive sub-networks from dti-based connectivity via a unified framework of non-negative matrix factorization and graph embedding

    Med. Image Anal.

    (2014)
  • Marco Giannelli et al.

    Dependence of brain dti maps of fractional anisotropy and mean diffusivity on the number of diffusion weighting directions

    J. Appl. Clin. Med. Phys.

    (2009)
  • Antonio Giorgio et al.

    Longitudinal changes in grey and white matter during adolescence

    Neuroimage

    (2010)
  • Yuankai Huo et al.

    Mapping lifetime brain volumetry with covariate-adjusted restricted cubic spline regression from cross-sectional multi-site mri

    International Conference on Medical Image Computing and Computer-assisted Intervention

    (2016)
  • Rafael A. Irizarry et al.

    Multiple-laboratory comparison of microarray platforms

    Nat. Methods

    (2005)
  • Neda Jahanshad et al.

    Multi-site genetic analysis of diffusion images and voxelwise heritability analysis: a pilot project of the enigma–dti working group

    Neuroimage

    (2013)
  • Mark Jenkinson et al.

    A global optimisation method for robust affine registration of brain images

    Med. Image Anal.

    (2001)
  • Mark Jenkinson et al.

    Improved optimization for the robust and accurate linear registration and motion correction of brain images

    Neuroimage

    (2002)
  • W Evan Johnson et al.

    Adjusting batch effects in microarray expression data using empirical bayes methods

    Biostatistics

    (2007)
  • Derek K. Jones

    The effect of gradient sampling schemes on measures derived from diffusion tensor mri: a monte carlo study

    Magn. Reson. Med.

    (2004)
  • Mina Kim et al.

    Spatial resolution dependence of dti tractography in human occipito-callosal region

    Neuroimage

    (2006)
  • Cited by (628)

    View all citing articles on Scopus
    1

    Equal contribution.

    View full text