Maftools: efficient and comprehensive analysis of somatic variants in cancer

  1. H. Phillip Koeffler1,3,5
  1. 1Cancer Science Institute of Singapore, National University of Singapore, 117599, Singapore;
  2. 2Epigenomics and Cancer Risk Factors, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany;
  3. 3Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, California 90048, USA;
  4. 4German Centre for Cardiovascular Research (DZHK), Partner Site Heidelberg/Mannheim, 69120 Heidelberg, Germany;
  5. 5National University Cancer Institute, National University Hospital, 119074, Singapore
  • Corresponding authors: dchlin11{at}gmail.com, a.mayakonda{at}dkfz-heidelberg.de
  • Abstract

    Numerous large-scale genomic studies of matched tumor-normal samples have established the somatic landscapes of most cancer types. However, the downstream analysis of data from somatic mutations entails a number of computational and statistical approaches, requiring usage of independent software and numerous tools. Here, we describe an R Bioconductor package, Maftools, which offers a multitude of analysis and visualization modules that are commonly used in cancer genomic studies, including driver gene identification, pathway, signature, enrichment, and association analyses. Maftools only requires somatic variants in Mutation Annotation Format (MAF) and is independent of larger alignment files. With the implementation of well-established statistical and computational methods, Maftools facilitates data-driven research and comparative analysis to discover novel results from publicly available data sets. In the present study, using three of the well-annotated cohorts from The Cancer Genome Atlas (TCGA), we describe the application of Maftools to reproduce known results. More importantly, we show that Maftools can also be used to uncover novel findings through integrative analysis.

    Footnotes

    • Received May 6, 2018.
    • Accepted September 27, 2018.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    | Table of Contents

    Preprint Server