Table 2

Online resources: tools for the bench and other useful websites

Resource	Description	URL link
CIMAC/CIDC network	The Cancer Immune Monitoring and Analysis Centers (CIMAC) and the Cancer Immunologic Data Commons (CIDC) are NCI-funded academic centers for advanced clinical trial immune monitoring.	https://cimac-network.org/
PACT	The Partnership for Accelerating Cancer Therapies (PACT) is a public–private collaboration that extends the CIMAC/CIDC activities to include additional non-NCI clinical trials.	https://fnih.org/what-we-do/programs/partnership-for-accelerating-cancer-therapies
Links to FDA biomarker approval	The FDA’s Center for Drug Evaluation and Research works with stakeholders to identify and develop new biomarkers, review biomarkers for use in regulatory decision-making, and qualify biomarkers for specific contexts of use.	https://www.fda.gov/drugs/drug-development-tool-qualification-programs/cder-biomarker-qualification-program
Public databases	ImmPort is a data repository and sharing tool built by NIAID for immunology-related assay data of various types.	http://www.immport.org
	The Cancer Genome Atlas is a database of sequences from over 20,000 cancer and matched normal tissues.	https://portal.gdc.cancer.gov
Transcription factors binding sites prediction software	Transcription factor (TF) binding site prediction is very important in deciphering gene regulation at a transcriptional level. TF binding sites are typically identified by either matching to a consensus sequence or using position-specific scoring matrices (PSSMs). PSSMs can be obtained from resources including the commercial transcription factor database (TRANSFAC) and the open access database JASPAR: Computer methods to locate signals in nucleic acid sequences.584 TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes.585 JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles.586 In 2005, Tompa M et al 587 evaluated 13 algorithms designed to identify cis-regulatory sites using TF binding sites from TRANSFAC. Their results revealed that the Weeder algorithm performed best: Assessing computational tools for the discovery of transcription factor binding sites.587 A set of de novo motif discovery tools, namely rGADEM (R-based genetic algorithm-guided formation of spaced dyads coupled with an expectation-maximization (EM) algorithm for motif discovery), HOMER (hypergeometric optimization of motif enrichment), MEME-ChIP (multiple EM for motif elicitation-chromatin immunoprecipitation), and ChIPMunk (a modification of the classical EM approach), were also evaluated using ChIP-seq data ENCODE. The study showed that rGADEM was the best-performing tool for creating PSSMs from high-throughput ChIP-seq data. FIMO (Find Individual Motif Occurrences) and MCAST (Motif Cluster Alignment and Search Tool) were the best-performing TF binding site prediction tools for scanning PSSMs against DNA: Evaluating tools for transcription factor binding site prediction.588
Tools for neoantigen prediction	Neoantigens are small peptides derived from mutated proteins in cancer cells that can be recognized as foreign by immune cells and trigger an immune response. There are many challenges in computational methods/tools to identify neoantigens and to predict which may serve as optimal targets for the development of immunotherapy approaches: Neoantigens in cancer immunotherapy589 Computational genomics tools for dissecting tumour-immune cell interactions.590 Applications of immunogenomics to cancer.490 MHC binding has been considered a necessary step for neoantigens to be recognized by T cell receptors. The MHC binding prediction methods can be categorized as binding motif-based, position-specific score-based or matrix-based, and machine learning-based, such as artificial neural networks (ANN) or support vector machines. Because of the polymorphic nature of MHC class II molecules and variations in accepted peptide length, the prediction results for MHC class II binding are less accurate than those for MHC class I. Many existing MHC binding peptide and T cell epitope databases could potentially serve as a training data pool to develop prediction models. A good example is the Immune Epitope Database (IEDB), which provides a comprehensive resource for experimental data on antibody and T cell epitopes studied in multiple diseases: SYFPEITHI: database for MHC ligands and peptide motifs.591 Profile analysis: detection of distantly related proteins.592 Gapped sequence alignment using artificial neural networks: application to the MHC class I system.593 NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets.594 NetMHCpan, a method for MHC class I binding prediction beyond humans.595 Application of support vector machines for T-cell epitopes prediction.596 SVMHC: a server for prediction of MHC-binding peptides.597 The immune epitope database and analysis resource: from vision to blueprint.598 The immune epitope database (IEDB) 3.0.599 IEDB: http://tools.iedb.org/main/datasets 600 Not all MHC binding peptides are immunogenic. Combination approaches have been developed to use additional information (eg, proteasome cleavage) in order to reduce the false positive rate. Since the stability of the peptide–MHC interaction has experimentally been shown to be more strongly correlated to T cell immunogenicity, netMHCstabpan (pan-specific prediction of peptide–MHC class I complex stability) uses a neural network approach based on a data set of stability values calculated for different peptide–MHC class I complexes, rather than their binding affinity values: Pan-specific prediction of peptide-MHC class I complex stability, a correlate of T cell immunogenicity.601 Many pipelines have been developed for neoantigen prediction from WES sequencing data via integration of multiple methods. For example, MuPeXI (mutant peptide extractor and informer) is a program to identify tumor-specific peptides from sequencing data and assess their potential to be neoantigens. The peptides are sorted according to a priority score which is intended to roughly predict immunogenicity. A flexible, streamlined computational workflow for identification of personalized Variant Antigens by Cancer Sequencing (pVACSeq) integrates tumor mutation and expression data: MuPeXI: prediction of neo-epitopes from tumor sequencing data.602 pVAC-Seq: A genome-guided in silico approach to identifying tumor neoantigens.603	IEDB: http://tools.iedb.org/main/datasets
CTRs	USA: https://www.clinicaltrials.gov Europe: https://www.clinicaltrialsregister.eu/

ANN, artificial neural networks; CIDC, Cancer Immunologic Data Commons; CIMAC, Cancer Immune Monitoring and Analysis Centers; CTR, clinical trial registry; EM, expectation maximization; FDA, Food and Drug Administration; FIMO, Find Individual Motif Occurrences; HOMER, hypergeometric optimization of motif enrichment; IEDB, Immune Epitope Database; MCAST, Motif Cluster Alignment and Search Tool; MHC, major histocompatibility complex; MuPeXI, mutant peptide extractor and informer; NCI, National Cancer Institute; NIAID, National Institute of Allergy and Infectious Diseases; PACT, Partnership for Accelerating Cancer Therapies; PSSM, position-specific scoring matrix; pVACSeq, peronsalized Variant Antigens by Cancer Sequencing; rGADEM, R-based genetic algorithm-guided formation of spaced dyads coupled with an EM algorithm for motif discovery; TF, transcription factor; WES, whole exome sequencing.