Table 2

Consensus recommendations for the standardization of analytical validation studies of targeted NGS panels that estimate TMB

  • Accuracy or agreement should be measured by comparing the TMB values generated by the assay requiring validation against reference TMB values generated either from:

    • A comparable companion diagnostic approved by a regulatory agency, such as the FDA, if available, OR

    • A WES assay with validated performance characteristics and using an accepted WES TMB calculation method, such as the common method reported in this study (see online supplementary table 1).

  • The minimum number of samples used for evaluation of accuracy should be at least 30. Samples should have TMB values that span the entire analytical range being investigated (0–40 mut/Mb is recommended currently).

  • TMB as a continuous score: This analysis will characterize the analytical performance of the assay over the analytical range of interest per the intended use.

    • Quantification of performance based on TMB as a continuous variable should include an appropriate regression analysis and a scatter plot showing the association between the panel and reference TMB values.

    • Additionally, quantification of performance should examine the pointwise prediction intervals for panel TMB values as obtained from the regression analysis, at predefined reference TMB values. A description of the absolute deviation from the mean should also be reported.

  • For TMB as a categorical call: accuracy should be analytically validated using a single or multiple discreet TMB cut-off values within the analytical range of interest.

    • Quantification of performance of TMB as a categorical call should be based on 2×2 agreement tables to inform the positive per cent agreement, negative per cent agreement and overall per cent agreement informed by 2×2 agreement tables.

    • For assays pursuing a companion diagnostic claim, accuracy should be examined using a predetermined discrete cut-off value investigated in a study using clinical samples covering the spectrum of conditions from a defined, or intent-to-treat (ITT) population. If the ITT population includes multiple cancer types, stratified analyses should be conducted.

  • Alteration level agreement: as a supplemental analysis, the agreement between alteration level calls (including single nucleotide variants (SNVs) and short indels) between the platform and the reference should be provided for each alteration included in the TMB panel and restricted to the overlapping genomic regions between the two assays.

    • Report the concordance of variant calls between variants identified by WES and panel as a function of the panel variant allele frequency (VAF).

  • Characterize the percentage of tests passing QC by reporting first pass acceptability rate and overall acceptability rate (after samples have been retested, if necessary).

  • Precision should be evaluated using several samples. For each sample, separate analyses should be performed as described in the TMB as a continuous score and TMB as a categorical call sections below.

  • Analytical validation of precision of TMB as both a continuous score and a categorical call will improve reliability of TMB as a biomarker.

  • Because TMB is a composite estimate composed of different variants, its precision should be evaluated as a composite score (mut/Mb).

  • TMB as a continuous score:

    • Precision studies of quantitative TMB estimates should evaluate the mean, SD and coefficient of variation of TMB values obtained from testing aliquots of the same sample under stipulated precision conditions (eg, replicates, runs, instruments, lots, operators) for a range of samples (5–6 samples with 20 TMB results distributed across the precision conditions each) with TMB values within the analytical range (0–40 mut/Mb is recommended currently), and include different levels of tumor content and VAF values.

      • Note: identification of the TMB range to be evaluated should be guided by the most recently published clinically relevant studies.

    • Precision of the TMB score should be estimated using a variance component analysis to estimate between-run, within-run, between instruments, between lots and between operator SD for each sample.

    • Quantification of performance should include calculation of repeatability and within-lab SD for each sample corresponding to several discreet TMB values, given that a single average value of variation (eg, coefficient of variation pooled across several samples having different TMB levels) may not best reflect the changing variability across the TMB range.

  • TMB as a categorical call: consistency of the categorical calls (eg, TMB high vs TMB low according to a discreet cut-off) should be evaluated based on both repeatability and within-laboratory precision for the TMB results according to a single or multiple discreet cut-off values.

    • For repeatability, calculate the per cent of TMB high calls (if majority call is high) or per cent of TMB low calls (if majority call is low) between replicate samples tested under the same lab conditions.

    • For within laboratory precision, calculate the per cent of TMB high calls (if majority call is high) or per cent of TMB low calls (if majority call is low) and the mean TMB score from replicate samples tested under varying within-lab conditions.

      • Note: the number of aliquots tested per sample should be sufficient to account for the various sources of assay variability, such as the ones described above (TMB as a continuous score). Moreover, the number of samples tested should be similar.

    • Emphasis should be placed on evaluating samples with TMB values:

      • Significantly below cut-off (approximates limit of blank, expect TMB low almost 100% of time).

      • Near and below cut-off (expect TMB low 95% of time).

      • Near and above cut-off (expect TMB high 95% of time).

      • Significantly above cut-off (expect very high TMB almost 100% of time).

  • Alteration level precision: the evaluation of precision for each individual alteration call used to estimate TMB is not necessary but may be performed as an exploratory analysis to provide insight into the mechanisms that contribute to the TMB score variability.

  • Per cent tumor content should be collected when evaluating precision and reported, if applicable.

  • Characterize the percentage of tests passing QC by reporting first pass acceptability rate and overall acceptability rate (after samples have been retested, if necessary).

  • The impact of tumor content of a sample on the TMB categorical call (high, low) should be evaluated using multiple samples, taking into consideration the precision of the TMB score as a function of decreasing tumor content.

    • Undiluted samples should have a range of expected TMB scores, a range of VAF values for somatic mutations and a ratio of SNVs and indels that are representative of clinical samples.

  • The evaluation of panel sensitivity to tumor content should be done using:

    • Samples: 6–10 undiluted samples where each sample is diluted to at least 5 levels of tumor content.

    • Each sample should have a dilution series ranging from well above and below the expected sensitivity limits for tumor content.

      • Note: It is likely that matched normal will be required to generate each respective dilution value. Consideration should be given to technical and biological factors that may impact the choice of the normal sample and design of the dilution series.

    • At each dilution level, at least 10 replicate samples should be tested.

  • For each sample, the evaluation should include a calculation of the per cent of TMB high (above a predefined TMB threshold) across replicates at each dilution and a probit regression of per cent TMB high versus tumor content. From the regression, report the estimated tumor content where the probability of detecting TMB high is 95%.

Limit of blank*
  • Non-tumor samples should be used to establish the limit of blank for TMB, yielding results close to, but not always equal to 0 mut/Mb.

    • Considerations should be given to technical and biological factors, such as age of patient and distance from tumor lesion, among others.

Percentage of tests passing QC
  • Report percentage of tests passing TMB QC metrics in routine testing.

  • Example QC metrics for TMB might include: median exon coverage, coverage uniformity, contamination rate.

  • *The definitions of the terms used in this table are based on the Clinical and Laboratory Standards Institute Harmonized Terminology Database at

  • FDA, Food and Drug Administration; NGS, next-generation sequencing; QC, quality control; TMB, tumor mutational burden ; WES, whole exome sequencing .