TMB

Tumor mutational burden (TMB) is a total number of somatic mutations present within the cancer genome.

To calculate TMB, the algorithm follows the following steps.

Small variant calling

Refer to Small Variants on how small variants are called.

Eligible region detection

TMB is computed over protein coding regions with sufficient coverage, excluding low confidence regions (our blocklist regions.) In case of the DRAGEN TSO 500 ctDNA analysis software, the total coding region with coverage ≥ 1000X is used.

Germline variant identification

To exclude germline variants from TMB calculation, the algorithm includes two methods for predicting germline variant origin.

1. Database filter

Variants with a population allele count ≥ 10 in either the 1000 Genome or gnomAD database are marked as germline and assigned a tag Germline_DB in the “tmb.trace.tsv” and “hard-filtered.vcf” files.

2. Proxi filter

In the TSO 500 ctDNA pipeline, the proxi filter uses a probabilistic approach. For a target variant, it estimates the expected germline allele frequency using the surrounding germline variants. It then tests whether the allele frequency of the target variant is similar to the expected germline allele frequency. If the allele frequency is similar to expected, a tag Germline_Proxi is assigned to the target variant in the “tmb.trace.tsv” and “hard-filtered.vcf” files.

Note that proxi filter does not work well for 100% pure cell lines as well as for mixed or contaminated samples, as these samples do not have clear germline variant allele frequency distributions.

Clonal hematopoiesis (CH) variant identification

Clonal hematopoiesis (CH) is characterized by the overrepresentation of blood cells derived from a single clone. CH is common and increases in prevalence with age. For the accurate determination of TMB, the CH variants need to be excluded.

The TSO 500 ctDNA pipeline uses two methods to tag variants as CH variants.

1. CH genes whitelist

Some of the most commonly mutated genes in clonal hematopoiesis, DNMT3A, TET2, PPM1D, and ASXL1, are included into the CH genes whitelist. If the variant is in one of these genes, a tag Somatic_Putative_CH is assigned to the variant in the “tmb.trace.tsv” and “hard-filtered.vcf” files.

2. cfDNA fragment size analysis

CH-derived cfDNA fragments are generally longer compared to tumor-derived cfDNA, which tends to be shorter. This difference is used to identify CH variants based on the fragment size of reads supporting variant calls. Non-germline variants from the longer fragments are tagged as Somatic_Putative_CH in the “tmb.trace.tsv” and “hard-filtered.vcf” file.

Only variants with sufficient level of supporting reads or variant allele counts (VAC) > 50 are tested for fragment size difference between the reads supporting reference allele and reads supporting the variant allele. Non-germline variants with lower levels of VAC or without enough statistical power for the size difference test will remain tagged as Somatic in the “tmb.trace.tsv” and “hard-filtered.vcf” file.

Germline variant and clonal hematopoiesis (CH) variant identification in the TMB algorithm.

Tumor driver variant identification

Excluding tumor driver variants helps reduce bias for the bTMB calculations that could be due to targeted enrichment of the panel of genes. Variants with count ≥ 50 in the COSMIC database are treated as tumor driver variants and excluded from the calculation.

Nonsynonymous variant identification

The nonsynonymous variant are defined as described in the DRAGEN user guide. Only nonsynonymous variants are used to calculate Nonsynonymous TMB.

TMB calculation

The TMB is calculated using the following equations:

TMB=Eligible VariantsEffective Panel SizeTMB = {Eligible\ Variants \over Effective\ Panel\ Size}

NonsynonymousTMB=Filtered Nonsynonymous VariantsEligible Region Size(Mbp)Nonsynonymous TMB = {Filtered\ Nonsynonymous\ Variants \over Eligible\ Region\ Size (Mbp)}

The eligible variants and effective panel size of the TMB calculation are summarized in the following table:

Calculation Value
Description

Eligible variants (numerator)

  • Variants in the coding region (RefSeq Cds)

  • Variant frequency ≥ 0.2%

  • Coverage ≥ 1000X

  • SNVs and Indels (MNVs excluded)

  • Nonsynonymous and synonymous variants. Only nonsynonymous variants are used for Nonsynonymous TMB.

  • Variants with count ≥ 50 in the COSMIC database are excluded

  • Mutations in ASXL1, DNMT3A, PPM1D, and TET2 are excluded

  • Fragment-size based potential clonal hematopoiesis (CH) variants are excluded

Effective panel size (denominator)

Total coding region with coverage ≥ 1000X.

TMB Output Files

The TMB algorithm outputs results in several files:

  1. Combined Variant Output File, {SampleID}_CombinedVariantOutput.tsv

  2. TMB Metrics CSV file, {Sample_ID}.tmb.metrics.csv

  3. TMB Trace TSV file, {Sample_ID}.tmb.trace.tsv

  4. TMB Max Somatic VAF file, {Sample_ID}.tmb.msaf.csv

1. Combined Variant Output File

File name: {SampleID}_CombinedVariantOutput.tsv

The TMS results are output in the section [TMB] and include:

  • The TMB value

  • Coding Region Size in Megabases (a denominator for the TMB formula)

  • Number of Passing Eligible Variants (a numerator for the TMB formula)

2. TMB Metrics CSV

File name: {Sample_ID}.tmb.metrics.csv

The TMB metrics file contains the TMB and Nonsynonimous TMB calculation results and values used to calculated them for each DNA sample.

Column
Description

Total Input Variant Count

Total number of variant considered by the algorithm

Total Input Variant Count in TMB region

Total number of variant considered by the algorithm in the TMB eligible region

Filtered Variant Count

Variants remaining after filtering, see TMB algorithm page for details

Filtered Nonsyn Variant Count

Nonsynonymous variants remaining after filtering, see TMB algorithm page for details

Eligible Region (MB)

The eligible region, in megabases, that meet the minimum coverage threshold.

TMB

TMB value for the sample

Nonsyn TMB

Nonsynonymous TMB value for the sample

3. TMB Trace File

The TMB trace file provides comprehensive information on how the TMB value is calculated for a given sample. All passing small variants from the small variant filtering step are included in this file. To view eligible variants for TMB calculation, set the filter for the column IncludedInTMBNumerator to TRUE.

Column
Description

Chromosome

Chromosome

Position

Position of variant

RefCall

Reference base

AltCall

Alternate base

VAF

Variant allele frequency

Depth

Coverage of position

CytoBand

Cytoband of variant

GeneName

Name of gene if applicable. A semicolon delimited list is used for multiple genes.

VariantType

Type of the variant: SNV, insertion, deletion, MNV

CosmicIDs

Cosmic IDs, if multiple concatenated by “;”

MaxCosmicCount

Maximum COSMIC study count

ClinVarIDs

Reference ClinVar Variation IDs (RCV IDs)

ClinVarSignificance

Variant Classification in ClinVar database

AlleleCountsGnomadExome

Variant allele count in gnomAD exome database

AlleleCountsGnomadGenome

Variant allele count in gnomAD genome database

AlleleCounts1000Genomes

Variant allele count in 1000 Genomes database

MaxDatabaseAlleleCounts

Maximum variant allele count over the three databases

GermlineFilterDatabase

TRUE if variant was filtered by the database filter

GermlineFilterProxi

TRUE if variant was filtered by the proxi filter

CodingVariant

TRUE if variant is in the coding region

Nonsynonymous

TRUE if variant has any transcript annotations with nonsynonymous consequences

IncludedinTMBNumerator

TRUE if variant is used in the TMB calculation

Status

Germline_DB or Germline_Proxi if the variant was filtered by the Database or the Proxi filter, correspondingly. Somatic_Putative_CH if the variant was predicted to be associated with clonal hematopoiesis (CH). Somatic - variants not determined to be germline or CH.

ProteinChange

p.HGVS

CDSChange

c.HGVS

Exons

Exon, where the variant is located

Consequence

Variant consequence

4. TMB Max Somatic VAF file

The file outputs a variant with the Max Somatic VAF, using the same file format as the TMB Trace File.

Last updated

Was this helpful?