Run on Multiple DRAGEN Servers

DRAGEN TruSight Oncology 500 Analysis Software can be used to run in parallel on multiple DRAGEN servers to decrease overall processing time. This is possible using a three stage process called scatter/ gather, which consists of demultiplexing, analysis, and result gathering.

Demultiplex and Scatter

The first stage is demultiplexing. Demultiplexing runs once on the entire run folder, generates FASTQ files for each sample in the run, and then separates sample files into respective folders. Once complete, note the output directory containing the sample directories holding the FASTQ files.

The process for scattering the analysis on multiple DRAGEN servers is as follows:

  1. Determine how many DRAGEN servers are available to run.

  2. Run demultiplex

    1. on a single DRAGEN server; or

    2. if the flow cell was loaded with individually-addressable lanes, follow the tip below to avoid copying data across servers.

  1. Transfer the FASTQ folder output from the original DRAGEN server to additional servers (not needed if option 2b was used).

    • Find the FASTQ folder at: Logs_Intermediates/FastqGeneration.

Analysis

Run analysis software using the --fastqFolder option on both the original and additional DRAGEN servers.

  • Option 1 Copy the original SampleSheet.csv to each server. Then provide a subsetted list to the Bash script on each DRAGEN server with the intended samples/pairs to run.

  • Option 2 Copy and modify the SampleSheet.csv to each DRAGEN server to only contain the list of samples/pairs to run. The software verifies that all samples in the sample sheet are contained within the FASTQ folders unless the --sampleOrPairIDs command-line option is present in the analysis launch. Failure to account for these checks results in an error.

Gather Results

Copy the results from demultiplexing and each analysis run onto a single server, and then generate the final /Results directory, which contains the aggregated results. Enter the --gather command followed by the output directories of the demultiplexing step and each individual analysis run.

Commands for Multi-node Analysis

Step
Command

Demultiplexing

DRAGEN_TSO500_2.6.0.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --runFolder /staging/{RunFolderName} --analysisFolder /staging/{DemultiplexAnalysisFolderName} --demultiplexOnly --sampleSheet /staging/illumina/{SampleSheetName}

Analysis (one server)

DRAGEN_TSO500_2.6.0.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_1,Pair_2

Analysis (additional servers)

DRAGEN_TSO500_2.6.0.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_3

Gather

DRAGEN_TSO500_2.6.0.sh --analysisFolder /Gathered_Results --resourcesFolder staging/illumina/DRAGEN_TSO500/resources --runFolder /staging/{RunFolderName}/--sampleSheet /staging/illumina/{SampleSheetName} --gather /Demultiplex_Output /Node1_Output /Node2_Output

Last updated

Was this helpful?