Run on Multiple DRAGEN Servers

DRAGEN TruSight Oncology 500 Analysis Software can be used to run in parallel on multiple DRAGEN servers to decrease overall processing time. This is possible using a three stage process called scatter/ gather, which consists of demultiplexing, analysis, and result gathering.

Demultiplex and Scatter

The first stage is demultiplexing. Demultiplexing runs once on the entire run folder, generates FASTQ files for each sample in the run, and then separates sample files into respective folders. Once complete, note the output directory containing the sample directories holding the FASTQ files.

The process for scattering the analysis on multiple DRAGEN servers is as follows:

Determine how many DRAGEN servers are available to run.
Run demultiplex
1. on a single DRAGEN server; or
2. if the flow cell was loaded with individually-addressable lanes, follow the tip below to avoid copying data across servers.

To run scatter for flow cells loaded with individually-addressable lanes (i.e. using the NovaSeq 6000 XP workflow), modify the sample sheet to include a subset of the lanes. For example, on an S2 flowcell, create two modified sample sheets with one containing the samples from lane 1 and the other from lane 2. This allows only the sample sheet to be modified instead of copying files between servers. This strategy would use the start from Run Folder commands without the --demultiplexOnly option. The entire run folder would need to be copied to each analysis server as demultiplexing is performed once per server.

Transfer the FASTQ folder output from the original DRAGEN server to additional servers (not needed if option 2b was used).
- Find the FASTQ folder at: Logs_Intermediates/FastqGeneration.

Moving or modifying files during an analysis may cause the analysis to fail or provide incorrect results.

Analysis

Run analysis software using the --fastqFolder option on both the original and additional DRAGEN servers.

Option 1 Copy the original SampleSheet.csv to each server. Then provide a subsetted list to the Bash script on each DRAGEN server with the intended samples/pairs to run.
Option 2 Copy and modify the SampleSheet.csv to each DRAGEN server to only contain the list of samples/pairs to run. The software verifies that all samples in the sample sheet are contained within the FASTQ folders unless the --sampleOrPairIDs command-line option is present in the analysis launch. Failure to account for these checks results in an error.

Gather Results

Copy the results from demultiplexing and each analysis run onto a single server, and then generate the final /Results directory, which contains the aggregated results. Enter the --gather command followed by the output directories of the demultiplexing step and each individual analysis run.

Commands for Multi-node Analysis

Step

Command

Demultiplexing

DRAGEN_TSO500_2.6.0.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --runFolder /staging/{RunFolderName} --analysisFolder /staging/{DemultiplexAnalysisFolderName} --demultiplexOnly --sampleSheet /staging/illumina/{SampleSheetName}

Analysis (one server)

DRAGEN_TSO500_2.6.0.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_1,Pair_2

Analysis (additional servers)

Gather

DRAGEN_TSO500_2.6.0.sh --analysisFolder /Gathered_Results --resourcesFolder staging/illumina/DRAGEN_TSO500/resources --runFolder /staging/{RunFolderName}/--sampleSheet /staging/illumina/{SampleSheetName} --gather /Demultiplex_Output /Node1_Output /Node2_Output

PreviousCommand-Line Options NextAnalysis Launch on ICA

Last updated 7 months ago

Was this helpful?

hashtagDemultiplex and Scatter

hashtagAnalysis

hashtagGather Results

hashtagCommands for Multi-node Analysis

Demultiplex and Scatter

Analysis

Gather Results

Commands for Multi-node Analysis