All pages
Powered by GitBook
1 of 7

Launching Analysis

Analysis Launch on Standalone DRAGEN Server

Start the DRAGEN TruSight Oncology 500 Analysis Software with the DRAGEN_TSO500-2.6.0.sh Bash script. The script is installed in the /usr/local/bin directory. The Bash script is executed on the command line and runs the software with Docker (or Apptainer if specified).

For arguments, refer to Command-Line Options. You can start from BCL files or from the FASTQ folder produced by BCL Convert. The following requirements apply for both methods:

  • Path to the sequencing run or FASTQ folder. Copy the run or FASTQ folder to the DRAGEN server into the staging folder with the following recommended organization: /staging/runs/{RunID}. You can copy the run folder onto the DRAGEN server using Linux commands such as rsync. The sample sheet within the run folder is used unless otherwise specified through the command line.

  • Run folder must be intact. Refer to Starting from BCL Files for input requirements.

  • If the analysis output folder path is different from the default, provide the analysis output folder path. Refer to Command-Line Options.

Before running the analysis, confirm that the output directory for the software to write to is empty and does not include results of previous analyses.

Storage Requirements

For optimal performance, run analysis on data stored locally on the DRAGEN server. Analysis of data stored on NAS can take longer and performance can be less reliable.

The DRAGEN server provides an NVMe SSD in the /staging directory to use as the software output directory. Network-attached storage is required for long-term storage.

When running the DRAGEN TruSight Oncology 500 Analysis Software, use the default settings or set the -analysisFolder command line option to a directory in /staging to make sure the DRAGEN server processes read and write data on the NVMe SSD.

Before beginning analysis, develop a strategy to copy data from the DRAGEN server to a network‑attached storage. Delete output data on the DRAGEN server as soon as possible.

The following are the run and analysis output sizes for each sequencing system per 101 bp:

Sequencing System
Run Folder Output (Gb)
Analysis Output (Gb)
Minimum Disk Space (Gb)

NextSeq 500/550/550Dx (RUO) HO flow cell

32-55

82-85

150

NovaSeq 6000/6000Dx (RUO) SP Flow Cell

85-100

250-374

300

NovaSeq 6000/6000Dx (RUO) S1 Flow Cell

164-200

360-665

800

NovaSeq 6000/6000Dx (RUO) S2 Flow Cell

290-460

890-1600

1500

NovaSeq 6000/6000Dx (RUO) S4 Flow Cell

800-1200

2700-4100

3000

NovaSeq X 1.5B

213

352

800

NovaSeq X 10B

1100

1800

3000

NovaSeq X 25B

1800

3300

4000

NextSeq 1000/2000

41

107

150

When launching the analysis, the software checks that the minimum disk space required is available. If the minimum disk space is not available, the software shows an error message and prevents analysis from starting. If disk space is exhausted during a run, the run shows an error and stops analyzing.

Moving or modifying files during an analysis may cause the analysis to fail or provide incorrect results.

Command-Line Options

You can use the following command-line options with DRAGEN TruSight Oncology 500 Analysis Software.

To learn more about the input requirements, use the --help command-line option.

Option
Required
Description

--help

No

Displays a help screen with available command line options.

--analysisFolder

No

Path to the local analysis folder. The default location is /staging/DRAGEN_TSO500_2.6.0_Analysis_{timestamp}. If not using the default location, provide the full path to the local analysis folder. Folder must have sufficient space and must be on an NVMe SSD drive. For example, the /staging directory on the DRAGEN server. Refer to table in Storage Requirements for minimum disk space requirements.

--resourcesFolder

No

Path to the resource folder location. The default location is /staging/illumina/DRAGEN_TSO500_2.6.0/resources. If not using the default location, enter the full path to the resource folder.

--runFolder

Yes

Required when --fastqFolder is not specified. Provide the full path to the local run folder.

--fastqFolder

Yes

Required when --runFolder is not specified. Provide the full path to the local FASTQ folder. Analysis starts at this location.

--user

No

Optional for Docker. Specify the user ID to be used within the Docker container.

--version

No

Displays the version of the software.

--sampleSheet

No

Provide the full path, including file name, if not provided as SampleSheet.csv in the run folder

--sampleOrPairIDs

No

Provide the comma-delimited sample or pair IDs that should be processed on this node with no spaces. For example, Pair_1,Pair_2,Sample_1.

--demultiplexOnly

No

Demultiplex to generate FASTQ only without additional analysis.

--gather

No

Follow this option for any directories with results that should be gathered into a single Results folder.

--hashtableFolder

No

Defaults to the DRAGEN hash table location created upon install. If not using the default location, enter the hash table location.

Note:

  • Use full paths when specifying the file paths in the command line.

  • Avoid special characters such as &, *, #, and spaces.

  • When starting from BCL files, only the run folder needs to be specified. The immediate parent directory containing the BCL files does not need to be specified.

When running the analysis software using SSH, Illumina recommends using additional software to prevent unexpected termination of analysis. Illumina recommends screen and tmux.

  1. Wait for any running DRAGEN TruSight Oncology 500 Analysis Software containers to complete before launching a new analysis. Run the following command to generate a list of running containers:docker ps

  2. Select from one of the following options:

  • Start from BCL files in the run folder with the sample sheet included in the run folder. DRAGEN_TSO500-2.6.0.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName}

  • Start from BCL files in the run folder with the sample sheet located in a folder other than the run folder. DRAGEN_TSO500.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \ --sampleSheet /staging/{SampleSheetName}.csv

  • Start from BCL files in the run folder with a different sample sheet and demultiplexing only. DRAGEN_TSO500-2.6.0.sh \ --runFolder /staging/{RunFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \ --sampleSheet /staging/{SampleSheetName}.csv \ --demultiplexOnly

  • Start from FASTQ with the sample sheet included in the FASTQ folder and with different resources and hash table folders. DRAGEN_TSO500-2.6.0.sh \ --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources \ --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable \ --fastqFolder /staging/{FastqFolderName} \ --analysisFolder /staging/{AnalysisFolderName}

  • Start from FASTQ folder with sample sheet included in the FASTQ folder and subset of samples or pairs. DRAGEN_TSO500-2.6.0.sh \ --fastqFolder /staging/{FastqFolderName} \ --analysisFolder /staging/{AnalysisFolderName} \ --sampleOrPairIDs "Pair_1,Pair2"

Starting from BCL Files

If starting from BCL (*.bcl) files, DRAGEN TruSight Oncology 500 Analysis Software requires the run folder to contain certain files and folders. These inputs are required for Docker.

The run folder contains data from the sequencing run, make sure that the folder contains the following files:

Folder/File
Description

Config folder

Configuration files

Data folder

*.bcl files

Images folder

[Optional] Raw sequencing image files.

Interop folder

Interop metric files.

Logs folder

[Optional] Sequencing system log files.

RTALogs folder

Real-Time Analysis (RTA) log files.

RunInfo.xml file

Run information.

RunParameters.xml file

Run parameters.

SampleSheet.csv file

Sample information. If you want to use a sample sheet that is not in the run folder or a sample sheet named something other than SampleSheet.csv, provide the full path.

Starting from FASTQ Files

The following inputs are required for running the DRAGEN TruSight Oncology 500 Analysis Software using FASTQ (*.fastq) files. The requirements apply to Docker.

  • Full path to an existing FASTQ folder.

  • The FASTQ folder structure conforms to the folder structure in FASTQ File Organization.

  • The sample sheet is in the FASTQ folder path, or you can set the path to the sample sheet with the --sampleSheet override command line option.

Make sure there is sufficient disk space for the analysis to complete. Refer to the --help command line argument details for disk space requirements.

Use BCL Convert to produce FASTQ files for DRAGEN TruSight Oncology 500 Analysis Software. Using bcl2fastq does not produce the same results and is discouraged.

Make sure that BCL Convert is set to write UMI sequences to the read headers in the FASTQ files.

FASTQ File Organization

Store FASTQ files in individual subfolders that correspond to a specific Sample_ID. Keep file pairs together in the same folder. Alternatively, store the FASTQ files in one flat folder structure where the FASTQ files are stored in one folder.

The DRAGEN TruSight Oncology 500 Analysis Software requires separate FASTQ files per sample. Do not merge FASTQ files.

The instrument generates two FASTQ files per flow cell lane, so that there are eight FASTQ files per sample.

Sample1_S1_L001_R1_001.fastq.gz

  • Sample1 represents the Sample ID.

  • The S in S1 means sample, and the 1 in S1 is based on the order of samples in the sample sheet, so S1 is the first sample.

  • L001 represents the flow cell lane number.

  • The R in R1 means Read, so R1 refers to Read 1.

Run on Multiple DRAGEN Servers

DRAGEN TruSight Oncology 500 Analysis Software can be used to run a subset of samples on different DRAGEN servers to decrease overall processing time. This is possible using a three stage process called scatter/gather, which consists of demultiplexing, analysis, and result gathering.

The first stage is demultiplexing. Demultiplexing runs once on the entire run folder, generates FASTQ files for each sample in the run, and then separates sample files into respective folders. Once complete, note the output directory containing the sample directories holding the FASTQ files.

The process for scattering the analysis on multiple DRAGEN servers is as follows:

  1. Determine how many DRAGEN servers are available to run.

  2. Run demultiplexing on a single DRAGEN server.

Moving or modifying files during an analysis may cause the analysis to fail or provide incorrect results.

To sequence runs on multiple DRAGEN servers using the NovaSeq 6000 XP workflow, modify the sample sheet to include a subset of the lanes. For example, on an S2 flowcell, create two modified sample sheets with one containing the samples from lane 1 and the other from lane 2. This allows only the sample sheet to be modified instead of copying files between servers. This strategy would use the start from Run Folder commands without the --demultiplexOnly option. The entire run folder would need to be copied to each analysis server as demultiplexing is performed once per server.

  1. Transfer the FASTQ folder output from the original DRAGEN server to additional servers.

    1. Logs_Intermediates/FastqGeneration.

  2. Run analysis software using the --fastqFolder option on both the original and additional DRAGEN servers.

    • Option 1 Copy the original SampleSheet.csv to each server. Then provide a subsetted list to the Bash script on each DRAGEN server with the intended samples/pairs to run.

    • Option 2 Copy and modify the SampleSheet.csv to each DRAGEN server to only contain the list of samples/pairs to run. The software verifies that all samples in the sample sheet are contained within the FASTQ folders unless the --sampleOrPairIDs command-line option is present in the analysis launch. Failure to account for these checks results in an error.

  3. Copy the results from demultiplexing and each analysis run onto a single server, and then generate the final /Results directory, which contains the aggregated results. Enter the --gather command followed by the output directories of the demultiplexing step and each individual analysis run.

Commands for Multinode Analysis

Step
Command

Demultiplexing

DRAGEN_TSO500_2.6.0.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --runFolder /staging/{RunFolderName} --analysisFolder /staging/{DemultiplexAnalysisFolderName} --demultiplexOnly --sampleSheet /staging/illumina/{SampleSheetName}

Analysis (one server)

DRAGEN_TSO500_2.6.0.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_1,Pair_2

Analysis (additional servers)

DRAGEN_TSO500_2.6.0.sh --resourcesFolder /staging/illumina/DRAGEN_TSO500/resources --hashtableFolder /staging/illumina/DRAGEN_TSO500/ref_hashtable --fastqFolder /staging/{DemultiplexAnalysisFolderName}/Logs_Intermediates/FastqGeneration/ --analysisFolder /staging/{Node1AnalysisFolderName} --sampleSheet /staging/illumina/{SampleSheetName} --sampleOrPairIDs Pair_3

Gather

DRAGEN_TSO500_2.6.0.sh --analysisFolder /Gathered_Results --resourcesFolder staging/illumina/DRAGEN_TSO500/resources --runFolder /staging/{RunFolderName}/--sampleSheet /staging/illumina/{SampleSheetName} --gather /Demultiplex_Output /Node1_Output /Node2_Output

Analysis Launch on ICA

Methods for Launching Analysis

Illumina Connected Analytics (ICA) supports the following methods for launching DRAGEN TruSight Oncology 500 Analysis Software.

  • Auto-launch—Stream run data directly from the instrument to ICA via a specially configured sample sheet and automatically begin DRAGEN TSO 500 analysis.

  • Manual launch—Initiate DRAGEN TSO 500 analysis on ICA using the run files and sample sheet files in the project.

For more information about using ICA or BaseSpace Sequence Hub, refer to the following support pages on the Illumina support site.

  • Illumina Connected Analytics support site page

  • BaseSpace Sequence Hub support site page

Auto-Launch of DRAGEN TSO 500 Analysis on ICA

Auto-launch Prerequisites and Workflow

*The BaseSpace Sequence Hub setting for run monitoring and storage must be selected on the instrument to use DRAGEN TSO 500 analysis auto-launch. For information on preparing your instrument for DRAGEN TSO 500 Auto-launch, refer to the documentation for your instrument.

  1. Use BaseSpace Sequence Hub Run Planning tool or the sample sheet templates provided on the support page to create and export a sample sheet.

    1. If BaseSpace Run Planning tool is not available in your region, use the sample sheet template.

  2. Import the sample sheet to the instrument and start the sequencing run. Refer to ICA Auto-launch Sample Sheet Requirements for sample sheet guidance.

    1. Data is uploaded to BaseSpace Sequence Hub and then pushed to ICA. You can monitor the run in BaseSpace Sequence Hub.

    2. Analysis auto launches in ICA when sequencing and the upload completes. You can monitor the status of the analysis in BaseSpace Sequence Hub or ICA

    3. If necessary, you can requeue the analysis via BaseSpace Sequence Hub.

  3. View the analysis output results in either BaseSpace Sequence Hub or ICA.

To avoid invalid sample sheet configurations, Illumina recommends using BaseSpace Run Planning tool to generate sample sheets. Using an invalid sample sheet can result in failed runs and analyses.

BaseSpace Sequence Hub Requirements for ICA Auto-Launch

BaseSpace Run Planning tool is a multi-step workflow that generates a manual launch or auto-launch capable sample sheet for export and requires the following additional settings:

  • Access to BaseSpace Sequence Hub.

  • ICA Run Storage is enabled under BaseSpace Sequence Hub settings.

Refer to the BaseSpace Sequence Hub support site page for information on setting up a BaseSpace Sequence Hub project.

Requeue Analysis

You can requeue analysis of a run via the run's Summary page in BaseSpace Sequence Hub.

Refer to the BaseSpace Sequence Hub support site page for more information on requeuing an analysis.

Minimum Storage Requirements on ICA

Sequencing System
Minimum Disk Space (Gb)

NextSeq 500/550/550Dx (RUO) HO flow cell

350

NovaSeq 6000/6000Dx (RUO) SP Flow Cell

500

NovaSeq 6000/6000Dx (RUO) S1 Flow Cell

1100

NovaSeq 6000/6000Dx (RUO) S2 Flow Cell

2500

NovaSeq 6000/6000Dx (RUO) S4 Flow Cell

4300

NovaSeq X 1.5B

2000

NovaSeq X 10B

4300

NovaSeq X 25B

8400

NextSeq 1000/2000

350

Refer to the Software Registration page for information on how to manage accounts and subscriptions.

Guided Examples

Please review these guided examples of using DRAGEN TSO 500 Analysis Software with auto-launch on ICA:

  • NovaSeq 6000Dx: TSO 500 Auto-launch Analysis in Cloud

  • NextSeq 500/550Dx: TSO 500 and Connected Insights Auto-launch Analysis in Cloud

Manual Launch of DRAGEN TSO 500 Analysis on ICA

How to Launch Analysis

  1. Create a Project: Project can be specific for the DRAGEN TruSight Oncology 500 pipeline or it can contain multiple Pipelines and/or Tools). For information on creating Projects, refer to the Projects section in Illumina Connected Analytics help.

ICA standard storage is used by default as soon as the Project is saved. To connect a different storage source, set it up before creating your Project. For details and options, refer to the Storage section in Illumina Connected Analytics help.

  1. Edit Project and Add Bundle: Edit the Project and add the bundle titled, "DRAGEN TSO 500 v2.6.0 (XX)." XX is a 2-letter code designating the region from which you are launching the analysis. Adding the Bundle automatically adds the pipeline and associated resource files and datasets to the Project. For information on Bundles, refer to the Bundles section in Illumina Connected Analytics help.

After adding the Bundle to the Project, an example dataset becomes available in the Demo_Data folder for the Project. 

  1.  Upload the sequencing data: For information on viewing and uploading data, refer to the Data section in Illumina Connected Analytics help.

  2. Start Analysis: In the Project, navigate to Pipelines, select the TSO 500 v2.6.0  Pipeline, and then select  "Start New Analysis". Set up the new analysis by configuring the parameters listed in the table below. When the required files are completed, start analysis.

  3. Download Results: After analysis is complete, navigate to results in the configured output location.

Please see the Illumina Support Shorts for guidance on how to set up and run DRAGEN TSO 500 RUO analysis on ICA.

Analysis Parameters on ICA

To launch an analysis via the ICA user interface, configure a DRAGEN TSO 500 pipeline analysis with the following parameters.

Parameter Name
Description

User Reference

The analysis run name.

User Tags

Text labels to help index the analysis.

Notify me when task is completed

Option to receive an email notification when analysis is complete.

Output Folder

The path to the analysis output folder. The default path is the project output folder.

Entitlement Bundle

Automatically populated from the project details.

Sample Sheet

Select a sample sheet in CSV format for the analysis.

To note: Sample Sheet selection is optional if starting from a run folder, and required when submitting a FASTQ folder.

Input Folder

The run folder or FASTQ folder that contains files to analyze.

FASTQ List CSV

Do not use, this only applies to auto-launch TSO 500 analysis from FASTQs after BCL auto-launch.

Starts from FASTQ

True for analysis performed on files in the FASTQ folder. False for analysis performed on files in the run folder.

Sample or Pair IDs

Optional subset of Sample IDs or Pair IDs to analyze.

Sample List

Do not use, this only applies to auto-launch TSO 500 analysis from FASTQs after BCL auto-launch.

Storage Size

The storage size to allocate for the analysis. The default and recommended value is Large.

For information about using pipelines, refer to Illumina Connected Analytics support site page.