dalmolingroup/euryale pipeline parameters

A pipeline for metagenomic taxonomic classification and functional annotation. Based on MEDUSA.

Input/output options

Define where the pipeline should find input data and save output data.

Parameter	Description	Type	Required
`input`	Path to comma-separated file containing information about the samples in the experiment. Help You will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row.	`string`	True
`outdir`	The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.	`string`	True
`email`	Email address for completion summary. Help Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to specify this on the command line for every run.	`string`
`multiqc_title`	MultiQC report title. Printed as page header, used for filename if not otherwise specified.	`string`
`save_dbs`	Save DIAMOND db to results directory after construction	`boolean`

Skip Steps

Choose to skip pipeline steps

Parameter	Description	Type
`skip_classification`	Skip taxonomic classification	`boolean`
`skip_alignment`	Skip alignment	`boolean`
`skip_functional`	Skip functional annotation	`boolean`
`skip_host_removal`	Skip host removal	`boolean`
`skip_microview`	Skip MicroView report	`boolean`
`skip_preprocess`	Skip Preprocessing steps	`boolean`

Decontamination

Parameter	Description	Type	Default	Required	Hidden
`host_fasta`	Host FASTA to use for decontamination	`string`
`bowtie2_db`	Pre-built bowtie2 index. Directory where index is located.	`string`

Alignment

Parameter	Description	Type	Default	Required	Hidden
`reference_fasta`	Path to FASTA genome file.	`string`
`diamond_db`	Path to pre-built DIAMOND db.	`string`

Taxonomy

Parameter	Description	Type	Default	Required
`kaiju_db`	Kaiju database	`string`		True
`kraken2_db`	Kraken2 database	`string`
`run_kaiju`	Run Kaiju classifier	`boolean`	True
`run_kraken2`	Run Kraken2 classifier	`boolean`

Functional

Parameter	Description	Type	Default
`id_mapping`	Path to ID mapping file to be used for the Functional annotation	`string`
`minimum_bitscore`	Minimum bitscore of a match to be used for annotation	`integer`	50
`minimum_pident`	Minimum identity of a match to be used for annotation	`integer`	80
`minimum_alen`	Minimum alignment length of a match to be used for annotation	`integer`	50
`maximum_evalue`	Maximum evalue of a match to be used for annotation	`number`	0.0001

Assembly

Parameter	Description	Type	Default	Required	Hidden
`assembly_based`		`boolean`

Reference genome options

Reference genome related files and options required for the workflow.

Parameter	Description	Type	Default	Hidden
`genome`	Name of iGenomes reference. Help If using a reference genome configured in the pipeline using iGenomes, use this parameter to give the ID for the reference. This is then used to build the full paths for all required reference genome files e.g. `--genome GRCh38`. See the nf-core website docs for more details.	`string`
`igenomes_base`	Directory / URL base for iGenomes references.	`string`	s3://ngi-igenomes/igenomes	True
`igenomes_ignore`	Do not load the iGenomes reference config. Help Do not load `igenomes.config` when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in `igenomes.config`.	`boolean`		True
`fasta`		`string`

Download Entry

Parameter	Description	Type	Default
`download_functional`	Whether to download functional references	`boolean`	True
`download_kaiju`	Whether to download the Kaiju reference db	`boolean`	True
`download_kraken`	Whether to download the Kraken2 reference db	`boolean`
`download_host`	Whether to download the host reference genome	`boolean`
`functional_db`	Functional reference URL (download entry)	`string`	https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz
`functional_dictionary`	Functional dictionary URL (download entry)	`string`	https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/idmapping.dat.gz
`kaiju_db_url`	Kaiju reference URL (download entry)	`string`	https://kaiju-idx.s3.eu-central-1.amazonaws.com/2023/kaiju_db_nr_2023-05-10.tgz
`kraken2_db_url`	Kraken2 reference URL (download entry)	`string`	https://genome-idx.s3.amazonaws.com/kraken/k2_standard_08gb_20240112.tar.gz
`host_url`	Host FASTA reference URL (download entry)	`string`	http://ftp.ensembl.org/pub/release-112/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

Max job request options

Set the top limit for requested resources for any single job.

Parameter	Description	Type	Default	Hidden
`max_cpus`	Maximum number of CPUs that can be requested for any single job. Help Use to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. `--max_cpus 1`	`integer`	16	True
`max_memory`	Maximum amount of memory that can be requested for any single job. Help Use to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. `--max_memory '8.GB'`	`string`	128.GB	True
`max_time`	Maximum amount of time that can be requested for any single job. Help Use to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. `--max_time '2.h'`	`string`	240.h	True

Generic options

Less common options for the pipeline, typically set in a config file.

Parameter	Description	Type	Default	Hidden
`help`	Display help text.	`boolean`		True
`version`	Display version and exit.	`boolean`		True
`publish_dir_mode`	Method used to save pipeline results to output directory. Help The Nextflow `publishDir` option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.	`string`	copy	True
`email_on_fail`	Email address for completion summary, only when pipeline fails. Help An email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.	`string`		True
`plaintext_email`	Send plain-text email instead of HTML.	`boolean`		True
`max_multiqc_email_size`	File size limit when attaching MultiQC reports to summary emails.	`string`	25.MB	True
`monochrome_logs`	Do not use coloured log outputs.	`boolean`		True
`hook_url`	Incoming hook URL for messaging service Help Incoming hook URL for messaging service. Currently, MS Teams and Slack are supported.	`string`		True
`multiqc_config`	Custom config file to supply to MultiQC.	`string`		True
`multiqc_logo`	Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file	`string`		True
`multiqc_methods_description`	Custom MultiQC yaml file containing HTML including a methods description.	`string`
`tracedir`	Directory to keep pipeline Nextflow logs and reports.	`string`	${params.outdir}/pipeline_info	True
`validate_params`	Boolean whether to validate parameters against the schema at runtime	`boolean`	True	True
`show_hidden_params`	Show all params when using `--help` Help By default, parameters set as hidden in the schema are not shown on the command line when a user runs with `--help`. Specifying this option will tell the pipeline to show all parameters.	`boolean`		True
`schema_ignore_params`		`string`	genomes	True