dalmolingroup/euryale pipeline parameters

A pipeline for metagenomic taxonomic classification and functional annotation. Based on MEDUSA.

Input/output options

Define where the pipeline should find input data and save output data.

Parameter Description Type Default Required Hidden
input Path to comma-separated file containing information about the samples in the experiment.
HelpYou will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row.
string True
outdir The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure. string True
email Email address for completion summary.
HelpSet this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (~/.nextflow/config) then you don't need to specify this on the command line for every run.
string
multiqc_title MultiQC report title. Printed as page header, used for filename if not otherwise specified. string
save_dbs Save DIAMOND db to results directory after construction boolean

Skip Steps

Choose to skip pipeline steps

Parameter Description Type Default Required Hidden
skip_classification Skip taxonomic classification boolean
skip_alignment Skip alignment boolean
skip_functional Skip functional annotation boolean
skip_host_removal Skip host removal boolean
skip_microview Skip MicroView report boolean
skip_preprocess Skip Preprocessing steps boolean

Decontamination

Parameter Description Type Default Required Hidden
host_fasta Host FASTA to use for decontamination string
bowtie2_db Pre-built bowtie2 index. Directory where index is located. string

Alignment

Parameter Description Type Default Required Hidden
reference_fasta Path to FASTA genome file. string
diamond_db Path to pre-built DIAMOND db. string

Taxonomy

Parameter Description Type Default Required Hidden
kaiju_db Kaiju database string True
kraken2_db Kraken2 database string
run_kaiju Run Kaiju classifier boolean True
run_kraken2 Run Kraken2 classifier boolean

Functional

Parameter Description Type Default Required Hidden
id_mapping Path to ID mapping file to be used for the Functional annotation string
minimum_bitscore Minimum bitscore of a match to be used for annotation integer 50
minimum_pident Minimum identity of a match to be used for annotation integer 80
minimum_alen Minimum alignment length of a match to be used for annotation integer 50
maximum_evalue Maximum evalue of a match to be used for annotation number 0.0001

Assembly

Parameter Description Type Default Required Hidden
assembly_based boolean

Reference genome options

Reference genome related files and options required for the workflow.

Parameter Description Type Default Required Hidden
genome Name of iGenomes reference.
HelpIf using a reference genome configured in the pipeline using iGenomes, use this parameter to give the ID for the reference. This is then used to build the full paths for all required reference genome files e.g. --genome GRCh38.

See the nf-core website docs for more details.
string
igenomes_base Directory / URL base for iGenomes references. string s3://ngi-igenomes/igenomes True
igenomes_ignore Do not load the iGenomes reference config.
HelpDo not load igenomes.config when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in igenomes.config.
boolean True
fasta string

Download Entry

Parameter Description Type Default Required Hidden
download_functional Whether to download functional references boolean True
download_kaiju Whether to download the Kaiju reference db boolean True
download_kraken Whether to download the Kraken2 reference db boolean
download_host Whether to download the host reference genome boolean
functional_db Functional reference URL (download entry) string https://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz
functional_dictionary Functional dictionary URL (download entry) string https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/idmapping.dat.gz
kaiju_db_url Kaiju reference URL (download entry) string https://kaiju-idx.s3.eu-central-1.amazonaws.com/2023/kaiju_db_nr_2023-05-10.tgz
kraken2_db_url Kraken2 reference URL (download entry) string https://genome-idx.s3.amazonaws.com/kraken/k2_standard_08gb_20240112.tar.gz
host_url Host FASTA reference URL (download entry) string http://ftp.ensembl.org/pub/release-112/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

Max job request options

Set the top limit for requested resources for any single job.

Parameter Description Type Default Required Hidden
max_cpus Maximum number of CPUs that can be requested for any single job.
HelpUse to set an upper-limit for the CPU requirement for each process. Should be an integer e.g. --max_cpus 1
integer 16 True
max_memory Maximum amount of memory that can be requested for any single job.
HelpUse to set an upper-limit for the memory requirement for each process. Should be a string in the format integer-unit e.g. --max_memory '8.GB'
string 128.GB True
max_time Maximum amount of time that can be requested for any single job.
HelpUse to set an upper-limit for the time requirement for each process. Should be a string in the format integer-unit e.g. --max_time '2.h'
string 240.h True

Generic options

Less common options for the pipeline, typically set in a config file.

Parameter Description Type Default Required Hidden
help Display help text. boolean True
version Display version and exit. boolean True
publish_dir_mode Method used to save pipeline results to output directory.
HelpThe Nextflow publishDir option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See Nextflow docs for details.
string copy True
email_on_fail Email address for completion summary, only when pipeline fails.
HelpAn email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.
string True
plaintext_email Send plain-text email instead of HTML. boolean True
max_multiqc_email_size File size limit when attaching MultiQC reports to summary emails. string 25.MB True
monochrome_logs Do not use coloured log outputs. boolean True
hook_url Incoming hook URL for messaging service
HelpIncoming hook URL for messaging service. Currently, MS Teams and Slack are supported.
string True
multiqc_config Custom config file to supply to MultiQC. string True
multiqc_logo Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file string True
multiqc_methods_description Custom MultiQC yaml file containing HTML including a methods description. string
tracedir Directory to keep pipeline Nextflow logs and reports. string ${params.outdir}/pipeline_info True
validate_params Boolean whether to validate parameters against the schema at runtime boolean True True
show_hidden_params Show all params when using --help
HelpBy default, parameters set as hidden in the schema are not shown on the command line when a user runs with --help. Specifying this option will tell the pipeline to show all parameters.
boolean True
schema_ignore_params string genomes True