This guide is intended for GNU/Linux and macOS users only.
This document is a quick guide that provides instructions to submit genome assemblies to the European Nucleotide Archive (ENA) using the Webin command line submission interface (Webin CLI). For complete information, please refer to the ENA documentation.
- You must have a valid Webin account (login:
Webin-xxxxxx). - Each submission must be associated with a pre-registered study (
PRJEBxxxxxxBioProject accession) and a sample (SAMEAxxxxxxBioSample accession). - If you are submitting an annotated assembly, consider registering a locus tag prefix.
Then prepare the following files:
This is the main file that contains all the metadata about the genome assembly. The file is a tab-separated (TSV) text file with two columns. A template is available in this repository.
The following metadata fields are supported in the manifest file for genome context:
Mandatory metadata:
STUDY: Study accession - mandatory, preregistered study (starts withPRJEBxxxxxxand is called the BioProject accession)SAMPLE: Sample accession - mandatory, preregistered sample (starts withSAMEAxxxxxxand is called the BioSample accession)ASSEMBLYNAME: Unique assembly name, user-provided - mandatoryASSEMBLY_TYPE: "clone" (cloned DNA fragments, not directly from a whole organism) or "isolate" (cultured organism derived from a single strain/colony) - mandatoryCOVERAGE: The estimated depth of sequencing coverage - mandatoryPROGRAM: The assembly program - mandatoryPLATFORM: The sequencing platform, or comma-separated list of platforms - mandatory
Optional metadata:
MOLECULETYPE: "genomic DNA", "genomic RNA" or "viral cRNA" - optionalMINGAPLENGTH: Minimum length of consecutive Ns to be considered a gap in scaffolds - optionalDESCRIPTION: Free text description of the genome assembly - optionalRUN_REF: Comma separated list of run accession(s) - optionalAGP: file that describes the assembly of scaffolds from contigs, or of chromosomes from scaffolds - optional
Assembly specifics metadata:
FASTA: file containing the sequences in FASTA format - mandatory for unannotated assembly (see below)FLATFILE: file containing the sequences in EMBL flat file format - mandatory for annotated assembly (see below)CHROMOSOME_LIST: file containing the list of chromosomes - mandatory for fully assembled chromosomes (see below)
In principle, the FASTA and FLATFILE metadata are mutually exclusive.
The FASTA flat file is mandatory for unannotated assemblies. It is obtained by concatenating the individual FASTA files and should be compressed with gzip.
To obtain the FASTA file use this command:
cat *.fasta | gzip -c > flat-file.fasta.gzThen add this line to the manifest file:
FASTA<tab>flat-file.fasta.gz
The EMBL flat file is mandatory for annotated assemblies. It is obtained by concatenating the individual EMBL files and should be compressed with gzip.
To obtain the EMBL flat file use this command:
cat *.embl | gzip -c > flat-file.embl.gzThen add this line to the manifest file:
FLATFILE<tab>flat-file.embl.gz
The chromosome file is required for fully assembled chromosomes.
The file contains the list of chromosomes (one per line) to be submitted in a tab-separated (TSV) format containing three or four columns. A template is available in this repository. This file must be compressed with gzip.
Columns:
- Sequence ID: unique sequence name, in FASTA file this is the sequence ID (e.g. ">ID"), in EMBL file this is the accession (e.g. "AC xxxx")
- Chromosome name: the name of the chromosome, e.g. "A"
- Chromosome topology and type: [linear, circular] and type [chromosome, plasmid, or other]: e.g. "linear-chromosome" or "circular-chromosome"
- Chromosome location (optional fourth column): e.g. "Mitochondrion"
Compress the file with gzip:
gzip chromosome-list.tsvThen add this line to the manifest file:
CHROMOSOME_LIST<tab>chromosome-list.tsv.gz
The latest version of the Webin command line submission interface (Webin-CLI) can be downloaded from GitHub
When you are ready, submit your files using your login credentials (login:Webin-XXXXX and password:YYYYYYY).
Java environment must be installed on your system. Test the installation by running the following command:
java -versionRun the following command to validate (no submission, only testing) your files:
java -jar webin-cli-x.y.z.jar -username Webin-XXXXX -password YYYYYYY -context genome -manifest manifest.tsv -validateIf all is ok, submit your assembly:
java -jar webin-cli-x.y.z.jar -username Webin-XXXXX -password YYYYYYY -context genome -manifest manifest.tsv -submitThe script annotated-sequences-submit.sh can be used to submit the assembly to the testing or submission server. First, complete the manifest file (TSV) and create the flat files with the required information (see above). A credential file containing your login (Webin-XXXXX) and password (YYYYYYY) separated by a space is required.
The script performs the following steps:
- Validates the submission files.
- Submits the assembly to the submission server.
Edit the parameters at the beginning of the script to set the SUBMISSION, CREDENTIAL, and MANIFEST variables:
# Submit or test?
# One of the following:
# "true": real data submission,
# "false": submit to testing server, validation only
SUBMISSION="false"
# CREDENTIAL FILE
# File containing the credentials.
# One line containing:
# username password, separated by a space
CREDENTIAL=".credential"
# MANIFEST FILE
# The manifest file that contains the metadata about the genome assembly.
# The file is a tab-separated (TSV) text file with two columns.
MANIFEST="manifest.tsv"Then, run the script with the following command:
./annotated-sequences-submit.shWe welcome you to report any issues in this document or script.
If you use this document or script in your research, please cite:
BibTeX
@misc{bigey2026,
author = {Bigey, Frédéric},
title = {A Quick Guide to Submitting Genome Assemblies to ENA Using the Webin CLI},
year = {2026},
howpublished = {\url{https://github.com/bigey/assembled-genomes-ena-submit}},
note = {accessed 2026-02-11}
}Biblatex
@software{bigey2026,
author = {Bigey, Frédéric},
title = {A Quick Guide to Submitting Genome Assemblies to ENA Using the Webin CLI},
year = {2026},
version = {v1.0.0},
url = {https://github.com/bigey/assembled-genomes-ena-submit},
note = {accessed 2026-02-11}
}