Assembly file

Description of assembly file format for use as input files for blobtools

Why is it needed?

A genome assembly is a collection of strings representing DNA sequences (contigs, scaffolds, chromosomes, ...) composed of the five letters A,G,C,T and N.

The assembly file is the genome assembly for which data will be collated by BlobTools

Format

FASTA

Comments

Caution: While parsing a FASTA file, blobtools will split the sequence ID on the first whitespace it encounters, analogous to the behaviour of other bioinformatic applications.

Updated over 8 years ago