Assembly file

Description of assembly file format for use as input files for blobtools

Why is it needed?

A genome assembly is a collection of strings representing DNA sequences (contigs, scaffolds, chromosomes, ...) composed of the five letters A,G,C,T and N.

  • The assembly file is the genome assembly for which data will be collated by BlobTools



Caution: While parsing a FASTA file, blobtools will split the sequence ID on the first whitespace it encounters, analogous to the behaviour of other bioinformatic applications.