Coverage file

Why is it needed?

A contains information regarding the base/read coverage of each sequence in an assembly file. Assuming an unbiased sequencing process, the base/read coverage reflects the molarity of the DNA molecule (represented by the sequence in the assembly) that went into the sequencing reaction.

The coverage information parsed by BlobTools is:

  • Base coverage
  • is used in the log-scaled y-axis in a blobplot
  • is used in both, log-scaled x-axis and y-axis in a covplot
  • Total number of reads and number of mapped reads
  • is used in the first two bars of a readcovplot
  • Read coverage

The details on how BlobTools parses coverage information are described in coverage parsing.

Mapping files

Assembly files

  • Assembly files generated by certain genome assembly programs contain coverage information in the header lines.
  • BlobTools create is able to parse this information if the assembly type is specified using the -y argument.
  • However, this only includes base-coverage and hence does not allow plotting readcovplots.
  • The output of the following assemblers is supported:
  • SPAdes
  • Velvet
  • Platanus

COV files

  • COV files are a custom file-format written by BlobTools map2cov and create (after parsing a mapping file, a COV file is written automatically for future use)
  • they contain all the information needed for BlobTools create.
  • their use is encouraged since parsing a COV file requires less time than parsing a mapping file.
  • an example of a COV file is shown below
## blobtools v1.0
## Total Reads = 15313
## Mapped Reads = 15313
## Unmapped Reads = 0
## Source(s) : example/mapping_1.bam
# contig_id	read_cov	base_cov
contig_1    369       90.406
contig_2    844       168.409
contig_3    188       43.761
contig_4    2096      456.313
contig_5    456       163.557
contig_6    52        25.88
contig_7    1005      52.312
contig_8    1008      91.742
contig_9    554       74.757
contig_10   8741      310.634