Frequently Asked Questions


The taxonomies for the sequences are not what I expected. Why?

BlobTools takes the information from HITS files and sums up bitscores by taxonomic group at each taxonomic rank (see taxonomy and tax-rule). Reasons for not arriving at the correct taxonomy is that the number of spurious hits is greater than number of good hits:

  • Imagine a sequence similarity search yielded the following result:
taxonomic groupscore
A300
B110
B108
B101
  • Although the best hit is to A, the taxonomy for that sequence would be set to B (score-sum = 319). This issue can be solved by setting the parameter [--min_score FLOAT] (default is '0.0')

Distinguishing 'BLASTed and no match' and 'not blasted' in blob plots

Question by @pinin4fjords

For resource conservation we don't BLAST short contigs from our de novo assemblies. The consequence when using BAM files with reads mapped against a reference that does contain those contigs is Blob plots with grey clouds containing both contigs that weren't BLASTed and contigs that were BLASTed but produced no hits (and uninformative bars in the ReadCovPlot). [...] is there a better way?

Answer