Taxonomy
- Taxonomy of a sequence is calculated at each taxonomic rank by tax-rules
- The taxonomy of a sequence is represented by three values:
taxonomy
: the NCBI taxonomy name at that taxonomic rank (calculated by the taxrule)score
: the score of the taxonomy (calculated by the taxrule)c-index
: count of alternative candidates for taxonomy- if the hits file(s) contained no hits for a given sequence or the score did not surpass
--min_score FLOAT]
, the taxonomy is set tono-hit
- if the taxonomy is "unresolved" (due to
[--min_diff FLOAT]
or two equally well scoring taxonomic names) the taxonomy is set tounresolved
- if a taxID does not have a node at a given taxonomic rank, the taxonomy of the next highest annotated rank gets used and the suffix
-undef
is appended, e.g.: - 9823 (Sus scrofa) does not posess a node at rank "order", order is set to "Chordata-undef"
- The taxonomy of a sequence determines its colour in a blobplot/covplot
Updated over 7 years ago