Skip to content

Distance Matrices

ikb6 edited this page Jan 20, 2022 · 4 revisions

image

MicrobeTrace is capable of taking, as input, a form of Distance Matrix. In order to parse properly, this distance matrix must be of the following format:

,Node1,Node2,Node3...
Node1,0,#,#,...
Node2,#,0,#,...
Node3,#,#,0,...
...

Where Node1-3... represent the node identifiers, and # represents the distance metric between the nodes of that row and column.

Some things to note:

  • The top-left cell is empty. It doesn't strictly need to be, but whatever it's contents are will be ignored.
  • The distance matrix must be sorted such that the ith row of the ith column is reflexive.
  • MicrobeTrace will export this type of Distance Matrix, if you'd like an example. It's also quick to load, so if you've parsed a large FASTA file and want to get to analysis more quickly next time, exporting a Distance Matrix is one way to accomplish that.

Using distance matrices exported from Nextstrain (Augur)

This is the most efficient way of using large datasets like SARS CoV2 alignments processed through Nextstrain workflows.

Clone this wiki locally