Read visualisation

Using the interactive dashboard

There are two ways to explore the data interactively. The first is the web app here (preview shown below, the app may take a moment to load).

To experiment with loading your own data, you can also try the notebook version, which is more configurable but currently does not include the option to directly filter the data by taxonomic class.

Open In Colab

The simplest way to run the notebook is to open it in Colab (you will not need to install anything on your local machine). Click on Runtime -> Run all at the top of the page. Continue with Run anyway if prompted that the notebook was not authored by Google. Then scroll down to interact with the plot.

Filtering and querying data

For example, you may draw a rectangular selection and run blastn to find out what is in the cluster at the bottom with high coding density:

You might also be interested in reads with k-mer coverage >= 1000. Notice the mitochondrion in the selection:

There is also some low-coverage bacterial contamination hiding in the plot. Try setting the range to 0-15 and changing the background colour to make the points easier to see (the webapp version also supports reversing the colour map):

Cross-referencing other sources of information

The section below the dashboard shows how read visualisation can be combined with existing labels for more efficient data exploration. The example below shows the reads flagged as trypanosome by marker scan in red. Reads that map to the mitochondrial assembly are shown in gold. All other (unlabelled) points in blue. Notice that the main “trypanosome” cluster corresponds to the coding dense cluster mentioned above.

Using your own data

To explore your own data with the notebook, you can upload your own files and adjust the configuration. More details are provided in the notebook. However, since fasta files can be very large, it will probably be more convenient to download the notebook and run it locally with Jupyter (it will also work in Virtual Studio Code).