Skip to content

Latest commit

 

History

History
135 lines (83 loc) · 6.89 KB

README.md

File metadata and controls

135 lines (83 loc) · 6.89 KB

borges

Of Exactitude in Science

...In that Empire, the craft of Cartography attained such Perfection that the Map of a Single province covered the space of an entire City, and the Map of the Empire itself an entire Province. In the course of Time, these Extensive maps were found somehow wanting, and so the College of Cartographers evolved a Map of the Empire that was of the same Scale as the Empire and that coincided with it point for point. Less attentive to the Study of Cartography, succeeding Generations came to judge a map of such Magnitude cumbersome, and, not without Irreverence, they abandoned it to the Rigours of sun and Rain. In the western Deserts, tattered Fragments of the Map are still to be found, Sheltering an occasional Beast or beggar; in the whole Nation, no other relic is left of the Discipline of Geography.

---From Travels of Praiseworthy Men (1658) by J. A. Suarez Miranda

borges is a small data visualization package that allows you to plot your single cell RNA-seq dataset (or any other dataset) as an antique or modern cartographic atlas. It uses 2D coordinates - be it UMAP, tSNE, PCA or anything else - and depicts group labels as continuous territories in an ocean, separated by rivers or seas.

This is all done through the use of oveRlay, ggplot2 and a few other libraries. borges is still very much under development so any feedback (especially bug reports) is more than welcome.

Wait, should I take this seriously?

If you are trying to represent high-dimensional data in 2D, not at all. All dimensional reduction techniques distort distances going from high dimensionality to low dimensionality, and non-linear techniques such as t-SNE and UMAP are very sensitive to tunable parameters (perplexity, number of neighbors, spread, etc) that do not depend on the input data. You can fiddle with as many of these parameters and RNG rounds as you want until you get something nice to show your friends. You can read more about it here. The purpose of borges is to make your beautiful, useless plots even more beautiful and slightly more useless. If instead you are representing point clouds that have a rigorous justification for their embedding in a 2D space, then by all means use borges to make your beautiful, useful plots even more beautiful and slightly less useful.

Install

To install borges you will need to install first oveRlay.

Use remotes::install_github() or devtools::install_github() as follows:

remotes::install_github("gdagstn/oveRlay")
remotes::install_github("gdagstn/borges")

Usage

borges has only two functions:

prepAtlas() to prepare the atlas coordinates from a SingleCellExperiment object, a matrix or a data.frame, and plotAtlas() to display it as a ggplot2 plot.

For a practical demonstration, let's download a SingleCellExperiment object using the scRNAseq BioConductor package from Zeisel et al. 2018, "Molecular architecture of the mouse nervous system" (link).

This is quite a large file which will take a while to download.

# BiocManager::install("scRNAseq")
zeisel = scRNAseq::ZeiselNervousData()

The Zeisel dataset has an "unnamed" reducedDim slot that contains a t-SNE embedding for cells of the nervous system. There are several labels in the colData slot, and we will choose one that offers a good balance between detail and redability.

zeiselatlas = prepAtlas(zeisel, 
                        dimred = "unnamed", 
                        res = 400, 
                        labels = "TaxonomyRank3", 
                        as_map = TRUE)

The atlas can be plotted:

plotAtlas(zeiselatlas)

Note that plotting can take a few seconds to a minute due to the high level of detail. To have less detailed maps, you can set the res argument in prepAtlas() to a smaller value, e.g. 250 or 300, and plot_cells = FALSE.

The arguments of plotAtlas() allow you to control a few graphical elements:

  • plot_cells (logical) to plot cells (as small, semi-transparent dots)

  • add_contours (logical) to add 2D kernel density contour estimates, clipped to stay within land masses (mostly)

  • show_labels (logical) to show labels using geom_label_repel() from ggrepel

  • label_size (numeric) to override the default label size decided by the map theme

  • shade_borders (logical) to add an antique-style shading to the boundaries

  • shade_offset (numeric) for the size and direction of the border

  • shade_skip (numeric) for the spacing between shading lines

  • capitalize_labels (logical) to capitalize all labels

Plotting generic 2D point clouds

borges can also be used on any generic 2D point cloud represented as a matrix or data.frame, as long as they have two columns (the first one is taken to have coordinates in the x axis, and the second one in the y axis). Moreover, if you are supplying either matrix or data.frame, the labels argument must be a character vector with labels for every point.

mats = rbind(matrix(rnorm(1000, 0, 1), ncol = 2),
             matrix(rnorm(1000, 4, 1), ncol = 2),
             matrix(rnorm(1000, -3, 1), ncol = 2))

labels = c(rep("Cluster 1", 500),
           rep("Cluster 2", 500),
           rep("Cluster 3", 500))

atl = prepAtlas(mats, res = 100, labels = labels)

plotAtlas(atl)

Geographical projection

The as_map argument in plotAtlas() controls whether it will be plotted using a geographical projection, and the map_proj argument controls the type of projection. Any one-character argument to ggplot2::coord_map() is acceptable.

For instance, we can plot the atlas using a "globular" projection:

plotAtlas(zeiselatlas, as_map = TRUE, map_proj = "globular")

Map themes

borges comes with a few different themes pre-packaged:

  • classic: the default theme

  • modern: a modern political atlas-like theme

  • renaissance: palette from 16th century maps

  • medieval: palette from 14th century maps

plotAtlas(zeiselatlas, map_theme = "renaissance")

In the future, themes will support different fonts and additional aesthetic elements.