Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better multiplots #72

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from
Draft

Conversation

Kucharssim
Copy link
Member

As I discussed with @vandenman, currently we have about a bazillion implementations of multi-plots that share similar structure, so let's add yet another implementation! Namely:

  1. Bivariate plot

Displays the joint distribution of two variables with their marginal distributions along the axes in this layout:
This layout is implemented in jaspBivariateWithMargins().

  1. Matrix plot

Displays potentially more than two variables in one plot, with marginal distributions on the diagonal, something on the upper diagonal, and something/nothing in the lower diagonal.
This layout is implemented in jaspMatrixPlot(). The matrix plot also allows overwriting the default plots with custom plotting functions so that developers can for example display raincloud plots or traceplots on the diagonal or whatever.


The idea is that these functions can be reused in all analyses using these common layouts, and themselves reuse a common functions for drawing the marginal (jaspHistogram()) and joint (a new jaspBivariate()) distributions, so that we keep the implementation consistent between analyses but also with the elementary plots like standard histograms.

While we are doing this it would be good to settle on a reasonable default behavior so that we really push for consistency, but there are some exceptions needed in specific analyses so the new functions are quite flexible. See the expandable "Examples" section below for examples (and attached code and jasp file for comparison: examples.zip).

The main advantage of using patchwork for stitching the plots together is that it ensures that axes in the subplots are spatially aligned, so we avoid misaligned axes like this

bad-plot

It's also easier to "collect" legends from each subplot and display them outside of the plotting area.

There is still work to be done, here are some to-do's:

  • Need to polish some rough edges and error handling
  • Write nicer documentation with examples
  • I am not happy yet with how we hook up jaspHistogram(), perhaps I need to fiddle with it a little bit

Before I continue, it would be nice to get some feedback:

  • What should be the default standards? Most of the stuff is already JASP-style, but we do not really have guidance on how to draw, for example, confidence and prediction intervals, what histogram method should be the default, etc. Perhaps we can have some discussion about these things @vandenman, @EJWagenmakers, @AlexanderLyNL.
  • Does the current implementation make sense? Especially welcome feedback from @vandenman here.
  • Should we implement unit tests for these plotting functions here in jaspGraphs? That way it would be easier to maintain the functionality if we need to fix or change something. The unit tests in individual modules then would not have to focus on how the plots look like, but more on whether a plot is drawn given a specific option selected, etc. What's your take @vandenman ?
Examples
# Recreate plots form JASP using the new functions ----
library(jaspGraphs)
# this is just so that the figures look nice on GitHub
knitr::opts_chunk$set(fig.width = 8, fig.height = 8, dpi = 250, warning = FALSE)

## Read the debug data set:
debugData <- read.csv(
 file = url("https://raw.githubusercontent.com/jasp-stats/jasp-desktop/stable/Resources/Data%20Sets/debug.csv"),
 stringsAsFactors = TRUE
 )

Descriptives

Basic plots: Correlation plots

df <- debugData[, c("contNormal", "contGamma", "contcor1")]

Default

jaspMatrixPlot(df, binWidthType = "sturges", topRightPlotArgs = list(smooth = "lm"))

Manual bins

jaspMatrixPlot(df, binWidthType = "manual", numberOfBins = 5, topRightPlotArgs = list(smooth = "lm"))

Density on diagonals

jaspMatrixPlot(df, binWidthType = "scott", diagonalPlotArgs = list(density = TRUE))

Rug marks on diagonals

jaspMatrixPlot(df, binWidthType = "fd", diagonalPlotArgs = list(rugs = TRUE))

Customisable plots: Scatter plots

Default

jaspBivariateWithMargins(
  x             = debugData$contNormal,
  y             = debugData$contGamma,
  xName         = "contNormal",
  yName         = "contGamma",
  histogramArgs = list(density = TRUE, densityShade = TRUE, histogram = FALSE),
  smooth        = "loess",
  smoothCi      = TRUE
)

Grouped

jaspBivariateWithMargins(
  x         = debugData$contNormal,
  y         = debugData$contGamma,
  xName     = "contNormal",
  yName     = "contGamma",
  group     = debugData$facGender,
  groupName = "Gender",
  smooth    = "lm"
)

Correlations

Scatter plots: default

df <- debugData[, c("contNormal", "contGamma", "contcor1")]

jaspMatrixPlot(df, diagonalPlotFunction = NULL)

Scatter plots: additional stuff

plotCorrValues <- function(x, y, ...) {
  r <- cor(x, y)
  df <- data.frame(x = 0.5, y = 0.5, label = sprintf("r = %.3f", r))
  ggplot2::ggplot(df, ggplot2::aes(x = x, y = y, label = label)) +
    ggplot2::theme_void() +
    ggplot2::geom_text(size = 5) +
    ggplot2::coord_cartesian(xlim = 0:1, ylim = 0:1)
}

jaspMatrixPlot(
  data                   = df,
  diagonalPlotArgs       = list(density = TRUE),
  # as pointed out by EJ, correlations should show prediction ellipse, not prediction band
  topRightPlotArgs       = list(smooth = "lm", smoothCi = TRUE, predict = "ellipse", predictArgs = list(alpha = 0, color = "black", linetype = 2, size = 1)),
  bottomLeftPlotFunction = plotCorrValues
)

Pairwise plots

jaspBivariateWithMargins(
  x                    = debugData$contNormal,
  y                    = debugData$contGamma,
  xName                = "contNormal",
  yName                = "contGamma",
  histogramArgs        = list(density = TRUE),
  margins              = c(0.4, 0.6),
  topRightPlotFunction = plotCorrValues,
  smooth               = "lm"
)

JAGS

df <- data.frame(
  mu    = rnorm(10000),
  sigma = rgamma(10000, 3, 3),
  chain = gl(4, 2500)
)

jaspMatrixPlot(
  data                   = df,
  bottomLeftPlotFunction = jaspBivariate,
  bottomLeftPlotArgs     = list(type = "contour"),
  topRightPlotArgs       = list(type = "hex", args = list(bins = 15))
)

Additional custom shennanigans

jaspMatrixPlot(
  data                   = df,
  binWidthType           = "manual",
  numberOfBins           = 30,
  diagonalPlotArgs       = list(density = TRUE, groupingVariable = df$chain, groupingVariableName = "Chain", histogramPosition = "identity"),
  bottomLeftPlotFunction = jaspBivariate,
  bottomLeftPlotArgs     = list(type = "density"),
  topRightPlotArgs       = list(type = "hex", args = list(bins = 15))
)

Created on 2022-10-20 with reprex v2.0.2

@Kucharssim Kucharssim requested a review from vandenman October 20, 2022 12:31
R/jaspBivariate.R Outdated Show resolved Hide resolved
R/jaspBivariate.R Outdated Show resolved Hide resolved
R/jaspBivariate.R Outdated Show resolved Hide resolved
R/jaspMatrixPlot.R Show resolved Hide resolved
@vandenman
Copy link
Contributor

Above are some minor comments. In general, I think this is a good idea.

Concerning the implementation, I think we'll only know for sure whether is flexible enough once we start replacing some existing figures in jasp with this.

Should we implement unit tests for these plotting functions here in jaspGraphs? That way it would be easier to maintain the functionality if we need to fix or change something.

Ideally, yes. There are also some unit tests for plot editing, although those need to be updated.

What should be the default standards? Most of the stuff is already JASP-style, but we do not really have guidance on how to draw, for example, confidence and prediction intervals, what histogram method should be the default,

I would stick to whatever existing analyses do at the moment. So for example,
image

is what linear regression does. So the shaded area for the ci and dashed lines for the prediction interval. For the default histogram method I would use what R uses (so Sturges).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants