Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overlapping histograms #114

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

arandolan
Copy link
Contributor

Addresses [https://github.com/jasp-stats/INTERNAL-jasp/issues/1695] by providing a new checkBox entitled 'Overlay' under Distribution Plots in Descriptives, which overlays the histograms for all split factor groups within a variable, so that they are all displayed on the same plot instead of one plot for each group.

Comment on lines +1350 to +1351
data = data.frame(x = variable, g = overlayFactor),
mapping = ggplot2::aes(x = x, fill=factor(g), y = ..density..),#options$splitBy, y = ..density..),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it makes sense to incorporate these changes here? If you prefer, I can also do that (either before or after this PR).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I have a very slight preference for moving onto something else in the interest of getting broader exposure to different parts of JASP, but emphasis on the word 'slight' - I would also happily have a go at incorporating the changes in jaspGraphs. By the way I think it's worth mentioning that there were a couple of not-completely-resolved questions about whether this could be implemented more neatly, and what to do if there is no split factor. Are there differences between the required functionality or code standards in the jaspGraphs and jaspDescriptives modules?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a very slight preference for moving onto something else in the interest of getting broader exposure to different parts of JASP

Makes sense.

By the way I think it's worth mentioning that there were a couple of not-completely-resolved questions about whether this could be implemented more neatly, and what to do if there is no split factor.

This can definitely be done in one function (and should probably be done in one function).
Personally, I'd implement this somewhat like this:

.plotMarginal_Overlay <- function(column, variableName, overlayFactor = NULL, splitByName = NULL, ...) {

if (displayDensity) {
  if (is.null(overlayFactor)) {
    histogramData    <- data.frame(x = variable, g = overlayFactor)
    histogramMapping <- ggplot2::aes(x = x, fill=factor(g), y = ..density..)
  } else {
    histogramData    <- data.frame(x = variable)
    histogramMapping <- ggplot2::aes(x = x, y = ..density..)
  }

  p <- p +
    ggplot2::geom_histogram(
      data    = histogramData,
      mapping = histogramMapping,
      ...
    ) + ...
}

Are there differences between the required functionality or code standards in the jaspGraphs and jaspDescriptives modules?

Since jaspGraphs is reused by all of JASP, I tend to be a bit more strict in terms of functionality and documentation (e.g., enforcing roxygen2 for documentation). Also, all code there must pass the R CMD CHECK. but I can also do that if you prefer to work on other things!

@arandolan arandolan requested a review from Kucharssim March 3, 2022 09:11
Copy link
Member

@Kucharssim Kucharssim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks quite good, I made some comments to both the code and the functionality. I agree with @vandenman that implementing the stacked plot as part of the new jaspGraphs::jaspHistogram would be a good idea, but we can also do this later.

aPlot <- .descriptivesFrequencyPlots_SubFunc(dataset = dataset, variable = variable, width = options$plotWidth, height = options$plotHeight, displayDensity = options$distPlotDensity, rugs = options$distPlotRug, title = variable, binWidthType = options$binWidthType, numberOfBins = options$numberOfBins)
aPlot$dependOn(options = "splitby", optionContainsValue = list(variables = variable))

if (options$overlay == TRUE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (options$overlay == TRUE) {
if (options[["overlay"]]) {
  1. overlay is binary already
  2. using [[ over $ avoids partial matching, so it's good to prefer that.

@@ -1239,6 +1289,111 @@ Descriptives <- function(jaspResults, dataset, options) {
return(p)
}



.plotMarginal_Overlay <- function(column, variableName, overlayFactor, splitByName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.plotMarginal_Overlay <- function(column, variableName, overlayFactor, splitByName,
.plotMarginalOverlay <- function(column, variableName, overlayFactor, splitByName,

We use camelCase, so no underscores in the names, please. I know that jaspDescriptives does not follow this convention, but it's good to start doing it here.

Comment on lines +1358 to +1363
ggplot2::geom_line(
data = data.frame(x = dens$x, y = dens$y),
mapping = ggplot2::aes(x = x, y = y),
lwd = lwd,
col = "black"
) + ggplot2::scale_fill_manual(name = splitByName, values = rainbow(nlevels(overlayFactor)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be also good to split the density plot by the group

@@ -83,6 +83,7 @@ Form
enabled: plotVariables.checked || plotCorrelationMatrix.checked

indent: true
CheckBox { name: "overlay"; label: qsTr("Overlay") }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CheckBox { name: "overlay"; label: qsTr("Overlay") }
CheckBox { name: "distPlotOverlay"; label: qsTr("Overlay split levels in one plot"); enabled: splitBy.count > 0 }
  1. change of the option name: It is good to make sure that the name of the option makes it clear which output it affects. In this case, the distribution plots.
  2. change the label: be more descriptive to the user
  3. enabled: enabling the option only if splitBy is not empty makes it clearer to the user what the button does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question in general is whether this option should be presented here as is, as it affects only the distribution plots but not the correlation plots. The alternative option would be to implement this split for the correlation plots as well (which would be ideal, I think). But this is already a good step forward, so I would not be that opposed to let it be like this. We can revisit the correlation plots in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it would be good to add description of this button in the help file (you can find the file inside inst/help/Descriptives.md).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants