Skip to content

Commit

Permalink
QuartetConcordance() option to weight averages (#176)
Browse files Browse the repository at this point in the history
  • Loading branch information
ms609 authored Feb 6, 2025
1 parent 5fbef31 commit fd64943
Show file tree
Hide file tree
Showing 3 changed files with 69 additions and 21 deletions.
48 changes: 38 additions & 10 deletions R/Concordance.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,22 @@
#' with any tree that groups the first two leaves together to the exclusion
#' of the second.
#'
#' By default, the reported value weights each site by the number of quartets
#' it is decisive for. This value can be interpreted as the proportion of
#' all decisive quartets that are concordant with a split.
#' If `weight = FALSE`, the reported value is the mean of the concordance
#' value for each site.
#' Consider a split associated with two sites:
#' one that is concordant with 25% of 96 decisive quartets, and
#' a second that is concordant with 75% of 4 decisive quartets.
#' If `weight = TRUE`, the split concordance will be 24 + 3 / 96 + 4 = 27%.
#' If `weight = FALSE`, the split concordance will be mean(75%, 25%) = 50%.
#'
#' `QuartetConcordance()` is computed exactly, using all quartets, where as
#' other implementations (e.g. IQ-TREE) follow
#' \insertCite{@Minh2020;textual}{TreeSearch} in using a random subsample
#' of quartets for a faster, if potentially less accurate, computation.
#'
# `ClusteringConcordance()` and `PhylogeneticConcordance()` respectively report
# the proportion of clustering information and phylogenetic information
# \insertCite{@as defined in @Vinh2010, @SmithDist}{TreeDist} within a dataset
Expand All @@ -28,6 +44,8 @@
#'
#' @template treeParam
#' @template datasetParam
#' @param weight Logical specifying whether to weight sites according to the
#' number of quartets they are decisive for.
#'
#'
#'
Expand All @@ -44,14 +62,15 @@
#' spc <- SharedPhylogeneticConcordance(tree, dataset)
#' mcc <- MutualClusteringConcordance(tree, dataset)
#'
#' oPar <- par(mar = rep(0, 4), cex = 0.8)
#' oPar <- par(mar = rep(0, 4), cex = 0.8) # Set plotting parameters
#' plot(tree)
#' TreeTools::LabelSplits(tree, signif(qc, 3))
#' TreeTools::LabelSplits(tree, signif(cc, 3))
#' TreeTools::LabelSplits(tree, signif(pc, 3))
#' par(oPar)
#' TreeTools::LabelSplits(tree, signif(qc, 3), cex = 0.8)
#' plot(tree)
#' TreeTools::LabelSplits(tree, signif(cc, 3), cex = 0.8)
#' par(oPar) # Restore plotting parameters
#'
#' pairs(cbind(qc, cc, pc, spc, mcc))
#' # Display correlation between concordance factors
#' pairs(cbind(qc, cc, pc, spc, mcc), asp = 1)
#' @template MRS
#' @importFrom ape keep.tip
#' @importFrom cli cli_progress_bar cli_progress_update
Expand All @@ -60,7 +79,7 @@
#' @name SiteConcordance
#' @family split support functions
#' @export
QuartetConcordance <- function (tree, dataset = NULL) {
QuartetConcordance <- function (tree, dataset = NULL, weight = TRUE) {
if (is.null(dataset)) {
warning("Cannot calculate concordance without `dataset`.")
return(NULL)
Expand All @@ -75,7 +94,7 @@ QuartetConcordance <- function (tree, dataset = NULL) {
cli_progress_bar(name = "Quartet concordance", total = dim(logiSplits)[[2]])
setNames(apply(logiSplits, 2, function (split) {
cli_progress_update(1, .envir = parent.frame(2))
quarts <- rowSums(apply(characters, 2, function (char) {
quarts <- apply(characters, 2, function (char) {
tab <- table(split, char)
nCol <- dim(tab)[[2]]
if (nCol > 1L) {
Expand Down Expand Up @@ -106,12 +125,21 @@ QuartetConcordance <- function (tree, dataset = NULL) {
# Only quartets that include two T and two F can be decisive
# Quartets must also include two pairs of characters
decisive <- concordant + discordant

# Return the numerator and denominatory of equation 2 in
# Minh et al. 2020
c(concordant, decisive)
} else {
c(0L, 0L)
}
}))
ifelse(is.nan(quarts[[2]]), NA_real_, quarts[[1]] / quarts[[2]])
})
if (isTRUE(weight)) {
quartSums <- rowSums(quarts)
ifelse(is.nan(quartSums[[2]]), NA_real_, quartSums[[1]] / quartSums[[2]])
} else {
mean(ifelse(is.nan(quarts[2, ]), NA_real_, quarts[1, ] / quarts[2, ]),
na.rm = TRUE)
}
}), names(splits))
}

Expand Down
34 changes: 27 additions & 7 deletions man/SiteConcordance.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions vignettes/getting-started.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,10 @@ datasets.

### Simple installation

To use "TreeSearch" you will first need to [install R](https://CRAN.R-project.org/).
[RStudio](https://posit.co/)
([overview](https://dss.princeton.edu/training/RStudio101.pdf)) is a popular
front-end that makes several of R's features easier to use.
To use "TreeSearch" you will first need to [install R](
https://CRAN.R-project.org/).
[RStudio](https://posit.co/) is a popular front-end that makes several of R's
features easier to use.

The "TreeSearch" package can be installed as any other package.
To get the latest stable version from CRAN, type
Expand Down

0 comments on commit fd64943

Please sign in to comment.