Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hide the input parameters from a design object #392

Open
LittleBeannie opened this issue May 14, 2024 · 5 comments
Open

Hide the input parameters from a design object #392

LittleBeannie opened this issue May 14, 2024 · 5 comments
Assignees
Labels
question Further information is requested

Comments

@LittleBeannie
Copy link
Collaborator

LittleBeannie commented May 14, 2024

The current design object generated by gs_design_ahr, gs_power_ahr, etc., includes a lengthy list of input parameters that users may not find of interest. Is it possible to hide these input parameters?
image

@LittleBeannie LittleBeannie self-assigned this May 14, 2024
@LittleBeannie LittleBeannie added development question Further information is requested and removed development labels May 14, 2024
@nanxstats
Copy link
Collaborator

I guess you would just need a print() method implemented for these objects - which should have been the case from the beginning.

@LittleBeannie LittleBeannie removed their assignment May 16, 2024
@jdblischak
Copy link
Collaborator

We can use an S3 print method to suppress printing the list element input from design objects (class gs_design or fixed_design). From a quick check of a few functions, it appears that objects of these classes always have the element input. However, I did notice that gs_design_wlr() was the only function to return bounds (plural) instead of bound.

library("gsDesign2")

x <- gs_design_ahr()
class(x)
## [1] "non_binding" "ahr"         "gs_design"   "list"
names(x)
## [1] "input"       "enroll_rate" "fail_rate"   "bound"       "analysis"

x <- gs_power_ahr(lpar = list(sf = gsDesign::sfLDOF, total_spend = 0.1))
class(x)
## [1] "non_binding" "ahr"         "gs_design"   "list"
names(x)
## [1] "input"       "enroll_rate" "fail_rate"   "bound"       "analysis"

x <- fixed_design_ahr(
  alpha = .025, power = .9,
  enroll_rate = define_enroll_rate(duration = 18, rate = 1),
  fail_rate = define_fail_rate(
    duration = c(4, 100),
    fail_rate = log(2) / 12,
    hr = c(1, .6),
    dropout_rate = .001
  ),
  study_duration = 36
)
class(x)
## [1] "fixed_design" "list"
names(x)
## [1] "input"       "enroll_rate" "fail_rate"   "analysis"    "design"

x <- gs_design_npe()
class(x)
## [1] "tbl_df"     "tbl"        "data.frame"

x <- gs_design_wlr()
class(x)
## [1] "non_binding" "wlr"         "gs_design"   "list"
names(x)
## [1] "input"       "enroll_rate" "fail_rate"   "bounds"      "analysis"

@jdblischak
Copy link
Collaborator

However, while implementing the S3 method, I ran into a problem. The output from the summary() functions summary.gs_design() and summary.fixed_design() also have the class "gs_design" or "fixed_design", even though their outputs are very different. Thus we're trying to write a print method that works not only on the list output from the design functions but also the tables returned by their summary functions.

In general, why are we sharing so many classes between objects with very different structures? This already allows for some non-sensical possibilities. For example, if you run as_gt() on the output from gs_design_ahr(), it will try to run as_gt.gs_design() because it has that class. Of course it immediately fails since it is a list and not a tibble/df.

Below demonstrates how we are assigning the same classes to both the original list object as well as its summarized tibble object.

x_gs <- gs_design_ahr()
class(x_gs)
## [1] "non_binding" "ahr"         "gs_design"   "list" 
class(summary(x_gs))
## [1] "non_binding" "ahr"         "gs_design"   "grouped_df"  "tbl_df"      "tbl"        
## [7] "data.frame" 

x_fs <- fixed_design_ahr(
  alpha = .025, power = .9,
  enroll_rate = define_enroll_rate(duration = 18, rate = 1),
  fail_rate = define_fail_rate(
    duration = c(4, 100),
    fail_rate = log(2) / 12,
    hr = c(1, .6),
    dropout_rate = .001
  ),
  study_duration = 36
)
class(x_fs)
## [1] "fixed_design" "list"   
class(summary(x_fs))
## [1] "fixed_design" "ahr"          "tbl_df"       "tbl"          "data.frame" 

I think one issue we are running into is that S3 is single dispatch. In other words, it can only choose a method based on a single class. It would be nice if we could have a summary method for objects with class fixed_design and ahr, but that's not possible with S3. We'd have to migrate to S4 to enable multiple dispatch on multiple classes, but S4 is a big jump in complexity and strictness that I doubt we want to take on.

At minimum, I think we need to differentiate between objects that are essentially lists versus data frames, eg maybe gs_design_list versus gs_design_df.

@nanxstats
Copy link
Collaborator

@jdblischak We should stick to S3. The summary return objects should not have the same class as the design objects - I'm talking about the critical one for method dispatching. If you look at lm() and summary.lm():

fit <- lm(1 ~ 1)
class(fit)
#> [1] "lm"
class(summary(fit))
#> [1] "summary.lm"

So the solution is simple: revisit how those classes got assigned and correct them.

@jdblischak
Copy link
Collaborator

I think one issue we are running into is that S3 is single dispatch. In other words, it can only choose a method based on a single class. It would be nice if we could have a summary method for objects with class fixed_design and ahr, but that's not possible with S3. We'd have to migrate to S4 to enable multiple dispatch on multiple classes, but S4 is a big jump in complexity and strictness that I doubt we want to take on.

I realized this was incorrect, so I want to clarify. Single versus multiple dispatch refers how many objects can influence the method dispatch. For example, in the hypothetical function call f(a, b), if f() is an S3 generic, then it is only influenced by the class of a. But if it were an S4 generic, then it could be influenced by the class of both a and b.

With that out of the way, back to the strategy of assigning multiple classes to a single object.

class(gsDesign2::gs_design_ahr())
## [1] "non_binding" "ahr"         "gs_design"   "list"

Only a single classed method will be dispatched, so why assign multiple classes? Because that way you can fall back to use methods for later classes. For example, you can define summary.gs_design() to create a custom summary (as is done in {gsDesign2}), but still take advantage of all the existing S3 methods defined for the class list:

methods(class = "list")
##  [1] all.equal     as.data.frame as_tibble     cbind2        coerce        kronecker    
##  [7] Ops           rbind2        relist        type.convert  within       
## see '?methods' for accessing help and source code

Looking at the package NAMESPACE, we only have S3 methods assigned to two classes: fixed_design and gs_design

gsDesign2/NAMESPACE

Lines 3 to 10 in 48a165b

S3method(as_gt,fixed_design)
S3method(as_gt,gs_design)
S3method(as_rtf,fixed_design)
S3method(as_rtf,gs_design)
S3method(summary,fixed_design)
S3method(summary,gs_design)
S3method(to_integer,fixed_design)
S3method(to_integer,gs_design)

Thus the other classes like non_binding, ahr, and wlr are not being used for method dispatch. Instead we are recreating object oriented methods by checking the class of the object at the start of the function, eg in as_gt.fixed_design

gsDesign2/R/as_gt.R

Lines 82 to 87 in 48a165b

# get the design method
if ("ahr" %in% class(x)) {
design_mtd <- "ahr"
} else if ("fh" %in% class(x)) {
design_mtd <- "fh"
} else if ("mb" %in% class(x)) {

I'll continue to investigate how classses/methods are used in the package currently and then propose an overhaul. This should help address not only this Issue but also the refactoring of the S3 methods for summary() and as_gt() described in #282.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants