Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_ function names #948

Closed
sbfnk opened this issue Oct 15, 2024 · 23 comments
Closed

get_ function names #948

sbfnk opened this issue Oct 15, 2024 · 23 comments

Comments

@sbfnk
Copy link
Contributor

sbfnk commented Oct 15, 2024

Sorry if I missed a previous discussion about this (which I suspect there has been) but I typically think of get_ functions to provide access to some properties of an object (as is the case for get_forecast_unit()) but not to perform operations / calculations. With this in mind I would expect e.g.

  • get_coverage() to be called coverage()
  • get_pit() to be called pit()
  • get_pairwise_comparison() to be called pairwise_comparison()
    Otherwise should score() not be called get_scores() etc.?
@nikosbosse
Copy link
Contributor

Hmm. The naming did make sense to me at the time in the context of this flowchart:

image

But I do see your point.

Not sure how many people are using the gh version of scoringutils, but I suspect by now it might be a few. So changing names might not be completely free (but easier now than later - and we could even think about deprecation even though we tried to avoid this by breaking everything once).

What do others think?

@seabbs @nickreich @elray1

@nikosbosse
Copy link
Contributor

What do people think about this? We should aim to resolve this as soon as possible, as we have to do a CRAN update until October 31.

@nikosbosse
Copy link
Contributor

In terms of names I tend to agree with @sbfnk I think

@elray1
Copy link
Collaborator

elray1 commented Oct 20, 2024

I don't think our group has anything in place that would be deeply affected by a name change here.

@nikosbosse
Copy link
Contributor

Thanks @elray1. Do you have a preference either way?

@elray1
Copy link
Collaborator

elray1 commented Oct 20, 2024

Seb's point seems reasonable to me

@seabbs
Copy link
Contributor

seabbs commented Oct 21, 2024

I don't have a strong view either way and happy to drop get_. Noting I think this would be effectively all uses aside from get_forecast_unit and get_forecast_type as I think everything else is doing some kind of compute and so following "but not to perform operations / calculations. With this in mind I would expect e.g." would be expected to not be referred to as a get.

Also noting that one of the benefits of get_ is that it avoids naming collisions as it acts as a package prefix here. I don't have good insights into how much of a problem that is here.

@nikosbosse
Copy link
Contributor

nikosbosse commented Oct 21, 2024

Renaming suggestions (i.e. visualising the proposal a bit)

  1. get_duplicate_forecasts() --> duplicate_forecasts()
  2. get_pit() --> pit_histogram() (this is the data.table version, pit_histogram_sample() is then the metric equivalent)
  3. get_coverage() --> coverage()
  4. get_correlations() --> correlations()
  5. get_pairwise_comparisons() --> pairwise_comparisons()
  6. get_forecast_counts() --> forecast_counts

get_forecast_unit() and get_forecast_type() I guess make sense as they are. I'm slightly on the fence with get_duplicate_forecasts() (function shows you duplicates that cause validation to fail).

@nikosbosse
Copy link
Contributor

Good point about the naming collisions

@nikosbosse
Copy link
Contributor

Also what I like about get_ is that you can start typing get_ and you'll see the suggestions in your IDE

@seabbs
Copy link
Contributor

seabbs commented Oct 21, 2024

which is really a benefit talk about for prefixes more generally (so one option is to pick a different prefix). I find that one a little weak though as what I usually do when exploring a package is do scoringutils:: and then look through the suggestions (as it works in the same way)

@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 21, 2024

  1. get_duplicate_forecasts() --> duplicate_forecasts()

With this one I think we should probably call it find_duplicate_forecasts() or similar, as it's not a function that duplicates forecasts.

@seabbs
Copy link
Contributor

seabbs commented Oct 22, 2024

Also noting a point from #949 that without a prefix it may be hard to distinguish these doing functions from metrics (which also don't have a clear prefix).

@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 22, 2024

One way to get around this would be to make the metrics function operating on matrices/vectors internal (see #949 (comment)).

@nikosbosse
Copy link
Contributor

Also noting a point from #949 that without a prefix it may be hard to distinguish these doing functions from metrics (which also don't have a clear prefix).

I understood Sam's comment as a broader comment on all functions. We have some mixing at the moment as well, not sure how much of an issue this is going to be in reality. But I appreciate it's definitely not zero.

@seabbs
Copy link
Contributor

seabbs commented Oct 23, 2024

Ideally we would resolve this asap as about to try and write a workshop for scoringutils and ideally it won't all get cycle broken.

I don't have a strong view and happy to go with whatever. I would note though that this is another instance of a late revamp of already discussed choices which does feel inefficient so reflecting on how we can do better in initial dev conversations seems like a good idea.

@nikosbosse
Copy link
Contributor

nikosbosse commented Oct 23, 2024

which does feel efficient

inefficient you mean?

@seabbs
Copy link
Contributor

seabbs commented Oct 23, 2024

ha yes

@nikosbosse
Copy link
Contributor

nikosbosse commented Oct 23, 2024

How strongly do you feel, @sbfnk ? If we're all basically indifferent then I'd lean towards just leaving everything as is. If you do feel strongly (and I'm all in favour of strong feelings related to scoringutils!), then I'm happy to make the change tonight

@sbfnk
Copy link
Contributor Author

sbfnk commented Oct 23, 2024

If I haven't managed to convinced people that this matters then clearly it's just my own pet grievance, and I will cope.

@sbfnk sbfnk closed this as not planned Won't fix, can't repro, duplicate, stale Oct 23, 2024
@nikosbosse
Copy link
Contributor

nikosbosse commented Oct 24, 2024

If I haven't managed to convinced people that this matters

I think it's more a general scoringutils-related decision paralysis :) I genuinely find it difficult at this point to put myself into an ordinary user's shoes.... (also let's not forget that I was the original mastermind behind naming choices like eval_forecasts() and range_wide_to_long()...)

But I'm genuinely happy to keep thinking and talking about this. Even at a later point we can always decide to just go through a proper deprecation cycle.

@seabbs
Copy link
Contributor

seabbs commented Oct 24, 2024

yes agree

@nikosbosse
Copy link
Contributor

yes agree

especially with the mastermind part, I suppose :-D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants