Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal for a new CRAN CTV on Archaeological Science has now been submitted #71

Open
benmarwick opened this issue Sep 3, 2024 · 13 comments

Comments

@benmarwick
Copy link
Owner

You can see it here: cran-task-views/ctv#64 I read a bunch of recently accepted CTV and edited the scope slightly to match what I saw in those CTVs

@nfrerebeau
Copy link
Contributor

Following on from the discussion of the proportion of GitHub projects relative to CRAN packages, here are two CRAN packages we missed:

@benmarwick
Copy link
Owner Author

Thanks @nfrerebeau I've added those now.

Hello everyone @scpederzani @SCSchmidt @lolosp @lsteinmann @LiYingWang @bbartholdy @joeroe @samleggs22 @scpederzani here is an update on the submission of our proposal to the CRAN CTV maintainers.

In brief, they would like to see no more than 20% of the packages on non-CRAN repositories. Currently we have about 47%, so we need to

  • remove 34 GitHub packages from our list, or
  • get 28 of the current pkgs that are currently only on GitHub to be hosted on CRAN, or
  • a combination of getting some of those 28 GitHub package onto CRAN and removing others of them from our CTV. Perhaps we can remove only those that have not been updated in the last five years? I saw @nfrerebeau's list here showing that about 1/3 of the GitHub packages had activity recently

Among our group of maintainers I counted about 11 GitHub packages that we maintain, @joeroe, me, @SCSchmidt and @lsteinmann. So one way to get started on this could be for us to get those 11 on CRAN, and remove a bunch of others, and perhaps ask the @ISAAKiel group if they might put some of theirs on CRAN to help us get close to 20%.

What do you think? Please let me know your thoughts!

Code for calculations
# get the text of the CTV
ctv_url <- "https://raw.githubusercontent.com/benmarwick/ctv-archaeology/master/Archaeology.md"

# import into R
ctv_url_tbl <- scan(ctv_url, what = character())

# convert to scalar
ctv_url_tbl_txt <- paste0(ctv_url_tbl, collapse = " ")

# count how many github pkgs
n_github_pkgs <- stringr::str_count(ctv_url_tbl_txt, "r github\\(")

# count how many cran pkgs
n_cran_pkgs <- stringr::str_count(ctv_url_tbl_txt, "r pkg\\(")

# what's the percentage of github pkgs currently?
n_github_pkgs / (n_github_pkgs + n_cran_pkgs)

# currently 47.5%
# target is <20% so how many github packages need to go to CRAN?

n_github_pkgs - (0.2 * (n_github_pkgs + n_cran_pkgs))

# ~ 28 github packages need to go to CRAN to get to 20% github pkgs

# how many github packages to remove to get to 20%

n_github_pkgs - (n_cran_pkgs * 0.25)

# ~ 34

# joeroe 6
# benmarwick 3
# SCSchmidt 1
# lsteinmann 1

@bbartholdy
Copy link
Contributor

If we do filter out no updates in the last five years, I think this should only be for software packages and not data packages, since the latter don't really need updating to the same extent?

@lolosp
Copy link
Contributor

lolosp commented Sep 14, 2024

What about a filtering criterion based on compatibility with current R and dependencies releases, i.e. filter out packages that are no longer functional? I would be happy to put some time towards testing these. The cool thing about the CTV is finding useful packages that are not on CRAN so would be good to try and keep those if possible...

@bbartholdy
Copy link
Contributor

We could also take a look at some of the data packages and see if we can get them on CRAN, since these should be relatively straightforward to submit and maintain?

@lsteinmann
Copy link
Contributor

I could prepare my GitHub-clayrings-data-thing for CRAN, but to be honest, I feel it does not make so much sense, because it is so very very tiny and overly specific.

I would be happy to help a bit with anyone else getting something CRAN-ready?

@nfrerebeau
Copy link
Contributor

nfrerebeau commented Sep 16, 2024

Submitting a bunch of packages to CRAN would be the ideal solution. Personnally, I've never had any issue with the submission process and I find it pretty smooth (despite the fact that CRAN always asks for mandatory fixes during fieldwork). However, I may be subject to survivorship bias. My packages are relatively simple to maintain (e.g. no system library dependency) and there are examples of CRAN maintainers beeing quite harsh (no name needed here, but there is a phrase for that). So I'd understand if package maintainers prefer to build their own CRAN-like repository (with R-universe or drat). Asking maintainers to submit their packages to CRAN also implies a long-term commitment, as we don't want these packages to be archived within the next 6 months.

That beeing said, CRAN is the centerpiece of the quality of the R ecosystem, thanks to its stringent standards. It also makes things easier for beginners, as you only need to use install.packages() to get started.

We should not only consider submitting new packages to CRAN, but also pruning GitHub projects. As @lolosp suggests, we can filter out GitHub projects that are no longer functional. Maybe we can also filter out projects without DOI (i.e. that are not properly archived) and emphasis peer review (e.g. rOpensci packages).

I suspect that most GitHub projects are listed at https://open-archaeo.info. We can add this link in the CTV preamble to let interested people discover more R resources.

@joeroe
Copy link
Contributor

joeroe commented Sep 18, 2024

It's a fair point about the proportion of CRAN packages. I know some of reasonable objections to CRAN and prefer e.g. r-universe or just distribute source code on GitHub, but I'd still say that CRAN is the de facto standard repository for R and therefore important from an accessibility and reproducibility point of view. Something we should be trying to encourage with this CTV, in other words.

I plan to submit all my packages to CRAN anyway, so I can make a push to do so with all of them listed here. But I have to warn that realistically I won't be able to do so before the end of October.

@nfrerebeau
Copy link
Contributor

While trying to increase the number of packages on CRAN, we could reduce the number of GitHub packages to move the CTV submission forward (we can always add more packages later)?

I've tried to make a small selection here: nfrerebeau@36fc32d

I have kept projects that have been peer-reviewed or have been on CRAN and are currently archived. I've also left some projects that I felt were the least redundant with the CRAN packages, but this is highly opinionated.

Whatever the criteria, the cut is significant. There are still about 25% of GitHub packages, but maybe the CTV editors will be understanding 😅

What do you think?

@benmarwick
Copy link
Owner Author

benmarwick commented Oct 2, 2024

Thanks everyone, looks like between us we can get a few more packages on to CRAN (thanks @joeroe and @lsteinmann!) and drop a few GitHub-only packages (thanks @nfrerebeau for making a start on this, great idea to mention https://open-archaeo.info in the CTV preamble).

Perhaps we should set a three month deadline for some of us to work on getting some of our packages to CRAN, and then update the proposal and see where we are at in terms of the requirements of the CRAN CTV reviewers.

Here are some GitHub packages from our group that we might be able to get onto CRAN:

Joe, let me know if some of these you know will never go to CRAN:

  • joeroe/c14
  • joeroe/stratigraphr
  • joeroe/fieldwalkr
  • joeroe/islay
  • joeroe/swapdata
  • joeroe/rintchron

These are mine that I'm confident I can get accepted to CRAN:

  • benmarwick/signatselect
  • benmarwick/roev
  • benmarwick/evoarchdata

Lisa, go for it!

  • lsteinmann/clayringsmiletus

And here some packages from others that I will contact and see if they are interested to submit to CRAN also:


  • bischrob/Rosegate-Projectile-Points-in-the-Fremont-Region

  • ercrema/cTransmission

All these are by Clemens, I think?

  • Johanna-Mestorf-Academy/sdsanalysis
  • nevrome/varnastats
  • nevrome/bleiglas
  • ropensci/c14bazAAR

@joeroe
Copy link
Contributor

joeroe commented Oct 2, 2024

I think we should retain c14bazAAR and bleiglas, at least. They're both mature and published packages. As far as I understand @nevrome prefers to keep them off CRAN on principle, not because they're not ready. The review process for becoming an rOpenSci package is at least as rigorous as CRAN's.

@benmarwick I won't submit swapdata to CRAN. The rest I think are doable.

@nevrome
Copy link

nevrome commented Oct 3, 2024

Cool that you're all so actively collaborating on this CTV! Would love to help with this issue, but I stopped doing CRAN submissions entirely. While I get the arguments voiced above by @nfrerebeau I made this decision based on the following experiences and considerations:

  • Back then the CRAN testing was not sandboxed. Once another person's package changed the test environment and caused my check run to fail. Because of this technical limitation CRAN maintainers naturally had to be strict about changes to the environment. I once got banned from submission for a month for writing a file to the home directory in a test.
  • Getting banned for one month because of a simple mistake like this is unreasonable, in my opinion. I found this very stressful. And that was not the only time I felt treated rudely and unfairly by CRAN maintainers. Especially after I had much more nice and professional encounters in the Haskell ecosystems I do not want to go back.
  • Most importantly I think the packages I'm writing are only really relevant for a tiny number of people and the small gain in convenience for these few users is not worth the overhead of a CRAN submission. CRAN expects very quick responses when your packages fails to check 100% because of a minor change in a dependency. My packages are not important enough that I have to address minor issues immediately. I'd rather have a user open an issue to let me know when they really encounter a problem. My packages also don't have to run everywhere. All these hours I invested back then to get recexcavAAR working on SunOS and make the CRAN checks happy...

Sorry for the rant. I just wanted to voice this once to explain my decision. Feel free to include or remove my packages from the CTV. I will keep maintaining them on a per-demand basis.

@benmarwick
Copy link
Owner Author

Thanks @nevrome, no worries at all, that's helpful. Sorry to hear of your negative experiences with the CRAN maintainers. I'll cross yours off our list of pkgs to consider getting to CRAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants