Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

push vs PR commit #55

Closed
yarikoptic opened this issue May 24, 2021 · 22 comments · Fixed by #64
Closed

push vs PR commit #55

yarikoptic opened this issue May 24, 2021 · 22 comments · Fixed by #64
Assignees

Comments

@yarikoptic
Copy link
Member

I had mentioned that I had only github actions in one folder and apparently travis + appveyor gets sorted out into another:

(tinuous-dev) datalad@smaug:/mnt/datasets/datalad/ci/logs/2021/05$ ls 24/pr/5693/*
24/pr/5693/5a27faf:
appveyor-6046-failed  appveyor-6046-success  travis-11955-success

24/pr/5693/d524cd9:
 github-Benchmarks-3127-success   github-CrippledFS-1683-success   github-Docs-3127-success   github-Extensions-3127-success   github-Linters-103-success  'github-Shellcheck on scripts-1314-success'  'github-Test on macOS-1364-success'

because:

(git)lena:~datalad/datalad-master[nf-foreach]git
$> git show d524cd9    
commit d524cd9c8af5a57192999a34647bfb970bd0e80e (refs/pull/origin/5693/head)
Author: Kyle Meyer <[email protected]>
Date:   Mon May 24 09:24:43 2021 -0400

    TST: test_fileinfo: Be more specific about status() return value
    
    Re: https://github.com/datalad/datalad/pull/5693#pullrequestreview-666722514

....
$> git show 5a27faf    
commit 5a27fafa8616eb6ab62f6b2ff6b67c6ff39ffb71 (refs/pull/origin/5693/merge)
Merge: f63ed567d d524cd9c8
Author: Kyle Meyer <[email protected]>
Date:   Mon May 24 09:28:35 2021 -0400

    Merge d524cd9c8af5a57192999a34647bfb970bd0e80e into f63ed567dbc79cc8b22a51b7e00f9fb2419dcb66

so 5a27faf is the merge of that PR commit into corresponding branch.

For the purpose of tinuous I would have liked to have all of those under the same commit/directory for the PR or it would be hard to impossible to bring them together into coherent collection of the CI logs for a given state of the PR. I wonder if that is possible?! is the original commit of the PR branch returned by travis and appveyor? then may be we could have refcommit and refcommit_abbrev or alike.

@jwodder
Copy link
Member

jwodder commented May 24, 2021

@yarikoptic Appveyor's build objects have a pullRequestHeadCommitId field that contains the hash of the commit on the PR branch, but there does not appear to be an equivalent field for Travis.

@yarikoptic
Copy link
Member Author

sad, could you please file and issue with travis? (I am still hopeful to look forward to just have those resolved instead of us coming up with workarounds)

@jwodder
Copy link
Member

jwodder commented May 24, 2021

@yarikoptic Support ticket filed.

@yarikoptic
Copy link
Member Author

one possible workaround for this and #42 (commit time) could be: establish a local clone of the repository with the fetch of all PRs refs, as I mentioned in #42 (comment) . It would unlikely work for historical PRs.

@yarikoptic
Copy link
Member Author

I wonder if a lightweight solution could be to just get the corresponding PR pull/head refs, e.g. like

curl -H "accept: application/json" https://github.com/datalad/datalad/refs/head/101 

(but this doesn't work... just trying to mimic on what git is likely to be doing, may be agent needs to be overriden) then we could mate both commits (PR head and merge). Another strategy could be to take information about parent commit from a CI which supports it before considering a CI which doesn't

@yarikoptic
Copy link
Member Author

Please choose any path you like @jwodder to go forward on this one.

IMHO local cached clone might be best since would allow for other interrogations and might help reduce needed amount of API calls -- so we could interrogate clone first (e.g. for push-vs-pr commit , or commit date, or git describe etc) and then resort to API or other calls where possible, and then just say 'None' if no way to figure out.

for this specific issue I think we should use push commit as the commit (and None if not known) and then add prcommit with the same logic of possibly None if unknown or not relevant. Then I would try {commit_abbrev}-{prcommit_abbrev} in my config. I expect to be able to group all CIs by {commit_abbrev} while still seeing difference whenever it would be for different merges (prcommits).

@jwodder
Copy link
Member

jwodder commented May 27, 2021

@yarikoptic I don't think a local clone would work. Aren't the merge commits something created on the fly by Travis rather than something that actually exists in the repo? How did 5a27faf end up in the datalad repo?

Also, getting the PR head from the GitHub API wouldn't work, as it could have changed since the build was run.

@jwodder
Copy link
Member

jwodder commented May 27, 2021

@yarikoptic I think merge_commit would be a clearer name than prcommit (assuming that you want it to be the commit that the build is run on for merged builds).

@jwodder
Copy link
Member

jwodder commented May 27, 2021

@yarikoptic Do we want to add a field with the same meaning as {commit} currently has? That seems like it might be useful.

@yarikoptic
Copy link
Member Author

@yarikoptic I don't think a local clone would work. Aren't the merge commits something created on the fly by Travis rather than something that actually exists in the repo? How did 5a27faf end up in the datalad repo?

we fetch all those merges too, see #42 (comment) referenced above.

Also, getting the PR head from the GitHub API wouldn't work, as it could have changed since the build was run.

yes. For those - None since indeed nothing we can do about (wilder alternative would be to figure out original PR clone url and add as a remote, but I am ok to not pursue it that far). I expect it would be ok in > 90% of cases . Also... got me thinking -- there might be a way to request remote to fetch a specific object/commit -- please check if possible. Then it could be one extra way to get commit until it eventually GCed

@yarikoptic I think merge_commit would be a clearer name than prcommit (assuming that you want it to be the commit that the build is run on for merged builds).

by itself it is a bit misleading. pull_merge_commit would be descriptive but way too long... but ok -- let's go with merge_commit. I might now like it over prcommit ;)

@yarikoptic Do we want to add a field with the same meaning as {commit} currently has? That seems like it might be useful.

sorry, not following -- I think we are already talking about adding a new field {merge_commit} with the meaning similar (not exactly the same).

Re _abbrev -- let me file a dedicated issue. I think we could/should refactor that to avoid exponential composition here.

@jwodder
Copy link
Member

jwodder commented May 27, 2021

@yarikoptic

sorry, not following -- I think we are already talking about adding a new field {merge_commit} with the meaning similar (not exactly the same).

You said that "we should use push commit as the commit (and None if not known)", which I took to mean that, for Travis PR builds, {commit} would be None instead of the merge commit that the build was performed on. This is a change to the meaning of {commit}, and I can envision some users desiring a placeholder with the current semantics of "the actual commit that the CI ran on, regardless of how it was created."

@yarikoptic
Copy link
Member Author

fair enough... ok -- what about adding refcommit (not merge_commit) as I originally mentioned as the reference commit in the PR branch for which the run was intended for. I.e. for push it would be the same {commit}. For a merged PR run - it would be different (ideally not None as long as we can figure it out from CI information, a local git clone ({commit}^2 IIRC), or from github).

@jwodder
Copy link
Member

jwodder commented May 27, 2021

@yarikoptic So:

  • {commit} stays the same
  • For non-PR builds, {refcommit} is the same as {commit}
  • For PR builds, {refcommit} is the head of the PR branch, or None* if it can't be determined

Is that what you want?

* I'd actually prefer the unknown value to be "UNK", for consistency with the placeholder for PRs whose numbers we can't find.

@yarikoptic
Copy link
Member Author

well - you started to implement support as {merge_commit} / {commit} pair in #60 -- it is good for me too. I do not care that much about naming - care more about those two being available . You choose to your likeing either {merge_commit} / {commit} or {commit} / {refcommit}.

@yarikoptic
Copy link
Member Author

yarikoptic commented May 27, 2021

UNK is ok for me.

What I wonder now how to address the issue that I might want just {commit} for non-PR, and {commit}-{merge_commit} for PR ... uff (since how would we tell two runs apart for the same {commit} but two different merges?)

@jwodder
Copy link
Member

jwodder commented May 27, 2021

I do not care that much about naming - care more about those two being available

Which two being available? I count three different commit hashes we could be interested in:

@yarikoptic
Copy link
Member Author

clarification question:

I think that CI does not create merge commit and takes it from the PR's merge ref from github. Am I incorrect?

If I am correct, we have only 2 commits: head or merge ref, and then CI runs on one of them for the PR (thus some times we do see two separate runs). So there are no 3 commits, just 2 (PR head and merge).

@yarikoptic
Copy link
Member Author

moreover if PR "merge" could be fast forwarded, there might even be no separate merge commit even for the PR (didn't check)

@jwodder
Copy link
Member

jwodder commented May 27, 2021

@yarikoptic

I think that CI does not create merge commit and takes it from the PR's merge ref from github. Am I incorrect?

I don't know.

So there are no 3 commits, just 2 (PR head and merge).

Well, there are three definitions of "the commit associated with this build."

@yarikoptic
Copy link
Member Author

I think that CI does not create merge commit and takes it from the PR's merge ref from github. Am I incorrect?

I don't know.

please check. E.g. by using that

[alias]
    add-fetch-pulls = !git config --add remote.origin.fetch '+refs/pull/*:refs/pull/origin/*' && git fetch origin

I kept mentioning, and comparing those heads to the ones you see in CIs for a PR.

So there are no 3 commits, just 2 (PR head and merge).

Well, there are three definitions of "the commit associated with this build."

yeah - we (well, probably I) confused ourselves by choices.

But I think we are arriving to finale with the next option which I think would be the best ever:

  • {build_commit} -- is what currently {commit} -- the commit on which build happens (so could be {merge_commit})
  • {commit} -- what we had in mind as {refcommit} -- the commit which "triggered" the build (and what people would typically care about). So for non-PR both would be identical.

Then regardless of either a CI merges, or does some other dance to prepare build state commit -- {build_commit} would stay semantically "legit".

@jwodder
Copy link
Member

jwodder commented May 28, 2021

@yarikoptic It appears that the merge commits are created by GitHub. However, once a PR is updated, the old merge commit ceases to exist.

@yarikoptic
Copy link
Member Author

even though they might disappear (#55 (comment)), if either cron job is ran fast enough, or we figure out how to establish tinuous action within CI itself (and just wait for other CIs/workflows to finish up) -- we can have it a quite stable solution, and resort to UNK only if "all is lost"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants