Replies: 10 comments 15 replies
-
I believe "squash and merge" policy dominate in this repository because almost all the time a single PR contains commits that fixes the PR itself and merging with "rebase and merge" policy of those PRs would create a mess in git history. Since the situation you are mentioning statistically unusual I would suggest not to put extra work for all authors for squashing patches before merge. Nevertheless, I agree that having several function patches from a single PR squashed is not the best practice either.
|
Beta Was this translation helpful? Give feedback.
-
If this is an option, and we can communicate its purpose amongs reviewers, I'd be very happy with that as a solution. I'm assuming the author would be responsible for rebasing on top of the current merge base to ensure a fast forward merge? |
Beta Was this translation helpful? Give feedback.
-
@smaslov-intel mentioned in the past that rebases removes the ability to track how review comments are addressed.
|
Beta Was this translation helpful? Give feedback.
-
As discussed in #13073 I do not agree with changing this behavior. I will try not to regurgitate the points already made, most of which I agree with.
Ironically, this is what we prefer contributors do. It means not only a cleaner separation in changes for the sake of review, but also for the sake of the history, as is what you are requesting here. The drawback is indeed that it comes at a slightly higher overhead for the contributor, having to maintain multiple patches, potentially some blocked by others, but it makes life easier for reviewers for the following reasons:
So this part comes down to whether we want to make life "easier" for contributors (@bader made a great point about it actually raising the bar for contributors, so is it really easier for them?) or reviewers. In this case I would side with the reviewers. For what it is worth, I do not think the process is perfect, but I also think this is the lesser evil.
Which part are you referring to? To my knowledge, the description is made the commit message, so the contributor still has control, and responsibility, to keep the message up to date. The miss-contribution point is fair, but maybe it is something that can be fine-tuned somewhere.
That's a valid concern, but are we certain this is actually how upstream will feel when we do the separation as mentioned above? I imagine there are other things we need to do for upstreaming as well, w.r.t. how commits are done, so I'd rather not speculatively make large changes to the process like this without knowing for sure this will be a blocker. @elizabethandrews & @premanandrao - Feel free to comment here. |
Beta Was this translation helpful? Give feedback.
-
I don't want to re-iterate the points already discussed but I broadly agree with @ldrumm on this issue. I want to say that I feel that discussion has turned into one about "Squash & Merge" or "Rebase & Merge". I personally don't see why we can't have both options available to gatekeepers. I agree that most PRs are one commit followed by several fixups, which should definitely be squashed. This, in a sense, feels to me like the "default" merge strategy. There's no argument for keeping these as separate commits - that's not benefiting anybody. But I don't see why we can't allow the option of merging certain PRs as a series of smaller patch sets. I saw @bader say that having each commit build and pass tests is important and so "Rebase" is off the cards. If that's so then isn't that the end of this discussion? I don't want to go off into the weeds about this particular aspect without a concrete answer, as it feels like a separate discussion. Whatever the decision, for me it's ultimately about trusting contributors and letting them use all of the tools at their disposal. Each contributor has sole control of only the commits they push to a PR, and taking that power, expressiveness, and control away from them is a shame. We're allowing GitHub with its (in my opinion) very crude review/merging capabilities to dictate our repo's git history. If you'll allow me to take a more holistic view for a second: encouraging and helping contributors to maintain good commit histories is a valuable transferable skill they can take to other projects (possibly using something other than GitHub), and help improve those by encouraging good practices there. Speaking of best practices, we even link to the influential article https://cbea.ms/git-commit/ in our contributing guide, but seem to ignore most of it when it comes to actually merging the PR. What does it matter how contributors' git commits are formulated and the importance of a good git history as long as the PR description is in order? That's because the Squash & Merge commit message is ultimately taken from the PR description, not from the commits making up the PR. Through an opaque process - I may add - as it also reformats it to 80 columns and maybe performs other transformations I'm not aware of - who knows what it does with HTML. So the gatekeepers are responsible for editing the final commit message (e.g., taking out any GitHub web-related aspects) which again is taking final authorship and control away from the contributor. If we're relying on gatekeepers to check that the PR contents match the PR description at the point of merging, and to edit the commit message as necessary, we have to trust that gatekeepers would be able to make a sensible decision to merge a PR as a series of individual commits where appropriate through the "Rebase" option. I reckon 99% of the time they'd make the correct choice in this regard. |
Beta Was this translation helpful? Give feedback.
-
As I said before, that's not a useful requirement since our upstream isn't buildable on every commit. |
Beta Was this translation helpful? Give feedback.
-
An anecdote: (Quite unaccountably, there's a similar accessible door two floors up that works flawlessly in assistance or traditional barge-through-the-door mode and the assistance buttons are convenient on both sides) My point is that github's deficiency in tracking review comments has altered our behaviour. If you understand how to present clean history (by fixing up your commits as you work or receive review), but suddenly the tooling stops you from doing it you do one of two things:
The unopenable door, and github are (2) here. We shouldn't be accepting the bad door, or github's bugs but here we are: maintainers tell people off for github's intransigence in fixing their tools, and building maintenance tell me to press the button. Contributors a forced through a less-expressive and powerful funnel of the github UI, and I lose out on an opportunity for daily activity.
People aren't CPU cores, and cognitive overhead of stitching together and ordering separate PRs is greater than looking at one diff that's already properly ordered and happens to be also reviewable as separate commits.
Good! As far as I'm concerned, that's a maintainer's job: we should review code and metadata both - asking contributors to correct either if they're wrong. Let's not let github take that opportunity for quality away from us. |
Beta Was this translation helpful? Give feedback.
-
I doubt github will listen, since github basically hasn't improved since MVP in 2013, but if another voice - perhaps one backed by a large and respected multinational corporation were to push them - that might be enough. What I'd prefer is reviewers actually understood git history, and that git != github. Once they realise that, they'll start treating the git history with the attention it deserves.
Definitely a case-by-case basis, I think. An example that comes to mind is an extension I'm working on where it makes sense to implement the common code in one commit, then the cuda, and hip, and opencl, and l0 changes in others, yet land the implementation of the feature all at once. Then it becomes trivial to reason about individual regressions, every commit has relevant documentation, and everyone gets the same feature at once. It's no harder to review - in fact for people who know what they're doing - it becomes easier to review since they can more easily split the review into architectural chunks by looking at individual commit changesets
Who's lives are you trying to make easier and at what point in time? My proposal to make this an optional feature doesn't change the status-quo; we can still do the (in my opinion sloppy and selling-out-the-future for the present) form of squash-everything, but making extra tools available to gatekeepers so that a contributor can present good work when needed given should not be banned because some people don't know or care about such features. |
Beta Was this translation helpful? Give feedback.
-
FYI an example of misattribution caused by this policy: #13358 |
Beta Was this translation helpful? Give feedback.
-
Another one: #14332 NVIDIA's copyrights would get misattributed to Codeplay on squash. Who in a decision making position is going to take ownership of enabling the workarounds we discussed, please? @steffenlarsen ? |
Beta Was this translation helpful? Give feedback.
-
Git history and merge requests
I've been a bit disappointed to discover that we automatically squash merge
requests containing multiple commits. I've found this troubling and as I've got
a series of fixes that contain cherry-picks, NFC refactoring of troublesome
code, and then the fix to a real bug. They're conceptually and legally
separate entities, but part of the same workflow. I've resorted to created pull
requests for each commit to ensure the integrity of the history, but this is a
hack.
I find the current status-quo problematic for the following reasons
need a cleanup before a feature or bugfix. In many cases the commits
conceptually standalone. The squash everything practise breaks this separation
and will confuse future history spelunking efforts. It's possible to create
multiple merge requests per commit, but this increases traffic, CI overhead, and
requires much more effort from the author to corral the patch series to landing.
misattributing authorship
autogenerated which quite often involves github specific language, user
mentions, merge-request machinery etc, and often clobbers individual commit
messages which contain subtle and important standalone information
people from being able to craft their own commit messages jVgq
eventually want to merge up
I'm not sure whether this has been reasoned about in public before, but I can't
see specific mention of the aggressive squashing practise in the wiki nor the
source.
Relying on github to automatically "clean up" working history loses too much
information, makes it impossible for those who want to do the right thing to
present a clean history, and allows those who don't care to ignore the needs of
those who have to read the code later. I won't use this forum to proselitize on
why a well-curated git history is important, as there are plenty of convincing
text on the subject. However, I propose to require authors to squash
review-fixups themselves and present a curated patch or patchset before final PR
merge. This requires a little extra effort from authors and reviewers upfront
(to check on them) but I think the long-term maintainability and health of the
project goes up when history is properly curated.
patches to this proposal welcome ;)
Beta Was this translation helpful? Give feedback.
All reactions