Reduce recommendation of 400 lines for a single review meeting (50 to upper limit of 200) #18

NickleDave · 2021-06-25T15:05:32Z

Met with Pat Schloss yesterday re: code clubs paper
He did take a look at the site in progress and one thing he commented on was that 400 lines is a lot of code!
In their paper they recommend ~50 lines

I see similarly in this Fernando Perez post on code review (from #17) a suggestion for roughly 50 lines, 200 max
http://fperez.org/py4science/code_reviews.html

Along with this could be a recommendation that if you can't isolate around 50 lines of code it might be an indicator you need to do some refactoring?

tlestang · 2021-07-19T20:42:30Z

Interesting! I think the number 400 emerged from a few studies (e.g. the Microsoft one). But I would guess that these studies are aimed at developers who a re going to review on their own for about an hour straight (e.g. reviewing a PR) wihout much interaction with the author. In this context 400 loc seems reasonable.

But in the case where that hour is about conversation over a piece of code - 50 loc makes sense

bielsnohr · 2022-03-11T14:27:00Z

Doing a bit more literature review on this topic, the oft quoted "400 loc per review" does indeed come from industry settings. Some of them include:

SmartBear Best Practices for Code Review: a blog post that is informed by an analysis of code review in industry that they do annually. 400 loc is the recommended maximum beyond which the detection of defects starts to decline.
Google Engineering Practices Documentation: small CLs (changelists) are recommended, and on the question of what "small" is, they land at around 200 loc, but then also make the important point that size can also be impacted by how many files the changes are spread across!
Investigating the effectiveness of peer code review in distributed software development based on objective and subjective data: a slightly more quantitative study that does some scraping of repositories and their metadata. The projects are from Gerrit, but not clear what domain (i.e. private industry, open source, unlikely scientific?). There is a clear conclusion that larger code reviews produce worse results in terms of defect detection and reviewer engagement. The "cut-off" seems to be at about 600 loc for their study, but again this is likely in a setting where the reviewers are familiar with the code base they are reviewing.
Contemporary Peer Review in Action: Lessons from Open Source Development: this article focuses on Open Source projects and how the changesets for those are quite small, with a median range of 11 to 32 lines. However, once again these are quite different reviews from the ones that we are trying to recommend. These projects are building massive software libraries and have many experts involved who are familiar with at least some parts of the code base.
I don't seem to have the Microsoft paper that Thibault is referring to above. Could someone post that here?

So, overall I think @NickleDave 's suggestion of 50 to 200 lines is supported by the literature, and ultimately it will be through our tests of the material that we find out whether that is a good range since there isn't really any directly analogous studies out there about this particular type of review.

NickleDave · 2022-03-12T11:50:48Z

I don't seem to have the Microsoft paper that Thibault is referring to above. Could someone post that here?

It's the second one on the refs page:
https://researchcodereviewcommunity.github.io/dev-review/refs-related/

MacLeod, Laura, et al. “Code reviewing in the trenches: Challenges and best practices.” IEEE Software 35.4 (2017): 34-42. https://ieeexplore.ieee.org/abstract/document/7950877/

NickleDave · 2022-03-12T11:52:20Z

Thank you @bielsnohr for finding all these references.

You are very right; these mainly make a strong case that most work on code review focuses on large tech teams

NickleDave · 2022-03-12T11:54:40Z

from The Turing Way site:
https://the-turing-way.netlify.app/reproducible-research/reviewing/reviewing-recommend.html#review-code-in-small-chunks

Don’t review more than 400 lines of code (LOC) at a time, less than 200 LOC is better. Don’t review more than 500 LOC/hour.

NickleDave mentioned this issue Aug 31, 2021

WIP: add 2nd half of flowchart #26

Closed

bielsnohr mentioned this issue Mar 10, 2022

Search the literature for a recommended number of lines of code to review in a single meeting #46

Closed

bielsnohr changed the title ~~change recommendation from 400 lines to 50 (200 max)?~~ Reduce recommendation of 400 lines for a single review meeting Mar 10, 2022

bielsnohr changed the title ~~Reduce recommendation of 400 lines for a single review meeting~~ Reduce recommendation of 400 lines for a single review meeting (50 to upper limit of 200) Mar 10, 2022

bielsnohr self-assigned this Mar 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce recommendation of 400 lines for a single review meeting (50 to upper limit of 200) #18

Reduce recommendation of 400 lines for a single review meeting (50 to upper limit of 200) #18

NickleDave commented Jun 25, 2021

tlestang commented Jul 19, 2021 •

edited

Loading

bielsnohr commented Mar 11, 2022

NickleDave commented Mar 12, 2022 •

edited

Loading

NickleDave commented Mar 12, 2022

NickleDave commented Mar 12, 2022

Reduce recommendation of 400 lines for a single review meeting (50 to upper limit of 200) #18

Reduce recommendation of 400 lines for a single review meeting (50 to upper limit of 200) #18

Comments

NickleDave commented Jun 25, 2021

tlestang commented Jul 19, 2021 • edited Loading

bielsnohr commented Mar 11, 2022

NickleDave commented Mar 12, 2022 • edited Loading

NickleDave commented Mar 12, 2022

NickleDave commented Mar 12, 2022

tlestang commented Jul 19, 2021 •

edited

Loading

NickleDave commented Mar 12, 2022 •

edited

Loading