You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've found that the LLM is more likely to hallucinate or provide "noise" suggestions on small changes (such as one line changes). We suspect it is because the bot has less context to work with so is more likely to "reach for straws".
I suggest configuration option(s) to set the "minimum diff/context/PR" size.
It could be:
A global minimum that is the minimum number of tokens of the entire PR diff (not of the prompt). This would be more precise because it would capture the "size of a line" or
A global minimum that is the minimum number of lines that need to change in the PR. The number of lines may be easier to configure and understand than number of tokens but won't take into effect how big a line is
When the bot skips processing it should provide some kind of feedback to that effect so the end user knows it ran (and didn't crash).
I could also see:
A minimum number of tokens before the pr agent will comment on a specific hunk and offer suggestions or
A minimum number of lines before the pr agent will comment on a specific hunk and offer suggestions
The defaults would be set to 0 for backwards compatibility.
The text was updated successfully, but these errors were encountered:
The correct way to address this is via prompt adjustments (maybe dedicated for PRs with small changes).
Not by a threshold to ignore PRs
The PR that broke the world was small.
PR Agent should review all code changes, no matter if they are small or big.
I think that the main cause for this problem (which I am not debating; I am also aware of it) is when the model doesn't have suggestions to give, it fallback to the silliest option, which is to "hallucinate" the PR content.
Also interesting is that it happens both to GPT4 and Claude (I guess you saw it with GPT4. i now have an example where I see it with Claude)
We've found that the LLM is more likely to hallucinate or provide "noise" suggestions on small changes (such as one line changes). We suspect it is because the bot has less context to work with so is more likely to "reach for straws".
I suggest configuration option(s) to set the "minimum diff/context/PR" size.
It could be:
When the bot skips processing it should provide some kind of feedback to that effect so the end user knows it ran (and didn't crash).
I could also see:
The defaults would be set to 0 for backwards compatibility.
The text was updated successfully, but these errors were encountered: