Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minimum context / size of changes #1144

Closed
MarkRx opened this issue Aug 14, 2024 · 2 comments
Closed

Minimum context / size of changes #1144

MarkRx opened this issue Aug 14, 2024 · 2 comments

Comments

@MarkRx
Copy link
Contributor

MarkRx commented Aug 14, 2024

We've found that the LLM is more likely to hallucinate or provide "noise" suggestions on small changes (such as one line changes). We suspect it is because the bot has less context to work with so is more likely to "reach for straws".

I suggest configuration option(s) to set the "minimum diff/context/PR" size.

It could be:

  • A global minimum that is the minimum number of tokens of the entire PR diff (not of the prompt). This would be more precise because it would capture the "size of a line" or
  • A global minimum that is the minimum number of lines that need to change in the PR. The number of lines may be easier to configure and understand than number of tokens but won't take into effect how big a line is

When the bot skips processing it should provide some kind of feedback to that effect so the end user knows it ran (and didn't crash).

I could also see:

  • A minimum number of tokens before the pr agent will comment on a specific hunk and offer suggestions or
  • A minimum number of lines before the pr agent will comment on a specific hunk and offer suggestions

The defaults would be set to 0 for backwards compatibility.

@mrT23
Copy link
Collaborator

mrT23 commented Aug 22, 2024

The correct way to address this is via prompt adjustments (maybe dedicated for PRs with small changes).
Not by a threshold to ignore PRs

The PR that broke the world was small.
PR Agent should review all code changes, no matter if they are small or big.

I think that the main cause for this problem (which I am not debating; I am also aware of it) is when the model doesn't have suggestions to give, it fallback to the silliest option, which is to "hallucinate" the PR content.

Also interesting is that it happens both to GPT4 and Claude (I guess you saw it with GPT4. i now have an example where I see it with Claude)

@mrT23 mrT23 added the WIP label Aug 22, 2024
@mrT23
Copy link
Collaborator

mrT23 commented Aug 23, 2024

This change should prevent/improve those problems

#1170

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants