Merge remote-tracking branch 'origin/main'

Codium-ai · Jul 10, 2024 · ea9decc · ea9decc
2 parents daa68f3 + e824308
commit ea9decc
Show file tree

Hide file tree

Showing 7 changed files with 85 additions and 69 deletions.
diff --git a/docs/docs/tools/improve.md b/docs/docs/tools/improve.md
@@ -47,17 +47,7 @@ num_code_suggestions_per_chunk = ...
 - The `pr_commands` lists commands that will be executed automatically when a PR is opened.
 - The `[pr_code_suggestions]` section contains the configurations for the `improve` tool you want to edit (if any)
 
-### Extended mode
-
-An extended mode, which does not involve PR Compression and provides more comprehensive suggestions, can be invoked by commenting on any PR by setting:
-```
-[pr_code_suggestions]
-auto_extended_mode=true
-```
-(This mode is true by default).
-
-Note that the extended mode divides the PR code changes into chunks, up to the token limits, where each chunk is handled separately (might use multiple calls to GPT-4 for large PRs).
-Hence, the total number of suggestions is proportional to the number of chunks, i.e., the size of the PR.
+## Usage Tips
 
 ### Self-review
 If you set in a configuration file:
@@ -82,11 +72,53 @@ approve_pr_on_self_review = true
 ```
 the tool can automatically approve the PR when the user checks the self-review checkbox.
 
-!!! tip "Demanding self-review from the PR author"
-  If you set the number of required reviewers for a PR to 2, this effectively means that the PR author must click the self-review checkbox before the PR can be merged (in addition to a human reviewer).
-
-  ![self_review_2](https://codium.ai/images/pr_agent/self_review_2.png){width=512}
+!!! tip "Tip - demanding self-review from the PR author"
+    If you set the number of required reviewers for a PR to 2, this effectively means that the PR author must click the self-review checkbox before the PR can be merged (in addition to a human reviewer).
+
+    ![self_review_2](https://codium.ai/images/pr_agent/self_review_2.png){width=512}
+
+### `Extra instructions` and `best practices`
+
+#### Extra instructions
+You can use the `extra_instructions` configuration option to give the AI model additional instructions for the `improve` tool.
+Be specific, clear, and concise in the instructions. With extra instructions, you are the prompter. Specify relevant aspects that you want the model to focus on.
+
+Examples for possible instructions:
+```
+[pr_code_suggestions]
+extra_instructions="""\
+(1) Answer in japanese
+(2) Don't suggest to add try-excpet block
+(3) Ignore changes in toml files
+...
+"""
+```
+Use triple quotes to write multi-line instructions. Use bullet points or numbers to make the instructions more readable.
+
+#### Best practices 💎
+Another option to give additional guidance to the AI model is by creating a dedicated [**wiki page**](https://github.com/Codium-ai/pr-agent/wiki) called `best_practices.md`. 
+This page can contain a list of best practices, coding standards, and guidelines that are specific to your repo/organization (up to 800 lines are allowed)
 
+The AI model will use this page as a reference, and in case the PR code violates any of the guidelines, it will suggest improvements accordingly.
+Examples for possible best practices:
+```
+## Here are the organization's best practices for writing code:
+- avoid nested loops
+- avoid typos
+- use meaningful variable names
+- follow the DRY principle
+- keep functions short and simple, typically within 10-30 lines of code.
+...
+```
+When a PR code violates any of the guidelines, the AI model will suggest improvements accordingly, with a dedicated label: `Organization
+best practice`. 
+
+Example results:
+
+![best_practice](https://codium.ai/images/pr_agent/org_best_practice.png){width=512}
+
+Note that while the `extra instructions` are more related to the way the `improve` tool behaves, the `best_practices.md` file is a general guideline for the way code should be written in the repo.
+Using a combination of both can help the AI model to provide relevant and tailored suggestions.
 
 ## Configuration options
 
@@ -156,34 +188,6 @@ the tool can automatically approve the PR when the user checks the self-review c
   </tr>
 </table>
 
-## Usage Tips
-
-!!! tip "Extra instructions"
-
-    Extra instructions are very important for the `improve` tool, since they enable you to guide the model to suggestions that are more relevant to the specific needs of the project.
-
-    Be specific, clear, and concise in the instructions. With extra instructions, you are the prompter. Specify relevant aspects that you want the model to focus on.
-
-    Examples for extra instructions:
-    ```
-    [pr_code_suggestions] # /improve #
-    extra_instructions="""\
-    Emphasize the following aspects:
-    - Does the code logic cover relevant edge cases?
-    - Is the code logic clear and easy to understand?
-    - Is the code logic efficient?
-    ...
-    """
-    ```
-    Use triple quotes to write multi-line instructions. Use bullet points to make the instructions more readable.
-
-!!! tip "Review vs. Improve tools comparison"
-
-    - The [review](https://pr-agent-docs.codium.ai/tools/review/) tool includes a section called 'Possible issues', that also provide feedback on the PR Code.
-    In this section, the model is instructed to focus **only** on [major bugs and issues](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/pr_reviewer_prompts.toml#L71).
-    - The `improve` tool, on the other hand, has a broader mandate, and in addition to bugs and issues, it can also give suggestions for improving code quality and making the code more efficient, readable, and maintainable (see [here](https://github.com/Codium-ai/pr-agent/blob/main/pr_agent/settings/pr_code_suggestions_prompts.toml#L34)).
-    - Hence, if you are interested only in feedback about clear bugs, the `review` tool might suffice. If you want a more detailed feedback, including broader suggestions for improving the PR code, also enable the `improve` tool to run on each PR.
-
 ## A note on code suggestions quality
 
 - While the current AI for code is getting better and better (GPT-4), it's not flawless. Not all the suggestions will be perfect, and a user should not accept all of them automatically. Critical reading and judgment are required.

diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml
@@ -131,7 +131,7 @@ markdown_extensions:
       emoji_generator: !!python/name:material.extensions.emoji.to_svg
   - toc:
       title: On this page
-      toc_depth: 2
+      toc_depth: 3
       permalink: true
 
 

diff --git a/pr_agent/algo/__init__.py b/pr_agent/algo/__init__.py
@@ -28,6 +28,8 @@
     'vertex_ai/claude-3-opus@20240229': 100000,
     'vertex_ai/claude-3-5-sonnet@20240620': 100000,
     'vertex_ai/gemini-1.5-pro': 1048576,
+    'vertex_ai/gemini-1.5-flash': 1048576,
+    'vertex_ai/gemma2': 8200,
     'codechat-bison': 6144,
     'codechat-bison-32k': 32000,
     'anthropic.claude-instant-v1': 100000,

diff --git a/pr_agent/settings/configuration.toml b/pr_agent/settings/configuration.toml
@@ -276,3 +276,7 @@ number_of_results = 5
 
 [lancedb]
 uri = "./lancedb"
+
+[best_practices]
+content = ""
+max_lines_allowed = 800
diff --git a/pr_agent/settings/pr_code_suggestions_prompts.toml b/pr_agent/settings/pr_code_suggestions_prompts.toml
@@ -34,7 +34,7 @@ Suggestions should always focus on ways to improve the new code lines introduced
 
 
 Specific instructions for generating code suggestions:
-- Provide up to {{ num_code_suggestions }} code suggestions. The suggestions should be diverse and insightful.
+- Provide in total up to {{ num_code_suggestions }} code suggestions. The suggestions should be diverse and insightful.
 - The suggestions should focus on improving the new code introduced the PR, meaning lines from '__new hunk__' sections, starting with '+' (after the line numbers).
 - Prioritize suggestions that address possible issues, major problems, and bugs in the PR code.
 - Don't suggest to add docstring, type hints, or comments, or to remove unused imports.
@@ -149,7 +149,7 @@ Suggestions should always focus on ways to improve the new code lines introduced
 
 
 Specific instructions for generating code suggestions:
-- Provide up to {{ num_code_suggestions }} code suggestions. The suggestions should be diverse and insightful.
+- Provide in total up to {{ num_code_suggestions }} code suggestions. The suggestions should be diverse and insightful.
 - The suggestions should focus on improving the new code introduced the PR, meaning lines from '__new hunk__' sections, starting with '+' (after the line numbers).
 - Prioritize suggestions that address possible issues, major problems, and bugs in the PR code.
 - Don't suggest to add docstring, type hints, or comments, or to remove unused imports.
@@ -179,7 +179,8 @@ class CodeSuggestion(BaseModel):
     one_sentence_summary: str = Field(description="a short summary of the suggestion action, in a single sentence. Focus on the 'what'. Be general, and avoid method or variable names.")
     relevant_lines_start: int = Field(description="The relevant line number, from a '__new hunk__' section, where the suggestion starts (inclusive). Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above")
     relevant_lines_end: int = Field(description="The relevant line number, from a '__new hunk__' section, where the suggestion ends (inclusive). Should be derived from the hunk line numbers, and correspond to the 'existing code' snippet above")
-    label: str = Field(description="a single label for the suggestion, to help understand the suggestion type. For example: 'security', 'possible bug', 'possible issue', 'performance', 'enhancement', 'best practice', 'maintainability', etc. Other labels are also allowed")
+    label: str = Field(description="a single label for the suggestion, to help the user understand the suggestion type. For example: 'security', 'possible bug', 'possible issue', 'performance', 'enhancement', 'best practice', 'maintainability', etc. Other labels are also allowed")
+
 
 class PRCodeSuggestions(BaseModel):
     code_suggestions: List[CodeSuggestion]

diff --git a/pr_agent/settings/pr_code_suggestions_reflect_prompts.toml b/pr_agent/settings/pr_code_suggestions_reflect_prompts.toml
@@ -6,10 +6,10 @@ Your goal is to inspect, review and score the suggestsions.
 Be aware - the suggestions may not always be correct or accurate, and you should evaluate them in relation to the actual PR code diff presented. Sometimes the suggestion may ignore parts of the actual code diff, and in that case, you should give it a score of 0.
 
 Specific instructions:
-- Carefully review both the suggestion content, and the related PR code diff. Mistakes in the suggestions can occur. Make sure the suggestions are correct, and properly derived from the PR code diff.
+- Carefully review both the suggestion content, and the related PR code diff. Mistakes in the suggestions can occur. Make sure the suggestions are logical and correct, and properly derived from the PR code diff.
 - In addition to the exact code lines mentioned in each suggestion, review the code around them, to ensure that the suggestions are contextually accurate.
-- Also check that the 'existing_code' and 'improved_code' fields correctly reflect the suggested changes.
-- Make sure the suggestions focus on new code introduced in the PR, and not on existing code that was not changed.
+- Check that the 'existing_code' field is valid. The 'existing_code' content should match, or be derived, from code lines from a 'new hunk' section in the PR code diff.
+- Check that the 'improved_code' section correctly reflects the suggestion content.
 - High scores (8 to 10) should be given to correct suggestions that address major bugs and issues, or security concerns. Lower scores (3 to 7) should be for correct suggestions addressing minor issues, code style, code readability, maintainability, etc. Don't give high scores to suggestions that are not crucial, and bring only small improvement or optimization.
 - Order the feedback the same way the suggestions are ordered in the input.
 
@@ -39,16 +39,17 @@ __old hunk__
 ...
 ======
 - In this format, we separated each hunk of code to '__new hunk__' and '__old hunk__' sections. The '__new hunk__' section contains the new code of the chunk, and the '__old hunk__' section contains the old code that was removed.
+- We added line numbers for the '__new hunk__' sections, to help you refer to the code lines in your suggestions. These line numbers are not part of the actual code, and are only used for reference.
 - Code lines are prefixed symbols ('+', '-', ' '). The '+' symbol indicates new code added in the PR, the '-' symbol indicates code removed in the PR, and the ' ' symbol indicates unchanged code.
-- We also added line numbers for the '__new hunk__' sections, to help you refer to the code lines in your suggestions. These line numbers are not part of the actual code, and are only used for reference.
+
 
 
 The output must be a YAML object equivalent to type $PRCodeSuggestionsFeedback, according to the following Pydantic definitions:
 =====
 class CodeSuggestionFeedback(BaseModel):
     suggestion_summary: str = Field(description="repeated from the input")
     relevant_file: str = Field(description="repeated from the input")
-    suggestion_score: int = Field(description="The actual output - the score of the suggestion, from 0 to 10. Give 0 if the suggestion is plain wrong. Otherwise, give a score from 1 to 10 (inclusive), where 1 is the lowest and 10 is the highest.")
+    suggestion_score: int = Field(description="The actual output - the score of the suggestion, from 0 to 10. Give 0 if the suggestion is wrong. Otherwise, give a score from 1 to 10 (inclusive), where 1 is the lowest and 10 is the highest.")
     why: str = Field(description="Short and concise explanation of why the suggestion received the score (one to two sentences).")
 
 class PRCodeSuggestionsFeedback(BaseModel):