Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exercise Chat: Implement native function calling agent #154

Merged
merged 32 commits into from
Nov 26, 2024

Conversation

kaancayli
Copy link
Contributor

@kaancayli kaancayli commented Sep 5, 2024

This PR implements the exercise chat as a native tool calling agent. For now only OpenAI tool calling is supported.

More details will be added.

Summary by CodeRabbit

  • New Features

    • Enhanced message conversion functionality to support new message types and roles, including tool messages.
    • Introduced the ExerciseChatAgentPipeline for improved interactions regarding exercise-related queries.
    • Added flexibility in the chat pipeline to handle different variants and improve response handling.
    • Implemented new methods for binding tools within language models, enhancing interaction capabilities.
    • Updated prompts for improved interaction with students during exercise submissions and feedback.
    • Introduced PyrisEventDTO for flexible event handling.
    • Added a new IRIS_CHAT_EXERCISE_AGENT_MESSAGE enum member to represent pipeline states.
  • Bug Fixes

    • Improved robustness in message handling and state management to prevent invalid transitions.
  • Documentation

    • Updated prompts and guidelines for the AI tutor to enhance educational interactions.
  • Chores

    • Added new dependencies to the project for enhanced language processing capabilities.
    • Updated existing dependencies to newer versions for improved performance and security.

Copy link
Contributor

coderabbitai bot commented Sep 5, 2024

Walkthrough

The changes involve updates to multiple files, enhancing message handling and pipeline functionality. Key modifications include the addition of new message types and subclasses, adjustments to method signatures for tool binding, and the introduction of new parameters in pipeline execution methods. The .gitignore file is also updated to ignore .DS_Store files. Overall, the changes aim to improve the flexibility and clarity of the message processing and interaction pipelines.

Changes

Files Change Summary
.gitignore Added entry for ignoring .DS_Store files.
app/common/pyris_message.py Updated IrisMessageRole enum to include TOOL, modified PyrisMessage contents to allow a union type, and added PyrisAIMessage and PyrisToolMessage subclasses.
app/llm/external/model.py Changed chat method return type from PyrisMessage to ChatCompletionMessage and added bind_tools method.
app/llm/external/ollama.py Added bind_tools method to OllamaModel.
app/llm/external/openai_chat.py Enhanced message conversion functions and added bind_tools methods to various chat model classes.
app/llm/langchain/iris_langchain_chat_model.py Added bind_tools method to allow dynamic tool binding.
app/llm/request_handler/basic_request_handler.py Introduced bind_tools method for binding tools to the language model.
app/llm/request_handler/capability_request_handler.py Added bind_tools method for tool integration.
app/llm/request_handler/request_handler_interface.py Introduced abstract bind_tools method to the request handler interface.
app/pipeline/chat/course_chat_pipeline.py Enhanced CourseChatPipeline to handle new event types and updated logic for competency calculations.
app/pipeline/chat/exercise_chat_agent_pipeline.py Introduced ExerciseChatAgentPipeline class for interactive chat sessions focused on student exercises.
app/pipeline/chat/exercise_chat_pipeline.py Updated ExerciseChatPipeline to accept a variant parameter for flexible behavior.
app/pipeline/chat/interaction_suggestion_pipeline.py Simplified prompt handling logic and improved error handling for chat history.
app/pipeline/prompts/iris_course_chat_prompts.py Updated prompts for clarity on response formatting and originality.
app/web/routers/pipelines.py Updated pipeline execution methods to accept new parameters for enhanced functionality.
app/web/status/status_update.py Replaced serialization method and improved state management in the StatusCallback class.
app/common/__init__.py Removed import statements for convert_iris_message_to_langchain_message and convert_langchain_message_to_iris_message.
app/common/message_converters.py Updated conversion functions and added new functions for handling messages and tool calls.
app/domain/event/pyris_event_dto.py Introduced PyrisEventDTO class for generic event handling.
app/pipeline/chat/lecture_chat_pipeline.py Updated import path for convert_iris_message_to_langchain_message.
app/retrieval/lecture_retrieval.py Updated import path for convert_iris_message_to_langchain_message and modified methods for improved functionality.
requirements.txt Added several new dependencies for enhanced functionality.

Possibly related PRs

  • Pipeline: Ingestion pipeline #96: The changes to the .gitignore file in the main PR are related to the modifications in the ingestion pipeline, which also includes updates to the .gitignore file to manage Docker-related artifacts.
  • Tutor Chat Pipeline with Lecture content.  #104: The Tutor Chat Pipeline with Lecture content PR may relate to the main PR as it involves enhancements to the chat functionality, which could include managing ignored files for better organization.
  • Track token usage of iris requests #165: The introduction of token tracking in the Iris requests may connect to the main PR's changes in the .gitignore, as both involve improvements in managing resources and tracking usage effectively.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kaancayli kaancayli self-assigned this Sep 5, 2024
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

Outside diff range, codebase verification and nitpick comments (3)
app/domain/data/programming_exercise_dto.py (1)

34-38: recent_changes field lacks integration.

The recent_changes field is defined in programming_exercise_dto.py but is not currently integrated with version control or change tracking systems. Consider implementing functionality to utilize this field effectively for tracking Git diffs and changes.

  • Location: app/domain/data/programming_exercise_dto.py
Analysis chain

Addition of recent_changes enhances transparency.

The new recent_changes field, which captures the Git diff of recent changes, is a valuable addition for tracking updates. Ensure that this field is properly integrated with version control and change tracking systems.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify integration of `recent_changes` with version control systems.

# Test: Search for handling of `recent_changes`. Expect: Proper integration with version control systems.
rg --type python -A 5 $'recent_changes'

Length of output: 69


Script:

#!/bin/bash
# Description: Verify integration of `recent_changes` with version control systems.

# Test: Search for handling of `recent_changes`. Expect: Proper integration with version control systems.
rg --type py -A 5 $'recent_changes'

Length of output: 410


Script:

#!/bin/bash
# Description: Search for references to `recent_changes` across the codebase to verify its integration.

# Test: Find all occurrences of `recent_changes` to check for integration with version control systems.
rg 'recent_changes'

Length of output: 106

app/domain/data/exercise_with_submissions_dto.py (1)

37-37: Field addition approved; consider reviewing the alias usage.

The addition of the url field with a default value of None is well-implemented and ensures backward compatibility. However, the use of an alias that directly matches the field name might be unnecessary unless there are specific serialization requirements that necessitate this approach.

app/common/message_converters.py (1)

Line range hint 28-47: Refactor suggestion for handling PyrisAIMessage and AIMessage conversions.

The function convert_iris_message_to_langchain_message has been updated to handle PyrisAIMessage specifically, which includes processing tool calls. This is a significant enhancement for supporting interactive tool functionalities. However, the current implementation could be improved for clarity and maintainability:

  • Extract Tool Call Conversion: The conversion logic for tool calls (lines 39-46) could be extracted into a separate function. This would make the convert_iris_message_to_langchain_message function cleaner and more focused on its primary responsibility.
  • Error Handling: Consider adding more specific error messages or custom exceptions for better debugging and user feedback.

app/domain/pyris_message.py Outdated Show resolved Hide resolved
app/domain/data/programming_exercise_dto.py Show resolved Hide resolved
app/llm/external/openai_chat.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Outside diff range, codebase verification and nitpick comments (9)
app/domain/chat/exercise_chat/exercise_chat_pipeline_execution_dto.py (1)

16-16: Well-implemented attribute addition.

The event_payload attribute is correctly implemented with an alias for serialization purposes. Consider adding a comment explaining the purpose of this attribute for future maintainability.

app/domain/pyris_message.py (2)

31-34: Well-designed subclass for AI messages.

PyrisAIMessage is appropriately designed for AI-generated messages. Consider adding documentation for the tool_calls attribute to clarify its usage.


37-40: Clear and functional design for tool messages.

PyrisToolMessage is well-designed for messages from tools. Adding documentation for the contents attribute could enhance clarity and maintainability.

app/domain/data/programming_exercise_dto.py (1)

33-38: Useful additions to the DTO.

The new fields max_points and recent_changes are well-implemented and enhance the functionality of the ProgrammingExerciseDTO. Consider adding validation for the max_points field to ensure it remains within a sensible range.

app/llm/request_handler/capability_request_handler.py (1)

74-81: Well-implemented method for tool binding.

The bind_tools method is correctly implemented and integrates well with the existing class structure. The use of type annotations is commendable as it enhances code readability and maintainability.

Consider adding error handling or logging within the bind_tools method to manage potential issues during the tool binding process, such as when the selected model does not support tool binding or when an invalid tool type is passed.

app/llm/external/model.py (1)

53-63: New abstract method bind_tools added.

The addition of the bind_tools method is a positive enhancement, increasing the extensibility of the LanguageModel class. It allows for more complex interactions with various tools.

Consider providing more detailed documentation or examples on how to implement this method in subclasses, especially given the variety of tool types that can be bound.

app/web/routers/pipelines.py (1)

28-37: Enhanced flexibility in run_exercise_chat_pipeline_worker.

The addition of the variant parameter and the use of ExerciseChatAgentPipeline enhance the flexibility and adaptability of the pipeline. This change allows for variant-specific processing, which could lead to improved performance in handling different scenarios.

Consider adding more detailed logging or error handling specific to the variant parameter to ensure that any issues related to variant-specific configurations are captured and addressed promptly.

app/llm/external/ollama.py (2)

Line range hint 83-135: Approve existing methods and suggest documentation for bind_tools.

The existing methods in the OllamaModel class are correctly implemented. However, the bind_tools method is marked as not implemented. It's important to document its intended future functionality to guide development and integration.

Would you like help drafting the documentation for the bind_tools method?


Potential Issue with convert_to_iris_message Function

The convert_to_iris_message function in app/llm/external/ollama.py correctly maps the role and content fields from a Message to a PyrisMessage. However, it uses datetime.now() for the send_at field, which may not preserve the original timestamp from the Message. Additionally, other fields in the Message are not handled, which could affect data integrity if those fields are important.

  • Consider preserving the original timestamp if it is significant.
  • Review if additional fields from the Message should be included in the PyrisMessage.
Analysis chain

Line range hint 43-66: Verify correct mapping and data integrity in convert_to_iris_message.

Ensure that the convert_to_iris_message function correctly maps all fields and maintains data integrity during the conversion.

Run the following script to verify the correct mapping and data integrity:

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correct mapping and data integrity in `convert_to_iris_message`.

# Test: Search for the function usage. Expect: Correct mapping and data integrity.
rg --type python -A 5 $'convert_to_iris_message'

Length of output: 78


Script:

#!/bin/bash
# Description: Verify the correct mapping and data integrity in `convert_to_iris_message`.

# Test: Search for the function definition and usage. Expect: Correct mapping and data integrity.
rg --type py -A 5 $'def convert_to_iris_message'

Length of output: 827


Script:

#!/bin/bash
# Description: Extract the complete implementation of `convert_to_iris_message` from `app/llm/external/ollama.py`.

# Extract the function implementation to verify correct mapping and data integrity.
ast-grep --lang python --pattern $'def convert_to_iris_message($_) {\n  $$$\n}' app/llm/external/ollama.py

Length of output: 569


Script:

#!/bin/bash
# Description: Extract the complete implementation of `convert_to_iris_message` from `app/llm/external/ollama.py`.

# Extract the function implementation to verify correct mapping and data integrity.
rg --type py -A 20 $'def convert_to_iris_message' app/llm/external/ollama.py

Length of output: 545

app/domain/event/pyris_event_dto.py Outdated Show resolved Hide resolved
app/domain/data/tool_message_content_dto.py Show resolved Hide resolved
app/domain/data/tool_call_dto.py Show resolved Hide resolved
app/domain/data/tool_call_dto.py Show resolved Hide resolved
app/llm/external/openai_chat.py Outdated Show resolved Hide resolved
app/pipeline/chat/exercise_chat_pipeline.py Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

Outside diff range, codebase verification and nitpick comments (8)
app/domain/data/exercise_with_submissions_dto.py (1)

37-37: Field definition for url is correct but consider simplifying the alias.

The implementation of the url field in ExerciseWithSubmissionsDTO is correct. However, the alias "url" is redundant since it matches the field name. Consider removing the alias for simplicity unless there is a specific reason for its inclusion.

app/llm/llm_manager.py (1)

42-42: Good shift towards validation in load_llms method.

The change to use model_validate instead of parse_obj in the load_llms method is a positive step towards ensuring data integrity. Consider adding error handling or logging to monitor the impact of this change on the system.

app/llm/request_handler/capability_request_handler.py (1)

74-81: Lack of Validation for tools Parameter in bind_tools Method

The bind_tools method in capability_request_handler.py does not perform any validation or security checks on the tools parameter before binding them to the language model. This could lead to potential security vulnerabilities. It is recommended to implement validation checks to ensure that only valid and safe tools are bound to the model.

  • File: app/llm/request_handler/capability_request_handler.py
  • Lines: 74-81
Analysis chain

Review the bind_tools method implementation.

The implementation of bind_tools appears correct and aligns with the PR objectives. However, consider adding security checks or validations for the tools parameter to ensure that only valid tools are bound to the model.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify security checks for the `tools` parameter.

# Test: Search for security checks related to `tools`. Expect: Security checks or validations.
rg --type python -A 5 $'tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]]'

Length of output: 132


Script:

#!/bin/bash
# Description: Search for occurrences of `tools` and any associated validation or security checks in the codebase.

# Search for occurrences of `tools` in the codebase to identify any validation or security checks.
rg 'tools' -A 5

Length of output: 27543

app/llm/external/model.py (1)

47-47: Potential issues with return type change in chat method.

The return type of the chat method in app/llm/external/model.py has been changed to ChatCompletionMessage, but other implementations and usages still expect PyrisMessage. This discrepancy could lead to compatibility issues. Ensure that all usages of the chat method are updated to handle the new return type correctly.

  • Files to check:
    • app/pipeline/lecture_ingestion_pipeline.py
    • app/pipeline/competency_extraction_pipeline.py
    • app/llm/external/openai_chat.py
    • app/llm/external/ollama.py
    • Various request handler files
Analysis chain

Verify the impact of the return type change in the chat method.

The change in return type from PyrisMessage to ChatCompletionMessage could affect downstream processing. Verify that all existing usages of the chat method have been updated to handle the new return type.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the impact of the return type change on existing code.

# Test: Search for usages of the `chat` method. Expect: Updated handling for `ChatCompletionMessage`.
rg --type python -A 5 $'chat\('

Length of output: 63


Script:

#!/bin/bash
# Description: Verify the impact of the return type change on existing code.

# Test: Search for usages of the `chat` method. Expect: Updated handling for `ChatCompletionMessage`.
rg --type py -A 5 $'chat\('

Length of output: 5840

app/llm/external/ollama.py (1)

Line range hint 83-135: Review: New method bind_tools and class structure.

The class OllamaModel is well-structured and inherits appropriately from related classes. The new method bind_tools is designed to raise a NotImplementedError, indicating it's a placeholder for future functionality. This approach is acceptable as it clearly communicates the current limitations and future intentions.

However, consider adding more detailed documentation to the bind_tools method to explain its future purpose and expected integration.

Would you like help drafting the additional documentation for this method?

app/pipeline/chat/interaction_suggestion_pipeline.py (1)

Line range hint 115-145: Approved: Streamlined prompt handling and improved error handling.

The modifications to the __call__ method in the InteractionSuggestionPipeline class simplify the handling of chat history and prompts. The removal of specific prompts related to chat history reduces redundancy and potential confusion. The addition of a ValueError when no last message is provided enhances the robustness of the method by ensuring necessary data is present.

Consider adding unit tests to cover these changes, especially the new error handling paths, to ensure they work as expected under various scenarios.

Would you like assistance in creating these unit tests?

app/common/message_converters.py (1)

Line range hint 22-47: Approved: Expanded message conversion functionality.

The function convert_iris_message_to_langchain_message has been effectively expanded to handle new message types such as ToolMessage and PyrisToolMessage. The robust error handling for empty messages and type mismatches ensures reliable operation.

Consider adding detailed documentation to describe the handling of these new message types, which will aid in maintenance and future enhancements.

Would you like help with drafting this documentation?

app/pipeline/chat/course_chat_pipeline.py (1)

Line range hint 585-591: Suggest refining exception handling.

The broad exception handling in the __call__ method could be refined to catch specific exceptions. This would allow for more targeted error responses and recovery actions, improving the robustness and maintainability of the code.

Consider handling specific exceptions like ValueError or TypeError separately to provide more informative error messages and recovery options.

app/domain/pyris_message.py Outdated Show resolved Hide resolved
app/llm/request_handler/basic_request_handler.py Outdated Show resolved Hide resolved
app/common/message_converters.py Show resolved Hide resolved
app/common/message_converters.py Show resolved Hide resolved
app/pipeline/chat/course_chat_pipeline.py Show resolved Hide resolved
app/pipeline/chat/course_chat_pipeline.py Show resolved Hide resolved
app/pipeline/chat/course_chat_pipeline.py Outdated Show resolved Hide resolved
app/pipeline/chat/exercise_chat_agent_pipeline.py Outdated Show resolved Hide resolved
app/pipeline/chat/exercise_chat_agent_pipeline.py Outdated Show resolved Hide resolved
@kaancayli kaancayli force-pushed the feature/exercise-chat/agent branch from 05fcc49 to ec52f25 Compare November 12, 2024 22:46
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Outside diff range and nitpick comments (8)
app/llm/llm_manager.py (2)

43-43: Consider version compatibility and error handling improvements

  1. The change to model_validate indicates a Pydantic v2 upgrade. Ensure the project's dependencies are updated accordingly.
  2. Consider adding more detailed error handling for YAML parsing and model validation failures.
-        self.entries = LlmList.model_validate({"llms": loaded_llms}).llms
+        try:
+            self.entries = LlmList.model_validate({"llms": loaded_llms}).llms
+        except Exception as e:
+            raise ValueError(f"Failed to validate LLM configuration: {str(e)}") from e

Line range hint 1-58: Consider documenting tool calling architecture

While these changes support polymorphic LLM handling, which is necessary for tool calling, the actual tool calling implementation isn't visible here. Consider:

  1. Adding documentation about how different LLM types implement tool calling
  2. Including examples in the LLM configuration for tool-enabled models
app/llm/request_handler/basic_request_handler.py (1)

21-23: Simplify initialization pattern.

The current initialization pattern creates a brief window where llm_manager is None. Consider passing the initialized manager directly to super():

def __init__(self, model_id: str):
-    super().__init__(model_id=model_id, llm_manager=None)
-    self.model_id = model_id
-    self.llm_manager = LlmManager()
+    llm_manager = LlmManager()
+    super().__init__(model_id=model_id, llm_manager=llm_manager)
+    self.model_id = model_id
app/llm/langchain/iris_langchain_chat_model.py (1)

44-69: Consider future extensibility for non-OpenAI tool calling.

While the current implementation focuses on OpenAI tool calling, the flexible design with BaseTool and the generic tool sequence type hints will make it easier to extend to other providers. Consider documenting the extension points and provider-specific requirements in the class docstring for future implementations.

app/llm/request_handler/capability_request_handler.py (2)

33-34: Review the llm_manager initialization pattern

The llm_manager is marked as optional with None default, but it's always initialized in __init__. This creates a misleading type hint. Consider either:

  1. Removing the None default if it's always required
  2. Making it truly optional if there are cases where it's not needed
-    llm_manager: LlmManager | None = None
+    llm_manager: LlmManager

82-112: Enhance tool validation in bind_tools

While the basic validation is good, consider adding:

  1. Type validation for individual tools in the sequence
  2. Validation of tool configurations (for Dict type)
  3. More specific error messages indicating which tool failed validation
     def bind_tools(
         self,
         tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
     ) -> LanguageModel:
         if not tools:
             raise ValueError("Tools sequence cannot be empty")
 
+        for i, tool in enumerate(tools):
+            if not isinstance(tool, (dict, type, Callable, BaseTool)):
+                raise TypeError(
+                    f"Tool at index {i} has unsupported type: {type(tool).__name__}"
+                )
+            if isinstance(tool, dict) and not all(k in tool for k in ['name', 'description']):
+                raise ValueError(
+                    f"Tool configuration at index {i} missing required fields"
+                )
+
         llm = self._select_model(ChatModel)
         if not hasattr(llm, "bind_tools"):
             raise TypeError(
app/pipeline/chat/course_chat_pipeline.py (2)

246-249: Documentation improvement: Add examples for metrics and JOL values.

The documentation would benefit from concrete examples of the metrics and JOL values to help developers understand the expected ranges and formats.

             The response may include metrics for each competency, such as progress and mastery (0%-100%).
             These are system-generated.
-            The judgment of learning (JOL) values indicate the self-reported confidence by the student (0-5, 5 star).
-            The object describing it also indicates the system-computed confidence at the time when the student
+            The judgment of learning (JOL) values indicate the self-reported confidence by the student (0-5, 5 star).
+            Example: {"value": 4, "timestamp": "2024-11-01T10:00:00Z", "system_confidence": 0.85}
+            The object also indicates the system-computed confidence (0-1) at the time when the student

276-281: Enhance lecture content retrieval documentation.

The documentation could be more specific about the RAG (Retrieval-Augmented Generation) process and its parameters.

-            This will run a RAG retrieval based on the chat history on the indexed lecture slides and return the
-            most relevant paragraphs.
+            This will run a RAG (Retrieval-Augmented Generation) retrieval on indexed lecture slides using:
+            - Recent chat history (last 5 messages)
+            - Student's latest query
+            - Course context (name and ID)
+            Returns the top 5 most relevant paragraphs with lecture metadata.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 29d9252 and ec52f25.

📒 Files selected for processing (11)
  • app/llm/external/__init__.py (2 hunks)
  • app/llm/external/model.py (2 hunks)
  • app/llm/external/ollama.py (3 hunks)
  • app/llm/external/openai_chat.py (3 hunks)
  • app/llm/langchain/iris_langchain_chat_model.py (2 hunks)
  • app/llm/llm_manager.py (3 hunks)
  • app/llm/request_handler/basic_request_handler.py (2 hunks)
  • app/llm/request_handler/capability_request_handler.py (3 hunks)
  • app/llm/request_handler/request_handler_interface.py (2 hunks)
  • app/pipeline/chat/course_chat_pipeline.py (7 hunks)
  • requirements.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • app/llm/external/model.py
  • app/llm/external/ollama.py
  • app/llm/request_handler/request_handler_interface.py
  • requirements.txt
🧰 Additional context used
🪛 Ruff
app/llm/langchain/iris_langchain_chat_model.py

37-37: Do not perform function call CompletionArguments in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

🔇 Additional comments (12)
app/llm/external/__init__.py (2)

1-1: LGTM! Clean import of Union type.

The explicit import of Union from typing module follows Python type hinting best practices.


15-23: Consider documenting tool binding support requirements.

Given that this PR implements native function calling capabilities, it would be helpful to document which of these LLM implementations support tool binding. Not all models (especially embedding models) would typically support function calling features.

Let's verify which models implement tool binding:

✅ Verification successful

Document Tool Binding Support for LLM Models

Document that the following models support tool binding:

  • DirectOpenAICompletionModel
  • AzureOpenAICompletionModel
  • DirectOpenAIChatModel
  • AzureOpenAIChatModel
  • OllamaModel

The embedding models do not support tool binding.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check which model classes implement tool binding functionality

# Search for classes that implement bind_tools method
ast-grep --pattern 'class $CLASS_NAME($_) {
  $$$
  def bind_tools($$$) {
    $$$
  }
  $$$
}'

# Also search for abstract method declarations
ast-grep --pattern 'class $CLASS_NAME($_) {
  $$$
  @abstractmethod
  def bind_tools($$$):
    $$$
  $$$
}'

Length of output: 206


Script:

#!/bin/bash
# Description: Improved search for classes implementing or declaring bind_tools method

# Search for classes that implement bind_tools method with any parameters and decorators
ast-grep --pattern 'class $_($_) {
  $$$
  @?[$$_decorator]?
  def bind_tools($_args) {
    $$$
  }
  $$$
}'

# Search for bind_tools method declarations with decorators like @abstractmethod
ast-grep --pattern 'class $_($_) {
  $$$
  @abstractmethod
  def bind_tools($_args):
    $$$
  $$$
}'

Length of output: 214


Script:

#!/bin/bash
# Description: Search for all occurrences of bind_tools method definitions using ripgrep

# Search for method definitions named bind_tools with any decorators and parameters
rg -n --pretty 'def bind_tools\('

# Additionally, search for any class that inherits from a base class that might define bind_tools
rg -n --pretty 'class\s+\w+\s*\(.*\):' | rg -i 'bind_tools'

Length of output: 552

app/llm/llm_manager.py (1)

2-2: LGTM: Import changes align with modern type hinting practices

The switch from Field to Discriminator and addition of Annotated support better type discrimination for polymorphic models.

Also applies to: 4-4

app/llm/request_handler/basic_request_handler.py (2)

1-5: LGTM: Import statements are well-organized and necessary.

The new imports support the tool binding functionality and follow Python's import style guidelines.


17-18: ⚠️ Potential issue

Consider the implications of nullable llm_manager.

While making llm_manager nullable aligns with the constructor change, all methods (complete, chat, embed, bind_tools) assume it's non-null and directly use it without checks. This could lead to NullPointerException.

Let's verify if there are any null checks in the codebase:

Consider either:

  1. Adding null checks in methods:
def bind_tools(self, tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]]) -> LanguageModel:
    if self.llm_manager is None:
        raise ValueError("LlmManager is not initialized")
    llm = self.llm_manager.get_llm_by_id(self.model_id)
    llm.bind_tools(tools)
    return llm
  1. Or ensuring llm_manager is always initialized:
llm_manager: LlmManager
app/llm/langchain/iris_langchain_chat_model.py (3)

2-15: LGTM: Import changes are well-organized and support the new functionality.

The new imports are properly organized and necessary for implementing the tool calling feature.


32-32: LGTM: Logger type annotation improves code clarity.

The addition of type annotation for the logger follows Python typing best practices.


44-69: LGTM: Well-implemented tool binding with proper validation and documentation.

The implementation is clean, type-safe, and includes proper validation and comprehensive documentation. The method chaining pattern is a nice touch for API usability.

app/llm/request_handler/capability_request_handler.py (1)

105-105: Document OpenAI-specific tool binding support

The PR indicates this is specifically for OpenAI tool binding, but the code assumes all ChatModel instances support it. Consider:

  1. Adding a comment indicating OpenAI-specific support
  2. Adding a more specific model check beyond just ChatModel
app/pipeline/chat/course_chat_pipeline.py (1)

256-264: 🛠️ Refactor suggestion

Improve mastery calculation configuration and consistency.

  1. The weight factor is hardcoded and lacks explanation
  2. The mastery calculation differs from the get_mastery function
-            weight = 2.0 / 3.0
+            # Weight factor: 2/3 for confidence and 1/3 for progress
+            CONFIDENCE_WEIGHT = 2.0 / 3.0
             return [
                 {
                     "info": competency_metrics.competency_information.get(comp, None),
                     "exercise_ids": competency_metrics.exercises.get(comp, []),
                     "progress": competency_metrics.progress.get(comp, 0),
-                    "mastery": (
-                        (1 - weight) * competency_metrics.progress.get(comp, 0)
-                        + weight * competency_metrics.confidence.get(comp, 0)
-                    ),
+                    "mastery": get_mastery(
+                        competency_metrics.progress.get(comp, 0),
+                        competency_metrics.confidence.get(comp, 0)
+                    ),

Likely invalid or redundant comment.

app/llm/external/openai_chat.py (2)

66-79: Validate the structure of tool calls for OpenAI API compatibility.

The create_openai_tool_calls function formats tool calls for the API:

{
    "id": tool.id,
    "type": tool.type,
    "function": {
        "name": tool.function.name,
        "arguments": json.dumps(tool.function.arguments),
    },
}

Confirm that this structure aligns with OpenAI's function calling API specifications. Any discrepancies might result in API errors.

Run the following script to verify the tool call structure:

Alternatively, consult OpenAI's official documentation to ensure compliance.


181-190: Confirm handling of empty contents in PyrisAIMessage.

When message.tool_calls is present, the convert_to_iris_message function returns a PyrisAIMessage with empty contents:

contents=[TextMessageContentDTO(textContent="")],

Verify that downstream components can handle PyrisAIMessage instances with empty content strings without issues.

Run the following script to check for potential problems:

Ensure that all parts of the codebase gracefully handle empty content strings.

✅ Verification successful

Downstream components handle empty contents correctly in PyrisAIMessage.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for usages of PyrisAIMessage and check handling of contents.

# Find all usages of PyrisAIMessage where contents are accessed.
rg 'PyrisAIMessage' -A 5

# Look for any assumptions that contents are non-empty.
rg 'message.contents\[' -A 2

Length of output: 5066

app/llm/llm_manager.py Show resolved Hide resolved
app/pipeline/chat/course_chat_pipeline.py Show resolved Hide resolved
app/pipeline/chat/course_chat_pipeline.py Show resolved Hide resolved
app/llm/external/openai_chat.py Show resolved Hide resolved
app/llm/external/openai_chat.py Show resolved Hide resolved
app/llm/external/openai_chat.py Show resolved Hide resolved
app/llm/external/openai_chat.py Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (4)
app/pipeline/chat/exercise_chat_agent_pipeline.py (4)

62-64: Consider using template strings for better maintainability

The current string replacement approach for curly braces is error-prone and harder to maintain.

Consider using a template string or format string:

-    - **Exercise Title:** {exercise_title.replace("{", "{{").replace("}", "}}")}
-    - **Problem Statement:** {problem_statement.replace("{", "{{").replace("}", "}}")}
-    - **Programming Language:** {programming_language}
+    - **Exercise Title:** {{exercise_title}}
+    - **Problem Statement:** {{problem_statement}}
+    - **Programming Language:** {programming_language}

134-142: Document the rationale for temperature setting

The LLM temperature setting of 0.5 is a crucial parameter that affects response randomness, but its choice isn't documented.

Add a comment explaining why this specific temperature value was chosen:

-        completion_args = CompletionArguments(temperature=0.5, max_tokens=2000)
+        # Temperature of 0.5 provides a balance between creative and deterministic responses
+        # while maintaining consistency in exercise-related explanations
+        completion_args = CompletionArguments(temperature=0.5, max_tokens=2000)

387-394: Consider making result_limit configurable

The result limit of 5 is hardcoded in the lecture content retrieval, which reduces flexibility.

Consider making it configurable:

+    def __init__(self, ..., lecture_result_limit: int = 5):
+        self.lecture_result_limit = lecture_result_limit
+
     def lecture_content_retrieval(...):
         self.retrieved_paragraphs = self.retriever(
             chat_history=chat_history,
             student_query=query.contents[0].text_content,
-            result_limit=5,
+            result_limit=self.lecture_result_limit,
             course_name=dto.course.name,

556-560: Replace magic string with constant

The string "!ok!" is used as a magic string for response validation.

Define it as a class constant:

+    RESPONSE_APPROVAL_MARKER = "!ok!"
+
     def __call__(self, dto: ExerciseChatPipelineExecutionDTO):
         # ...
-        if "!ok!" in guide_response:
+        if self.RESPONSE_APPROVAL_MARKER in guide_response:
             print("Response is ok and not rewritten!!!")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ec52f25 and 10f4755.

📒 Files selected for processing (3)
  • app/common/PipelineEnum.py (1 hunks)
  • app/pipeline/chat/exercise_chat_agent_pipeline.py (1 hunks)
  • app/pipeline/chat/exercise_chat_pipeline.py (6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/pipeline/chat/exercise_chat_pipeline.py
🔇 Additional comments (3)
app/common/PipelineEnum.py (2)

8-8: LGTM! The new enum member follows conventions.

The addition of IRIS_CHAT_EXERCISE_AGENT_MESSAGE follows the established naming patterns and is appropriately placed near related exercise chat enums.


8-8: Verify enum handling in switch/match statements.

Since this is a new enum value, ensure all switch/match statements handling PipelineEnum are updated to handle this new case where necessary.

app/pipeline/chat/exercise_chat_agent_pipeline.py (1)

606-623: LGTM! Well-implemented utility method

The should_allow_lecture_tool method is well-implemented with proper error handling, type hints, and documentation.

Copy link

❌ Unable to deploy to test server ❌

Pyris Testserver is already in use by PR #142.

Copy link
Contributor

@yassinsws yassinsws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exercise Chaat pipeline works well with lecture chat pipeline good job Kaan

Copy link
Member

@alexjoham alexjoham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me! Also checked the token tracking, works as it should!

@kaancayli kaancayli merged commit d7f5bf0 into main Nov 26, 2024
12 checks passed
@kaancayli kaancayli deleted the feature/exercise-chat/agent branch November 26, 2024 23:50
@bassner bassner added this to the 1.1.0 milestone Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants