Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mail channel initial #1615

Closed
wants to merge 19 commits into from

Conversation

hasinaxp
Copy link
Contributor

@hasinaxp hasinaxp commented Dec 4, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced mail channel functionality in the CLI for processing emails.
    • Added endpoints for managing mail configurations (create, retrieve, update, delete).
    • Implemented mail scheduling capabilities, including a new event type for mail channel scheduling.
    • Enhanced the command line interface to support mail channel commands.
    • Added a new method for processing multiple messages through a bot.
  • Bug Fixes

    • Enhanced error handling for mail configuration operations.
  • Tests

    • Expanded test coverage for mail configurations and mail scheduling functionalities.
    • Added unit tests for the new mail channel functionality and mail processing features.
    • Introduced tests for email verification scenarios.
  • Documentation

    • Updated integration specifications to include required fields for the new mail channel.
    • Enhanced configuration documentation to reflect new mail-related settings.

Copy link

coderabbitai bot commented Dec 4, 2024

Walkthrough

The pull request introduces significant enhancements to the Kairon application, primarily focusing on mail channel functionalities. It adds a new mail_channel module to the command-line interface, new API endpoints for mail configuration management, and a MailChannelScheduleEvent class for scheduling mail events. Additionally, various supporting classes and methods for processing emails, managing configurations, and scheduling tasks are introduced. The changes also include updates to configuration files and dependencies, alongside a comprehensive suite of tests to ensure the functionality of the new features.

Changes

File Path Change Summary
kairon/__init__.py Added mail_channel module to imported subparsers and invoked mail_channel.add_subparser.
kairon/api/app/routers/bot/bot.py Added endpoints: GET /mail/config, POST /mail/config, PUT /mail/config, DELETE /mail/config/{intent}.
kairon/chat/agent_processor.py Introduced get_agent_without_cache method to retrieve agent without cache.
kairon/chat/utils.py Added process_messages_via_bot method for batch message processing.
kairon/cli/mail_channel.py New file with process_channel_mails and add_subparser functions for mail channel CLI commands.
kairon/events/definitions/factory.py Added new event type MailChannelScheduleEvent to the EventFactory.
kairon/events/definitions/mail_channel_schedule.py Introduced MailChannelScheduleEvent class with methods for validation, enqueueing, and execution.
kairon/events/server.py Added request_epoch endpoint and integrated MailScheduler.epoch() in application startup.
kairon/shared/channels/mail/constants.py Introduced MailConstants class for mail-related constants.
kairon/shared/channels/mail/data_objects.py Added MailClassificationConfig class for managing mail classification configurations.
kairon/shared/channels/mail/processor.py New MailProcessor class for email handling and processing.
kairon/shared/channels/mail/scheduler.py New MailScheduler class for scheduling email processing tasks.
kairon/shared/chat/processor.py Modified save_channel_config to invoke MailScheduler.request_epoch() for mail connector type.
kairon/shared/constants.py Updated EventClass and ChannelTypes enums to include new members for email handling.
kairon/shared/data/data_models.py Added MailConfigRequest class for mail configuration handling.
metadata/integrations.yml Introduced new "mail" integration channel with required and optional fields.
requirements/prod.txt Updated huggingface-hub version and added imap-tools dependency.
system.yaml Added mail_queue_name and mail_scheduler_collection parameters; updated evaluator URL.
tests/integration_test/event_service_test.py Added test_request_epoch for testing the new mail request epoch endpoint.
tests/integration_test/services_test.py Added tests for mail configuration API functionalities.
tests/unit_test/channels/mail_channel_test.py New tests for MailProcessor class functionalities.
tests/unit_test/channels/mail_scheduler_test.py Added tests for MailScheduler class methods.
tests/unit_test/chat/chat_test.py Added test for process_messages_via_bot method in ChatDataProcessor.
tests/unit_test/cli_test.py New tests for the mail channel CLI functionality.
tests/unit_test/data_processor/agent_processor_test.py Added test for get_agent_without_cache method.
tests/unit_test/events/definitions_test.py Added tests for MailChannelScheduleEvent class functionalities.
tests/unit_test/utility_test.py Enhanced email handling tests and updated get_channels method to include "mail".

Possibly related PRs

  • content importer fix #1527: Modifies the create_argument_parser function in kairon/__init__.py by adding a new subparser for the content_importer module, indicating a related enhancement to the CLI functionality.

Suggested reviewers

  • hiteshghuge

🐰 In the garden, where the flowers bloom,
New mail features chase away the gloom.
With channels and events, all in a row,
Kairon hops forward, ready to grow!
So let’s send those emails, one by one,
A joyful journey has just begun! 🌼✉️


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

🧹 Outside diff range and nitpick comments (33)
tests/integration_test/action_service_test.py (1)

6-6: Use uppercase for environment variable names

Following Python conventions, environment variable names should be in uppercase.

-os.environ["system_file"] = "./tests/testing_data/system.yaml"
+os.environ["SYSTEM_FILE"] = "./tests/testing_data/system.yaml"
🧰 Tools
🪛 Ruff (0.8.0)

6-6: Use capitalized environment variable SYSTEM_FILE instead of system_file

Replace system_file with SYSTEM_FILE

(SIM112)

kairon/shared/channels/mail/scheduler.py (2)

65-66: Remove unused parameter scheduler in process_mails method

The scheduler parameter in the process_mails method is not used within the function body and can be safely removed to clean up the code.

Apply this diff to remove the unused parameter:

-    def process_mails(bot, scheduler: BackgroundScheduler = None):
+    def process_mails(bot):

78-78: Replace print statement with proper logging

Using print statements is not recommended in production code. Replace print(vals) with logger.debug(vals) or an appropriate logging level.

Apply this diff to replace the print statement:

-            print(vals)
+            logger.debug(vals)
kairon/shared/channels/mail/data_objects.py (2)

61-61: Use exception chaining to preserve traceback

When raising a new exception inside an except block, use raise AppException(str(e)) from e to preserve the original traceback and provide better error context.

Apply this diff to each except block:

-                raise AppException(str(e))
+                raise AppException(str(e)) from e

Also applies to: 79-79, 94-94, 102-102, 109-109, 120-120

🧰 Tools
🪛 Ruff (0.8.0)

61-61: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


114-114: Simplify dictionary key iteration

Instead of iterating over kwargs.keys(), you can iterate directly over kwargs for better readability and efficiency.

Apply this diff to simplify the loop:

-            for key in kwargs.keys():
+            for key in kwargs:
🧰 Tools
🪛 Ruff (0.8.0)

114-114: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

kairon/shared/channels/mail/processor.py (2)

77-78: Consider logging the exception instead of ignoring it

The exception e is assigned but not used. Logging the exception can help with debugging SMTP validation failures.

Apply this diff to log the exception:

            except Exception as e:
+                logger.error(f"SMTP validation failed: {str(e)}")
                return False
🧰 Tools
🪛 Ruff (0.8.0)

77-77: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


127-127: Use exception chaining to preserve traceback

When raising a new exception inside an except block, use raise AppException(str(e)) from e to retain the original traceback.

Apply this diff:

-            raise AppException(str(e))
+            raise AppException(str(e)) from e

Also applies to: 188-188

🧰 Tools
🪛 Ruff (0.8.0)

127-127: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

tests/unit_test/channels/mail_channel_test.py (1)

11-11: Use capitalized environment variable SYSTEM_FILE instead of system_file

Environment variable names should be uppercase. Replace system_file with SYSTEM_FILE to follow the convention.

Apply this diff:

-os.environ["system_file"] = "./tests/testing_data/system.yaml"
+os.environ["SYSTEM_FILE"] = "./tests/testing_data/system.yaml"
🧰 Tools
🪛 Ruff (0.8.0)

11-11: Use capitalized environment variable SYSTEM_FILE instead of system_file

Replace system_file with SYSTEM_FILE

(SIM112)

tests/integration_test/services_test.py (3)

4464-4475: Enhance test coverage for mail configuration retrieval.

The current test only covers the successful retrieval case. Consider adding test cases for:

  1. Empty configuration list
  2. Invalid bot ID
  3. Unauthorized access

Also, remove the debug print statement.

Apply this diff to remove the print statement:

     response = client.get(
         f"/api/bot/{pytest.bot}/mail/config",
         headers={"Authorization": pytest.token_type + " " + pytest.access_token},
     )
     actual = response.json()
-    print(actual)
     assert actual["success"]
     assert len(actual['data']) == 1
     assert actual['data'][0]['intent'] == 'greet'
     assert actual['error_code'] == 0

Would you like me to help generate the additional test cases?


4510-4528: Enhance test coverage for mail configuration deletion.

The current test only covers the successful deletion case. Consider adding test cases for:

  1. Deleting non-existent configuration
  2. Invalid intent parameter
  3. Unauthorized access

Also, remove the debug print statement.

Apply this diff to remove the print statement:

     response = client.get(
         f"/api/bot/{pytest.bot}/mail/config",
         headers={"Authorization": pytest.token_type + " " + pytest.access_token},
     )
     actual = response.json()
-    print(actual)
     assert actual["success"]
     assert len(actual['data']) == 0

Would you like me to help generate the additional test cases?


4409-4531: Improve overall test structure and organization.

Consider the following improvements to enhance test maintainability:

  1. Move common test data to fixtures
  2. Group related test cases using test classes
  3. Remove excessive blank lines between test functions
  4. Add docstrings to describe test scenarios

Example structure:

@pytest.fixture
def mail_config_data():
    return {
        "intent": "greet",
        "entities": ["name", "subject", "summary"],
        "classification_prompt": "any personal mail of greeting"
    }

class TestMailConfig:
    """Test cases for mail configuration management."""
    
    def test_add_mail_config(self, mail_config_data):
        """Test successful addition of mail configuration."""
        response = client.post(
            f"/api/bot/{pytest.bot}/mail/config",
            json=mail_config_data,
            headers={"Authorization": pytest.token_type + " " + pytest.access_token},
        )
        actual = response.json()
        assert actual["success"]
        assert actual['message'] == 'Config applied!'
        assert actual['error_code'] == 0
kairon/shared/channels/mail/constants.py (2)

21-22: Enhance system prompt for better intent classification

The current system prompt is quite generic. Consider providing more specific instructions for email intent classification and entity extraction.

-    DEFAULT_SYSTEM_PROMPT = 'Classify into one of the intents and extract entities as given in the context.' \
-                            'If the mail does not belong to any of the intents, classify intent as null.'
+    DEFAULT_SYSTEM_PROMPT = '''
+        Analyze the email content and:
+        1. Classify it into one of the provided intents based on the primary purpose
+        2. Extract relevant entities mentioned in the email
+        3. If no matching intent is found, classify as null
+        4. Ensure extracted entities align with the classified intent
+        '''

4-6: Consider security implications of default mail settings

The default SMTP/IMAP settings are specific to Gmail. Consider:

  1. Making these configurable via environment variables
  2. Adding support for other email providers
  3. Documenting SSL/TLS requirements
kairon/events/definitions/mail_channel_schedule.py (1)

15-21: Enhance constructor documentation

The constructor's docstring should document the parameters and their types.

     def __init__(self, bot: Text, user: Text, **kwargs):
         """
         Initialise event.
+
+        Args:
+            bot (Text): The bot identifier
+            user (Text): The user identifier
+            **kwargs: Additional keyword arguments
         """
         self.bot = bot
         self.user = user
kairon/__init__.py (1)

Line range hint 1-46: Update docstring to include mail channel usage examples

The docstring should be updated to include usage examples for the new mail channel functionality to maintain consistency with other commands.

kairon/events/server.py (2)

150-151: Remove unnecessary empty lines

Remove the extra empty lines before the new endpoint definition to maintain consistent spacing with other endpoints.


151-155: Optimize imports and maintain consistency

  1. Move the MailScheduler import to the top of the file with other imports
  2. The endpoint implementation follows FastAPI patterns correctly
 from kairon.shared.utils import Utility
 from contextlib import asynccontextmanager
 from kairon.shared.otel import instrument_fastapi
+from kairon.shared.channels.mail.scheduler import MailScheduler

 hsts = StrictTransportSecurity().include_subdomains().preload().max_age(31536000)
tests/unit_test/channels/mail_scheduler_test.py (3)

9-10: Use uppercase for environment variables

Environment variables should use uppercase naming convention.

-os.environ["system_file"] = "./tests/testing_data/system.yaml"
+os.environ["SYSTEM_FILE"] = "./tests/testing_data/system.yaml"
🧰 Tools
🪛 Ruff (0.8.0)

9-9: Use capitalized environment variable SYSTEM_FILE instead of system_file

Replace system_file with SYSTEM_FILE

(SIM112)


80-80: Remove unused variable

The mock_add_job variable is assigned but never used.

-            patch("apscheduler.schedulers.background.BackgroundScheduler.add_job", autospec=True) as mock_add_job:
+            patch("apscheduler.schedulers.background.BackgroundScheduler.add_job", autospec=True):
🧰 Tools
🪛 Ruff (0.8.0)

80-80: Local variable mock_add_job is assigned to but never used

Remove assignment to unused variable mock_add_job

(F841)


33-44: Add assertions for scheduler configuration

The test verifies that add_job is called but doesn't verify the job configuration.

Consider adding assertions for:

  • Job interval/frequency
  • Job function arguments
  • Job ID or name
kairon/shared/chat/processor.py (1)

43-43: Move import to top of file

Conditional imports can impact performance and make code harder to maintain.

Consider moving the import to the top of the file with other imports.

tests/testing_data/system.yaml (2)

124-124: Add queue size limit configuration

Consider adding a configuration parameter for the mail queue size limit to prevent unbounded growth.

mail_queue_name: ${EVENTS_MAIL_QUEUE_NAME:"mail_queue"}
mail_queue_size_limit: ${EVENTS_MAIL_QUEUE_SIZE_LIMIT:1000}

144-144: Add retention period for mail scheduler collection

Consider adding a configuration parameter for data retention in the mail scheduler collection.

mail_scheduler_collection: ${MAIL_SCHEDULER_COLLECTION:"mail_scheduler"}
mail_scheduler_retention_days: ${MAIL_SCHEDULER_RETENTION_DAYS:30}
tests/integration_test/event_service_test.py (1)

537-544: Add additional test cases for comprehensive coverage

While the happy path is tested, consider adding these test cases:

  1. Error handling when MailScheduler.epoch fails
  2. Validation of request parameters if any

Example additional test:

@patch('kairon.shared.channels.mail.scheduler.MailScheduler.epoch')
def test_request_epoch_failure(mock_epoch):
    mock_epoch.side_effect = Exception("Failed to process epoch")
    response = client.get('/api/mail/request_epoch')
    mock_epoch.assert_called_once()
    assert response.status_code == 422
    resp = response.json()
    assert resp['success'] is False
    assert "Failed to process epoch" in resp['message']
tests/unit_test/cli_test.py (1)

340-351: Consider consolidating the nested context managers.

The test effectively verifies error handling, but the nested context managers can be combined for better readability.

-        with patch('argparse.ArgumentParser.parse_args',
-                   return_value=argparse.Namespace(func=process_channel_mails,
-                                                   bot="test_bot",
-                                                    user="test_user", mails=data)):
-            with pytest.raises(ValueError):
-                cli()
+        with (
+            patch('argparse.ArgumentParser.parse_args',
+                  return_value=argparse.Namespace(func=process_channel_mails,
+                                                bot="test_bot",
+                                                user="test_user", mails=data)),
+            pytest.raises(ValueError)
+        ):
+            cli()
🧰 Tools
🪛 Ruff (0.8.0)

345-349: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

kairon/shared/data/data_models.py (1)

1346-1351: Add field validation and documentation to the MailConfigRequest class.

Consider enhancing the class with:

  1. Field validation for required fields
  2. Type hints for better IDE support
  3. Class and field documentation
 class MailConfigRequest(BaseModel):
-    intent: str
-    entities: list[str] = []
-    subjects: list[str] = []
-    classification_prompt: str
-    reply_template: str = None
+    """
+    Pydantic model for mail configuration requests.
+    """
+    intent: str = Field(..., description="Intent to be used for mail classification")
+    entities: list[str] = Field(default=[], description="List of entities to extract from mail")
+    subjects: list[str] = Field(default=[], description="List of mail subjects to match")
+    classification_prompt: str = Field(..., description="Prompt used for mail classification")
+    reply_template: Optional[str] = Field(default=None, description="Template for mail replies")
+
+    @validator("intent")
+    def validate_intent(cls, v):
+        if not v.strip():
+            raise ValueError("intent cannot be empty")
+        return v
+
+    @validator("classification_prompt")
+    def validate_prompt(cls, v):
+        if not v.strip():
+            raise ValueError("classification_prompt cannot be empty")
+        return v
tests/unit_test/chat/chat_test.py (1)

942-966: Add test coverage for error scenarios and metadata handling.

The test effectively verifies the happy path, but consider adding:

  1. Test cases for error scenarios (e.g., agent initialization failure)
  2. Assertions to verify metadata is correctly passed to the agent
  3. Test cases with different message types

Example test case for error scenario:

@pytest.mark.asyncio
@patch("kairon.chat.utils.AgentProcessor.get_agent_without_cache")
@patch("kairon.chat.utils.ChatUtils.get_metadata")
async def test_process_messages_via_bot_error(mock_get_metadata, mock_get_agent_without_cache):
    messages = ["/greet"]
    mock_get_metadata.return_value = {"key": "value"}
    mock_get_agent_without_cache.side_effect = Exception("Failed to initialize agent")
    
    with pytest.raises(Exception, match="Failed to initialize agent"):
        await ChatUtils.process_messages_via_bot(
            messages, 1, "test_bot", "test_user", False, {}
        )
tests/unit_test/events/definitions_test.py (5)

1204-1221: Clean up unused mock variables

The test contains unused mock variables that should be removed or utilized:

  • mp on line 1214
  • mock_login on line 1215
  • mock_logout on line 1216
-        with patch('kairon.shared.channels.mail.processor.MailProcessor.__init__', return_value=None) as mp:
-            with patch('kairon.shared.channels.mail.processor.MailProcessor.login_smtp', return_value=None) as mock_login:
-                with patch('kairon.shared.channels.mail.processor.MailProcessor.logout_smtp', return_value=None) as mock_logout:
+        with patch('kairon.shared.channels.mail.processor.MailProcessor.__init__', return_value=None), \
+             patch('kairon.shared.channels.mail.processor.MailProcessor.login_smtp', return_value=None), \
+             patch('kairon.shared.channels.mail.processor.MailProcessor.logout_smtp', return_value=None):
🧰 Tools
🪛 Ruff (0.8.0)

1214-1215: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)


1214-1214: Local variable mp is assigned to but never used

Remove assignment to unused variable mp

(F841)


1215-1215: Local variable mock_login is assigned to but never used

Remove assignment to unused variable mock_login

(F841)


1216-1216: Local variable mock_logout is assigned to but never used

Remove assignment to unused variable mock_logout

(F841)


1238-1253: Improve test structure using pytest.raises

The test could be more idiomatic by using pytest.raises for exception handling instead of try/except with pytest.fail.

-        try:
-            event.enqueue()
-        except AppException as e:
-            pytest.fail(f"Unexpected exception: {e}")
+        event.enqueue()

1255-1270: Remove unused import

The 'patch' import from unittest.mock is not used in this test.

-        from unittest.mock import patch
🧰 Tools
🪛 Ruff (0.8.0)

1258-1258: unittest.mock.patch imported but unused

Remove unused import: unittest.mock.patch

(F401)


1271-1278: Improve test structure using pytest.raises

Similar to the earlier test, this could be more idiomatic by removing the try/except with pytest.fail.

-        try:
-            MailChannelScheduleEvent("", "").execute()
-        except AppException as e:
-            pytest.fail(f"Unexpected exception: {e}")
+        MailChannelScheduleEvent("", "").execute()

1279-1288: Combine nested with statements

Improve readability by combining the nested with statements.

-        with patch("kairon.shared.channels.mail.processor.MailProcessor.process_message_task",
-                   side_effect=Exception("Test")):
-            with pytest.raises(AppException, match="Test"):
-                MailChannelScheduleEvent("", "").execute(mails=["[email protected]"])
+        with patch("kairon.shared.channels.mail.processor.MailProcessor.process_message_task",
+                   side_effect=Exception("Test")), \
+             pytest.raises(AppException, match="Test"):
+            MailChannelScheduleEvent("", "").execute(mails=["[email protected]"])
🧰 Tools
🪛 Ruff (0.8.0)

1285-1287: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

kairon/api/app/routers/bot/bot.py (1)

1664-1676: Improve data formatting readability

The list comprehension could be replaced with a more readable approach using a separate function.

+    def format_mail_config(item):
+        data = item.to_mongo()
+        data.pop('_id', None)
+        data.pop('user', None)
+        return data

-    formatted_data = [
-        {key: value for key, value in item.to_mongo().items() if key not in {"_id", "user"}}
-        for item in data
-    ]
+    formatted_data = [format_mail_config(item) for item in data]
🧰 Tools
🪛 Ruff (0.8.0)

1666-1666: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c3def34 and e28f9f0.

📒 Files selected for processing (29)
  • kairon/__init__.py (2 hunks)
  • kairon/api/app/routers/bot/bot.py (3 hunks)
  • kairon/chat/agent_processor.py (1 hunks)
  • kairon/chat/utils.py (1 hunks)
  • kairon/cli/mail_channel.py (1 hunks)
  • kairon/events/definitions/factory.py (2 hunks)
  • kairon/events/definitions/mail_channel_schedule.py (1 hunks)
  • kairon/events/server.py (2 hunks)
  • kairon/shared/channels/mail/constants.py (1 hunks)
  • kairon/shared/channels/mail/data_objects.py (1 hunks)
  • kairon/shared/channels/mail/processor.py (1 hunks)
  • kairon/shared/channels/mail/scheduler.py (1 hunks)
  • kairon/shared/chat/processor.py (1 hunks)
  • kairon/shared/constants.py (2 hunks)
  • kairon/shared/data/data_models.py (1 hunks)
  • metadata/integrations.yml (1 hunks)
  • requirements/prod.txt (1 hunks)
  • system.yaml (3 hunks)
  • tests/integration_test/action_service_test.py (2 hunks)
  • tests/integration_test/event_service_test.py (1 hunks)
  • tests/integration_test/services_test.py (3 hunks)
  • tests/testing_data/system.yaml (2 hunks)
  • tests/unit_test/channels/mail_channel_test.py (1 hunks)
  • tests/unit_test/channels/mail_scheduler_test.py (1 hunks)
  • tests/unit_test/chat/chat_test.py (2 hunks)
  • tests/unit_test/cli_test.py (2 hunks)
  • tests/unit_test/data_processor/agent_processor_test.py (1 hunks)
  • tests/unit_test/events/definitions_test.py (1 hunks)
  • tests/unit_test/utility_test.py (1 hunks)
🔥 Files not summarized due to errors (1)
  • tests/integration_test/action_service_test.py: Error: Server error: no LLM provider could handle the message
🧰 Additional context used
🪛 Ruff (0.8.0)
tests/integration_test/action_service_test.py

6-6: Use capitalized environment variable SYSTEM_FILE instead of system_file

Replace system_file with SYSTEM_FILE

(SIM112)

tests/unit_test/channels/mail_scheduler_test.py

9-9: Use capitalized environment variable SYSTEM_FILE instead of system_file

Replace system_file with SYSTEM_FILE

(SIM112)


80-80: Local variable mock_add_job is assigned to but never used

Remove assignment to unused variable mock_add_job

(F841)

tests/unit_test/cli_test.py

323-323: Use capitalized environment variable SYSTEM_FILE instead of system_file

Replace system_file with SYSTEM_FILE

(SIM112)


345-349: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

kairon/events/definitions/mail_channel_schedule.py

39-39: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


51-51: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

kairon/api/app/routers/bot/bot.py

1666-1666: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


1683-1683: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


1696-1696: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


1710-1710: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

kairon/shared/channels/mail/data_objects.py

61-61: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


79-79: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


94-94: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


102-102: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


109-109: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


114-114: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)


120-120: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

kairon/shared/channels/mail/processor.py

77-77: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


127-127: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


188-188: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

tests/unit_test/channels/mail_channel_test.py

11-11: Use capitalized environment variable SYSTEM_FILE instead of system_file

Replace system_file with SYSTEM_FILE

(SIM112)

tests/unit_test/events/definitions_test.py

1214-1215: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)


1214-1214: Local variable mp is assigned to but never used

Remove assignment to unused variable mp

(F841)


1215-1215: Local variable mock_login is assigned to but never used

Remove assignment to unused variable mock_login

(F841)


1216-1216: Local variable mock_logout is assigned to but never used

Remove assignment to unused variable mock_logout

(F841)


1258-1258: unittest.mock.patch imported but unused

Remove unused import: unittest.mock.patch

(F401)


1285-1287: Use a single with statement with multiple contexts instead of nested with statements

(SIM117)

🔇 Additional comments (12)
tests/integration_test/action_service_test.py (1)

24-25: LGTM!

Empty lines between imports improve readability.

kairon/chat/agent_processor.py (1)

82-99: Verify the impact on memory usage

The method bypasses caching which could lead to memory issues if used frequently. Consider adding monitoring or rate limiting.

requirements/prod.txt (1)

70-71: Consider updating imap-tools to latest version.

The imap-tools package has a newer version (1.7.5) available with bug fixes. Consider updating unless there's a specific reason to use 1.7.4.

Let's verify dependency compatibility:

metadata/integrations.yml (1)

80-82: ⚠️ Potential issue

Ensure secure handling of email credentials.

The email_account and email_password fields contain sensitive information. Ensure these are:

  1. Stored securely (encrypted at rest)
  2. Transmitted securely (encrypted in transit)
  3. Not logged in plain text

Let's check for any potential security implementations:

✅ Verification successful

Based on the search results, I can see that the codebase has robust security implementations for handling sensitive data, including:

  1. Encryption using Fernet symmetric encryption for secrets and credentials
  2. Secure parameter handling with encryption flags
  3. Masking of sensitive values in logs
  4. Encrypted storage of credentials in key vaults

Let me verify the specific handling of email credentials with another search:


Email credentials are handled securely with proper encryption and security measures.

The codebase demonstrates robust security implementations for email credentials:

  1. Email credentials are stored in encrypted form using Fernet symmetric encryption (a secure implementation provided by the cryptography package)
  2. The codebase includes proper parameter validation and secure handling through EmailActionConfig
  3. Credentials are processed through secure channels:
    • SMTP connections use TLS/SSL (smtp.starttls())
    • IMAP connections are handled securely through the imap-tools library
  4. Password and sensitive data are:
    • Never logged in plain text
    • Encrypted at rest using Fernet encryption
    • Transmitted securely during SMTP/IMAP authentication
    • Handled through dedicated credential management classes

The implementation follows security best practices for handling email credentials, with proper encryption both at rest and in transit.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for encryption/security related code
rg -l "encrypt|decrypt|cipher|fernet" | xargs rg -A 5 "encrypt|decrypt|cipher|fernet"

# Check for credential handling in mail-related files
fd -e py | rg "mail|email" | xargs rg -A 5 "password|credential|secret"

Length of output: 120503


Script:

#!/bin/bash
# Check email credential handling in IMAP/SMTP implementations
rg -A 5 "email_account|email_password|imap_server" --type py

# Check email configuration handling
rg -A 5 "EmailActionConfig|SmtpConfig" --type py

Length of output: 42952

kairon/__init__.py (1)

47-48: LGTM: Mail channel CLI integration

The integration of mail_channel into the CLI subparsers follows the established pattern and maintains consistency with other modules.

Also applies to: 66-66

kairon/shared/constants.py (1)

83-83: LGTM: Enum additions for mail channel support

The new enum members email_channel_scheduler and MAIL follow consistent naming patterns and are appropriately placed within their respective enums.

Also applies to: 119-119

system.yaml (2)

130-130: LGTM! Mail configuration entries follow best practices

The new configuration entries for mail queue and scheduler collection:

  • Follow the existing pattern of using environment variables
  • Have sensible default values
  • Use consistent naming conventions

Also applies to: 145-145


158-158: Verify the impact of URL change

The evaluator URL has been changed from a specific IP (192.168.100.109) to localhost. Ensure this change doesn't break existing deployments.

✅ Verification successful

The previous search might have failed because the IP was too specific. Let's broaden our search to find any references to the evaluator service or its port, as well as checking configuration files that might contain related settings.


URL change to localhost is safe and follows configuration pattern

The URL change from a specific IP to localhost is safe because:

  1. The URL is configurable via the PYSCRIPT_EVALUATOR_ENDPOINT environment variable
  2. The localhost default is consistent with other test configurations (e.g., in tests/testing_data/system.yaml)
  3. The service is properly containerized (as seen in docker/docker-compose.yml mapping port 8087)
  4. No hardcoded IP references were found in the codebase

The change follows the best practice of using environment variables for configuration, allowing different deployment environments to override the URL as needed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the evaluator URL is referenced in other configuration files
rg -l "192.168.100.109:8087"

Length of output: 28


Script:

#!/bin/bash
# Let's search for any references to the evaluator endpoint or port 8087
rg -i "pyscript.*evaluator|:8087"

# Also check for any Docker or deployment configurations
fd -e yml -e yaml -e env | xargs cat | grep -i "pyscript\|8087"

Length of output: 3904

tests/unit_test/cli_test.py (2)

319-326: LGTM! Test class setup follows established patterns.

The test class setup follows the same pattern as other test classes in the file, with proper fixture setup and database connection initialization.

🧰 Tools
🪛 Ruff (0.8.0)

323-323: Use capitalized environment variable SYSTEM_FILE instead of system_file

Replace system_file with SYSTEM_FILE

(SIM112)


327-338: LGTM! Well-structured test for successful mail channel processing.

The test effectively verifies:

  • JSON list input is properly handled
  • Mock is called exactly once
  • Command-line interface integration works as expected
tests/unit_test/events/definitions_test.py (1)

1223-1235: LGTM!

The test case properly verifies the failure scenario for mail channel schedule event validation.

tests/unit_test/utility_test.py (1)

2851-2852: LGTM!

The addition of 'line' and 'mail' channels to the expected_channels list is appropriate and ensures that the test_get_channels test case remains accurate.

Comment on lines +82 to +99
@staticmethod
def get_agent_without_cache(bot: str, use_store: bool = True) -> Agent:
endpoint = AgentProcessor.mongo_processor.get_endpoints(
bot, raise_exception=False
)
action_endpoint = Utility.get_action_url(endpoint)
model_path = Utility.get_latest_model(bot)
domain = AgentProcessor.mongo_processor.load_domain(bot)
if use_store:
mongo_store = Utility.get_local_mongo_store(bot, domain)
lock_store_endpoint = Utility.get_lock_store(bot)
agent = KaironAgent.load(model_path, action_endpoint=action_endpoint, tracker_store=mongo_store,
lock_store=lock_store_endpoint)
else:
agent = KaironAgent.load(model_path, action_endpoint=action_endpoint)

agent.model_ver = model_path.split("/")[-1]
return agent
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider refactoring to reduce code duplication and add error handling

The implementation duplicates logic from the reload method and lacks proper error handling. Consider the following improvements:

  1. Add comprehensive error handling similar to the reload method
  2. Add docstring with parameter descriptions
  3. Consider extracting common logic into a private method shared with reload

Here's a suggested implementation:

    @staticmethod
    def get_agent_without_cache(bot: str, use_store: bool = True) -> Agent:
+        """
+        Load bot agent directly from model without caching
+
+        Args:
+            bot: bot identifier
+            use_store: whether to use MongoDB store and lock store
+
+        Returns:
+            Agent: loaded bot agent
+
+        Raises:
+            AppException: if bot has not been trained
+        """
+        try:
            endpoint = AgentProcessor.mongo_processor.get_endpoints(
                bot, raise_exception=False
            )
            action_endpoint = Utility.get_action_url(endpoint)
            model_path = Utility.get_latest_model(bot)
            domain = AgentProcessor.mongo_processor.load_domain(bot)
            if use_store:
                mongo_store = Utility.get_local_mongo_store(bot, domain)
                lock_store_endpoint = Utility.get_lock_store(bot)
                agent = KaironAgent.load(model_path, action_endpoint=action_endpoint, tracker_store=mongo_store,
                                         lock_store=lock_store_endpoint)
            else:
                agent = KaironAgent.load(model_path, action_endpoint=action_endpoint)

            agent.model_ver = model_path.split("/")[-1]
            return agent
+        except Exception as e:
+            logging.exception(e)
+            raise AppException("Bot has not been trained yet!")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@staticmethod
def get_agent_without_cache(bot: str, use_store: bool = True) -> Agent:
endpoint = AgentProcessor.mongo_processor.get_endpoints(
bot, raise_exception=False
)
action_endpoint = Utility.get_action_url(endpoint)
model_path = Utility.get_latest_model(bot)
domain = AgentProcessor.mongo_processor.load_domain(bot)
if use_store:
mongo_store = Utility.get_local_mongo_store(bot, domain)
lock_store_endpoint = Utility.get_lock_store(bot)
agent = KaironAgent.load(model_path, action_endpoint=action_endpoint, tracker_store=mongo_store,
lock_store=lock_store_endpoint)
else:
agent = KaironAgent.load(model_path, action_endpoint=action_endpoint)
agent.model_ver = model_path.split("/")[-1]
return agent
@staticmethod
def get_agent_without_cache(bot: str, use_store: bool = True) -> Agent:
"""
Load bot agent directly from model without caching
Args:
bot: bot identifier
use_store: whether to use MongoDB store and lock store
Returns:
Agent: loaded bot agent
Raises:
AppException: if bot has not been trained
"""
try:
endpoint = AgentProcessor.mongo_processor.get_endpoints(
bot, raise_exception=False
)
action_endpoint = Utility.get_action_url(endpoint)
model_path = Utility.get_latest_model(bot)
domain = AgentProcessor.mongo_processor.load_domain(bot)
if use_store:
mongo_store = Utility.get_local_mongo_store(bot, domain)
lock_store_endpoint = Utility.get_lock_store(bot)
agent = KaironAgent.load(model_path, action_endpoint=action_endpoint, tracker_store=mongo_store,
lock_store=lock_store_endpoint)
else:
agent = KaironAgent.load(model_path, action_endpoint=action_endpoint)
agent.model_ver = model_path.split("/")[-1]
return agent
except Exception as e:
logging.exception(e)
raise AppException("Bot has not been trained yet!")

Comment on lines +138 to +143
def test_get_agent_no_cache(self):
model = AgentProcessor.get_agent_without_cache(pytest.bot, False)
assert model
assert len(list(ModelProcessor.get_training_history(pytest.bot))) == 1
assert not Utility.check_empty_string(model.model_ver)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance test coverage for get_agent_without_cache

The current test only covers the basic scenario. Consider adding the following test cases:

  1. Test with use_store=True to verify MongoDB store and lock store integration
  2. Test error handling with invalid bot ID
  3. Add assertions to verify the agent's properties (action_endpoint, tracker_store, lock_store)

Here's a suggested implementation:

    def test_get_agent_no_cache(self):
+        # Test without store
        model = AgentProcessor.get_agent_without_cache(pytest.bot, False)
        assert model
        assert len(list(ModelProcessor.get_training_history(pytest.bot))) == 1
        assert not Utility.check_empty_string(model.model_ver)
+        assert model.tracker_store is None
+        assert model.lock_store is None
+
+    def test_get_agent_no_cache_with_store(self):
+        # Test with store
+        model = AgentProcessor.get_agent_without_cache(pytest.bot, True)
+        assert model
+        assert len(list(ModelProcessor.get_training_history(pytest.bot))) == 1
+        assert not Utility.check_empty_string(model.model_ver)
+        assert model.tracker_store is not None
+        assert isinstance(model.lock_store, InMemoryLockStore)
+
+    def test_get_agent_no_cache_invalid_bot(self):
+        # Test error handling
+        with pytest.raises(AppException) as e:
+            AgentProcessor.get_agent_without_cache('invalid_bot')
+        assert str(e).__contains__("Bot has not been trained yet!")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_get_agent_no_cache(self):
model = AgentProcessor.get_agent_without_cache(pytest.bot, False)
assert model
assert len(list(ModelProcessor.get_training_history(pytest.bot))) == 1
assert not Utility.check_empty_string(model.model_ver)
def test_get_agent_no_cache(self):
# Test without store
model = AgentProcessor.get_agent_without_cache(pytest.bot, False)
assert model
assert len(list(ModelProcessor.get_training_history(pytest.bot))) == 1
assert not Utility.check_empty_string(model.model_ver)
assert model.tracker_store is None
assert model.lock_store is None
def test_get_agent_no_cache_with_store(self):
# Test with store
model = AgentProcessor.get_agent_without_cache(pytest.bot, True)
assert model
assert len(list(ModelProcessor.get_training_history(pytest.bot))) == 1
assert not Utility.check_empty_string(model.model_ver)
assert model.tracker_store is not None
assert isinstance(model.lock_store, InMemoryLockStore)
def test_get_agent_no_cache_invalid_bot(self):
# Test error handling
with pytest.raises(AppException) as e:
AgentProcessor.get_agent_without_cache('invalid_bot')
assert str(e).__contains__("Bot has not been trained yet!")

Comment on lines +4479 to +4507
def test_update_mail_config():
data = {
"intent": "greet",
"entities": ["name", "subject", "summery"],
"classification_prompt": "any personal email of greeting"
}

response = client.put(
f"/api/bot/{pytest.bot}/mail/config",
json=data,
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
print(actual)
assert actual["success"]
assert actual['message'] == 'Config updated!'
assert actual['error_code'] == 0

response = client.get(
f"/api/bot/{pytest.bot}/mail/config",
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
print(actual)
assert actual["success"]
assert len(actual['data']) == 1
assert actual['data'][0]['intent'] == 'greet'
assert actual['data'][0]['classification_prompt'] == 'any personal email of greeting'
assert actual['error_code'] == 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix typo and enhance test coverage for mail configuration updates.

Issues to address:

  1. Fix the "summery" typo in test data
  2. Remove debug print statements
  3. Add test cases for:
    • Updating non-existent configuration
    • Invalid classification prompt validation
    • Invalid entities

Apply this diff to fix the immediate issues:

     data = {
         "intent": "greet",
-        "entities": ["name", "subject", "summery"],
+        "entities": ["name", "subject", "summary"],
         "classification_prompt": "any personal email of greeting"
     }
 
     response = client.put(
         f"/api/bot/{pytest.bot}/mail/config",
         json=data,
         headers={"Authorization": pytest.token_type + " " + pytest.access_token},
     )
     actual = response.json()
-    print(actual)
     assert actual["success"]
     assert actual['message'] == 'Config updated!'
     assert actual['error_code'] == 0
 
     response = client.get(
         f"/api/bot/{pytest.bot}/mail/config",
         headers={"Authorization": pytest.token_type + " " + pytest.access_token},
     )
     actual = response.json()
-    print(actual)
     assert actual["success"]

Would you like me to help generate the additional test cases?

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def test_update_mail_config():
data = {
"intent": "greet",
"entities": ["name", "subject", "summery"],
"classification_prompt": "any personal email of greeting"
}
response = client.put(
f"/api/bot/{pytest.bot}/mail/config",
json=data,
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
print(actual)
assert actual["success"]
assert actual['message'] == 'Config updated!'
assert actual['error_code'] == 0
response = client.get(
f"/api/bot/{pytest.bot}/mail/config",
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
print(actual)
assert actual["success"]
assert len(actual['data']) == 1
assert actual['data'][0]['intent'] == 'greet'
assert actual['data'][0]['classification_prompt'] == 'any personal email of greeting'
assert actual['error_code'] == 0
data = {
"intent": "greet",
"entities": ["name", "subject", "summary"],
"classification_prompt": "any personal email of greeting"
}
response = client.put(
f"/api/bot/{pytest.bot}/mail/config",
json=data,
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
assert actual["success"]
assert actual['message'] == 'Config updated!'
assert actual['error_code'] == 0
response = client.get(
f"/api/bot/{pytest.bot}/mail/config",
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
assert actual["success"]

Comment on lines +4411 to +4461
def test_add_mail_config():
data = {
"intent": "greet",
"entities": ["name", "subject", "summery"],
"classification_prompt": "any personal mail of greeting"
}

response = client.post(
f"/api/bot/{pytest.bot}/mail/config",
json=data,
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
print(actual)
assert actual["success"]
assert actual['message'] == 'Config applied!'
assert actual['error_code'] == 0

def test_add_mail_config_missing_field():
data = {
"entities": ["name", "subject", "summery"],
}

response = client.post(
f"/api/bot/{pytest.bot}/mail/config",
json=data,
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
print(actual)
assert not actual["success"]
assert len(actual['message'])
assert actual['error_code'] == 422

def test_add_mail_config_same_intent():
data = {
"intent": "greet",
"entities": ["name", "subject", "summery"],
"classification_prompt": "any personal mail of greeting"
}

response = client.post(
f"/api/bot/{pytest.bot}/mail/config",
json=data,
headers={"Authorization": pytest.token_type + " " + pytest.access_token},
)
actual = response.json()
print(actual)
assert not actual["success"]
assert actual['message'] == 'Mail configuration already exists for intent [greet]'
assert actual['error_code'] == 422
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Fix typo and enhance test coverage for mail configuration addition.

Several improvements needed in the mail configuration test suite:

  1. Fix the typo in test data: "summery" should be "summary"
  2. Add test cases for invalid entities and classification prompt validation
  3. Remove debug print statements
  4. Consider moving test data to fixtures for better maintainability

Apply this diff to fix the typo and improve the test structure:

 def test_add_mail_config():
     data = {
         "intent": "greet",
-        "entities": ["name", "subject", "summery"],
+        "entities": ["name", "subject", "summary"],
         "classification_prompt": "any personal mail of greeting"
     }
 
     response = client.post(
         f"/api/bot/{pytest.bot}/mail/config",
         json=data,
         headers={"Authorization": pytest.token_type + " " + pytest.access_token},
     )
     actual = response.json()
-    print(actual)
     assert actual["success"]
     assert actual['message'] == 'Config applied!'
     assert actual['error_code'] == 0

Would you like me to help generate additional test cases for:

  1. Invalid entity validation
  2. Classification prompt format/length validation
  3. Test fixtures for common test data?

Committable suggestion skipped: line range outside the PR's diff.

"frequency_penalty": 0,
"logit_bias": {},
"max_tokens": 300,
"model": "gpt-4o-mini",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo in model name

The model name "gpt-4o-mini" appears to be incorrect. Common OpenAI model names are "gpt-4", "gpt-3.5-turbo", etc.

-        "model": "gpt-4o-mini",
+        "model": "gpt-4"  # or appropriate model name

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +42 to +44
if configuration['connector_type'] == ChannelTypes.MAIL.value:
from kairon.shared.channels.mail.scheduler import MailScheduler
MailScheduler.request_epoch()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for mail scheduler initialization

The mail scheduler request could fail and should be handled gracefully.

 if configuration['connector_type'] == ChannelTypes.MAIL.value:
     from kairon.shared.channels.mail.scheduler import MailScheduler
-    MailScheduler.request_epoch()
+    try:
+        MailScheduler.request_epoch()
+    except Exception as e:
+        logger.error(f"Failed to initialize mail scheduler: {str(e)}")
+        raise AppException(f"Failed to initialize mail scheduler: {str(e)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if configuration['connector_type'] == ChannelTypes.MAIL.value:
from kairon.shared.channels.mail.scheduler import MailScheduler
MailScheduler.request_epoch()
if configuration['connector_type'] == ChannelTypes.MAIL.value:
from kairon.shared.channels.mail.scheduler import MailScheduler
try:
MailScheduler.request_epoch()
except Exception as e:
logger.error(f"Failed to initialize mail scheduler: {str(e)}")
raise AppException(f"Failed to initialize mail scheduler: {str(e)}")

Comment on lines +46 to +74
@staticmethod
async def process_messages_via_bot(
messages: [str],
account: int,
bot: str,
user: str,
is_integration_user: bool = False,
metadata: Dict = None,
):
"""Process a list of messages through the bot.
Args:
messages: List of messages to process
account: Account ID
bot: Bot ID
user: User ID
is_integration_user: Flag indicating if user is integration user
metadata: Additional metadata

Returns:
List of responses from the bot
"""
responses = []
uncached_model = AgentProcessor.get_agent_without_cache(bot, False)
metadata = ChatUtils.get_metadata(account, bot, is_integration_user, metadata)
for message in messages:
msg = UserMessage(message, sender_id=user, metadata=metadata)
chat_response = await uncached_model.handle_message(msg)
responses.append(chat_response)
return responses
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling and consider caching implications

While the implementation is good, consider these improvements:

  1. Add error handling for failed responses
  2. Consider the performance impact of using uncached model for each batch

Consider this implementation:

 @staticmethod
 async def process_messages_via_bot(
         messages: [str],
         account: int,
         bot: str,
         user: str,
         is_integration_user: bool = False,
         metadata: Dict = None,
 ):
     responses = []
-    uncached_model = AgentProcessor.get_agent_without_cache(bot, False)
+    model = AgentProcessor.get_agent(bot)  # Use cached model
     metadata = ChatUtils.get_metadata(account, bot, is_integration_user, metadata)
     for message in messages:
         try:
             msg = UserMessage(message, sender_id=user, metadata=metadata)
-            chat_response = await uncached_model.handle_message(msg)
+            chat_response = await model.handle_message(msg)
             responses.append(chat_response)
+        except Exception as e:
+            logger.error(f"Error processing message: {message}, error: {str(e)}")
+            responses.append({"error": str(e)})
     return responses

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +1680 to +1691
@router.post("/mail/config", response_model=Response)
async def set_mail_config(
request_data: MailConfigRequest,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
Applies the mail config
"""
request_dict = request_data.dict()
MailClassificationConfig.create_doc(**request_dict, bot=current_user.get_bot(), user=current_user.get_user())
return {"message": "Config applied!"}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for duplicate configurations

The endpoint should handle potential duplicate configurations gracefully.

     request_dict = request_data.dict()
+    try:
         MailClassificationConfig.create_doc(**request_dict, bot=current_user.get_bot(), user=current_user.get_user())
-    return {"message": "Config applied!"}
+        return {"message": "Config applied!"}
+    except Exception as e:
+        raise AppException(f"Failed to create mail configuration: {str(e)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@router.post("/mail/config", response_model=Response)
async def set_mail_config(
request_data: MailConfigRequest,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
Applies the mail config
"""
request_dict = request_data.dict()
MailClassificationConfig.create_doc(**request_dict, bot=current_user.get_bot(), user=current_user.get_user())
return {"message": "Config applied!"}
@router.post("/mail/config", response_model=Response)
async def set_mail_config(
request_data: MailConfigRequest,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
Applies the mail config
"""
request_dict = request_data.dict()
try:
MailClassificationConfig.create_doc(**request_dict, bot=current_user.get_bot(), user=current_user.get_user())
return {"message": "Config applied!"}
except Exception as e:
raise AppException(f"Failed to create mail configuration: {str(e)}")
🧰 Tools
🪛 Ruff (0.8.0)

1683-1683: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

Comment on lines +1707 to +1716
@router.delete("/mail/config/{intent}", response_model=Response)
async def del_soft_mail_config(
intent: str,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
delete the mail config
"""
MailClassificationConfig.soft_delete_doc(current_user.get_bot(), intent)
return {"message": "Config deleted!"}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation for existing configuration

The endpoint should verify that the configuration exists before deletion.

+    existing = MailClassificationConfig.objects(bot=current_user.get_bot(), intent=intent, status=True).first()
+    if not existing:
+        raise AppException(f"Mail configuration for intent '{intent}' not found")
     MailClassificationConfig.soft_delete_doc(current_user.get_bot(), intent)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@router.delete("/mail/config/{intent}", response_model=Response)
async def del_soft_mail_config(
intent: str,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
delete the mail config
"""
MailClassificationConfig.soft_delete_doc(current_user.get_bot(), intent)
return {"message": "Config deleted!"}
@router.delete("/mail/config/{intent}", response_model=Response)
async def del_soft_mail_config(
intent: str,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
delete the mail config
"""
existing = MailClassificationConfig.objects(bot=current_user.get_bot(), intent=intent, status=True).first()
if not existing:
raise AppException(f"Mail configuration for intent '{intent}' not found")
MailClassificationConfig.soft_delete_doc(current_user.get_bot(), intent)
return {"message": "Config deleted!"}
🧰 Tools
🪛 Ruff (0.8.0)

1710-1710: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

Comment on lines +1693 to +1704
@router.put("/mail/config", response_model=Response)
async def update_mail_config(
request_data: MailConfigRequest,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
update the mail config
"""
request_dict = request_data.dict()
MailClassificationConfig.update_doc(**request_dict, bot=current_user.get_bot())
return {"message": "Config updated!"}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation for existing configuration

The endpoint should verify that the configuration exists before updating.

     request_dict = request_data.dict()
+    existing = MailClassificationConfig.objects(bot=current_user.get_bot(), intent=request_dict['intent'], status=True).first()
+    if not existing:
+        raise AppException(f"Mail configuration for intent '{request_dict['intent']}' not found")
     MailClassificationConfig.update_doc(**request_dict, bot=current_user.get_bot())
     return {"message": "Config updated!"}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@router.put("/mail/config", response_model=Response)
async def update_mail_config(
request_data: MailConfigRequest,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
update the mail config
"""
request_dict = request_data.dict()
MailClassificationConfig.update_doc(**request_dict, bot=current_user.get_bot())
return {"message": "Config updated!"}
@router.put("/mail/config", response_model=Response)
async def update_mail_config(
request_data: MailConfigRequest,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=DESIGNER_ACCESS)
):
"""
update the mail config
"""
request_dict = request_data.dict()
existing = MailClassificationConfig.objects(bot=current_user.get_bot(), intent=request_dict['intent'], status=True).first()
if not existing:
raise AppException(f"Mail configuration for intent '{request_dict['intent']}' not found")
MailClassificationConfig.update_doc(**request_dict, bot=current_user.get_bot())
return {"message": "Config updated!"}
🧰 Tools
🪛 Ruff (0.8.0)

1696-1696: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (2)
kairon/shared/channels/mail/processor.py (2)

23-37: Add type hints and input validation

The class initialization could be improved with:

  • Type hints for the bot parameter
  • Input validation for required configuration fields
  • Class-level docstring explaining the purpose and requirements
 class MailProcessor:
+    """
+    Handles email processing operations including IMAP/SMTP connections,
+    message classification, and email responses.
+    
+    Args:
+        bot (str): The bot identifier used to load configurations
+    
+    Raises:
+        AppException: If required configurations are missing
+    """
-    def __init__(self, bot):
+    def __init__(self, bot: str):
+        if not bot:
+            raise AppException("Bot identifier is required")
         self.config = ChatDataProcessor.get_channel_config(ChannelTypes.MAIL, bot, False)['config']
+        required_fields = ['email_account', 'email_password']
+        missing_fields = [field for field in required_fields if field not in self.config]
+        if missing_fields:
+            raise AppException(f"Missing required configuration fields: {', '.join(missing_fields)}")

258-271: Improve JSON extraction reliability

The current regex pattern might match invalid JSON structures. Consider using a more robust approach.

     @staticmethod
     def extract_jsons_from_text(text) -> list:
         """
         Extract json objects from text as a list
         """
-        json_pattern = re.compile(r'(\{.*?\}|\[.*?\])', re.DOTALL)
+        json_pattern = re.compile(r'(\{(?:[^{}]|(?R))*\}|\[(?:[^\[\]]|(?R))*\])', re.DOTALL)
         jsons = []
         for match in json_pattern.findall(text):
             try:
                 json_obj = json.loads(match)
+                # Validate expected structure
+                if isinstance(json_obj, (dict, list)):
+                    jsons.append(json_obj)
-                jsons.append(json_obj)
             except json.JSONDecodeError:
                 continue
         return jsons
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between e28f9f0 and 6a84142.

📒 Files selected for processing (1)
  • kairon/shared/channels/mail/processor.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
kairon/shared/channels/mail/processor.py

77-77: Do not use bare except

(E722)


127-127: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


188-188: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

kairon/shared/channels/mail/processor.py Show resolved Hide resolved
kairon/shared/channels/mail/processor.py Outdated Show resolved Hide resolved
kairon/shared/channels/mail/processor.py Show resolved Hide resolved
Comment on lines +39 to +46
def login_imap(self):
if self.mailbox:
return
email_account = self.config['email_account']
email_password = self.config['email_password']
imap_server = self.config.get('imap_server', MailConstants.DEFAULT_IMAP_SERVER)
self.mailbox = MailBox(imap_server).login(email_account, email_password)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add timeout and error handling for IMAP connection

The IMAP connection could fail silently or hang indefinitely.

     def login_imap(self):
         if self.mailbox:
             return
         email_account = self.config['email_account']
         email_password = self.config['email_password']
         imap_server = self.config.get('imap_server', MailConstants.DEFAULT_IMAP_SERVER)
-        self.mailbox = MailBox(imap_server).login(email_account, email_password)
+        try:
+            self.mailbox = MailBox(imap_server, timeout=30).login(email_account, email_password)
+        except Exception as e:
+            logger.error(f"Failed to connect to IMAP server: {str(e)}")
+            raise AppException(f"IMAP connection failed: {str(e)}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def login_imap(self):
if self.mailbox:
return
email_account = self.config['email_account']
email_password = self.config['email_password']
imap_server = self.config.get('imap_server', MailConstants.DEFAULT_IMAP_SERVER)
self.mailbox = MailBox(imap_server).login(email_account, email_password)
def login_imap(self):
if self.mailbox:
return
email_account = self.config['email_account']
email_password = self.config['email_password']
imap_server = self.config.get('imap_server', MailConstants.DEFAULT_IMAP_SERVER)
try:
self.mailbox = MailBox(imap_server, timeout=30).login(email_account, email_password)
except Exception as e:
logger.error(f"Failed to connect to IMAP server: {str(e)}")
raise AppException(f"IMAP connection failed: {str(e)}") from e

Comment on lines +211 to +257
@staticmethod
def read_mails(bot: str) -> ([dict], str, int):
"""
Read mails from the mailbox
Parameters:
- bot: str - bot id
Returns:
- list of messages - each message is a dict with the following
- mail_id
- subject
- date
- body
- user
- time_shift

"""
mp = MailProcessor(bot)
time_shift = int(mp.config.get('interval', 20 * 60))
last_read_timestamp = datetime.now() - timedelta(seconds=time_shift)
messages = []
is_logged_in = False
try:
mp.login_imap()
is_logged_in = True
msgs = mp.mailbox.fetch(AND(seen=False, date_gte=last_read_timestamp.date()))
for msg in msgs:
subject = msg.subject
sender_id = msg.from_
date = msg.date
body = msg.text or msg.html or ""
logger.info(subject, sender_id, date)
message_entry = {
'mail_id': sender_id,
'subject': subject,
'date': str(date),
'body': body
}
messages.append(message_entry)
mp.logout_imap()
is_logged_in = False
return messages, mp.bot_settings.user, time_shift
except Exception as e:
logger.exception(e)
if is_logged_in:
mp.logout_imap()
return [], mp.bot_settings.user, time_shift

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add pagination to read_mails method

Reading all unread messages at once could cause memory issues with large mailboxes.

     @staticmethod
-    def read_mails(bot: str) -> ([dict], str, int):
+    def read_mails(bot: str, batch_size: int = 100) -> ([dict], str, int):
         """
         Read mails from the mailbox
         Parameters:
         - bot: str - bot id
+        - batch_size: int - maximum number of messages to fetch at once
         Returns:
         - list of messages - each message is a dict with the following
             - mail_id
             - subject
             - date
             - body
         - user
         - time_shift
         """
         mp = MailProcessor(bot)
         time_shift = int(mp.config.get('interval', 20 * 60))
         last_read_timestamp = datetime.now() - timedelta(seconds=time_shift)
         messages = []
         is_logged_in = False
         try:
             mp.login_imap()
             is_logged_in = True
             msgs = mp.mailbox.fetch(AND(seen=False, date_gte=last_read_timestamp.date()))
+            message_count = 0
             for msg in msgs:
+                if message_count >= batch_size:
+                    break
                 subject = msg.subject
                 sender_id = msg.from_
                 date = msg.date
                 body = msg.text or msg.html or ""
                 logger.info(subject, sender_id, date)
                 message_entry = {
                     'mail_id': sender_id,
                     'subject': subject,
                     'date': str(date),
                     'body': body
                 }
                 messages.append(message_entry)
+                message_count += 1
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@staticmethod
def read_mails(bot: str) -> ([dict], str, int):
"""
Read mails from the mailbox
Parameters:
- bot: str - bot id
Returns:
- list of messages - each message is a dict with the following
- mail_id
- subject
- date
- body
- user
- time_shift
"""
mp = MailProcessor(bot)
time_shift = int(mp.config.get('interval', 20 * 60))
last_read_timestamp = datetime.now() - timedelta(seconds=time_shift)
messages = []
is_logged_in = False
try:
mp.login_imap()
is_logged_in = True
msgs = mp.mailbox.fetch(AND(seen=False, date_gte=last_read_timestamp.date()))
for msg in msgs:
subject = msg.subject
sender_id = msg.from_
date = msg.date
body = msg.text or msg.html or ""
logger.info(subject, sender_id, date)
message_entry = {
'mail_id': sender_id,
'subject': subject,
'date': str(date),
'body': body
}
messages.append(message_entry)
mp.logout_imap()
is_logged_in = False
return messages, mp.bot_settings.user, time_shift
except Exception as e:
logger.exception(e)
if is_logged_in:
mp.logout_imap()
return [], mp.bot_settings.user, time_shift
@staticmethod
def read_mails(bot: str, batch_size: int = 100) -> ([dict], str, int):
"""
Read mails from the mailbox
Parameters:
- bot: str - bot id
- batch_size: int - maximum number of messages to fetch at once
Returns:
- list of messages - each message is a dict with the following
- mail_id
- subject
- date
- body
- user
- time_shift
"""
mp = MailProcessor(bot)
time_shift = int(mp.config.get('interval', 20 * 60))
last_read_timestamp = datetime.now() - timedelta(seconds=time_shift)
messages = []
is_logged_in = False
try:
mp.login_imap()
is_logged_in = True
msgs = mp.mailbox.fetch(AND(seen=False, date_gte=last_read_timestamp.date()))
message_count = 0
for msg in msgs:
if message_count >= batch_size:
break
subject = msg.subject
sender_id = msg.from_
date = msg.date
body = msg.text or msg.html or ""
logger.info(subject, sender_id, date)
message_entry = {
'mail_id': sender_id,
'subject': subject,
'date': str(date),
'body': body
}
messages.append(message_entry)
message_count += 1
mp.logout_imap()
is_logged_in = False
return messages, mp.bot_settings.user, time_shift
except Exception as e:
logger.exception(e)
if is_logged_in:
mp.logout_imap()
return [], mp.bot_settings.user, time_shift

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (4)
kairon/shared/channels/mail/processor.py (4)

24-37: Add validation for required configuration parameters

The initialization should validate that all required configuration parameters are present before proceeding.

 def __init__(self, bot):
+    required_params = ['email_account', 'email_password']
     self.config = ChatDataProcessor.get_channel_config(ChannelTypes.MAIL, bot, False)['config']
+    missing_params = [param for param in required_params if param not in self.config]
+    if missing_params:
+        raise AppException(f"Missing required configuration parameters: {', '.join(missing_params)}")
     self.llm_type = self.config.get('llm_type', "openai")

191-201: Add validation for empty mail configurations

The method should handle the case when no mail configurations are present.

     def get_context_prompt(self) -> str:
         context_prompt = ""
+        if not self.mail_configs:
+            logger.warning("No mail configurations found")
+            return context_prompt
         for item in self.mail_configs:

203-210: Enhance docstring with parameter documentation

Add parameter documentation to improve code maintainability.

     @staticmethod
     def process_message_task(bot: str, message_batch: [dict]):
         """
         Process a batch of messages
         used for execution by executor
+        
+        Parameters:
+        - bot: str - The bot identifier
+        - message_batch: [dict] - List of messages to process
         """
         asyncio.run(MailProcessor.process_messages(bot, message_batch))

259-272: Add type hints to improve code maintainability

Add type hints for better code documentation and IDE support.

     @staticmethod
-    def extract_jsons_from_text(text) -> list:
+    def extract_jsons_from_text(text: str) -> list[dict]:
         """
         Extract json objects from text as a list
+        
+        Parameters:
+        - text: str - The text to extract JSON objects from
+        
+        Returns:
+        - list[dict] - List of extracted JSON objects
         """
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 6a84142 and 497bbdb.

📒 Files selected for processing (1)
  • kairon/shared/channels/mail/processor.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
kairon/shared/channels/mail/processor.py

128-128: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


189-189: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (5)
kairon/shared/channels/mail/processor.py (5)

39-46: Add timeout and error handling for IMAP connection

The IMAP connection could fail silently or hang indefinitely.


81-92: Improve error handling in send_mail method

The method swallows exceptions, which could hide important delivery failures.


93-110: Add input sanitization for template formatting

The template formatting is vulnerable to potential injection attacks through slot values.


212-258: Add pagination to read_mails method

Reading all unread messages at once could cause memory issues with large mailboxes.


71-80: ⚠️ Potential issue

Improve SMTP validation error handling and logging

The validation method needs proper error logging and cleanup.

 @staticmethod
 def validate_smpt_connection(bot):
+    mp = None
     try:
         mp = MailProcessor(bot)
         mp.login_smtp()
         mp.logout_smtp()
         return True
     except Exception as e:
-        logger.log(str(e))
+        logger.error(f"SMTP validation failed: {str(e)}")
         return False
+    finally:
+        if mp and mp.smtp:
+            mp.logout_smtp()

Likely invalid or redundant comment.

kairon/shared/channels/mail/processor.py Show resolved Hide resolved
kairon/shared/channels/mail/processor.py Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)
kairon/shared/channels/mail/processor.py (2)

260-272: Enhance JSON extraction robustness

The method could benefit from better error handling and validation.

 @staticmethod
 def extract_jsons_from_text(text) -> list:
     """
     Extract json objects from text as a list
     """
+    if not text or not isinstance(text, str):
+        return []
     json_pattern = re.compile(r'(\{.*?\}|\[.*?\])', re.DOTALL)
     jsons = []
     for match in json_pattern.findall(text):
         try:
             json_obj = json.loads(match)
             jsons.append(json_obj)
-        except json.JSONDecodeError:
+        except json.JSONDecodeError as e:
+            logger.debug(f"Failed to parse JSON: {str(e)}")
             continue
     return jsons

23-272: Consider implementing retry mechanisms

Critical operations like email sending and LLM classification could benefit from retry mechanisms to handle temporary failures.

Consider implementing retries for:

  1. SMTP/IMAP connections
  2. Email sending operations
  3. LLM classification calls

This could be implemented using a decorator pattern or a dedicated retry utility class.

🧰 Tools
🪛 Ruff (0.8.0)

128-128: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


189-189: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 497bbdb and c35b0a3.

📒 Files selected for processing (1)
  • kairon/shared/channels/mail/processor.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.0)
kairon/shared/channels/mail/processor.py

128-128: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


189-189: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (5)
kairon/shared/channels/mail/processor.py (5)

39-46: Add timeout and error handling for IMAP connection

The IMAP connection could fail silently or hang indefinitely.


71-80: Improve error handling in SMTP validation

The bare except clause masks important errors that could help diagnose connection issues.


81-92: Improve error handling in send_mail method

The method swallows exceptions, which could hide important delivery failures.


93-110: Add input sanitization for template formatting

The template formatting is vulnerable to potential injection attacks through slot values.


132-190: Add batch size limit and improve error handling

The batch processing should have a size limit and better error handling.

🧰 Tools
🪛 Ruff (0.8.0)

189-189: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Comment on lines +24 to +37
def __init__(self, bot):
self.config = ChatDataProcessor.get_channel_config(ChannelTypes.MAIL, bot, False)['config']
self.llm_type = self.config.get('llm_type', "openai")
self.hyperparameters = self.config.get('hyperparameters', MailConstants.DEFAULT_HYPERPARAMETERS)
self.bot = bot
bot_info = Bot.objects.get(id=bot)
self.account = bot_info.account
self.llm_processor = LLMProcessor(self.bot, self.llm_type)
self.mail_configs = list(MailClassificationConfig.objects(bot=self.bot))
self.mail_configs_dict = {item.intent: item for item in self.mail_configs}
self.bot_settings = BotSettings.objects(bot=self.bot).get()
self.mailbox = None
self.smtp = None

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add input validation in constructor

The constructor accepts critical configurations without validation. Consider adding checks for required fields and valid configurations.

 def __init__(self, bot):
+    if not bot:
+        raise ValueError("Bot ID is required")
     self.config = ChatDataProcessor.get_channel_config(ChannelTypes.MAIL, bot, False)['config']
+    required_fields = ['email_account', 'email_password']
+    missing_fields = [field for field in required_fields if field not in self.config]
+    if missing_fields:
+        raise ValueError(f"Missing required configuration fields: {', '.join(missing_fields)}")
     self.llm_type = self.config.get('llm_type', "openai")
     self.hyperparameters = self.config.get('hyperparameters', MailConstants.DEFAULT_HYPERPARAMETERS)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def __init__(self, bot):
self.config = ChatDataProcessor.get_channel_config(ChannelTypes.MAIL, bot, False)['config']
self.llm_type = self.config.get('llm_type', "openai")
self.hyperparameters = self.config.get('hyperparameters', MailConstants.DEFAULT_HYPERPARAMETERS)
self.bot = bot
bot_info = Bot.objects.get(id=bot)
self.account = bot_info.account
self.llm_processor = LLMProcessor(self.bot, self.llm_type)
self.mail_configs = list(MailClassificationConfig.objects(bot=self.bot))
self.mail_configs_dict = {item.intent: item for item in self.mail_configs}
self.bot_settings = BotSettings.objects(bot=self.bot).get()
self.mailbox = None
self.smtp = None
def __init__(self, bot):
if not bot:
raise ValueError("Bot ID is required")
self.config = ChatDataProcessor.get_channel_config(ChannelTypes.MAIL, bot, False)['config']
required_fields = ['email_account', 'email_password']
missing_fields = [field for field in required_fields if field not in self.config]
if missing_fields:
raise ValueError(f"Missing required configuration fields: {', '.join(missing_fields)}")
self.llm_type = self.config.get('llm_type', "openai")
self.hyperparameters = self.config.get('hyperparameters', MailConstants.DEFAULT_HYPERPARAMETERS)
self.bot = bot
bot_info = Bot.objects.get(id=bot)
self.account = bot_info.account
self.llm_processor = LLMProcessor(self.bot, self.llm_type)
self.mail_configs = list(MailClassificationConfig.objects(bot=self.bot))
self.mail_configs_dict = {item.intent: item for item in self.mail_configs}
self.bot_settings = BotSettings.objects(bot=self.bot).get()
self.mailbox = None
self.smtp = None

Comment on lines +111 to +129
async def classify_messages(self, messages: [dict]) -> [dict]:
if self.bot_settings.llm_settings['enable_faq']:
try:
system_prompt = self.config.get('system_prompt', MailConstants.DEFAULT_SYSTEM_PROMPT)
system_prompt += '\n return json format: [{"intent": "intent_name", "entities": {"entity_name": "value"}, "mail_id": "mail_id", "subject": "subject"}], if not classifiable set intent and not-found entity values as null'
context_prompt = self.get_context_prompt()
messages = json.dumps(messages)
info = await self.llm_processor.predict(messages,
self.bot_settings.user,
system_prompt=system_prompt,
context_prompt=context_prompt,
similarity_prompt=[],
hyperparameters=self.hyperparameters)
classifications = MailProcessor.extract_jsons_from_text(info["content"])[0]
return classifications
except Exception as e:
logger.error(str(e))
raise AppException(str(e))

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance error handling and validation in classify_messages

The method needs better error context and input validation.

 async def classify_messages(self, messages: [dict]) -> [dict]:
+    if not messages:
+        return []
     if self.bot_settings.llm_settings['enable_faq']:
         try:
             system_prompt = self.config.get('system_prompt', MailConstants.DEFAULT_SYSTEM_PROMPT)
             system_prompt += '\n return json format: [{"intent": "intent_name", "entities": {"entity_name": "value"}, "mail_id": "mail_id", "subject": "subject"}], if not classifiable set intent and not-found entity values as null'
             context_prompt = self.get_context_prompt()
             messages = json.dumps(messages)
             info = await self.llm_processor.predict(messages,
                                                 self.bot_settings.user,
                                                 system_prompt=system_prompt,
                                                 context_prompt=context_prompt,
                                                 similarity_prompt=[],
                                                 hyperparameters=self.hyperparameters)
             classifications = MailProcessor.extract_jsons_from_text(info["content"])[0]
             return classifications
         except Exception as e:
             logger.error(str(e))
-            raise AppException(str(e))
+            raise AppException(f"Failed to classify messages: {str(e)}") from e
+    return []
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async def classify_messages(self, messages: [dict]) -> [dict]:
if self.bot_settings.llm_settings['enable_faq']:
try:
system_prompt = self.config.get('system_prompt', MailConstants.DEFAULT_SYSTEM_PROMPT)
system_prompt += '\n return json format: [{"intent": "intent_name", "entities": {"entity_name": "value"}, "mail_id": "mail_id", "subject": "subject"}], if not classifiable set intent and not-found entity values as null'
context_prompt = self.get_context_prompt()
messages = json.dumps(messages)
info = await self.llm_processor.predict(messages,
self.bot_settings.user,
system_prompt=system_prompt,
context_prompt=context_prompt,
similarity_prompt=[],
hyperparameters=self.hyperparameters)
classifications = MailProcessor.extract_jsons_from_text(info["content"])[0]
return classifications
except Exception as e:
logger.error(str(e))
raise AppException(str(e))
async def classify_messages(self, messages: [dict]) -> [dict]:
if not messages:
return []
if self.bot_settings.llm_settings['enable_faq']:
try:
system_prompt = self.config.get('system_prompt', MailConstants.DEFAULT_SYSTEM_PROMPT)
system_prompt += '\n return json format: [{"intent": "intent_name", "entities": {"entity_name": "value"}, "mail_id": "mail_id", "subject": "subject"}], if not classifiable set intent and not-found entity values as null'
context_prompt = self.get_context_prompt()
messages = json.dumps(messages)
info = await self.llm_processor.predict(messages,
self.bot_settings.user,
system_prompt=system_prompt,
context_prompt=context_prompt,
similarity_prompt=[],
hyperparameters=self.hyperparameters)
classifications = MailProcessor.extract_jsons_from_text(info["content"])[0]
return classifications
except Exception as e:
logger.error(str(e))
raise AppException(f"Failed to classify messages: {str(e)}") from e
return []
🧰 Tools
🪛 Ruff (0.8.0)

128-128: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

@hasinaxp hasinaxp closed this Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant