-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Callback Service Changes #1842
base: master
Are you sure you want to change the base?
Callback Service Changes #1842
Conversation
…ipt-lambda to callback service and fixed test cases.
…ipt-lambda to callback service and fixed test cases.
WalkthroughThe changes refactor multiple modules by transitioning from FastAPI to BlackSheep for asynchronous handling. The application’s lifecycle now uses explicit startup and shutdown functions for MongoDB connections. Middleware for CORS, security headers, and error handling has been updated, and routing decorators have been adapted to BlackSheep’s API. In addition, core utilities for executing Python scripts, managing scheduled jobs, email dispatch, and database interactions have been revised or newly introduced. Testing files have been updated to use asynchronous calls, and several new utility functions and constants have been added, resulting in a cleaner and more modular architecture. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant App as BlackSheep App
participant DB as MongoDB
Client->>App: Initiate Startup
App->>DB: Call startup() to establish connection
Note over App: Middleware (CORS, Security Headers, Error Handling) active
Client->>App: Send HTTP Request
App->>App: Process request via router (@router.get)
App-->>Client: Return BSResponse
Client->>App: Initiate Shutdown
App->>DB: Call shutdown() to disconnect
sequenceDiagram
participant Client
participant Router as BlackSheep Router
participant Util as CallbackUtility
Client->>Router: Request to /callback endpoint
Router->>Router: Parse query parameters and read request body
Router->>Util: Invoke pyscript_handler
Util-->>Router: Return structured BSResponse
Router-->>Client: Respond with JSON and status code
Suggested reviewers
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🧹 Nitpick comments (25)
tests/unit_test/api/api_processor_test.py (1)
18-18
: Unused import detected.The
disconnect
import from mongoengine is not being used in this file.Remove the unused import:
-from mongoengine import connect, disconnect +from mongoengine import connect🧰 Tools
🪛 Ruff (0.8.2)
18-18:
mongoengine.disconnect
imported but unusedRemove unused import:
mongoengine.disconnect
(F401)
kairon/shared/actions/utils.py (1)
681-690
: Refactored script execution to simplify flowThe
run_pyscript
method has been simplified by removing the conditional Lambda execution path. Now it always uses direct HTTP requests to the pyscript evaluator service, which creates a more consistent execution flow and reduces code complexity.The refactoring is good but consider adding error handling for network issues or timeouts when making the HTTP request, as the current implementation might not handle connection problems gracefully.
🧰 Tools
🪛 Ruff (0.8.2)
682-682: Local variable
trigger_task
is assigned to but never usedRemove assignment to unused variable
trigger_task
(F841)
kairon/async_callback/processor.py (1)
27-35
: Replaced Lambda invocation with CallbackUtilityThe implementation now uses
CallbackUtility.pyscript_handler
instead of triggering a Lambda function, creating a more consistent approach to script execution across the codebase.The error handling looks good, but consider adding more detailed logging around the response status code and body to help with debugging issues in production.
tests/unit_test/callback/pyscript_handler_test.py (3)
12-12
: Capitalize the environment variable for consistency.The environment variable key
system_file
should preferably be uppercase (e.g.,SYSTEM_FILE
) to conform with common naming conventions for environment variables, as also hinted by the static analysis tool.- os.environ["system_file"] = "./tests/testing_data/system.yaml" + os.environ["SYSTEM_FILE"] = "./tests/testing_data/system.yaml"🧰 Tools
🪛 Ruff (0.8.2)
12-12: Use capitalized environment variable
SYSTEM_FILE
instead ofsystem_file
Replace
system_file
withSYSTEM_FILE
(SIM112)
19-65
: Tests appear thorough for a basic pyscript flow.The test verifies the presence of a default bot response, ensuring correctness of the function's output. Consider removing or reducing
639-689
: CRUD operation test: Good coverage but consider parameterization.You've provided a detailed test that checks adding data to the mock database. As multiple tests share similar setup steps, you might factor them out or parametrize to reduce repetition.
tests/unit_test/callback_test.py (1)
384-403
: Expand negative test paths fortest_run_pyscript
.This test effectively verifies the happy path when
pyscript_handler
returns a normal response. Consider adding a scenario that tests error handling (e.g., whenpyscript_handler
raises an exception) to ensure coverage of failure modes.kairon/async_callback/router/pyscript_callback.py (2)
24-24
: Use dictionary membership checks instead of calling.keys()
.Static analysis highlights a minor optimization. You can iterate over
request.query
directly without calling.keys()
, improving readability.- data['params'].update({key: request.query.get(key) for key in request.query.keys()}) + data['params'].update({key: request.query.get(key) for key in request.query})🧰 Tools
🪛 Ruff (0.8.2)
24-24: Use
key in dict
instead ofkey in dict.keys()
Remove
.keys()
(SIM118)
32-32
: Remove unused exception variable.
Exception as e
is assigned but not used. Either log the error details or removeas e
if it's not needed.- except Exception as e: + except Exception:🧰 Tools
🪛 Ruff (0.8.2)
32-32: Local variable
e
is assigned to but never usedRemove assignment to unused variable
e
(F841)
kairon/shared/concurrency/actors/pyscript_runner.py (1)
27-29
: Consider clarifying naming forsrtp_time
andsrtf_time
.
The function names appear potentially misspelled from the standardstrptime
andstrftime
. If intended, ensure documentation clarifies usage.tests/unit_test/actors/actors_test.py (1)
7-7
: Establishing MongoDB connection in tests.
Consider using a separate or ephemeral test database to avoid collisions or the need for cleanup.kairon/async_callback/main.py (4)
19-28
:startup
method usesawait asyncio.sleep(1)
.
Sleeping may not guarantee the DB readiness. Consider verifying database connection or removing the artificial delay.
30-36
:shutdown
also usesawait asyncio.sleep(1)
.
Likewise, consider a more robust teardown or removing the delay if not strictly necessary.
71-77
: Using wildcard CORS in production can be risky.
Consider restricting origins to enhance security.
80-91
: Catching exceptions is good; consider limiting exposed error details.
Leaking raw exception messages might reveal sensitive info. A generic message for production would be safer.kairon/shared/concurrency/actors/utils.py (4)
22-25
: Ensure consistent naming for date/datetime parameters.
srtptime
accepts arguments nameddate_string
andformate
. It might be clearer to renameformate
tofmt
ordate_format
to ensure consistency throughout the codebase.
26-29
: Keep method signatures consistent.
srtftime
similarly usesdate_string
for a datetime object andformate
for the format. Consider aligning names withsrtptime
to improve consistency (e.g., rename parameters to something likedatetime_obj
andfmt
).
30-33
: Validate inputs for robustness.
Currently,url_parse_quote_plus
does not check ifstring
isNone
or empty. If it's expected to be a non-empty string, adding a quick check or documentation note could help prevent silent failures.
153-174
: Consider timeouts and retries in external HTTP calls.
api_call
executes a request to an external HTTP endpoint. For reliability, specify timeouts or implement retry logic if these requests are prone to transient failures. Doing so can prevent indefinite hangs or partial successes.kairon/async_callback/utils.py (5)
59-105
: Add resilience measures in scheduling logic.
add_schedule_job
inserts job states into MongoDB but does not handle potential insert failures (e.g., connectivity issues, unique key violations). Consider adding error handling or logging to manage such conditions gracefully.
68-68
: Use logger in place of
Calls tologger.info
orlogger.debug
is recommended for better observability.Also applies to: 139-139, 142-142
133-151
: Check job existence before deletion.
delete_schedule_job
calls an external endpoint to delete scheduled jobs. If the job does not exist in the database or is already removed, the code might still attempt to dispatch deletion. A pre-check (and a more graceful error if missing) could be beneficial.
199-229
: Evaluate security of script execution.
execute_script
uses RestrictedPython, which is safer but not bulletproof. Review the environment carefully for any possible unauthorized operations. Ensure that critical modules or builtins are not inadvertently exposed.
231-255
: Standardize error response codes.
pyscript_handler
sets statusCode 422 for exceptions. Other parts of the code raiseAppException
or use 400 for errors. Consider unifying error codes to maintain consistency across the application.tests/integration_test/callback_service_test.py (1)
328-344
: Consider consistent error handling in asynchronous flows.
Intest_async_callback
,test_pyscript_failure
, andtest_dispatch_message_failure
, the code verifies exceptions by calling different methods (AppException("Error")
, etc.). This is correct, but consider making all error codes uniform (e.g., always 400 or 422) for easier debugging.Also applies to: 347-364, 367-380
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
kairon/async_callback/main.py
(2 hunks)kairon/async_callback/processor.py
(2 hunks)kairon/async_callback/router/pyscript_callback.py
(1 hunks)kairon/async_callback/utils.py
(1 hunks)kairon/evaluator/main.py
(2 hunks)kairon/shared/actions/utils.py
(1 hunks)kairon/shared/concurrency/actors/pyscript_runner.py
(3 hunks)kairon/shared/concurrency/actors/utils.py
(1 hunks)kairon/shared/data/constant.py
(1 hunks)requirements/prod.txt
(1 hunks)tests/integration_test/callback_service_test.py
(5 hunks)tests/integration_test/evaluator_service_test.py
(3 hunks)tests/unit_test/actors/actors_test.py
(5 hunks)tests/unit_test/api/api_processor_test.py
(3 hunks)tests/unit_test/callback/pyscript_handler_test.py
(1 hunks)tests/unit_test/callback_test.py
(1 hunks)tests/unit_test/evaluator/evaluator_test.py
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- kairon/shared/data/constant.py
🧰 Additional context used
🪛 Gitleaks (8.21.2)
tests/unit_test/actors/actors_test.py
110-110: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
154-154: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🪛 Ruff (0.8.2)
tests/unit_test/callback/pyscript_handler_test.py
12-12: Use capitalized environment variable SYSTEM_FILE
instead of system_file
Replace system_file
with SYSTEM_FILE
(SIM112)
tests/unit_test/api/api_processor_test.py
18-18: mongoengine.disconnect
imported but unused
Remove unused import: mongoengine.disconnect
(F401)
kairon/shared/concurrency/actors/utils.py
129-129: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
kairon/async_callback/router/pyscript_callback.py
24-24: Use key in dict
instead of key in dict.keys()
Remove .keys()
(SIM118)
32-32: Local variable e
is assigned to but never used
Remove assignment to unused variable e
(F841)
kairon/async_callback/main.py
16-16: Redefinition of unused Response
from line 11
(F811)
tests/integration_test/callback_service_test.py
10-10: Redefinition of unused TestClient
from line 9
(F811)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Analyze (python)
- GitHub Check: Python CI
🔇 Additional comments (47)
requirements/prod.txt (1)
72-73
: Dependencies added for the BlackSheep framework transition.The addition of
blacksheep==2.0.7
aligns with the codebase refactoring from FastAPI to BlackSheep for asynchronous handling. The re-inclusion ofnltk
is likely needed for natural language processing tasks in the callback utilities.tests/unit_test/api/api_processor_test.py (2)
62-66
: Good addition of test fixture for isolation.This fixture ensures test isolation by cleaning up the default account before running the test, preventing state from previous test runs from affecting the current test.
1148-1148
: Test updated to use the cleanup fixture.The test now uses the
delete_default_account
fixture to ensure a clean state before execution, which is a good testing practice.kairon/evaluator/main.py (3)
6-6
: Added imports for database connection lifecycle management.Appropriate imports added for handling MongoDB connections in the application lifecycle.
Also applies to: 14-14
48-55
: Improved application lifecycle with proper database connection management.The addition of an async context manager for handling MongoDB connections is an excellent improvement. This ensures that:
- Database connections are properly established at application startup
- Connections are cleanly closed during shutdown
- Resources are managed efficiently throughout the application lifecycle
This approach follows best practices for resource management in asynchronous applications.
57-57
: FastAPI application now uses proper lifecycle management.Modified the FastAPI initialization to use the lifespan function, ensuring proper database connection management.
tests/unit_test/evaluator/evaluator_test.py (3)
30-30
: Updated test to include predefined_objects parameterThis test now correctly passes the new required
predefined_objects
parameter, ensuring compatibility with the updated method signature.
53-60
: Changed unauthorized import test from 'requests' to 'numpy'The test case has been updated to check for unauthorized numpy imports instead of requests. This is a good test variation that ensures the security restrictions are properly enforced for different libraries.
Make sure this change aligns with your import security policy. If both libraries should be restricted, consider having separate tests for each to ensure comprehensive coverage.
67-68
: Added predefined_objects parameter to error testTest correctly updated to include the required predefined_objects parameter in the interpreter error test case.
tests/integration_test/evaluator_service_test.py (3)
291-292
: Added required predefined_objects to request bodyTest has been correctly updated to include the new required parameter in the request body.
334-338
: Updated test script to use numpy instead of requestsThe test script has been modified to attempt importing numpy rather than requests, which aligns with the changes in the unit tests.
Consider adding a comment explaining why this specific library was chosen for the unauthorized import test to improve test maintainability.
354-354
: Updated error message assertion to match new testError message assertion correctly updated to check for unauthorized numpy import rather than requests.
tests/unit_test/callback/pyscript_handler_test.py (3)
85-109
: Clear invalid import test coverage.Verifies that unauthorized libraries like
numpy
are disallowed. This looks robust and clearly tested. No issues found.
351-453
: Adequate schedule job creation test.Thoroughly checks the job scheduling logic, usage of triggers, and verifies the resulting response. Good coverage. You might consider adding negative test scenarios (e.g., invalid triggers) if needed.
1041-1082
: Solid negative test for missing bot ID.This test ensures that the system correctly responds with
Missing bot id
. The approach is consistent with other negative test cases.kairon/shared/concurrency/actors/pyscript_runner.py (5)
1-1
: Use ofpartial
is appropriate for binding.
This import looks good and facilitates partial function application in the code.
15-15
: ImportingPyscriptUtility
aligns with the new functionalities.
No issues spotted; usage is consistent.
16-16
: No changes to review on this blank line.
20-22
: Confirm security implications of newly allowed modules.
Allowing modules likegooglemaps
,decimal
, and_strptime
may introduce additional attack surface. Verify necessity and safety in restricted execution contexts.
40-40
: Import ofCallbackUtility
is consistent with partial usage.
No immediate concerns here.tests/unit_test/actors/actors_test.py (5)
53-54
: Predefined objects usage in the test setup looks good.
No concerns; test coverage is appropriate.
61-170
: Comprehensive test forapi_call
functionality.
Good coverage of request/response handling. Note that details about the key are addressed separately.🧰 Tools
🪛 Gitleaks (8.21.2)
110-110: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
154-154: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
172-248
: DB action data retrieval test is properly structured.
Implementation and coverage look solid.
251-260
: Test confirms unauthorized import scenario.
Valid check ensures restricted environment is enforced.
281-282
: Interpreter syntax error test is correct.
Raising anAppException
with an appropriate message is verified.kairon/async_callback/main.py (13)
1-1
: Asynchronous import is suitable for BlackSheep usage.
No concerns with addingasyncio
.
5-6
: ImportingAccountProcessor
for startup tasks.
Looks aligned with system initialization flow.
9-10
: Router import for BlackSheep is fine.
No issues.
11-12
: Switch to BlackSheepApplication
is correctly introduced.
Continues the new async pattern.
38-41
: Application and lifecycle events are properly set.
No issues here.
46-51
: Security policy definitions are consistent.
The approach to controllingframe_ancestors
,connect_src
, etc. looks properly structured.
92-121
: Configurations for HSTS, CSP, Cache, and Permissions are well-structured.
No immediate concerns.
124-130
: Building CSP header instructions is straightforward.
Implementation is clear.
133-137
: Cache-Control header generation is correct.
No issues noted.
140-146
: Security headers middleware is effectively applied.
Implementation is consistent with prior definitions.
148-149
: Appending middleware for security and exception handling is logical.
No concerns.
152-154
: Index endpoint updated to BlackSheep route.
Implementation is correct.
157-160
: Healthcheck endpoint is straightforward.
No issues noted.kairon/shared/concurrency/actors/utils.py (2)
101-135
: Supplement error-handling for payload extraction.
get_payload
could encounter cases wherepredefined_objects.get("latest_message", {})
is missing some keys or if the chosenquery_type
doesn’t match any recognized search. Either handle these edge cases with safe lookups or raise a descriptive exception.🧰 Tools
🪛 Ruff (0.8.2)
129-129: Within an
except
clause, raise exceptions withraise ... from err
orraise ... from None
to distinguish them from errors in exception handling(B904)
137-151
: Validate the consistency of database action payloads.
get_db_action_data
accessespayload[0]["query_type"]
without checking ifpayload
is empty or if the required fields exist. You may want to include additional checks (and raise informative exceptions) when the database configuration is incomplete or malformed.kairon/async_callback/utils.py (2)
153-180
: Confirm SMTP configurations securely.
Insend_email
, the SMTP password and user ID are retrieved from the stored config. Ensure that these are encrypted at rest, used only in memory, and not logged. Consider adopting secure handling for all credentials.
257-274
: Return type clarity.
fetch_collection_data
yields dicts with multiple fields but might cause confusion if consumed incorrectly. Add docstrings or usage examples to ensure that consumers understand they must iterate over the generator.tests/integration_test/callback_service_test.py (5)
210-225
: Use correct assertion method for mocks.
Many tests, such astest_get_callback
(lines 210-225) andtest_post_callback
(lines 229-248), useassert mock_dispatch_message.called_once_with(...)
. The usual built-in method is[mock].assert_called_once_with(...)
. Verify thatcalled_once_with
does what you expect or replace it with the standard assertion method.Also applies to: 229-248, 252-268, 272-285, 289-302, 306-319
183-208
: Tests for app readiness.
test_index
andtest_healthcheck
confirm the basic endpoints work correctly. These look fine and are well-structured. No further concerns.
322-325
: Guard against missing request data.
test_invalid_request
raisesAppException
ifprocess_router_message
is called withrequest=None
. This ensures the system handles invalid input gracefully. Good job.
384-396
: Shortened URL callback tests show robust coverage.
The code checks GET, POST, and PUT callbacks for shortened tokens. This thorough approach to testing ensures each HTTP method is fully validated. Nicely done.Also applies to: 399-415, 419-437
441-461
: Comprehensive tests for standalone callbacks with dynamic states.
You have extensive coverage for different failure modes (wrong identifier, missing path, etc.) and for updating state across calls. This is a strong approach to ensure reliability in real-world scenarios.Also applies to: 465-484, 488-507, 511-528, 532-549, 553-591
logger.exception(e) | ||
return BSResponse( | ||
status=500, | ||
content=JSONContent({ | ||
"message": str(e), | ||
"error_code": 400, | ||
"data": None, | ||
"success": False | ||
}) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Ensure status code and error code alignment.
The block sets status=500
while error_code=400
. Consider aligning these values to accurately reflect the error. A general exception returning a 500 status code typically pairs with a matching error code (e.g., error_code=500
).
- "error_code": 400,
+ "error_code": 500,
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
logger.exception(e) | |
return BSResponse( | |
status=500, | |
content=JSONContent({ | |
"message": str(e), | |
"error_code": 400, | |
"data": None, | |
"success": False | |
}) | |
) | |
logger.exception(e) | |
return BSResponse( | |
status=500, | |
content=JSONContent({ | |
"message": str(e), | |
"error_code": 500, | |
"data": None, | |
"success": False | |
}) | |
) |
bot = predefined_objects.get("slot", {}).get("bot") | ||
|
||
global_safe['add_data'] = partial(CallbackUtility.add_data, bot=bot) | ||
global_safe['get_data'] = partial(CallbackUtility.get_data, bot=bot) | ||
global_safe['delete_data'] = partial(CallbackUtility.delete_data, bot=bot) | ||
global_safe['update_data'] = partial(CallbackUtility.update_data, bot=bot) | ||
global_safe['delete_schedule_job'] = partial(CallbackUtility.delete_schedule_job, bot=bot) | ||
global_safe['get_db_action_data'] = partial(PyscriptUtility.get_db_action_data, bot=bot, | ||
predefined_objects=predefined_objects) | ||
global_safe['api_call'] = partial(PyscriptUtility.api_call, bot=bot, | ||
predefined_objects=predefined_objects) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid storing partial-bound callables in a global dictionary.
This approach can cause unpredictable behavior if scripts run concurrently under different bot
values because global_safe
is a shared resource. Consider storing these callables in a local scope instead.
- global_safe['add_data'] = partial(CallbackUtility.add_data, bot=bot)
- # ... other partial assignments
+ local_functions = {
+ "add_data": partial(CallbackUtility.add_data, bot=bot),
+ "get_data": partial(CallbackUtility.get_data, bot=bot),
+ # etc.
+ }
+ local_vars.update(local_functions)
Committable suggestion skipped: line range outside the PR's diff.
@@ -19,6 +19,7 @@ class TestActors: | |||
def setup(self): | |||
os.environ["system_file"] = "./tests/testing_data/system.yaml" | |||
Utility.load_environment() | |||
connect(**Utility.mongoengine_connection(Utility.environment['database']["url"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Connection setup might benefit from isolation.
For safer and more repeatable tests, consider using a dedicated mock or temporary database.
assert result["resp"] == { | ||
'messaging_product': 'whatsapp', | ||
'contacts': [ | ||
{ | ||
'input': '919876543210', | ||
'wa_id': '919876543210' | ||
} | ||
], | ||
'messages': [ | ||
{ | ||
'id': 'wamid.HBgMOTE5NTE1abcdcffdfdv1tsh9qlr9Oul5AKRkE0MzBCNTI3AA==', | ||
'message_status': 'accepted' | ||
} | ||
] | ||
} | ||
assert result["headers"] == {'D360-API-KEY': 'abcdcffdfdv1tsh9qlr9Oul5AK'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential leak of an API key in test.
Static analysis highlights a generic key 'abcdcffdfdv1tsh9qlr9Oul5AK'
. For security, store test keys in environment variables or reduce them to obvious non-sensitive stub examples.
-assert result["headers"] == {'D360-API-KEY': 'abcdcffdfdv1tsh9qlr9Oul5AK'}
+import os
+test_key = os.environ.get("TEST_D360_API_KEY", "fake_key")
+assert result["headers"] == {'D360-API-KEY': test_key}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
assert result["resp"] == { | |
'messaging_product': 'whatsapp', | |
'contacts': [ | |
{ | |
'input': '919876543210', | |
'wa_id': '919876543210' | |
} | |
], | |
'messages': [ | |
{ | |
'id': 'wamid.HBgMOTE5NTE1abcdcffdfdv1tsh9qlr9Oul5AKRkE0MzBCNTI3AA==', | |
'message_status': 'accepted' | |
} | |
] | |
} | |
assert result["headers"] == {'D360-API-KEY': 'abcdcffdfdv1tsh9qlr9Oul5AK'} | |
assert result["resp"] == { | |
'messaging_product': 'whatsapp', | |
'contacts': [ | |
{ | |
'input': '919876543210', | |
'wa_id': '919876543210' | |
} | |
], | |
'messages': [ | |
{ | |
'id': 'wamid.HBgMOTE5NTE1abcdcffdfdv1tsh9qlr9Oul5AKRkE0MzBCNTI3AA==', | |
'message_status': 'accepted' | |
} | |
] | |
} | |
+import os | |
+test_key = os.environ.get("TEST_D360_API_KEY", "fake_key") | |
-assert result["headers"] == {'D360-API-KEY': 'abcdcffdfdv1tsh9qlr9Oul5AK'} | |
+assert result["headers"] == {'D360-API-KEY': test_key} |
🧰 Tools
🪛 Gitleaks (8.21.2)
154-154: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
@staticmethod | ||
def get_embedding(texts: Union[Text, List[Text]], user: str, bot: str, | ||
invocation: str) -> Union[List[float], List[List[float]]]: | ||
""" | ||
Get embeddings for a batch of texts using LiteLLM. | ||
|
||
Args: | ||
texts (Union[Text, List[Text]]): Text or list of texts to generate embeddings for. | ||
user (str): User information for embedding metadata. | ||
bot (str): Bot identifier for embedding metadata. | ||
invocation (str): Invocation identifier for embedding metadata. | ||
|
||
Returns: | ||
Union[List[float], List[List[float]]]: A single embedding or a list of embeddings. | ||
""" | ||
tokenizer = get_encoding("cl100k_base") | ||
embedding_ctx_length = 8191 | ||
|
||
is_single_text = isinstance(texts, str) | ||
if is_single_text: | ||
texts = [texts] | ||
|
||
truncated_texts = [] | ||
for text in texts: | ||
tokens = tokenizer.encode(text)[:embedding_ctx_length] | ||
truncated_texts.append(tokenizer.decode(tokens)) | ||
|
||
llm_secret = Sysadmin.get_llm_secret(llm_type=DEFAULT_LLM, bot=bot) | ||
|
||
result = litellm.embedding( | ||
model="text-embedding-3-small", | ||
input=truncated_texts, | ||
metadata={'user': user, 'bot': bot, 'invocation': invocation}, | ||
api_key=llm_secret.get("api_key"), | ||
num_retries=3 | ||
) | ||
|
||
embeddings = [embedding["embedding"] for embedding in result["data"]] | ||
|
||
return embeddings[0] if is_single_text else embeddings | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Handle potential exceptions from external service calls.
In get_embedding
, the call to litellm.embedding
might fail or time out, and the method does not currently handle any runtime errors (network failures, invalid API keys, etc.). Consider wrapping it in a try/except block to raise a meaningful error or return fallback data.
@staticmethod | ||
def perform_operation(data: dict, user: str, **kwargs): | ||
request = {} | ||
vector_db_url = Utility.environment['vector']['db'] | ||
url = urljoin(vector_db_url, f"/collections/{kwargs.pop('collection_name')}/points/scroll") | ||
if "embedding_search" in data: | ||
user_msg = data.get("embedding_search") | ||
if user_msg and isinstance(user_msg, str): | ||
vector = PyscriptUtility.get_embedding(user_msg, user, invocation='db_action_qdrant', **kwargs) | ||
request['query'] = vector | ||
|
||
if "payload_search" in data: | ||
payload = data.get("payload_search") | ||
if payload: | ||
request.update(**payload) | ||
|
||
if request: | ||
request.update(**{'with_payload': True}) | ||
if 'limit' not in request: | ||
request['limit'] = 10 | ||
response = requests.post(url, json=request) | ||
result = response.json() | ||
else: | ||
raise Exception('No Operation to perform') | ||
return result | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Check response status from the vector database.
perform_operation
sends a POST request to the vector DB but does not check response.status_code
. If the DB returns an error status code, calling response.json()
might raise additional exceptions. Consider validating the status code and handling errors consistently (e.g., logging or raising a custom exception).
except json.JSONDecodeError as e: | ||
logger.debug(e) | ||
raise Exception(f"Error converting payload to JSON: {value}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Leverage exception chaining for clarity.
When re-raising an exception inside the except json.JSONDecodeError
, use raise ... from e
to preserve the traceback and convey contextual details:
- raise Exception(f"Error converting payload to JSON: {value}")
+ raise Exception(f"Error converting payload to JSON: {value}") from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except json.JSONDecodeError as e: | |
logger.debug(e) | |
raise Exception(f"Error converting payload to JSON: {value}") | |
except json.JSONDecodeError as e: | |
logger.debug(e) | |
raise Exception(f"Error converting payload to JSON: {value}") from e |
🧰 Tools
🪛 Ruff (0.8.2)
129-129: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
@staticmethod | ||
def perform_cleanup(local_vars: Dict): | ||
logger.info(f"local_vars: {local_vars}") | ||
filtered_locals = {} | ||
if local_vars: | ||
for key, value in local_vars.items(): | ||
if not isinstance(value, Callable) and not isinstance(value, ModuleType): | ||
if isinstance(value, datetime): | ||
value = value.strftime("%m/%d/%Y, %H:%M:%S") | ||
elif isinstance(value, date): | ||
value = value.strftime("%Y-%m-%d") | ||
elif isinstance(value, Response): | ||
value = value.text | ||
filtered_locals[key] = value | ||
logger.info(f"filtered_vars: {filtered_locals}") | ||
return filtered_locals | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Clarify debugging approach.
perform_cleanup
logs local variables. While useful for debugging, it can expose sensitive data if not carefully filtered or in production logs. Ensure no PII or secrets are written to logs.
def get_data(collection_name: str, user: str, filter_data: dict, bot: Text = None): | ||
if not bot: | ||
raise Exception("Missing bot id") | ||
|
||
collection_name = collection_name.lower() | ||
|
||
query = {"bot": bot, "collection_name": collection_name} | ||
|
||
query.update({f"data__{key}": value for key, value in filter_data.items()}) | ||
data = list(CallbackUtility.fetch_collection_data(query)) | ||
return {"data": data} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add data operation validations.
The get_data
, add_data
, update_data
, and delete_data
functions all depend on cognition_processor
calls. Provide robust checks for missing or invalid input (e.g., no bot
, invalid collection_id
, etc.) and log or raise errors to simplify diagnosing usage issues.
Also applies to: 289-298, 300-309, 311-320
from blacksheep.contents import JSONContent | ||
|
||
from kairon import Utility | ||
from fastapi.testclient import TestClient | ||
from blacksheep.testing import TestClient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid conflicting import for TestClient
.
Both from fastapi.testclient import TestClient
and from blacksheep.testing import TestClient
are imported. This triggers redefinition warnings (F811). Remove the unused fastapi.testclient
import to prevent confusion.
- from fastapi.testclient import TestClient
+ # Removed duplicate import
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from blacksheep.contents import JSONContent | |
from kairon import Utility | |
from fastapi.testclient import TestClient | |
from blacksheep.testing import TestClient | |
from blacksheep.contents import JSONContent | |
from kairon import Utility | |
# Removed duplicate import | |
from blacksheep.testing import TestClient |
🧰 Tools
🪛 Ruff (0.8.2)
10-10: Redefinition of unused TestClient
from line 9
(F811)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (12)
tests/integration_test/action_service_test.py (4)
1366-1367
: Environment configuration setup looks good, but consider centralizing.The configuration looks good with proper timeout setting and URL for the pyscript evaluator. However, since I see this configuration duplicated in another test case below, consider moving it to a setup method or fixture to maintain DRY principles.
1442-1443
: Duplicated environment configuration.This is a duplicate of the configuration at lines 1366-1367. Consider extracting this to a shared helper method or fixture to avoid duplication.
1444-1447
: Grammar error in error message.There's a grammatical error in the error message: "Failed to evaluated the pyscript" should be "Failed to evaluate the pyscript".
- json={"message": "Failed to evaluated the pyscript", "success": False, "error_code": 422, "data": None} + json={"message": "Failed to evaluate the pyscript", "success": False, "error_code": 422, "data": None}
1488-1488
: Error assertion might be brittle.This assertion checks for the exact error message format which might make the test brittle if the error message format changes. Consider a more flexible assertion that checks for key components of the error (like error_code and success status) instead of the exact string format.
For example:
error_data = json.loads(log['exception'].replace("Pyscript evaluation failed: ", "")) assert error_data['success'] == False assert error_data['error_code'] == 422tests/unit_test/callback/pyscript_handler_test.py (8)
12-12
: Follow the naming convention for environment variables.Environment variables should use uppercase letters to follow standard conventions.
-os.environ["system_file"] = "./tests/testing_data/system.yaml" +os.environ["SYSTEM_FILE"] = "./tests/testing_data/system.yaml"🧰 Tools
🪛 Ruff (0.8.2)
12-12: Use capitalized environment variable
SYSTEM_FILE
instead ofsystem_file
Replace
system_file
withSYSTEM_FILE
(SIM112)
447-448
: Simplified assertion using set comparison.The current assertion with DeepDiff ignoring order is overly complex for a simple set comparison.
-assert not DeepDiff(list(job_state['args'][2]['predefined_objects'].keys()), ['user', 'bot', 'event'], - ignore_order=True) +assert set(job_state['args'][2]['predefined_objects'].keys()) == {'user', 'bot', 'event'}
9-15
: Database connection management needs cleanup.The test directly connects to MongoDB without properly closing the connection when tests complete. This could lead to resource leaks.
Consider adding a fixture for database connection management:
import pytest from mongoengine import connect, disconnect @pytest.fixture(scope="module") def database(): # Set up connection connect(**Utility.mongoengine_connection(Utility.environment['database']["url"])) yield # Clean up connection disconnect()Then use this fixture in your tests that require database access.
🧰 Tools
🪛 Ruff (0.8.2)
12-12: Use capitalized environment variable
SYSTEM_FILE
instead ofsystem_file
Replace
system_file
withSYSTEM_FILE
(SIM112)
19-64
: Simplify test code by removing unused source code.The test contains multiple definitions of
source_code
with only the last one being used.def test_lambda_handler_with_simple_pyscript(): - source_code = ''' - from datetime import datetime, timedelta - - # Calculate the job trigger time 5 minutes from now - trigger_time1 = datetime.utcnow() + timedelta(minutes=30) - trigger_time2 = datetime.utcnow() + timedelta(minutes=2) - - id = generate_id() - - # Function to add the scheduled job (with the adjusted trigger time) - # add_schedule_job('mng2', trigger_time1, {'user_rajan': 'test rajan sep 12'}, 'UTC', id, kwargs={'task_type': 'Callback'}) - # add_schedule_job('del_job1', trigger_time2, {'event_id': id}, 'UTC', kwargs={'task_type': 'Callback'}) - - add_schedule_job('mng2', trigger_time1, {'user_rajan': 'test rajan sep 12'}, 'UTC', id) - add_schedule_job('del_job1', trigger_time2, {'event_id': id}, 'UTC') - ''' source_code = ''' bot_response = "This is testing pyscript" '''
397-409
: Remove redundant source code reassignment.Similar to the first test, this test contains multiple definitions of
source_code
with only the last one being used.- source_code = ''' - from datetime import datetime, timedelta - - # Calculate the job trigger time 5 minutes from now - trigger_time1 = datetime.utcnow() + timedelta(minutes=30) - trigger_time2 = datetime.utcnow() + timedelta(minutes=2) - - id = generate_id() - - # Function to add the scheduled job (with the adjusted trigger time) - # add_schedule_job('mng2', trigger_time1, {'user_rajan': 'test rajan sep 12'}, 'UTC', id, kwargs={'task_type': 'Callback'}) - # add_schedule_job('del_job1', trigger_time2, {'event_id': id}, 'UTC', kwargs={'task_type': 'Callback'}) - - add_schedule_job('mng2', trigger_time1, {'user_rajan': 'test rajan sep 12'}, 'UTC', id) - add_schedule_job('del_job1', trigger_time2, {'event_id': id}, 'UTC') - ''' source_code = ''' from datetime import datetime, timedelta
485-497
: Remove redundant source code reassignment.Redundant source code reassignment creates confusion and reduces test clarity.
- source_code = ''' - from datetime import datetime, timedelta - - # Calculate the job trigger time 5 minutes from now - trigger_time1 = datetime.utcnow() + timedelta(minutes=30) - trigger_time2 = datetime.utcnow() + timedelta(minutes=2) - - id = generate_id() - - # Function to add the scheduled job (with the adjusted trigger time) - # add_schedule_job('mng2', trigger_time1, {'user_rajan': 'test rajan sep 12'}, 'UTC', id, kwargs={'task_type': 'Callback'}) - # add_schedule_job('del_job1', trigger_time2, {'event_id': id}, 'UTC', kwargs={'task_type': 'Callback'}) - - add_schedule_job('mng2', trigger_time1, {'user_rajan': 'test rajan sep 12'}, 'UTC', id) - add_schedule_job('del_job1', trigger_time2, {'event_id': id}, 'UTC') - ''' source_code = '''
1-1133
: Enhance test maintainability with fixtures and parameterization.The test file contains many similar test functions that could benefit from pytest's fixture and parameterization features to reduce code duplication.
Consider:
- Creating fixtures for common test data and resources
- Using parameterized tests for similar scenarios
- Grouping related tests into classes
Example of parameterization for error test cases:
@pytest.mark.parametrize( "test_id, missing_field, source_code, expected_error", [ ("missing_bot", "bot", 'add_schedule_job(...)', 'Missing bot id'), ("missing_event_id", "event_id", 'delete_schedule_job(False)', 'Missing event id'), # Add other similar test cases ] ) def test_pyscript_handler_error_cases(test_id, missing_field, source_code, expected_error): # Setup test based on parameters # Run test # Assert expected error🧰 Tools
🪛 Ruff (0.8.2)
12-12: Use capitalized environment variable
SYSTEM_FILE
instead ofsystem_file
Replace
system_file
withSYSTEM_FILE
(SIM112)
🪛 Gitleaks (8.21.2)
363-363: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
269-270
: SMTP port and credentials should be consistent across tests.The test uses port 293 which is non-standard, and hardcoded test credentials.
Consider using constants or configuration values for test SMTP settings to ensure consistency across tests:
# At the top of the file TEST_SMTP_URL = "smtp.gmail.com" TEST_SMTP_PORT = 587 # Standard TLS port TEST_SMTP_PASSWORD = {"value": "test_password"} TEST_SMTP_USERID = {"value": "test_user"}Then use these constants in your test setup.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
tests/integration_test/action_service_test.py
(3 hunks)tests/unit_test/action/action_test.py
(1 hunks)tests/unit_test/callback/pyscript_handler_test.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
tests/unit_test/callback/pyscript_handler_test.py
12-12: Use capitalized environment variable SYSTEM_FILE
instead of system_file
Replace system_file
with SYSTEM_FILE
(SIM112)
🪛 Gitleaks (8.21.2)
tests/unit_test/callback/pyscript_handler_test.py
363-363: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Python CI
- GitHub Check: Analyze (python)
🔇 Additional comments (3)
tests/integration_test/action_service_test.py (1)
1368-1373
: Successfully mocked the pyscript evaluation response.The mock response setup is well-structured with appropriate success response data including bot_response and slots. This properly tests the success path of your callback service.
tests/unit_test/action/action_test.py (1)
1444-1450
: The mock environment has been updated to use a direct HTTP endpoint instead of a lambda function.The test case has been modified to use a direct HTTP call to a local evaluator endpoint rather than triggering a lambda function. This change aligns with the broader architectural shift mentioned in the PR description of moving
pyscript-lambda
andpyscript-callback
code.The mock response provides a properly structured response object with both bot response and slot values, which will properly test the callback functionality.
tests/unit_test/callback/pyscript_handler_test.py (1)
269-269
:✅ Verification successful
Check SMTP port.
The SMTP port 293 seems unusual. Standard SMTP ports are typically 25 (unencrypted), 465 (SSL), or 587 (TLS).
🏁 Script executed:
#!/bin/bash # Check standard SMTP ports used elsewhere in the codebase rg -g "*.py" -A 2 -B 2 "smtp_port\s*=\s*\d+"Length of output: 14979
SMTP Port Configuration Verified
After reviewing the tests, the use of
smtp_port=293
is consistently found across multiple unit and integration tests (including intests/unit_test/callback/pyscript_handler_test.py
). Although port 293 is nonstandard for production environments, it is used intentionally in the test suite. No changes are needed unless this was not the intended test configuration.
execution_mode="sync", | ||
expire_in=0, | ||
shorten_token=True, | ||
token_hash="0191703078f87a039906afc0a219dd5c", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Potential API key or secret exposed in test configuration.
The code includes what appears to be an API key or validation secret in the test. Consider using a mock value or placeholder for testing.
🏁 Script executed:
#!/bin/bash
# Check if this appears to be a real secret or just a test value
rg -g "*.py" "0191703078f779199d90c1a91fe9839f"
Length of output: 306
Action: Replace Hard-Coded Test Secret with a Dummy Value
The verification confirms that the string "0191703078f779199d90c1a91fe9839f"
is hard-coded in two test files:
- In
tests/integration_test/callback_service_test.py
where it’s passed as thevalidation_secret
viaencrypt_secret(...)
- In
tests/unit_test/callback/pyscript_handler_test.py
with a similar usage
Using such a fixed value—even if intended only for testing—can be misleading. To prevent any accidental misuse or confusion, please replace this hard-coded secret with a clearly labeled placeholder (e.g., "TEST_SECRET"
) and consider centralizing test configuration values.
🧰 Tools
🪛 Gitleaks (8.21.2)
363-363: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
moved pyscript-lambda and pyscript-callback code and added test cases.
Summary by CodeRabbit
New Features
Refactor
Tests
Chores