Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot Specific Executor Logs #1572

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion kairon/actions/definitions/schedule.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
from kairon.shared.actions.models import ActionType
from kairon.shared.actions.utils import ActionUtility
from kairon.shared.callback.data_objects import CallbackConfig
from kairon.shared.data.constant import TASK_TYPE


class ActionSchedule(ActionsBase):
Expand Down Expand Up @@ -136,12 +137,13 @@ async def add_schedule_job(self,
date_time: datetime,
data: Dict,
timezone: Text,
kwargs=None):
**kwargs):
func = obj_to_ref(ExecutorFactory.get_executor().execute_task)

_id = uuid7().hex
data['predefined_objects']['event'] = _id
args = (func, "scheduler_evaluator", data,)
kwargs.update({'task_type': TASK_TYPE.ACTION.value})
trigger = DateTrigger(run_date=date_time, timezone=timezone)

next_run_time = trigger.get_next_fire_time(None, datetime.now(astimezone(timezone) or get_localzone()))
Expand Down
3 changes: 2 additions & 1 deletion kairon/actions/definitions/web_search.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,8 @@ async def execute(self, dispatcher: CollectingDispatcher, tracker: Tracker, doma
if not ActionUtility.is_empty(latest_msg):
results = ActionUtility.perform_web_search(latest_msg,
topn=action_config.get("topn"),
website=action_config.get("website"))
website=action_config.get("website"),
bot=self.bot)
Comment on lines +67 to +68
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Some calls to perform_web_search are missing the bot parameter.

Please update all instances of perform_web_search to include the bot parameter to ensure consistency and proper functionality.

Affected locations:

  • tests/unit_test/action/action_test.py
  • tests/integration_test/action_service_test.py
🔗 Analysis chain

LGTM! Consider updating docstring and verify other method calls.

The addition of the bot parameter to the perform_web_search method call is correct and consistent with the changes mentioned in the summary. This change allows for better flexibility and potentially improved separation of concerns.

However, to ensure completeness:

  1. Please verify that all other calls to perform_web_search in the codebase have been updated to include this new parameter.
  2. Consider updating the docstring of the execute method to reflect this change, mentioning that it now passes the bot information to the web search function.

To verify other calls to perform_web_search, you can run the following script:

This will help ensure that all relevant calls have been updated consistently across the codebase.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all calls to perform_web_search and verify they include the bot parameter

# Search for perform_web_search calls
rg "perform_web_search\(" --type python -g "!test_*" -A 3

Length of output: 90


Script:

# 
#!/bin/bash
# Description: Find all calls to perform_web_search and verify they include the bot parameter

# Search for perform_web_search calls without specifying the file type
rg "perform_web_search\s*\(" --glob "!test_*" -A 3

Length of output: 6976

if results:
bot_response = ActionUtility.format_search_result(results)
if not ActionUtility.is_empty(action_config.get('set_slot')):
Expand Down
22 changes: 22 additions & 0 deletions kairon/api/app/routers/bot/bot.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
from kairon.shared.data.processor import MongoProcessor
from kairon.shared.data.training_data_generation_processor import TrainingDataGenerationProcessor
from kairon.shared.data.utils import DataUtility
from kairon.shared.events.processor import ExecutorProcessor
from kairon.shared.importer.data_objects import ValidationLogs
from kairon.shared.importer.processor import DataImporterLogProcessor
from kairon.shared.live_agent.live_agent import LiveAgentHandler
Expand Down Expand Up @@ -1705,6 +1706,27 @@ async def get_llm_logs(
return Response(data=data)


@router.get("/executor/logs", response_model=Response)
async def get_executor_logs(
start_idx: int = 0, page_size: int = 10,
event_class: str = None, task_type: str = None,
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=TESTER_ACCESS)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Avoid function call in argument default

The static analysis tool Ruff has flagged a potential issue with performing a function call in the argument default for the current_user parameter. This is generally discouraged as it can lead to unexpected behavior with mutable default arguments.

To address this, you can modify the function signature as follows:

from fastapi import Depends

@router.get("/executor/logs", response_model=Response)
async def get_executor_logs(
    start_idx: int = 0,
    page_size: int = 10,
    event_class: str = None,
    task_type: str = None,
    current_user: User = Depends(Authentication.get_current_user_and_bot)
):
    # ... rest of the function

This change uses FastAPI's Depends function, which is the recommended way to handle dependency injection and security in FastAPI. It will still apply the TESTER_ACCESS scope check but avoids the issue with function calls in default arguments.

🧰 Tools
🪛 Ruff

1713-1713: Do not perform function call Security in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

):
"""
Get executor logs data based on filters provided.
"""
logs = list(ExecutorProcessor.get_executor_logs(current_user.get_bot(), start_idx, page_size,
event_class=event_class, task_type=task_type))
row_cnt = ExecutorProcessor.get_row_count(current_user.get_bot(),
event_class=event_class,
task_type=task_type)
data = {
"logs": logs,
"total": row_cnt
}
return Response(data=data)


@router.get("/metadata/llm", response_model=Response)
async def get_llm_metadata(
current_user: User = Security(Authentication.get_current_user_and_bot, scopes=TESTER_ACCESS)) -> Response:
Expand Down
4 changes: 2 additions & 2 deletions kairon/events/executors/base.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from abc import abstractmethod

from typing import Any

from kairon.shared.constants import EventClass
from kairon.shared.data.constant import EVENT_STATUS, TASK_TYPE
Expand All @@ -13,7 +13,7 @@ class ExecutorBase:
def execute_task(self, event_class: EventClass, data: dict, **kwargs):
raise NotImplementedError("Provider not implemented")

def log_task(self, event_class: EventClass, task_type: TASK_TYPE, data: dict, status: EVENT_STATUS, **kwargs):
def log_task(self, event_class: EventClass, task_type: TASK_TYPE, data: Any, status: EVENT_STATUS, **kwargs):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Issue Detected: Type Inconsistency in log_task Method

The log_task method signature has been changed from dict to Any, but its implementation in CloudUtility.log_task still expects data as a dict. This inconsistency can lead to type safety issues and potential runtime errors.

Recommendations:

  1. Revert the data Parameter Type:

    • Change the data parameter type back to dict in the log_task method to maintain consistency with CloudUtility.log_task.
  2. Enhance Type Handling:

    • If flexibility is required, update CloudUtility.log_task to handle Any type for data with appropriate type checks and validations.
  3. Update Documentation and Type Annotations:

    • Ensure all type annotations across the codebase are consistent and accurately reflect the expected data types.
    • Add documentation to clarify the expected types for the data parameter.
  4. Run Comprehensive Tests:

    • Execute unit and integration tests to verify that type changes do not introduce unintended side effects.
🔗 Analysis chain

Approved: log_task method signature update, with considerations.

The change from dict to Any for the data parameter type allows for greater flexibility. However, consider the following points:

  1. Ensure that all subclasses and calling code are updated to handle this change appropriately.
  2. Consider adding type checking or documentation to clarify what types of data are actually supported.
  3. Verify that CloudUtility.log_task() can handle various data types safely.

To ensure this change doesn't introduce unexpected behavior, please run the following verification:


from bson import ObjectId
from kairon.shared.cloud.utils import CloudUtility

Expand Down
2 changes: 1 addition & 1 deletion kairon/shared/actions/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -490,7 +490,7 @@ def perform_web_search(search_term: str, **kwargs):
trigger_task = Utility.environment['web_search']['trigger_task']
search_engine_url = Utility.environment['web_search']['url']
website = kwargs.get('website') if kwargs.get('website') else ''
request_body = {"text": search_term, "site": website, "topn": kwargs.get("topn")}
request_body = {"text": search_term, "site": website, "topn": kwargs.get("topn"), "bot": kwargs.get("bot")}
results = []
try:
if trigger_task:
Expand Down
45 changes: 44 additions & 1 deletion kairon/shared/cloud/utils.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import Any

import ujson as json
import os
import time
Expand Down Expand Up @@ -57,7 +59,7 @@ def delete_file(bucket, file):
s3.delete_object(Bucket=bucket, Key=file)

@staticmethod
def trigger_lambda(event_class: EventClass, env_data: dict, task_type: TASK_TYPE = TASK_TYPE.ACTION.value,
def trigger_lambda(event_class: EventClass, env_data: Any, task_type: TASK_TYPE = TASK_TYPE.ACTION.value,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Changing env_data type to Any reduces type safety

Changing the type annotation of env_data from dict to Any in the trigger_lambda method reduces type safety and may make it harder to catch errors during static analysis. If env_data can be multiple specific types, consider using Union types to represent all expected types (e.g., Union[dict, list]). This approach maintains type safety while allowing flexibility.


⚠️ Potential issue

Ensure env_data is JSON serializable

Since env_data is being serialized with json.dumps(env_data), changing its type to Any might lead to runtime errors if env_data contains non-serializable objects. Please ensure that all possible types passed as env_data are JSON serializable.

from_executor: bool = False):
"""
Triggers lambda based on the event class.
Expand Down Expand Up @@ -103,6 +105,9 @@ def log_task(event_class: EventClass, task_type: TASK_TYPE, data: dict, status:
from kairon.shared.events.data_objects import ExecutorLogs

executor_log_id = kwargs.get("executor_log_id") if kwargs.get("executor_log_id") else ObjectId().__str__()
bot_id = CloudUtility.get_bot_id_from_env_data(event_class, data,
from_executor=kwargs.get("from_executor", False),
task_type=task_type)
Comment on lines +108 to +110
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Check for None before assigning bot_id to log.bot

In the log_task method, bot_id is obtained from get_bot_id_from_env_data, which may return None if a bot ID isn't found. Assigning None to log.bot might not be intended and could lead to issues when querying or displaying logs. Consider adding a check to handle cases where bot_id is None, possibly assigning a default value or logging a warning.


try:
log = ExecutorLogs.objects(executor_log_id=executor_log_id, task_type=task_type, event_class=event_class,
Expand All @@ -116,9 +121,47 @@ def log_task(event_class: EventClass, task_type: TASK_TYPE, data: dict, status:
for key, value in kwargs.items():
if not getattr(log, key, None) and Utility.is_picklable_for_mongo({key: value}):
setattr(log, key, value)
log.bot = bot_id
log.save()
return executor_log_id

@staticmethod
def get_bot_id_from_env_data(event_class: EventClass, data: Any, **kwargs):
bot = None
from_executor = kwargs.get("from_executor")

if isinstance(data, dict) and 'bot' in data:
bot = data['bot']

elif event_class == EventClass.web_search:
bot = data.get('bot')

elif event_class == EventClass.pyscript_evaluator:
predefined_objects = data.get('predefined_objects', {})

if 'slot' in predefined_objects and 'bot' in predefined_objects['slot']:
bot = predefined_objects['slot']['bot']

task_type = kwargs.get("task_type")
if task_type == "Callback" and 'bot' in predefined_objects:
bot = predefined_objects['bot']
Comment on lines +145 to +147
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Type mismatch when comparing task_type

In line 146, task_type is compared to the string "Callback":

if task_type == "Callback" and 'bot' in predefined_objects:

If task_type is an instance of TASK_TYPE (an enum), comparing it directly to a string may not work as intended. Instead, compare it to the enum value:

  • Use task_type == TASK_TYPE.CALLBACK
  • Or use task_type.value == "Callback"

This ensures that the comparison behaves correctly.


elif event_class == EventClass.scheduler_evaluator and isinstance(data, list):
for item in data:
if item.get('name') == 'PREDEFINED_OBJECTS':
predefined_objects = item.get('value', {})
if 'bot' in predefined_objects:
bot = predefined_objects['bot']
break

elif from_executor and isinstance(data, list):
for item in data:
if item.get('name') == 'BOT':
bot = item.get('value')
break

return bot
Comment on lines +128 to +163
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor get_bot_id_from_env_data for clarity and maintainability

The get_bot_id_from_env_data method contains multiple conditional branches that handle different event_class cases and data structures. This can make the code hard to read and maintain. Consider refactoring the method:

  • Use a mapping of event classes to handler functions: Create separate functions to handle each event_class and store them in a dictionary. This approach simplifies the main method and separates concerns.

  • Add comments or docstrings: Provide explanations for each block to clarify the intent and any assumptions about the data structure.

  • Handle missing or unexpected data gracefully: Ensure that the method doesn't raise exceptions due to unexpected data types and that it logs warnings when necessary.


@staticmethod
def lambda_execution_failed(response):
return (response['StatusCode'] != 200 or
Expand Down
41 changes: 41 additions & 0 deletions kairon/shared/events/processor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
from kairon.shared.events.data_objects import ExecutorLogs


class ExecutorProcessor:

@staticmethod
def get_executor_logs(bot: str, start_idx: int = 0, page_size: int = 10, **kwargs):
"""
Get all executor logs data .
@param bot: bot id.
@param start_idx: start index
@param page_size: page size
@return: list of logs.
"""
event_class = kwargs.get("event_class")
task_type = kwargs.get("task_type")
query = {"bot": bot}
if event_class:
query.update({"event_class": event_class})
if task_type:
query.update({"task_type": task_type})
for log in ExecutorLogs.objects(**query).order_by("-timestamp").skip(start_idx).limit(page_size):
executor_logs = log.to_mongo().to_dict()
executor_logs.pop('_id')
yield executor_logs

@staticmethod
def get_row_count(bot: str, **kwargs):
"""
Gets the count of rows in a ExecutorLogs for a particular bot.
:param bot: bot id
:return: Count of rows
"""
event_class = kwargs.get("event_class")
task_type = kwargs.get("task_type")
query = {"bot": bot}
if event_class:
query.update({"event_class": event_class})
if task_type:
query.update({"task_type": task_type})
return ExecutorLogs.objects(**query).count()
10 changes: 5 additions & 5 deletions tests/integration_test/action_service_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -8312,7 +8312,7 @@ def test_process_web_search_action():

def _perform_web_search(*args, **kwargs):
assert args == ('What is data?',)
assert kwargs == {'topn': 1, 'website': 'https://www.w3schools.com/'}
assert kwargs == {'topn': 1, 'website': 'https://www.w3schools.com/', 'bot': bot}
return [{
'title': 'Data Science Introduction - W3Schools',
'text': "Data Science is a combination of multiple disciplines that uses statistics, data analysis, and machine learning to analyze data and to extract knowledge and insights from it. What is Data Science? Data Science is about data gathering, analysis and decision-making.",
Expand Down Expand Up @@ -8437,7 +8437,7 @@ def test_process_web_search_action_with_search_engine_url():
status=200,
match=[
responses.matchers.json_params_matcher({
"text": 'What is data science?', "site": '', "topn": 1
"text": 'What is data science?', "site": '', "topn": 1, 'bot': bot
})],
)

Expand Down Expand Up @@ -8545,7 +8545,7 @@ def test_process_web_search_action_with_kairon_user_msg_entity():

def _perform_web_search(*args, **kwargs):
assert args == ('my public search text',)
assert kwargs == {'topn': 2, 'website': None}
assert kwargs == {'topn': 2, 'website': None, 'bot': bot}
return [
{'title': 'What is Data Science? | IBM',
'text': 'Data science combines math, statistics, programming, analytics, AI, and machine learning to uncover insights from data. Learn how data science works, what it entails, and how it differs from data science and BI.',
Expand Down Expand Up @@ -8665,7 +8665,7 @@ def test_process_web_search_action_without_kairon_user_msg_entity():

def _perform_web_search(*args, **kwargs):
assert args == ('/action_public_search',)
assert kwargs == {'topn': 2, 'website': None}
assert kwargs == {'topn': 2, 'website': None, 'bot': bot}
return [
{'title': 'What is Data Science? | IBM',
'text': 'Data science combines math, statistics, programming, analytics, AI, and machine learning to uncover insights from data. Learn how data science works, what it entails, and how it differs from data science and BI.',
Expand Down Expand Up @@ -8785,7 +8785,7 @@ def test_process_web_search_action_dispatch_false():

def _perform_web_search(*args, **kwargs):
assert args == ('What is Python?',)
assert kwargs == {'topn': 1, 'website': None}
assert kwargs == {'topn': 1, 'website': None, 'bot': bot}
return [
{'title': 'Python.org - What is Python? Executive Summary',
'text': 'Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-level built in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development, as well as for use as a scripting or glue language to connect existing components together.',
Expand Down
Loading
Loading