Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track token usage of iris requests #165

Merged
merged 20 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
fc44738
Add token usage monitoring in exercise chat and send to Artemis
alexjoham Oct 1, 2024
26e3873
Add Pipeline enum for better tracking
alexjoham Oct 11, 2024
aa50faf
Update tokens location, add token tracking to competency and chat pipe
alexjoham Oct 11, 2024
9905460
added first versions for tracking for smaller pipelines
alexjoham Oct 11, 2024
e241d45
Fix lint errors
alexjoham Oct 11, 2024
4502e30
Fix last lint error
alexjoham Oct 11, 2024
3b81a30
Fix lint errors
alexjoham Oct 11, 2024
74b1239
Merge remote-tracking branch 'origin/feature/track-usage-of-iris-requ…
alexjoham Oct 11, 2024
6bcb002
Merge branch 'main' into track-token-usage
alexjoham Oct 11, 2024
4324180
Add token cost tracking for input and output tokens
alexjoham Oct 12, 2024
c9e89be
Update token handling as proposed by CodeRabbit
alexjoham Oct 12, 2024
4c92900
Update PyrisMessage to use only TokenUsageDTO, add token count for error
alexjoham Oct 12, 2024
6bd4b33
Fix competency extraction did not save Enum
alexjoham Oct 12, 2024
c79837d
Merge branch 'main' into track-token-usage
alexjoham Oct 15, 2024
4d61c85
Update code after merge
alexjoham Oct 15, 2024
3253c46
Make -1 default value if no tokens have been received
alexjoham Oct 16, 2024
9fe9e0a
Update DTO for new Artemis table
alexjoham Oct 19, 2024
13c5db1
Change number of tokens if error to 0, as is standard by OpenAI & Ollama
alexjoham Oct 23, 2024
dd504fc
Fix token usage list append bug
bassner Oct 23, 2024
043264a
Fix formatting
bassner Oct 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Merge branch 'main' into track-token-usage
  • Loading branch information
alexjoham committed Oct 15, 2024
commit c79837d2ff64c9cb6cb0fdfd20652f5a507db5ea
18 changes: 14 additions & 4 deletions app/llm/external/openai_chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -127,10 +127,20 @@ def chat(
temperature=arguments.temperature,
max_tokens=arguments.max_tokens,
)
return convert_to_iris_message(
response.choices[0].message, response.usage, response.model
)
except Exception as e:
choice = response.choices[0]
if choice.finish_reason == "content_filter":
# I figured that an openai error would be automatically raised if the content filter activated,
# but it seems that that is not the case.
# We don't want to retry because the same message will likely be rejected again.
# Raise an exception to trigger the global error handler and report a fatal error to the client.
raise ContentFilterFinishReasonError()
return convert_to_iris_message(choice.message)
except (
APIError,
APITimeoutError,
RateLimitError,
InternalServerError,
):
wait_time = initial_delay * (backoff_factor**attempt)
logging.exception(f"OpenAI error on attempt {attempt + 1}:")
logging.info(f"Retrying in {wait_time} seconds...")
Expand Down
2 changes: 1 addition & 1 deletion app/web/status/status_update.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from abc import ABC

from app.common.token_usage_dto import TokenUsageDTO
from ...domain.status.competency_extraction_status_update_dto import (
from app.domain.status.competency_extraction_status_update_dto import (
CompetencyExtractionStatusUpdateDTO,
)
from app.domain.chat.course_chat.course_chat_status_update_dto import (
Expand Down
You are viewing a condensed version of this merge commit. You can view the full changes here.