Skip to content

fix: Continue summarizer on error and add more logging #2220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

arslanashraf7
Copy link
Contributor

@arslanashraf7 arslanashraf7 commented Apr 30, 2025

What are the relevant tickets?

https://github.com/mitodl/hq/issues/7123

Description (What does it do?)

  • Adds continuation of the command if any content file summarization fails for some reason
  • Adds more logging for the command to print. Both success and error messages with details on which content file failed
  • Fixes an issue with the Keycloak container on an m4 machine ARM64 arch

Screenshots (if appropriate):

  • Desktop screenshots
  • Mobile width screenshots

How can this be tested?\

Testing summarize fix

  • Follow the same instructions mentioned in feat: content summaries and flashcards generation #2118 to be able to set up the summarization feature.
  • To reproduce the error on main branch, you can follow the steps in https://github.com/mitodl/hq/issues/7123 OR do the below
    • Have 4-5 content files set up with some content value
    • Manually reduce the value of [max_tokens](https://github.com/mitodl/mit-learn/blob/main/learning_resources/content_summarizer.py#L238) so that the flashcards response from the LLM would not be complete because of low max_tokens. The JSON response will cut off in the middle and fail the flashcards JSON format validation. See More details in the ticket.
    • When you reduce the max_tokens, make sure you do not set it so low that this fails on the prompt. It should be reasonable enough that the prompt gets through and but their response from LLM breaks.
  • When you have reproduced this issue on main, you will notice that all the files after the failing one will not be processed, and the command will stop.
  • Once you have reproduced the error on the main branch, now shift to this branch and re-run the summarize command again. This time, you will again hit the flashcards Format error because of a low max_tokens value, but the command will not halt, and it should process all the remaining files.
  • At the end, the command should log stats, i.e.
    • Failed file IDs and the error details
    • Succeccful content file Ids

Testing Keycloak fix on ARM64 machine

  • Your Keycloak container should run without any errors (Ideal, if you could check it on a container built fresh)

Additional Context

Copy link

OpenAPI Changes

Show/hide No detectable change.

@arslanashraf7 arslanashraf7 changed the title fix: Add logs and continue summarizer on error fix: Continue summarizer on error and add more logging Apr 30, 2025
@arslanashraf7 arslanashraf7 force-pushed the arslan/7123-summarizer-continue-on-error branch from ed0138d to 0caed68 Compare April 30, 2025 15:47
Copy link

OpenAPI Changes

Show/hide No detectable change.

@@ -101,6 +101,7 @@ services:
environment:
- KEYCLOAK_ADMIN=${KEYCLOAK_SVC_ADMIN:-admin}
- KEYCLOAK_ADMIN_PASSWORD=${KEYCLOAK_SVC_ADMIN_PASSWORD:-admin}
- "_JAVA_OPTIONS=${JAVA_OPTIONS:-}" # Load _JAVA_OPTIONS from env, fallback to empty string
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My Keycloak container was not starting up because of java errors so I had to do the same thing here that I did with opensearch container. #2069 & #2082

Copy link

github-actions bot commented May 2, 2025

OpenAPI Changes

Show/hide No detectable change.

@arslanashraf7 arslanashraf7 force-pushed the arslan/7123-summarizer-continue-on-error branch from 78e7e84 to 50d5f96 Compare May 2, 2025 10:22
Copy link

github-actions bot commented May 2, 2025

OpenAPI Changes

Show/hide No detectable change.

@arslanashraf7 arslanashraf7 marked this pull request as ready for review May 2, 2025 12:36
@arslanashraf7 arslanashraf7 added the Needs Review An open Pull Request that is ready for review label May 2, 2025
@arslanashraf7 arslanashraf7 force-pushed the arslan/7123-summarizer-continue-on-error branch from 50d5f96 to 3926dfe Compare May 2, 2025 12:53
Copy link

github-actions bot commented May 2, 2025

OpenAPI Changes

Show/hide No detectable change.

@shanbady shanbady self-requested a review May 2, 2025 13:23
Copy link
Contributor

@shanbady shanbady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked as expected after syncing to set up initial fixtures properly. As discussed it might be worth adding docs on how to set this up or adding some default fixtures for local envs in a separate pr.

@shanbady shanbady added Waiting on author and removed Needs Review An open Pull Request that is ready for review labels May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants