Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GMS Stuck on Bootstrap WaitForSystemUpdateStep #12593

Open
guerremdq opened this issue Feb 11, 2025 · 2 comments
Open

GMS Stuck on Bootstrap WaitForSystemUpdateStep #12593

guerremdq opened this issue Feb 11, 2025 · 2 comments

Comments

@guerremdq
Copy link

Our GMS server get stuck on WaitForSystemUpdateStep, the only way to continue is to re-apply helm so the system-update job run.. This happen randomly every 2-3 weeks.

2025-02-11 12:40:06,241 [main] INFO  c.l.metadata.boot.BootstrapManager:25 - Starting Bootstrap Process...
2025-02-11 12:40:06,242 [main] INFO  c.l.metadata.boot.BootstrapManager:32 - Executing bootstrap step 1/1 with name WaitForSystemUpdateStep...

How we can avoid this?

@belint01
Copy link

The same issue. In my case gms process has been killed during ingestion

@belint01
Copy link

belint01 commented Feb 25, 2025

WaitForSystemUpdateStep," is typically related to the GMS waiting for a message from the system-update job to confirm that the system update has been completed successfully. This is a common step in the DataHub upgrade process to ensure that all necessary data migrations and updates have been applied before the GMS service starts.
Here are some potential causes and solutions:
1. System Update Job Not Completed: Ensure that the system-update job has run and completed successfully. This job is responsible for performing necessary data migrations and updates. You can check the logs of the system-update job to verify its completion.
2. Kafka Topic Retention: The GMS process relies on messages from the DataHubUpgradeHistory_v1 Kafka topic to determine if the system update has been completed. If the retention period for this topic is too short, the necessary messages might have been deleted. You can adjust the retention settings for this topic to ensure messages are retained long enough for the GMS to process them.
3. Version Mismatch: Ensure that the version of the datahub-upgrade job matches the version of the GMS. A mismatch can cause the GMS to wait indefinitely for a message that corresponds to its version.
4. Configuration Issues: Verify that the environment variables DATAHUB_GMS_HOST and DATAHUB_GMS_PORT are set correctly in your GMS deployment configuration. These should point to the correct host and port where the GMS service is running.
5. Disable System Update Wait: If you are certain that the system update is not necessary for your setup, you can bypass this check by setting the environment variable BOOTSTRAP_SYSTEM_UPDATE_WAIT_FOR_SYSTEM_UPDATE to false. However, this is not recommended unless you are sure that skipping the update will not cause issues.

In my case it was Kafka Topic Retention issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants