Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for tserver failure in balance loop after some iterations #5070

Open
wants to merge 4 commits into
base: 2.1
Choose a base branch
from

Conversation

dlmarion
Copy link
Contributor

The StatusThread.balanceTablets method could loop indefinitely in some cases, like when a TabletServer dies but is the target of a current migration. The status of the TabletServers is checked before this method is invoked and never updated. Created a new property to indicate how many iterations should pass before the Tablet Server status is re-checked.

The StatusThread.balanceTablets method could loop indefinitely
in some cases, like when a TabletServer dies but is the target
of a current migration. The status of the TabletServers is
checked before this method is invoked and never updated. Created
a new property to indicate how many iterations should pass before
the Tablet Server status is re-checked.
@dlmarion dlmarion added this to the 2.1.4 milestone Nov 15, 2024
@dlmarion dlmarion self-assigned this Nov 15, 2024

SortedMap<TabletServerId,TServerStatus> statusForBalancerLevel =
tserverStatusForBalancerLevel;
if (attemptNum % checkInterval == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding this check, what about instead replacing the do/while loop with a break out of the for loop. I think the loop is intended to prevent balancing of user tablets when metadata is not balanced. This also seems to prevent other work that a thread that calls balancing is doing. This change redoing a subset of that work here. If the function returned instead of looping it would allow that work to be done.

Wondering if we could do something like the following at this line instead of the while loop.

if(migrationsOutForLevel > 0 && (dl == DataLevel.ROOT || dl == DataLevel.METADATA)){
   // do not continue balancing later levels until this level is balanced.  
   break;
}

Then maybe the outer code/thread will do its thing and eventually call balanceTablets again and maybe get further down the levels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though we are passing the balancer a set of tablets for the root or metadata table, that doesn't prevent it from returning migrations for other tables. I think the HostRegexTableLoadBalancer will do this here. In this case, we could be operating on the root or metadata level with migrationsOutForLevel being greater than zero. If we exit and call this method again, then I think we end up at the same place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the HostRegexTableLoadBalancer will do this here.

Ok and following that tablet id map it gets it from here which is coming from this balancerEnv that is only initialized once for the manager process. So when the manager is intending to only balance the root table its not restricting that table id map, so the balancer is completely unaware of this outer goal.

This is making me wonder if we should back out all of the changes for balancing by level in 2.1 for now and try to figure out how to accomplish that goal in a better way. As it stands now the manager code has a goal for only balancing the root or metadata table and the balancer plugin is completely unaware of that goal. Going down the path of making the balancer aware of those goals may introduce more bugs in 2.1. That is why I am wondering if we should back out the changes for now and try to figure out how to balance by level in follow on work

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be that the best way to balance by level in 2.1 is inside a balancer plugin. Not sure, but maybe doing this at the manager level requires changes to the SPI which would be best done in a 2.2, 3.1, or 4.0. Not sure though.

Modified code such that only migrations for the current data level
would be considered and to only recheck tablet server status for
the root and metadata levels. Balancing of tablets returned for
non-current data level will be processed in their data level.
return migrations.stream().filter(m -> {
boolean includeMigration = false;
if (m.getTablet() == null) {
log.error("Balancer gave back a null tablet {}", m);
} else if (DataLevel.of(m.getTablet().getTable()) != level) {
log.warn("Balancer wants to move a tablet outside of the current processing level");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be too much as we actually can expect the HostRegex balancer to create migrations outside of the processing level. Should this be debug or trace maybe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 967e0cd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants