-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check for tserver failure in balance loop after some iterations #5070
base: 2.1
Are you sure you want to change the base?
Conversation
The StatusThread.balanceTablets method could loop indefinitely in some cases, like when a TabletServer dies but is the target of a current migration. The status of the TabletServers is checked before this method is invoked and never updated. Created a new property to indicate how many iterations should pass before the Tablet Server status is re-checked.
|
||
SortedMap<TabletServerId,TServerStatus> statusForBalancerLevel = | ||
tserverStatusForBalancerLevel; | ||
if (attemptNum % checkInterval == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding this check, what about instead replacing the do/while loop with a break out of the for loop. I think the loop is intended to prevent balancing of user tablets when metadata is not balanced. This also seems to prevent other work that a thread that calls balancing is doing. This change redoing a subset of that work here. If the function returned instead of looping it would allow that work to be done.
Wondering if we could do something like the following at this line instead of the while loop.
if(migrationsOutForLevel > 0 && (dl == DataLevel.ROOT || dl == DataLevel.METADATA)){
// do not continue balancing later levels until this level is balanced.
break;
}
Then maybe the outer code/thread will do its thing and eventually call balanceTablets again and maybe get further down the levels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though we are passing the balancer a set of tablets for the root or metadata table, that doesn't prevent it from returning migrations for other tables. I think the HostRegexTableLoadBalancer will do this here. In this case, we could be operating on the root or metadata level with migrationsOutForLevel
being greater than zero. If we exit and call this method again, then I think we end up at the same place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the HostRegexTableLoadBalancer will do this here.
Ok and following that tablet id map it gets it from here which is coming from this balancerEnv that is only initialized once for the manager process. So when the manager is intending to only balance the root table its not restricting that table id map, so the balancer is completely unaware of this outer goal.
This is making me wonder if we should back out all of the changes for balancing by level in 2.1 for now and try to figure out how to accomplish that goal in a better way. As it stands now the manager code has a goal for only balancing the root or metadata table and the balancer plugin is completely unaware of that goal. Going down the path of making the balancer aware of those goals may introduce more bugs in 2.1. That is why I am wondering if we should back out the changes for now and try to figure out how to balance by level in follow on work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be that the best way to balance by level in 2.1 is inside a balancer plugin. Not sure, but maybe doing this at the manager level requires changes to the SPI which would be best done in a 2.2, 3.1, or 4.0. Not sure though.
Modified code such that only migrations for the current data level would be considered and to only recheck tablet server status for the root and metadata levels. Balancing of tablets returned for non-current data level will be processed in their data level.
return migrations.stream().filter(m -> { | ||
boolean includeMigration = false; | ||
if (m.getTablet() == null) { | ||
log.error("Balancer gave back a null tablet {}", m); | ||
} else if (DataLevel.of(m.getTablet().getTable()) != level) { | ||
log.warn("Balancer wants to move a tablet outside of the current processing level"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may be too much as we actually can expect the HostRegex balancer to create migrations outside of the processing level. Should this be debug or trace maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 967e0cd
The StatusThread.balanceTablets method could loop indefinitely in some cases, like when a TabletServer dies but is the target of a current migration. The status of the TabletServers is checked before this method is invoked and never updated. Created a new property to indicate how many iterations should pass before the Tablet Server status is re-checked.