-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a config to limit the code-gen and class loading #83
base: master
Are you sure you want to change the base?
Conversation
It sets the maximum number of fast SerDes classes generated and loaded. It helps to limit the metaspace and codecache usage brought by fast-avro runtime code-gen and class loading. The limit config can be set via FastSerdeCache constructors.
2510845
to
5a79fa9
Compare
Codecov Report
@@ Coverage Diff @@
## master #83 +/- ##
=========================================
Coverage 53.42% 53.42%
Complexity 275 275
=========================================
Files 39 39
Lines 1662 1662
Branches 206 206
=========================================
Hits 888 888
Misses 692 692
Partials 82 82 Continue to review full report at Codecov.
|
} else { | ||
LOGGER.warn("Loaded fast serdes classes number {}, with limit set to {}", loadClassNum, loadClassLimit); | ||
} | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will be clearer to just throw the exception instead of returning null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will fix it.
* @param limit | ||
* custom number {@link #generatedFastSerDesLimit} | ||
*/ | ||
public FastSerdeCache(int limit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add other two constructors? Maybe this one is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No harm to add these two constructors :) For exmaple, I am using FastSerdeCache(Executor executorService, int limit)
in the test.
T result = null; | ||
if (this.generatedSerDesNum.get() < this.generatedFastSerDesLimit) { | ||
result = supplier.get(); | ||
} else if (this.generatedSerDesNum.get() == this.generatedFastSerDesLimit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am seeing there are two counters, which are being maintained independently in FastSerdeCache and FastSerdeBase, and I think we should combine and just use one.
The idea in my mind:
- Maintaining the counter and limit in FastSerdeCache.
- Passing the limit enforcement function to FastSerdeBase via all the Generator classes.
- The limit enforcement function can be this way:
private int generatedSerDeNum;
private int generatedFastSerDeLimit;
Predicate<Boolean> limitPredicate = (whetherIncrementCounter) -> {
synchronized (this) {
if (generatedSerDeNum >= generatedFastSerDeLimit) {
return false;
}
if (whetherIncrementCounter) {
++generatedSerDeNum;
}
return true;
}
};
- In FastSerdeCache and all kinds of generators before the fast class generation, we should call
limitPredicate(false)
, and fail fast when this predicate returns false. - In FastSerdeBase, before loading the new generated class, we should call
limitPredicate(true)
, and fail fast if the predicate return false.
Essentially, the idea is to keep the counting logic in a single place to make it consistent.
Invoking this function in FastSerdeCache and various Generators is to try to avoid useless work.
Let me know if you want to have a sync up about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explaination!
This approach indeed helps to reduce two counters to one. However, it also leads to redundant code-gens and compilations which are CPU intensive tasks. IIUC, there's no limit for the code-gen and compilation before we really load generatedFastSerDeLimit
number of fast classes. So the corner case is fast-avro may generate and compile a great number of fast classes, and then throw them away.
For the current implementation, we at most generate extra N - 1 fast classes, N is the threads number of FastSerdeCache Executor. So I think the current implementation is better. What do you think?
The config sets the maximum number of fast SerDes classes
generated and loaded. It helps to limit the metaspace and codecache
usage brought by fast-avro runtime code-gen and class loading.
The limit config can be set via FastSerdeCache constructors.