-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize Catalog-Endpoint Performance #4499
Comments
could you please provide a detailed benchmark trying to highlight after how many contract definitions the performances start to degrade and what's the time factor? EDIT: it was a typo, I was referring to contract definitions |
I 100% agree with what @ndr-brt said - provide speeds and feeds, otherwise it's all conjecture. Querying with a If anything, I would think about a local policy cache, i.e. a But again, we can't do anything without concrete numbers. |
Thank you for your prompt responses. The performance degradation is not tied to any contract negotiation process, but rather scales with the number of To illustrate this, consider the following example: resolving a policy by its ID from the database takes approximately 1ms. With 100,000 |
first, is that an actual measurement? if so, can you share the results? like I said, a local policy cache would be a good option, but unless you provide hard facts (not fictitious numbers) there is nothing we can do. edit: database queries do not scale linearly, and are influenced by many factors, so I think your calculation is a bit too simplistic. we should base our implementation on facts, not conjecture. |
it was a typo 🤭
as we already stated, please provide more details. (btw: if you have 100.000 definitions, is very likely that you have modeled them in the wrong way. read: the pattern 1 asset/1 policy/1 contract definition is definitely not correct and needs to be avoided) |
I just wanted to give an example If I understand correctly, you suggest that the response time for the catalog is acceptable when dealing with a reasonable number of I don't know what you think is a reasonable number, but even if there are just ~100 Wouldn't you agree that there is still potential for optimization here? |
What we are asking for is a benchmark. How do you know the response time will be reduced by ~100ms? It certainly may be the case, but on what basis is this asserted? What is the test case, how was the database and other infrastructure configured, and what were the actual numbers? Did you profile the behavior? What bottlenecks were observed? |
Guys, I believe we do not need a benchmark to understand that any N+1 select problem on a highly used and visited endpoint such as the catalog, that scales over any of the EDC entities, is an issue. I understand the challenges the EDC faces with its design decision to split the stores with accessing even its own DB. I believe we can still find a good solution by using batched DB queries. The following is a common pattern I have met in many JPA code bases, as JPA code bases also often face N+1 select issues: var policyIds = contractDefinitions.stream()
.map(it -> it.getPolicyId())
.collect(toSet());
var policiesById = policyService.findByIdIn(policyIds).stream()
.collect(toMap(it -> it.getId(), Function.identity()))
return contractDefinitions.stream()
.map(contractDefinition -> {
var policy = policiesById.get(contractDefinition.getPolicyId());
return buildContractOffer(contractDefinition, policy);
})
.toList(); Trying to change EDC usage also does not help here:
|
Three committers disagree with you and are in agreement that we do need a benchmark and hard facts.
one has been offered, in the form of a local policy cache. Fetching the policies in batch for all the contract defs (that are materialized in memory anyway) is just a way of populating the cache.
This is not JPA.
measurements? benchmarks? timing results? what does the deployment situation look like? is this geo-colocated? is database caching enabled? also, it is likely the connection establishment that takes time, and connections are pooled.
this has nothing to do with offers To be honest, this is starting to annoy me. We have asked for hard data, which you seem incapable of or unwilling to provide. We even offered a solution, that would alleviate this supposed problem, while keeping our entity data model and object model intact. It seems I have to be blunt: unless you clearly demonstrate a problem, supported by hard facts (benchmarks, profiling data, timings, etc.), we will not invest any more time in this. |
Other than agreeing with @paullatzelsperger , taking a look at the code, from a "simplicity" point of view, I see that the I like the ids idea brought by @richardtreier but, really, we need data to take a decision that will impact countless adopters |
that was exactly my point. |
Please note, our facts regarding this issue are not purely related to this issue, hence the hesitation.
We have a connector running in production that was running into timeouts of our reverse proxy and this is where our investigation began.
We wanted to only raise the N+1 select issue here, since we need time to fully test things in 0.7. From my perspective, the fact that the catalog query by the asset ID that excludes the asset N+1 select issue still takes 1-5s was a sign that this issue was worth writing. Regarding N+1 Select problems generally:
Having a hundred of something really happens too easily. And even if it happens, even if by accident, you'd really notice. Thus they are a nice low hanging fruit to take care of, with the rule of thumb, not to call the DB in any loop. Whether it is a |
@richardtreier thanks for your efforts, but it goes without saying that we will require those numbers based on The fix, if one will be created, will only land on the |
Awesome, thanks for incorparating those changes! |
Feature Request
The performance of the Catalog-Endpoint degrades as the number of
ContractDefinitions
increases. This appears to be due to an n+1 query issue within theContractDefinitionResolverImpl
. Currently, for eachContractDefinition
, the corresponding access policy is resolved by its ID, resulting in a separate database query for each policy.This issue seems to persists in the
main
branch. The relevant code can be found here:Connector/core/control-plane/control-plane-contract/src/main/java/org/eclipse/edc/connector/controlplane/contract/offer/ContractDefinitionResolverImpl.java
Lines 72 to 89 in dfa1be6
Which Areas Would Be Affected?
Why Is the Feature Desired?
As the number of
ContractDefinitions
grows, the Catalog-Endpoint experiences significant performance degradation due to the increasing number of database queries.Solution Proposal
Implement a database join to retrieve all relevant policies in a single query, reducing the number of queries executed.
The text was updated successfully, but these errors were encountered: