-
Notifications
You must be signed in to change notification settings - Fork 225
Add Request Timeouts #1431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Request Timeouts #1431
Conversation
api/iceberg-service/build.gradle.kts
Outdated
@@ -45,6 +45,9 @@ dependencies { | |||
implementation("com.fasterxml.jackson.core:jackson-annotations") | |||
implementation("com.fasterxml.jackson.core:jackson-core") | |||
implementation("com.fasterxml.jackson.core:jackson-databind") | |||
|
|||
compileOnly(platform(libs.quarkus.bom)) | |||
compileOnly("io.quarkus:quarkus-smallrye-fault-tolerance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one doubt I have about this PR is whether we can safely add these dependencies here -- or what the alternative would be if that's not recommended.
cc @snazy / @dimas-b / @singhpk234
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... Why do we need this dep? @RichardLiu2001 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify: Even if we go with server-side timeouts, I believe this control should be applied to implementations, not to API definitions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more detail here @dimas-b? Within dropwizard, we used a filter for this because that seemed like the best narrow waist for a top-level timeout like this. How can we cover all of the APIs if we push this into implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this dep?
This dep is so that IcebergRestCatalogApi
can compile.
believe this control should be applied to implementations, not to API definitions
Why? Is it to avoid adding a Quarkus dependency in the API? If so, what is wrong with that?
We also have @Timed
at the API level. Why is that acceptable but @Timeout
isn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Timed
is non-intrusive, hence less of a concern, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compile-only deps should not leak to POM, I guess.
In the end, since API is generated and implementations are meant to be used together with the generated API, I guess it does not make a lot of different from the maintenance POV where the annotation is places.
Sorry about the false alarm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to not import Quarkus but rather only the dependency that defines the annotation: org.eclipse.microprofile.fault-tolerance:microprofile-fault-tolerance-api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, cf. #1431 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced the dependency to use just microprofile-fault-tolerance-api
rather than bringing in the Quarkus.
@@ -108,6 +110,7 @@ public class {{classname}} { | |||
@Produces({ {{#produces}}"{{{mediaType}}}"{{^-last}}, {{/-last}}{{/produces}} }){{/hasProduces}}{{#hasAuthMethods}} | |||
{{#authMethods}}{{#isOAuth}}@RolesAllowed("**"){{/isOAuth}}{{/authMethods}}{{/hasAuthMethods}} | |||
@Timed("{{metricsPrefix}}.{{baseName}}.{{nickname}}") | |||
@Timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the timeout actually defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this change is beneficial... Clients should control their own timeouts (in general).
If this annotation cancels the related HTTP request, will it actually cancel the server-side processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within Open Catalog, server-side timeouts have been very useful for protecting the service from very long-running requests. Clients can of course have their own timeouts, but I think server side timeouts are very beneficial.
Agreed that this only makes sense if the server-side processing actually gets cancelled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the timeout actually defined?
Quarkus configs - I added a default value of 60s to the default application.properties
.
will it actually cancel the server-side processing?
The API method will throw a TimeoutException
which gets appropriately mapped in the IcebergExceptionMapper
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API method will throw a TimeoutException
That is clear :) What happens to the server threads that are still executing the request that timed out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading the docs :) I guess the server thread will get a simple java interrupt, which does look useful. Sorry, I should have paid more attention to this PR initially 🤦
It LGTM (assuming longer default timeput), but let's collect some more reviews, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is 1 second. To disable the Timeout, we can disable Quarkus fault tolerance which I did in application.properties
.
@@ -86,6 +86,9 @@ quarkus.otel.sdk.disabled=true | |||
|
|||
quarkus.test.integration-test-profile=it | |||
|
|||
quarkus.fault-tolerance.global.timeout.unit=seconds | |||
quarkus.fault-tolerance.global.timeout.value=60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, 60 sec default timeout is too low. I believe we should set the default to allow most calls. Administrators concerned with overload, will have to lower this config based on their specific requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, maybe the default can be something like 600s or even disabled (is that possible?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to no timeout by default (if possible), otherwise 600 sec looks reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disabled it via application.properties
.
api/iceberg-service/build.gradle.kts
Outdated
@@ -45,6 +45,9 @@ dependencies { | |||
implementation("com.fasterxml.jackson.core:jackson-annotations") | |||
implementation("com.fasterxml.jackson.core:jackson-core") | |||
implementation("com.fasterxml.jackson.core:jackson-databind") | |||
|
|||
compileOnly(platform(libs.quarkus.bom)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we do not need the Quarkus BOM here. Could we add a specific local version for org.eclipse.microprofile.fault-tolerance:microprofile-fault-tolerance-api
to the TOML file? Since the dep is compile-only, version mismatches should not be a problem. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this was my main concern initially, but I did not property research and articulate it 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a version for microprofile-fault-tolerance-api
to the toml file, which is the same version brought in by the current version of Quarkus.
@@ -73,6 +73,7 @@ javax-servlet-api = { module = "javax.servlet:javax.servlet-api", version = "4.0 | |||
junit-bom = { module = "org.junit:junit-bom", version = "5.12.2" } | |||
logback-classic = { module = "ch.qos.logback:logback-classic", version = "1.5.18" } | |||
micrometer-bom = { module = "io.micrometer:micrometer-bom", version = "1.14.6" } | |||
microprofile-fault-tolerance-api = { module = "org.eclipse.microprofile.fault-tolerance:microprofile-fault-tolerance-api", version = "4.1.1" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the microprofile-fault-tolerance-api
version that is brought in by the current version of Quarkus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we instead rely on the Quarkus BOM for this to prevent conflict? i.e. we would import it as:
dependencies {
implementation(enforcedPlatform(libs.quarkus.bom))
implementation("org.eclipse.microprofile.fault-tolerance:microprofile-fault-tolerance-api")
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We never use microprofile-fault-tolerance-api
directly as an implementation dependency. We just bring it in as a compile time dependency so that the files in the api / service modules can import the timeout classes and avoid bringing in all of the quarkus dependencies. The runtime dependency will be brought in the quarkus/service module via
implementation("io.quarkus:quarkus-smallrye-fault-tolerance")
which does use the Quarkus BOM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sorry the implementation
part was not really my focus, I suppose it would look like:
compileOnly(enforcedPlatform(libs.quarkus.bom))
compileOnly("org.eclipse.microprofile.fault-tolerance:microprofile-fault-tolerance-api")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does bringing in the Quarkus BOM prevent conflict? As none of the places where I added the compile time dependency bring in Quarkus, what would it conflict with?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compile-only deps have very little effect downstream, if I'm not mistaken. The only tricky situation I can imagine is when annotation packages or class names change... which is very unlikely for a mature library.
I do not think pulling Quarkus into API modules is justified for that.
During the server build, Quarkus will (should) bump all versions according to its platform deps anyway.
However, we should probably have a Quarkus integration test that overrides the timeout to a low value and validates that requests do get cancelled. This should validate proper annotation handling (in case of class name changes).
@@ -42,6 +42,8 @@ dependencies { | |||
compileOnly("io.micrometer:micrometer-core") | |||
|
|||
implementation(libs.slf4j.api) | |||
|
|||
compileOnly(libs.microprofile.fault.tolerance.api) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe move up for this to be together with other annotation deps?
@@ -73,6 +73,7 @@ javax-servlet-api = { module = "javax.servlet:javax.servlet-api", version = "4.0 | |||
junit-bom = { module = "org.junit:junit-bom", version = "5.12.2" } | |||
logback-classic = { module = "ch.qos.logback:logback-classic", version = "1.5.18" } | |||
micrometer-bom = { module = "io.micrometer:micrometer-bom", version = "1.14.6" } | |||
microprofile-fault-tolerance-api = { module = "org.eclipse.microprofile.fault-tolerance:microprofile-fault-tolerance-api", version = "4.1.1" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compile-only deps have very little effect downstream, if I'm not mistaken. The only tricky situation I can imagine is when annotation packages or class names change... which is very unlikely for a mature library.
I do not think pulling Quarkus into API modules is justified for that.
During the server build, Quarkus will (should) bump all versions according to its platform deps anyway.
However, we should probably have a Quarkus integration test that overrides the timeout to a low value and validates that requests do get cancelled. This should validate proper annotation handling (in case of class name changes).
@@ -86,6 +86,8 @@ quarkus.otel.sdk.disabled=true | |||
|
|||
quarkus.test.integration-test-profile=it | |||
|
|||
quarkus.fault-tolerance.enabled=false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, but I'd still prefer to keep actual timeout properties as an example too (perhaps commented out)
@@ -90,6 +90,8 @@ dependencies { | |||
implementation("com.azure:azure-storage-blob") | |||
implementation("com.azure:azure-storage-file-datalake") | |||
|
|||
compileOnly(libs.microprofile.fault.tolerance.api) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be an implementation
dep because the exception mapper references these classes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
@@ -98,6 +100,8 @@ dependencies { | |||
|
|||
testImplementation(libs.logback.classic) | |||
|
|||
testImplementation(libs.microprofile.fault.tolerance.api) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed for the IcebergExceptionMapperTest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for this @RichardLiu2001
@Timeout
to request endpoints, configured with a default of 60 seconds.