-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-37510] Avoid duplicate calculations for SlidingEventTimeWindows #26319
base: master
Are you sure you want to change the base?
Conversation
LGTM |
ping @aljoscha @StephanEwen cc @1996fanrui @davidradl |
@@ -71,7 +71,8 @@ public Collection<TimeWindow> assignWindows( | |||
if (timestamp > Long.MIN_VALUE) { | |||
List<TimeWindow> windows = new ArrayList<>((int) (size / slide)); | |||
long lastStart = TimeWindow.getWindowStartWithOffset(timestamp, offset, slide); | |||
for (long start = lastStart; start > timestamp - size; start -= slide) { | |||
long lower = timestamp - size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we add final to this and long lastStart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks for the improvement!
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably I missed something
however can you please describe what is the issue here and how we could check that it is gone after this commit?
It's obviously. I have added the description. |
Thanks for updating the description I tried to see the difference with help of jmh @BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@Fork(value = 2, jvmArgs = {"-Xms2G", "-Xmx2G"})
public class FlinkTest {
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(FlinkTest.class.getSimpleName())
.build();
new Runner(opt).run();
}
@State(Scope.Thread)
public static class MyState {
public long offset = 10000;
public long timestamp = 2000;
public long size = 1234;
public long slide = 12;
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
public void current(Blackhole blackhole, MyState state) {
long offset = state.offset;
long timestamp = state.timestamp;
long size = state.size;
long slide = state.slide;
List<TimeWindow> windows = new ArrayList<>((int) (size / slide));
long lastStart = TimeWindow.getWindowStartWithOffset(timestamp, offset, slide);
for (long start = lastStart; start > timestamp - size; start -= slide) {
windows.add(new TimeWindow(start, start + size));
}
blackhole.consume(windows);
}
@Benchmark
@BenchmarkMode(Mode.Throughput)
public void improved(Blackhole blackhole, MyState state) {
long offset = state.offset;
long timestamp = state.timestamp;
long size = state.size;
long slide = state.slide;
List<TimeWindow> windows = new ArrayList<>((int) (size / slide));
final long lastStart = TimeWindow.getWindowStartWithOffset(timestamp, offset, slide);
final long lower = timestamp - size;
for (long start = lastStart; start > lower; start -= slide) {
windows.add(new TimeWindow(start, start + size));
}
blackhole.consume(windows);
}
} my results
for jdk21
so I don't see any significant difference (probably either this optimization has been already done by jdk itself or the rest of the code consumes much more time so the improvement brings so low benefit that it could be neglected) So the question is still same from my side: how can we check that this PR fixes something? Or did I miss anything? |
@snuyanzin The benchmark ignored some questions. Obviously, the overhead of minus is very small. Otherwise, we should doubt the availability of JAVA. Second, The JVM will optimize it after interpretation. |
On the other hand, avoid duplicate expressions is always a best practices know for anybody. |
ping @aljoscha @StephanEwen |
@snuyanzin Could you have time to review again ? This Improvement is just avoid the duplicate calculation which is not have a very strong improvement in performance. |
What is the purpose of the change
This PR aim to avoid duplicate calculations for
SlidingEventTimeWindows
.The original code put the expression
timestamp - size
into the loop, so it causes duplicate calculation.We should put them out of the loop and just calculate once.
Brief change log
Avoid duplicate calculations for
SlidingEventTimeWindows
Verifying this change
This change is already covered by existing tests, such as (SlidingEventTimeWindowsTest).
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation