-
Notifications
You must be signed in to change notification settings - Fork 15
Conversation
9a2ae70
to
e3a0686
Compare
long wereNotFullyFulfilled = requests.keySet().stream() | ||
.filter(timestampName -> { | ||
int requestedTimestamps = requests.get(timestampName); | ||
long returnedTimestamps = responses.get(timestampName).getFreshTimestamps().getCount(); | ||
if (returnedTimestamps >= requestedTimestamps) { | ||
return false; | ||
} | ||
telemetryTimestampShortFallConsumer.accept(requestedTimestamps - returnedTimestamps); | ||
return true; | ||
}) | ||
.count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code path will be moved and factored as part of the previous PR reviews, so just review the logic here.
I'll move things later as appropriate
if (returnedTimestamps >= requestedTimestamps) { | ||
return false; | ||
} | ||
telemetryTimestampShortFallConsumer.accept(requestedTimestamps - returnedTimestamps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only want to report shortfall > 0
An alternative is to report shortfall >= 0, or any possible shortfall (or overflow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realistically, we care about the count here rather than the histogram. But, it may be needed eventually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a histogram?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will struggle with interpreting these. If you care about magnitude, just log it. But I'd just log a binary based on wasFullyFulfilled > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and if that turns out to be false and we need more, we'll do that later, but at this point I'm struggling to think why I'd want to see the exact number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a histogram?
More data? Most appropriate would be a meter then? I don't usually like counters because of how they are reported but can use that as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, just a counter
@@ -30,4 +30,10 @@ namespaces: | |||
type: timer | |||
docs: observed call duration during leader election | |||
|
|||
timestampLease: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will generate a class named the same as the other metrics in this PR. I think that's fine
import javax.annotation.concurrent.GuardedBy; | ||
|
||
final class NamedMinTimestampTrackerImpl implements NamedMinTimestampTracker { | ||
private final String timestampName; | ||
private final AtomicInteger numLocksHeld; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Write path: this is modified in
lock
andunlock
calls which are already synchronized - Read path: this will be ran in some telemetry wc thread, and the read seems to be a volatile read only - I can't imagine a less perf-impacting read (other than a final field) if we are to have memory visibility. I guess we could have a variable with no synchronization and hope for the best, the metrics don't need to be 100% sure
import javax.annotation.concurrent.GuardedBy; | ||
|
||
final class NamedMinTimestampTrackerImpl implements NamedMinTimestampTracker { | ||
private final String timestampName; | ||
private final AtomicInteger numLocksHeld; | ||
private final AtomicLong approximateMinLocked; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one I only update when there are calls for getMinimumTimestamp
as calling getFirstKey is O(tree height) in the tree map iirc
|
||
@GuardedBy("this") | ||
private final SortedMap<Long, UUID> holdersByTimestamp = new TreeMap<>(); | ||
|
||
NamedMinTimestampTrackerImpl(String timestampName) { | ||
private NamedMinTimestampTrackerImpl( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also considered adding meters for lock/unlock as generally those give more signal/granularity than a size metric
But, looking at Meter
, updating it involves more CASs etc
@@ -46,6 +64,7 @@ public synchronized void lock(long timestamp, UUID requestId) { | |||
SafeArg.of("requestId", requestId), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note is: the code path is now different than before as this impacts the immutable timestamp. I can special case.
Of course, with all concurrency comments on this PR, I may be too paranoid here
type: gauge | ||
tags: | ||
- name | ||
approximateSmallestLeased: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name to make sure any operator/dev knows this is best-effort and does not over-index
@@ -61,24 +63,38 @@ public final class TimestampLeaseAcquirerImpl implements TimestampLeaseAcquirer | |||
private final NamespacedTimestampLeaseService delegate; | |||
private final Unlocker unlocker; | |||
private final Supplier<UUID> uuidSupplier; | |||
private final Consumer<Long> telemetryTimestampShortFallConsumer; | |||
|
|||
private final Retryer<RequestAndResponse> retryer = RetryerBuilder.<RequestAndResponse>newBuilder() | |||
.retryIfResult(requestAndResponse -> !wasRequestFullyFulfilled(requestAndResponse)) | |||
.retryIfException() | |||
.withStopStrategy(StopStrategies.stopAfterAttempt(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some backoff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline that we won't be adding this now
}); | ||
long wereNotFullyFulfilled = requests.keySet().stream() | ||
.filter(timestampName -> { | ||
int requestedTimestamps = requests.get(timestampName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we validate somewhere that these are non-zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I can add a preconditions before I start the whole retry block?
59546dd
to
c5620cc
Compare
e3a0686
to
cd3afb5
Compare
c5620cc
to
bdc5c3c
Compare
cd3afb5
to
a4e1d8e
Compare
bdc5c3c
to
80fe31c
Compare
a4e1d8e
to
9bdd9e3
Compare
80fe31c
to
db8f15d
Compare
9bdd9e3
to
44ce9a8
Compare
} | ||
|
||
@Override | ||
public synchronized Optional<Long> getMinimumTimestamp() { | ||
if (holdersByTimestamp.isEmpty()) { | ||
return Optional.empty(); | ||
} | ||
return Optional.of(holdersByTimestamp.firstKey()); | ||
Long minimum = holdersByTimestamp.firstKey(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just report it in the same way we do right now in the immutable timestamp case? Just have it be done by Atlas and in the consumer application. Doing this in timelock breaks the usecase here of being able to write dashboard for a particular consumer (or at least maybe makes it more snowflakey/different than how we do all other metrics?)
} | ||
|
||
@Override | ||
public synchronized Optional<Long> getMinimumTimestamp() { | ||
if (holdersByTimestamp.isEmpty()) { | ||
return Optional.empty(); | ||
} | ||
return Optional.of(holdersByTimestamp.firstKey()); | ||
Long minimum = holdersByTimestamp.firstKey(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in general, I'm less bought in to the idea of this metric right now / care about it the least for the initial milestone, so at least split this out into a separate PR, so we can ship the counter first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will spin out a PR only for the shortfall thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
db8f15d
to
218cad2
Compare
44ce9a8
to
2257446
Compare
2257446
to
19d1124
Compare
Setting to draft - need to:
|
68962f6
to
69ff52c
Compare
19d1124
to
d5df92a
Compare
d5df92a
to
8005f25
Compare
Metrics