Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macro to easily create rate-limited logs #76

Open
fisherdarling opened this issue Oct 9, 2024 · 0 comments
Open

Macro to easily create rate-limited logs #76

fisherdarling opened this issue Oct 9, 2024 · 0 comments

Comments

@fisherdarling
Copy link
Collaborator

We added an error log line for an internal lib that we wanted to make sure was only called a few times. It could have been potentially called on a per-connection basis, so we were concerned with log volume. The initial implementation used a simple atomic counter that limited invocations of the line 1000 times. @Noah-Kennedy contributed a patch for a token-bucket based approach which limited the number of invocations that can be sent at once (a burst), but allowed for an indefinite amount (no cap).

Feature request: Can the foundations logging macros add the ability to setup a rate-limiter for a log line?

Something like:

log::warn_ratelimited!("failed to process QUIC initial packet"; "error" => e, MAX_BURST=20, RESET_PERIOD=60, TOKENS_PER_PERIOD=3);

The macro would allow one to configure how quickly the bucket fills back up and the maximum burst rate. It then expand into something like the below redacted implementation:

/// Use a token bucket to log QUIC initials `Unexpected` errors, but not too quic(k)ly, so that we
/// can get an idea of the verbosity and types of errors we see.
///
/// This token bucket gets additional tokens once per minute, and is allowed to have up to 20
/// tokens in reserve to burst above its normal rate limits if logging is slower than the maximum
/// normal rate.
///
/// Some log investigation shows log rates between 200 lines per minute down to just
/// 1 line in 24h. Hence, we choose a conservative rate of up to 3 lines per minute to strike
/// a balance between added log load and useful results. Based on existing metrics, we expect
/// these types of errors to be very rare.
///
/// The max burst threshold will allow us to potentially burst a bit above if we get a sudden spike
/// of these logs, but not for too long, and only if we already aren't logging too much.
fn log_unexpected_initial_error(e: &io::Error) {
    use foundations::telemetry::log;
    use std::sync::Once;
    use std::sync::atomic::{AtomicU64, Ordering};
    use std::cmp::min;

    const TOKEN_BUCKET_MAX_BURST: u64 = 20;
    const TOKEN_BUCKET_RESET_PERIOD: Duration = Duration::from_secs(60);
    const TOKENS_PER_PERIOD: u64 = 3;

    // give a full burst initially, just so we can see things better on startup
    static LOG_COUNT: AtomicU64 = AtomicU64::new(TOKEN_BUCKET_MAX_BURST);
    static FILLER_INIT: Once = Once::new();

    async fn bucket_filler() {
        // by default missed tick behavior is burst
        let mut interval = tokio::time::interval(TOKEN_BUCKET_RESET_PERIOD);
        loop {
            interval.tick().await;

            // Skip update if v == TOKEN_BUCKET_MAX_BURST by returning None
            let _ = LOG_COUNT.fetch_update(
                Ordering::AcqRel,
                Ordering::Relaxed,
                |v| (v < TOKEN_BUCKET_MAX_BURST)
                    .then_some(min(v + TOKENS_PER_PERIOD, TOKEN_BUCKET_MAX_BURST))
            );
        }
    }
    FILLER_INIT.call_once(|| _ = tokio::spawn(bucket_filler()));

    let token_acquired = LOG_COUNT.fetch_update(
        Ordering::AcqRel,
        Ordering::Relaxed,
        |v| (v > 0).then_some(v - 1),
    ).is_ok();
    if token_acquired {
        log::warn!("failed to process QUIC initial packet"; "error" => e);
    }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant