Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close TLS and QUIC links on certificate expiration #1564

Merged
merged 23 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
188c460
Add boilerplate for spawning expiration task
oteffahi Oct 22, 2024
032a54b
Add initial implementation of expiration_task
oteffahi Oct 23, 2024
c6b0735
Rework logic to avoid unsoundness of sleep_until with long durations
oteffahi Oct 24, 2024
b4ba3f5
Fix instant resolve of mpsc channel recv after closing it
oteffahi Oct 24, 2024
c1042e7
Integrate expiration task in listen/connect logic
oteffahi Oct 24, 2024
8e35114
Clarify error message
oteffahi Oct 24, 2024
59a3944
Move cert expiration logic to zenoh-link-commons crate, replace mpsc …
oteffahi Oct 25, 2024
fd20c5e
Add cert expiration to quic link
oteffahi Oct 25, 2024
65d3b41
Fix formatting
oteffahi Oct 25, 2024
efb6cb9
Switch design to one expiration task per link instance
oteffahi Nov 8, 2024
05f0bc2
Fix constant name in comment
oteffahi Nov 8, 2024
06d43a3
Add trace log for expiration task
oteffahi Nov 8, 2024
ab0cb1a
Use tokio::sync::OnceCell instead of std::sync::OnceLock
oteffahi Nov 8, 2024
2f4613c
Make close link on certificate expiration configurable from json file
oteffahi Nov 12, 2024
e583532
Fix computation of sleep duration from expiration time
oteffahi Nov 15, 2024
d2e873c
Use Arc::new_cyclic instead of OnceCell, Remove LinkCertExpirationInf…
oteffahi Nov 15, 2024
171ec01
Refactor sleep logic out of expiration task
oteffahi Nov 15, 2024
5c98c32
Move start log from links to expiration task
oteffahi Nov 15, 2024
2f25983
chore: Rename variable
oteffahi Nov 15, 2024
6e2a6e8
Replace String with &'static str
oteffahi Nov 15, 2024
bde2aad
Avoid calling get_cert_chain_expiration_time when expiration is not m…
oteffahi Nov 15, 2024
cb0d6c4
Rework to grant exclusive access to close operation and properly hand…
oteffahi Nov 19, 2024
bdba659
Optimize concurrency of closing link and cancelation of expiration_task
oteffahi Nov 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ socket2 = { version = "0.5.7", features = ["all"] }
stop-token = "0.7.0"
syn = "2.0"
tide = "0.16.0"
time = "0.3.36"
token-cell = { version = "1.5.0", default-features = false }
tokio = { version = "1.40.0", default-features = false } # Default features are disabled due to some crates' requirements
tokio-util = "0.7.12"
Expand Down
4 changes: 4 additions & 0 deletions DEFAULT_CONFIG.json5
Original file line number Diff line number Diff line change
Expand Up @@ -466,6 +466,10 @@
// This could be dangerous because your CA can have signed a server cert for foo.com, that's later being used to host a server at baz.com. If you wan't your
// ca to verify that the server at baz.com is actually baz.com, let this be true (default).
verify_name_on_connect: true,
// Whether or not to close links when remote certificates expires.
// If set to true, links that require certificates (tls/quic) will automatically disconnect when the time of expiration of the remote certificate chain is reached
// note that mTLS (client authentication) is required for a listener to disconnect a client on expiration
close_link_on_expiration: false,
},
},
/// Shared memory configuration.
Expand Down
1 change: 1 addition & 0 deletions commons/zenoh-config/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -483,6 +483,7 @@ validated_struct::validator! {
connect_private_key: Option<String>,
connect_certificate: Option<String>,
verify_name_on_connect: Option<bool>,
close_link_on_expiration: Option<bool>,
// Skip serializing field because they contain secrets
#[serde(skip_serializing)]
root_ca_certificate_base64: Option<SecretValue>,
Expand Down
1 change: 1 addition & 0 deletions io/zenoh-link-commons/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ futures = { workspace = true }
rustls = { workspace = true, optional = true }
rustls-webpki = { workspace = true, optional = true }
serde = { workspace = true, features = ["default"] }
time = { workspace = true }
tokio = { workspace = true, features = [
"fs",
"io-util",
Expand Down
118 changes: 118 additions & 0 deletions io/zenoh-link-commons/src/tls.rs
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,121 @@ impl WebPkiVerifierAnyServerName {
Self { roots }
}
}

pub mod expiration {
use std::sync::Weak;

use time::OffsetDateTime;
use tokio::task::JoinHandle;
use tokio_util::sync::CancellationToken;
use zenoh_protocol::core::Locator;

use crate::LinkUnicastTrait;

#[derive(Debug)]
pub struct LinkCertExpirationManager {
pub token: CancellationToken,
handle: Option<JoinHandle<()>>,
}

impl LinkCertExpirationManager {
pub fn new(
expiration_info: LinkCertExpirationInfo,
token: Option<CancellationToken>,
) -> Self {
let token = token.unwrap_or_default();
let handle = zenoh_runtime::ZRuntime::Acceptor
.spawn(expiration_task(expiration_info, token.clone()));
Self {
token,
handle: Some(handle),
}
}
}

// Cleanup expiration task when manager is dropped
impl Drop for LinkCertExpirationManager {
fn drop(&mut self) {
if let Some(handle) = self.handle.take() {
zenoh_runtime::ZRuntime::Acceptor.block_in_place(async {
self.token.cancel();
let _ = handle.await;
})
}
}
}

async fn expiration_task(expiration_info: LinkCertExpirationInfo, token: CancellationToken) {
// TODO: Expose or tune sleep duration
const MAX_EXPIRATION_SLEEP_DURATION: tokio::time::Duration =
tokio::time::Duration::from_secs(600);

tracing::trace!(
"Expiration task started for link {} => {}",
expiration_info.src_locator,
expiration_info.dst_locator,
);

loop {
let now = OffsetDateTime::now_utc();
if expiration_info.expiration_time <= now {
// close link
if let Some(link) = expiration_info.link.upgrade() {
tracing::warn!(
"Closing link {} => {} : remote certificate chain expired",
expiration_info.src_locator,
expiration_info.dst_locator,
);
if let Err(e) = link.close().await {
tracing::error!(
"Error closing link {} => {} : {}",
expiration_info.src_locator,
expiration_info.dst_locator,
e
)
}
}
break;
}
// next sleep duration is the minimum between MAX_EXPIRATION_SLEEP_DURATION and the duration till next expiration
// this mitigates the unsoundness of using `tokio::time::sleep_until` with long durations
OlivierHecart marked this conversation as resolved.
Show resolved Hide resolved
let next_expiration_duration = std::time::Duration::from_secs_f32(
(expiration_info.expiration_time - now).as_seconds_f32(),
);
let next_wakeup_instant = tokio::time::Instant::now()
+ tokio::time::Duration::min(
MAX_EXPIRATION_SLEEP_DURATION,
next_expiration_duration,
);

tokio::select! {
_ = token.cancelled() => break,
_ = tokio::time::sleep_until(next_wakeup_instant) => {},
}
}
}

pub struct LinkCertExpirationInfo {
// Weak is used instead of Arc, in order to allow cleanup at Drop of the underlying link which owns the expiration manager
link: Weak<dyn LinkUnicastTrait>,
expiration_time: OffsetDateTime,
src_locator: Locator,
dst_locator: Locator,
}

impl LinkCertExpirationInfo {
pub fn new(
link: Weak<dyn LinkUnicastTrait>,
expiration_time: OffsetDateTime,
src_locator: &Locator,
dst_locator: &Locator,
) -> Self {
Self {
link,
expiration_time,
src_locator: src_locator.clone(),
dst_locator: dst_locator.clone(),
}
}
}
}
1 change: 1 addition & 0 deletions io/zenoh-links/zenoh-link-quic/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ rustls-pemfile = { workspace = true }
rustls-pki-types = { workspace = true }
rustls-webpki = { workspace = true }
secrecy = { workspace = true }
time = { workspace = true }
tokio = { workspace = true, features = [
"fs",
"io-util",
Expand Down
3 changes: 3 additions & 0 deletions io/zenoh-links/zenoh-link-quic/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -112,4 +112,7 @@ pub mod config {

pub const TLS_VERIFY_NAME_ON_CONNECT: &str = "verify_name_on_connect";
pub const TLS_VERIFY_NAME_ON_CONNECT_DEFAULT: bool = true;

pub const TLS_CLOSE_LINK_ON_EXPIRATION: &str = "close_link_on_expiration";
pub const TLS_CLOSE_LINK_ON_EXPIRATION_DEFAULT: bool = false;
}
83 changes: 78 additions & 5 deletions io/zenoh-links/zenoh-link-quic/src/unicast.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,16 @@ use std::{

use async_trait::async_trait;
use quinn::crypto::rustls::{QuicClientConfig, QuicServerConfig};
use tokio::sync::Mutex as AsyncMutex;
use time::OffsetDateTime;
use tokio::sync::{Mutex as AsyncMutex, OnceCell};
use tokio_util::sync::CancellationToken;
use x509_parser::prelude::*;
use x509_parser::prelude::{FromDer, X509Certificate};
use zenoh_core::zasynclock;
use zenoh_link_commons::{
get_ip_interface_names, LinkAuthId, LinkAuthType, LinkManagerUnicastTrait, LinkUnicast,
LinkUnicastTrait, ListenersUnicastIP, NewLinkChannelSender,
get_ip_interface_names,
tls::expiration::{LinkCertExpirationInfo, LinkCertExpirationManager},
LinkAuthId, LinkAuthType, LinkManagerUnicastTrait, LinkUnicast, LinkUnicastTrait,
ListenersUnicastIP, NewLinkChannelSender,
};
use zenoh_protocol::{
core::{EndPoint, Locator},
Expand All @@ -48,6 +51,7 @@ pub struct LinkUnicastQuic {
send: AsyncMutex<quinn::SendStream>,
recv: AsyncMutex<quinn::RecvStream>,
auth_identifier: LinkAuthId,
expiration_manager: OnceCell<LinkCertExpirationManager>,
}

impl LinkUnicastQuic {
Expand All @@ -68,6 +72,7 @@ impl LinkUnicastQuic {
send: AsyncMutex::new(send),
recv: AsyncMutex::new(recv),
auth_identifier,
expiration_manager: OnceCell::new(),
}
}
}
Expand Down Expand Up @@ -259,6 +264,8 @@ impl LinkManagerUnicastTrait for LinkManagerUnicastQuic {
.map_err(|e| zerror!("Can not create a new QUIC link bound to {}: {}", host, e))?;

let auth_id = get_cert_common_name(&quic_conn)?;
let certchain_expiration_time =
get_cert_chain_expiration(&quic_conn)?.expect("server should have certificate chain");

let link = Arc::new(LinkUnicastQuic::new(
quic_conn,
Expand All @@ -269,6 +276,20 @@ impl LinkManagerUnicastTrait for LinkManagerUnicastQuic {
auth_id.into(),
));

if client_crypto.tls_close_link_on_expiration {
// setup expiration manager
let link_trait_object: Arc<dyn LinkUnicastTrait> = link.clone();
let expiration_info = LinkCertExpirationInfo::new(
Arc::downgrade(&link_trait_object),
certchain_expiration_time,
&link.src_locator,
&link.dst_locator,
);
link.expiration_manager
.set(LinkCertExpirationManager::new(expiration_info, None))
.expect("should be the only call to initialize expiration manager");
}

Ok(LinkUnicast(link))
}

Expand Down Expand Up @@ -337,7 +358,15 @@ impl LinkManagerUnicastTrait for LinkManagerUnicastQuic {
let token = token.clone();
let manager = self.manager.clone();

async move { accept_task(quic_endpoint, token, manager).await }
async move {
accept_task(
quic_endpoint,
token,
manager,
server_crypto.tls_close_link_on_expiration,
)
.await
}
};

// Initialize the QuicAcceptor
Expand Down Expand Up @@ -369,6 +398,7 @@ async fn accept_task(
quic_endpoint: quinn::Endpoint,
token: CancellationToken,
manager: NewLinkChannelSender,
tls_close_link_on_expiration: bool,
) -> ZResult<()> {
async fn accept(acceptor: quinn::Accept<'_>) -> ZResult<quinn::Connection> {
let qc = acceptor
Expand Down Expand Up @@ -419,6 +449,9 @@ async fn accept_task(
// Get Quic auth identifier
let auth_id = get_cert_common_name(&quic_conn)?;

// Get certificate chain expiration
let maybe_expiration_time = get_cert_chain_expiration(&quic_conn)?;
oteffahi marked this conversation as resolved.
Show resolved Hide resolved

tracing::debug!("Accepted QUIC connection on {:?}: {:?}", src_addr, dst_addr);
// Create the new link object
let link = Arc::new(LinkUnicastQuic::new(
Expand All @@ -430,6 +463,28 @@ async fn accept_task(
auth_id.into()
));

if tls_close_link_on_expiration {
// setup expiration manager if applicable
if let Some(certchain_expiration_time) = maybe_expiration_time {
let link_trait_object: Arc<dyn LinkUnicastTrait> = link.clone();
let expiration_info = LinkCertExpirationInfo::new(
Arc::downgrade(&link_trait_object),
certchain_expiration_time,
&link.src_locator,
&link.dst_locator,
);
link.expiration_manager
.set(LinkCertExpirationManager::new(expiration_info, Some(token.child_token())))
.expect("should be the only call to initialize expiration manager");
} else {
tracing::warn!(
"Cannot monitor expiration for link {} -> {} : client does not have certificates",
link.src_locator,
link.dst_locator
);
}
}

// Communicate the new link to the initial transport manager
if let Err(e) = manager.send_async(LinkUnicast(link)).await {
tracing::error!("{}-{}: {}", file!(), line!(), e)
Expand Down Expand Up @@ -475,6 +530,24 @@ fn get_cert_common_name(conn: &quinn::Connection) -> ZResult<QuicAuthId> {
Ok(auth_id)
}

/// Returns the minimum value of the `not_after` field in the remote certificate chain.
/// Returns `None` if the remote certificate chain is empty
fn get_cert_chain_expiration(conn: &quinn::Connection) -> ZResult<Option<OffsetDateTime>> {
let mut link_expiration: Option<OffsetDateTime> = None;
if let Some(pi) = conn.peer_identity() {
if let Ok(remote_certs) = pi.downcast::<Vec<rustls_pki_types::CertificateDer>>() {
for cert in *remote_certs {
let (_, cert) = X509Certificate::from_der(cert.as_ref())?;
let cert_expiration = cert.validity().not_after.to_datetime();
link_expiration = link_expiration
.map(|current_min| current_min.min(cert_expiration))
.or(Some(cert_expiration));
}
}
}
Ok(link_expiration)
}

#[derive(Debug, Clone)]
struct QuicAuthId {
auth_value: Option<String>,
Expand Down
Loading