Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impl Serde for PackedPatternsV1; export Serde impls for PluralElements #5592

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions components/datetime/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ datagen = [
"dep:litemap",
"icu_calendar/datagen",
"icu_timezone/datagen",
"icu_plurals/datagen",
"serde",
"std",
]
Expand Down
118 changes: 114 additions & 4 deletions components/datetime/src/provider/packed_pattern.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ use crate::pattern::runtime::Pattern;

/// A field of [`PackedPatternsBuilder`].
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(serde::Deserialize))] // human-readable only
#[cfg_attr(feature = "datagen", derive(serde::Serialize))] // human-readable only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say this, but there's nothing stopping anyone from using this directly in a data struct, or with a non-human-readable format.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a type exported in a provider module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't stop anyone from using it in a data struct

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm closing this PR based on the other comments, but I don't think I agree with the premise "That doesn't stop anyone from using it in a data struct": I think it is our duty to help prevent common bugs in clients of ICU4X, but it seems wrong to restrict ICU4X development for the sake of preventing ICU4X developers from doing that. We have a high standard for code with thorough review processes and CI in place.

pub struct LengthPluralElements<T> {
/// The "long" length pattern plural elements.
pub long: PluralElements<T>,
Expand All @@ -31,12 +33,20 @@ pub struct LengthPluralElements<T> {

/// A builder for a [`PackedPatternsV1`].
#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(serde::Deserialize))] // human-readable only
#[cfg_attr(feature = "datagen", derive(serde::Serialize))] // human-readable only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a builder type should implement serde.

pub struct PackedPatternsBuilder<'a> {
/// Patterns always available.
#[cfg_attr(feature = "serde", serde(borrow))]
#[cfg_attr(feature = "serde", serde(flatten))]
pub standard: LengthPluralElements<Pattern<'a>>,
/// Patterns for variant 0. If `None`, falls back to standard.
#[cfg_attr(feature = "serde", serde(borrow))]
#[cfg_attr(feature = "serde", serde(skip_serializing_if = "Option::is_none"))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works for human-readable formats, which is not enforced anywhere.

pub variant0: Option<LengthPluralElements<Pattern<'a>>>,
/// Patterns for variant 1. If `None`, falls back to standard.
#[cfg_attr(feature = "serde", serde(borrow))]
#[cfg_attr(feature = "serde", serde(skip_serializing_if = "Option::is_none"))]
pub variant1: Option<LengthPluralElements<Pattern<'a>>>,
}

Expand Down Expand Up @@ -362,6 +372,61 @@ impl PackedPatternsV1<'_> {
}
}

#[cfg(feature = "serde")]
mod _serde {
use super::*;
use zerovec::VarZeroSlice;

#[cfg(feature = "serde")]
#[derive(serde::Deserialize)]
#[cfg_attr(feature = "datagen", derive(serde::Serialize))]
struct PackedPatternsMachine<'data> {
pub header: u32,
#[serde(borrow)]
pub elements: &'data VarZeroSlice<PluralElementsPackedULE<ZeroSlice<PatternItem>>>,
}

impl<'de, 'data> serde::Deserialize<'de> for PackedPatternsV1<'data>
where
'de: 'data,
{
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
if deserializer.is_human_readable() {
let builder = <PackedPatternsBuilder>::deserialize(deserializer)?;
Ok(builder.build())
} else {
let machine = <PackedPatternsMachine>::deserialize(deserializer)?;
Ok(Self {
header: machine.header,
elements: machine.elements.as_varzerovec(),
})
}
}
}

#[cfg(feature = "datagen")]
impl serde::Serialize for PackedPatternsV1<'_> {
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
if serializer.is_human_readable() {
let builder = self.to_builder();
builder.serialize(serializer)
} else {
let machine = PackedPatternsMachine {
header: self.header,
elements: &self.elements,
};
machine.serialize(serializer)
}
}
}
}

#[cfg(test)]
pub mod tests {
use super::*;
Expand All @@ -379,16 +444,20 @@ pub mod tests {
"y MMMM",
];

#[test]
fn test_basic() {
let patterns = PATTERN_STRS
fn get_patterns() -> Vec<Pattern<'static>> {
PATTERN_STRS
.iter()
.map(|s| {
s.parse::<reference::Pattern>()
.unwrap()
.to_runtime_pattern()
})
.collect::<Vec<_>>();
.collect()
}

#[test]
fn test_basic() {
let patterns = get_patterns();
let mut it = patterns.iter().cloned();
let lms0 = LengthPluralElements {
long: PluralElements::new(it.next().unwrap()),
Expand Down Expand Up @@ -501,4 +570,45 @@ pub mod tests {
assert_eq!(builder, recovered_builder);
}
}

#[cfg(feature = "datagen")]
#[test]
fn test_serde() {
let patterns = get_patterns();
let lms0a = LengthPluralElements {
long: PluralElements::new(patterns[0].clone()),
medium: PluralElements::new(patterns[0].clone()),
short: PluralElements::new(patterns[1].clone()),
};
let lms1 = LengthPluralElements {
long: PluralElements::new(patterns[3].clone()),
medium: PluralElements::new(patterns[4].clone()),
short: PluralElements::new(patterns[5].clone()),
};

let builder = PackedPatternsBuilder {
standard: lms0a,
variant0: Some(lms1),
variant1: None,
};
let packed = builder.clone().build();

let bincode_bytes = bincode::serialize(&packed).unwrap();
assert_eq!(
bincode_bytes.as_slice(),
&[
26, 11, 0, 0, 76, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 16, 0, 26, 0, 30, 0, 46,
0, 0, 128, 32, 1, 0, 0, 47, 128, 64, 1, 0, 0, 47, 128, 16, 1, 2, 128, 114, 2, 0, 0,
58, 128, 128, 2, 0, 128, 80, 1, 0, 128, 80, 1, 0, 0, 32, 128, 32, 3, 0, 0, 32, 128,
64, 1, 0, 128, 64, 2, 0, 0, 46, 128, 32, 2, 0, 0, 46, 128, 16, 2
][..]
);
let bincode_recovered = bincode::deserialize::<PackedPatternsV1>(&bincode_bytes).unwrap();
assert_eq!(builder, bincode_recovered.to_builder());

let json_str = serde_json::to_string(&packed).unwrap();
assert_eq!(json_str, "{\"long\":{\"other\":\"M/d/y\"},\"medium\":{\"other\":\"M/d/y\"},\"short\":{\"other\":\"HH:mm\"},\"variant0\":{\"long\":{\"other\":\"E\"},\"medium\":{\"other\":\"E MMM d\"},\"short\":{\"other\":\"dd.MM.yy\"}}}");
let json_recovered = serde_json::from_str::<PackedPatternsV1>(&json_str).unwrap();
assert_eq!(builder, json_recovered.to_builder());
}
}
14 changes: 12 additions & 2 deletions components/plurals/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -878,12 +878,22 @@ where
}

#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(
feature = "serde",
derive(serde::Deserialize, serde::Serialize),
serde(transparent)
)]
/// A bag of values for different plural cases.
///
/// # Serialization
///
/// Although this type implements Serde traits, ICU4X developers and clients
/// seeking a more efficient bitpacked representation should consider
/// [`crate::provider::PluralElementsPackedCow`].
pub struct PluralElements<T>(PluralElementsInner<T>);
Comment on lines +887 to 893
Copy link
Member Author

@sffc sffc Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robertbastian I added you to the review primarily to give you an opportunity to raise an objection to this change, in icu_plurals (I noted this in the OP). Given the lack of comment in this file, is it safe to assume that you are okay with this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, I do have objections. The serde impls on PluralElementsInner are currently only used in the human readable code path; this change uses them for all serializers through PluralElements. skip_serializing_if does not work for non-self-describing formats like postcard.


#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(serde::Deserialize))]
#[cfg_attr(feature = "datagen", derive(serde::Serialize))]
#[cfg_attr(feature = "serde", derive(serde::Deserialize, serde::Serialize))]
pub(crate) struct PluralElementsInner<T> {
#[cfg_attr(feature = "serde", serde(skip_serializing_if = "Option::is_none"))]
zero: Option<T>,
Expand Down