Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Custom derive macro for rust_xlsxwriter specific serialization options #66

Closed
22 tasks done
jmcnamara opened this issue Jan 4, 2024 · 57 comments
Closed
22 tasks done
Assignees
Labels
enhancement New feature or request help wanted in progress Currently under investigation or development

Comments

@jmcnamara
Copy link
Owner

jmcnamara commented Jan 4, 2024

In a comment on #63 @claudiofsr said:

Another suggestion would be to choose to format the data structure like this:

use serde::{Serialize, Deserialize};

  // Create a deserializable/serializable test struct.
  #[derive(Deserialize, Serialize)]
  struct Produce {
      #[serde(rename = "Fruit FooBar")]
      #[rust_xlsxwriter(set_min_width=8)]
      fruit: &'static str,
      #[serde(rename = "Value")]
      #[rust_xlsxwriter(set_num_format="#,##0.00")] // 2 digits after the decimal point
      cost: f64,
      #[serde(rename = "Date DMY")]
      #[rust_xlsxwriter(set_num_format="dd/mm/yyyy")]
      dmy: Option<NaiveDate>,
      #[serde(rename = "Date MDY",)]
      #[rust_xlsxwriter(set_num_format="mm/dd/yyyy")]
      mdy: NaiveDate,
      #[serde(rename = "Long Description")]
      #[rust_xlsxwriter(set_min_width=8, set_max_width=60)]
      long_description: String
  }

I've started work on implementing a custom derive macro to enable this functionality on the derive branch.

Here is the feature set to be implemented

SerializeFieldOptions:

  • set_header_format()
  • hide_headers()
  • set_custom_headers()

CustomSerializeField

  • rename()
  • set_header_format()
  • set_column_format()
  • set_num_format()
  • set_value_format()
  • skip()
  • set_column_width()
  • set_column_width_pixels()

Serde Container attributes:

  • #[serde(rename = "name")]
  • #[serde(rename_all = "...")]
  • #[serde(rename_all(serialize = "..."))]

Serde Field attributes:

  • #[serde(rename = "name")]
  • #[serde(skip)]
  • #[serde(skip_serializing)]

Packaging:

  • Re-export trait
  • Examples
  • Docs
  • Feature flag
  • Release

As a start, you can now do this:

use derive::ExcelSerialize;
use rust_xlsxwriter::{ExcelSerialize, Workbook, XlsxError};
use serde::Serialize;

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    #[derive(ExcelSerialize, Serialize)]
    struct Produce {
        fruit: &'static str,
        cost: f64,
        in_stock: bool,
    }

    // Create some data instances.
    let items = [
        Produce {
            fruit: "Peach",
            cost: 1.05,
            in_stock: true,
        },
        Produce {
            fruit: "Plum",
            cost: 0.15,
            in_stock: false,
        },
        Produce {
            fruit: "Pear",
            cost: 0.75,
            in_stock: true,
        },
    ];

    worksheet.set_serialize_headers::<Produce>(0, 0)?;
    worksheet.serialize(&items)?;

    // Save the file.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

Output:

screenshot

Note the ExcelSerialize derived trait. The set_serialize_headers() doesn't use Serde serialization or deserialization (for the headers). Instead the macro generates code a custom impl for the type inline like this:

    impl ExcelSerialize for Produce {
        fn to_rust_xlsxwriter() -> rust_xlsxwriter::SerializeFieldOptions {
            let mut custom_headers: Vec<rust_xlsxwriter::CustomSerializeField> 
                = ::alloc::vec::Vec::new();
            custom_headers.push(rust_xlsxwriter::CustomSerializeField::new("fruit"));
            custom_headers.push(rust_xlsxwriter::CustomSerializeField::new("cost"));
            custom_headers.push(rust_xlsxwriter::CustomSerializeField::new("in_stock"));
            rust_xlsxwriter::SerializeFieldOptions::new()
                .set_struct_name("Produce")
                .set_custom_headers(&custom_headers)
        }
    }

It should be straight forward to support attributes like #[rust_xlsxwriter(set_num_format="dd/mm/yyyy")]. However, interacting (correctly) with Serde attributes will be a little trickier.

I'll continue to work on this for the next few weeks and post updates when there is enough functionality to try it out with a more realistic use case.

@lucatrv for information.

@jmcnamara
Copy link
Owner Author

I asked a question about this on users.rust-lang.org. Some interesting suggestions there.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 5, 2024

Just an update on progress on this. You can now mix and match rust_xlsxwriter and some (currently only 1) serde attributes.

Like this:

se rust_xlsxwriter::{ExcelSerialize, Workbook, XlsxError};
use rust_xlsxwriter_derive::ExcelSerialize;
use serde::Serialize;

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    #[derive(ExcelSerialize, Serialize)]
    #[allow(dead_code)]
    struct Produce {
        #[rust_xlsxwriter(rename = "Item")]
        fruit: &'static str,

        #[rust_xlsxwriter(rename = "Price")]
        #[rust_xlsxwriter(num_format = "$0.00")]
        cost: f64,

        #[serde(skip)]
        in_stock: bool,
    }

    // Create some data instances.
    let items = [
        Produce {
            fruit: "Peach",
            cost: 1.05,
            in_stock: true,
        },
        Produce {
            fruit: "Plum",
            cost: 0.15,
            in_stock: false,
        },
        Produce {
            fruit: "Pear",
            cost: 0.75,
            in_stock: true,
        },
    ];

    worksheet.set_serialize_headers::<Produce>(0, 0)?;
    worksheet.serialize(&items)?;

    // Save the file.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

Output:

screenshot

There is still quite a bit of work to make this robust but at least it demonstrates that it is feasible.

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 5, 2024

Can we make the attribute name rust_xlsxwriter shorter as xlsxwriter?

@jmcnamara
Copy link
Owner Author

Can we make the attribute name rust_xlsxwriter shorter as xlsxwriter?

That is a good suggestion. rust_xlsxwriter is a bit long for an attribute.

There is a different xlsxwriter library that I didn't write but wraps the libxlxwriter library which I did write. I would have preferred to use just xlsxwriter for the rust crate but that wasn't possible.

So the shorted attribute "path" of xlsxwriter would be better but I wonder if it would be confusing. Presumably not to the person writing the code but maybe to a reader.

I'll definitely consider it though.

@lucatrv your 2 cents?

@lucatrv
Copy link
Contributor

lucatrv commented Jan 5, 2024

I think that using xlsxwriter as a shorted attribute path, while keeping the rust_xlsxwriter library name (also considering that another xlsxwriter library already exists), would be confusing. What if someone uses both libraries in the same code?
Instead (but maybe it is too late to suggest this...) one could consider a shorter name for the library itself. I know that many *xlsx* names are already reserved, but for instance xlsxw is available and it sounds good to me.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 5, 2024

#[rust_xlsxwriter(rename = "Item")]
fruit: &'static str,

Is it also possible to rely on #[serde(rename = "Item")], for instance in case it is already declared for other serializers?

@jmcnamara
Copy link
Owner Author

Is it also possible to rely on #[serde(rename = "Item")], for instance in case it is already declared for other serializers?

Yes. That is the goal. But it is also the hard part. :-) Hence my question on users.rust-lang.org.

@jmcnamara
Copy link
Owner Author

Instead (but maybe it is too late to suggest this...) one could consider a shorter name for the library itself. I know that many *xlsx* names are already reserved, but for instance xlsxw is available and it sounds good to me.

Naming things is hard. I had also registered excelwriter and made an earlier attempt to move the code to that but in the end I didn't.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 5, 2024

Naming things is hard. I had also registered excelwriter and made an earlier attempt to move the code to that but in the end I didn't.

That's a nice name indeed, and there's already an eminent homonymous ;)

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 6, 2024

Is it also possible to rely on #[serde(rename = "Item")], for instance in case it is already declared for other serializers?

Yes. That is the goal. But it is also the hard part. :-)

Well, actually it's not that hard... Here's the MVP/POV code: https://github.com/zjp-CN/parse-serde-macro/blob/main/_impl/src/lib.rs

// key implementation
            if ident == "rust_xlsxwriter" || ident == "xlsxwriter" || ident == "serde" {
                // e.g. #[ident(one)] or #[ident(one, two)] or #[ident(one, two, ...)]
                let parsed = Punctuated::<AttributeTypes, Token![,]>::parse_separated_nonempty
                    .parse2(list.tokens)?;
                return Ok(parsed.into_iter().collect());
            }

user interface :

// field `b` => [Rename(LitStr { token: "serde rename for b" })]
// field `c` => [Skip]
// field `d` => [Rename(LitStr { token: "rust_xlsxwriter rename for d" }), NumFormat(LitStr { token: "$0.00" })]
// field `e` => [Rename(LitStr { token: "xlsxwriter rename for d" }), NumFormat(LitStr { token: "$0.00" })]
// field `f` => [Skip]
#[derive(_impl::ExcelSerialize, serde::Serialize)]
pub struct A {
    #[serde(rename = "serde rename for b")]
    b: (),
    #[serde(skip)]
    c: (),

    #[rust_xlsxwriter(rename = "rust_xlsxwriter rename for d", num_format = "$0.00")]
    d: (),

    #[xlsxwriter(rename = "xlsxwriter rename for d", num_format = "$0.00")]
    e: (),
    #[xlsxwriter(skip)]
    f: (),
}

// error: `not_exist` is not supported by ExcelSerialize derive macro
//   --> src/main.rs:40:18
//    |
// 40 |     #[xlsxwriter(not_exist)]
//    |                  ^^^^^^^^^
#[derive(_impl::ExcelSerialize, serde::Serialize)]
pub struct B {
    #[xlsxwriter(not_exist)]
    f: (),
}

@lucatrv
Copy link
Contributor

lucatrv commented Jan 6, 2024

Yes. That is the goal. But it is also the hard part. :-) Hence my question on users.rust-lang.org.

IMHO once the integration with Serde is completed, so that Serde's rename, skip, etc are supported, then the corresponding rust_xlsxwriter implementations should be removed, because they would represent overlapping and practically useless annotations which would only create confusion and possible bugs. Only the additional features specific for Excel writing should be left, such as set_header_format, set_column_format, etc.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 6, 2024

Naming things is hard. I had also registered excelwriter and made an earlier attempt to move the code to that but in the end I didn't.

That's a nice name indeed, and there's already an eminent homonymous ;)

@jmcnamara, the more I think about this, the more I would suggest to change name into excelwriter, for two reasons. The first one is to reduce library and path name lengths, as suggested by @zjp-CN. The second one, in my opinion even more important, is to make this project much more reachable and to avoid confusion with the other existing *xlsxwriter libraries.
I remember the first time I was looking for a library to use in my code, I started my research on crates.io with the excel keyword, before the xlsx keyword. Then, it was quite difficult to orient myself among the *xlsxwriter libraries, and to finally understand that rust_xlsxwriter was the best one. Even nowadays, when I google rust_xlsxwriter docs, the link xlsxwriter - Rust comes before rust_xlsxwriter - Rust, and this is an additional source of confusion. Now I know the difference, but I can imagine new users getting confused among these two Rust libraries, and libxlsxwriter.
You now have 88k downloads, it is unfortunate to loose them but soon you will have much more, and I think changing name will help that. It may actually be the last opportunity to think about this. You are now implementing these Serde integration features which will be a game changer. You could freeze the rust_xlsxwriter development at release v0.6.0, and notify that all new features will be implemented in excelwriter. I think this is still possible, and if you need some help in changing name references I could help.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 6, 2024

Well, actually it's not that hard... Here's the MVP/POV code

@zjp-CN Thank you. I really appreciate that.

I actually got the #[serde(rename = "...")] part working late yesterday. The code is here.

In terms of difficultly I was thinking more about the container attribute #[serde(rename_all = "...")] but even that probably won't be too difficult.

Your code looks much cleaner and more generic. Could you have a look at my implementation and make any suggestions that you have.

One issue that I have and haven't managed to come up with a solution for is that I would like to an rust_xlsxwriter Format object as an attribute parameter. Something like this:

    #[derive(ExcelSerialize, Serialize)]
    struct Produce {
        fruit: &'static str,

        #[rust_xlsxwriter(value_format = Format::new().set_bold())]
        cost: f64,

        in_stock: bool,
    }

Is that possible to handle non-primitives as an attribute value? Anything that I have tried so far end up with a literal string or a parse error.

I would also need to modify the token stream to get rust_xlsxwriter::Format::new().set_bold() (and maybe some other qualifiers).

Any thoughts on whether that is possible and if so how?

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 6, 2024

the more I think about this, the more I would suggest to change name into

@lucatrv I think you are right. Could you open a separate Feature Request for that so we can have the discussion separately to this thread. I think it is worth doing.

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 6, 2024

One issue that I have and haven't managed to come up with a solution for is that I would like to an rust_xlsxwriter Format object as an attribute parameter.

You can simply add it as follows:

@@ -57,6 +57,7 @@ enum AttributeTypes {
     Skip,
     Rename(LitStr),
     NumFormat(LitStr),
+    FormatObj(Expr),
 }

@@ -73,6 +74,9 @@ fn parse(input: syn::parse::ParseStream) -> syn::Result<Self> {
         } else if ident == "num_format" {
             let _ = input.parse::<Token![=]>()?;
             Ok(Self::NumFormat(input.parse()?))
+        } else if ident == "value_format" {
+            let _ = input.parse::<Token![=]>()?;
+            Ok(Self::FormatObj(input.parse()?))
         } else {
             Err(syn::Error::new(
                 ident.span(),

Could you have a look at my implementation and make any suggestions that you have.

Yes, and I've already wrote the code partially based on yours. I can open a PR for your derive branch, but I don't know if there are tests for the feature.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 6, 2024

@zjp-CN

You can simply add it as follows:

Thank you. I'll try that.

I can open a PR for your derive branch, but I don't know if there are tests for the feature.

There are integration test to cover it. I normally run the integration tests like the following to avoid the long compilation and test time for the examples:

cargo test --test '*'

Currently the following tests cover this functionality. I'll add more once I get the Format support working.

tests/integration/serde01.rs
tests/integration/serde12.rs
tests/integration/serde07.rs

Just some notes:

  • #[serde(rename = "...")] and #[rust_xlsxwriter(rename = "...")] are not the same. Same with #[serde(skip)] and #[rust_xlsxwriter(skip)]. There are subtle differences so they should be handled differently.
  • I'm not (very) interested in supporting multiple key/value pairs in the attributes like your example rust_xlsxwriter(rename = "NewName", num_format = "$0.00"). I don't know if it works in my current code but if it doesn't then don't put effort into it.

Thanks once more.

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 6, 2024

cargo test --test '*'

Thanks. I'll go with cargo test --test integration serde to only run tests the name of which contains serde in tests/integration.

There are subtle differences so they should be handled differently.

I know they bring different meanings, but I think we need both of #[serde(skip)] and #[rust_xlsxwriter(skip)] (same for #[serde(rename] and #[rust_xlsxwriter(rename] ), because

#[serde(skip)] = #[rust_xlsxwriter(skip)] in ExcelSerialize + original meaning in serde::Serialize
  • someone would like to write #[serde(skip)] to skip serialization in any way,
  • but someone else would like to write #[rust_xlsxwriter(skip)] to skip serialization only in rust_xlsxwriter, but keep it serialized to any other output (like as part of JSON/CSV/etc).

I don't know if it works in my current code but if it doesn't then don't put effort into it.

I checked the code in parse_nested_meta, and it indeed handles multiple arguments separated by a comma with an optional trailing comma: it basically parses a path first for you, and offers you the responsibility in the callback to push the token cursor forward when the tokens follows the checked path but without thinking about the comma separator. So it works really well, and you don't need my code, as both versions almost do the same thing (but parse_nested_meta is more powerful).

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 7, 2024

I've updated the code to handle all of the Serde rename variants including one nested variant.

I'm moving on to the Serde skip_serializing_if and the various rust_xlsxwriter format variants.

Update 1: I have never used this option before but it turns out that skip_serializing_if is used to skip individual values and not the field. This means that it doesn't have to be handled by the rust_xlsxwriter proc macros but it also means that using this will cause mismatched output in Excel. That is probably a bug so I'll handle that in a separate Bug tracker after this work.

@jmcnamara
Copy link
Owner Author

Good news for a small audience. I got all of the major functionality working including the format support.

Here is an example:

use rust_xlsxwriter::{ExcelSerialize, Workbook, XlsxError};
use serde::Serialize;

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Create a serializable struct.
    #[derive(ExcelSerialize, Serialize)]
    #[rust_xlsxwriter(header_format = Format::new()
        .set_bold()
        .set_border(FormatBorder::Thin)
        .set_background_color("C6EFCE"))]
    #[serde(rename_all = "PascalCase")]
    struct Produce {
        fruit: &'static str,

        #[rust_xlsxwriter(value_format = Format::new().set_num_format("$0.00"))]
        cost: f64,
    }

    // Create some data instances.
    let item1 = Produce {
        fruit: "Peach",
        cost: 1.05,
    };

    let item2 = Produce {
        fruit: "Plum",
        cost: 0.15,
    };

    let item3 = Produce {
        fruit: "Pear",
        cost: 0.75,
    };

    // Set the serialization location and headers.
    worksheet.set_serialize_headers::<Produce>(0, 0)?;

    // Serialize the data.
    worksheet.serialize(&item1)?;
    worksheet.serialize(&item2)?;
    worksheet.serialize(&item3)?;

    // Save the file to disk.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

Which generates this output:

screenshot

The header format is a bit long in this example but I wanted to replicate the previous example. The diff with the previous example looks like this:

diff --git a/examples/app_serialize.rs b/examples/app_serialize.rs
index 1b4810a..3e7980b 100644
--- a/examples/app_serialize.rs
+++ b/examples/app_serialize.rs
@@ -5,10 +5,8 @@
 //! Example of serializing Serde derived structs to an Excel worksheet using
 //! `rust_xlsxwriter`.
 
-use rust_xlsxwriter::{
-    CustomSerializeField, Format, FormatBorder, SerializeFieldOptions, Workbook, XlsxError,
-};
-use serde::{Deserialize, Serialize};
+use rust_xlsxwriter::{ExcelSerialize, Workbook, XlsxError};
+use serde::Serialize;
 
 fn main() -> Result<(), XlsxError> {
     let mut workbook = Workbook::new();
@@ -16,19 +14,17 @@ fn main() -> Result<(), XlsxError> {
     // Add a worksheet to the workbook.
     let worksheet = workbook.add_worksheet();
 
-    // Set some formats.
-    let header_format = Format::new()
+    // Create a serializable struct.
+    #[derive(ExcelSerialize, Serialize)]
+    #[rust_xlsxwriter(header_format = Format::new()
         .set_bold()
         .set_border(FormatBorder::Thin)
-        .set_background_color("C6EFCE");
-
-    let value_format = Format::new().set_num_format("$0.00");
-
-    // Create a serializable struct.
-    #[derive(Deserialize, Serialize)]
+        .set_background_color("C6EFCE"))]
     #[serde(rename_all = "PascalCase")]
     struct Produce {
         fruit: &'static str,
+
+        #[rust_xlsxwriter(value_format = Format::new().set_num_format("$0.00"))]
         cost: f64,
     }
 
@@ -48,13 +44,8 @@ fn main() -> Result<(), XlsxError> {
         cost: 0.75,
     };
 
-    // Set the custom headers.
-    let header_options = SerializeFieldOptions::new()
-        .set_header_format(&header_format)
-        .set_custom_headers(&[CustomSerializeField::new("Cost").set_value_format(&value_format)]);
-
     // Set the serialization location and headers.
-    worksheet.deserialize_headers_with_options::<Produce>(0, 0, &header_options)?;
+    worksheet.set_serialize_headers::<Produce>(0, 0)?;
 
     // Serialize the data.
     worksheet.serialize(&item1)?;

The nice thing about it is that you get standard rustc error messages:

error[E0599]: no method named `set_scold` found for struct `Format` in the current scope
  --> examples/app_serialize.rs:20:10
   |
19 |       #[rust_xlsxwriter(header_format = Format::new()
   |  _______________________________________-
20 | |         .set_scold()
   | |         -^^^^^^^^^ help: there is a method with a similar name: `set_bold`
   | |_________|
   | 

// or

error: unknown rust_xlsxwriter attribute: `skipper`
  --> examples/app_serialize.rs:27:27
   |
27 |         #[rust_xlsxwriter(skipper)]
   |                           ^^^^^^^

In case anyone is interested, here is what the expanded derive output for the example above looks like:

    #[doc(hidden)]
    const _: () = {
        #[allow(unused_imports)]
        use rust_xlsxwriter::{
            Color, Format, FormatAlign, FormatBorder, FormatDiagonalBorder,
            FormatPattern, FormatScript, FormatUnderline,
        };
        impl ExcelSerialize for Produce {
            fn to_serialize_field_options() -> rust_xlsxwriter::SerializeFieldOptions {
                let mut custom_headers: Vec<rust_xlsxwriter::CustomSerializeField> = ::alloc::vec::Vec::new();
                custom_headers.push(rust_xlsxwriter::CustomSerializeField::new("Fruit"));
                custom_headers
                    .push(
                        rust_xlsxwriter::CustomSerializeField::new("Cost")
                            .set_value_format(Format::new().set_num_format("$0.00")),
                    );
                rust_xlsxwriter::SerializeFieldOptions::new()
                    .set_header_format(
                        Format::new()
                            .set_bold()
                            .set_border(FormatBorder::Thin)
                            .set_background_color("C6EFCE"),
                    )
                    .set_struct_name("Produce")
                    .set_custom_headers(&custom_headers)
            }
        }
    };

I stole the const _: () = {} wrapper idea from serde.

Overall I'm very happy with the way this turned out. I may drop/deprecate/hide the existing serialize/deserialize way of setting up the headers and only document this.

I now have a heartfelt appreciation for the functionality of the syn, quote and serde crates. There is some impressive software engineering in there. All the above functionality comes from around 500 lines of code. (500 hard fought lines of code but nevertheless.)

I'll move this back onto main shortly and fix up the feature flags so that folks can try it out. Thanks @zjp-CN and @lucatrv for the help and feedback.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 7, 2024

I'll move this back onto main shortly and fix up the feature flags so that folks can try it out. Thanks @zjp-CN and @lucatrv for the help and feedback.

Thanks @jmcnamara , I'm looking forward to testing this out!

jmcnamara added a commit that referenced this issue Jan 7, 2024
Add support for ExcelSerialize trait via a proc macro. Support
rust_xlsxwriter attributes and also Serde attributes.

See #66
jmcnamara added a commit that referenced this issue Jan 7, 2024
Make rust_xlsxwriter_derive a sub-dependency of the serde
feature. It is enabled with the serde flag.

Feature #66
jmcnamara added a commit that referenced this issue Jan 7, 2024
Make rust_xlsxwriter_derive a sub-dependency of the serde
feature. It is enabled with the serde flag.

Feature #66
@jmcnamara
Copy link
Owner Author

I've merged this back to main. The rust_xlsxwriter_derive macro is a dependency on serde.

You should add the following to your Cargo.toml for testing.

[dependencies]
rust_xlsxwriter = { git = "https://github.com/jmcnamara/rust_xlsxwriter.git", version = "0.60.0", features = ["serde"] }
serde = { version = "1.0.195", features = ["derive"] }

I seem to have broken installation via cargo add:

$ cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git -F serde
    Updating git repository `https://github.com/jmcnamara/rust_xlsxwriter.git`
error: multiple packages found at `https://github.com/jmcnamara/rust_xlsxwriter.git`:
    rust_xlsxwriter, rust_xlsxwriter_derive
To disambiguate, run `cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git <package>`

$ cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git -F serde rust_xlsxwriter
    Updating git repository `https://github.com/jmcnamara/rust_xlsxwriter.git`
      Adding rust_xlsxwriter (git) to dependencies.
error: unrecognized feature for crate rust_xlsxwriter: serde
no features available for crate rust_xlsxwriter

I'm not sure if this is a cargo add issue or if I need to set up a workspace since there are now to packages/crates in the repo. If anyone knows let me know. (It works when installing via cargo add --path ...).

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 8, 2024

To disambiguate, run cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git <package>

That tells us how to do it right by sepecifying the package in the workshop:

cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git rust_xlsxwriter -F serde

@jmcnamara
Copy link
Owner Author

That tells us how to do it right by sepecifying the package in the workshop:

I had tried that but it can't seem to process the feature flag:

$ cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git rust_xlsxwriter -F serde
    Updating git repository `https://github.com/jmcnamara/rust_xlsxwriter.git`
      Adding rust_xlsxwriter (git) to dependencies.
error: unrecognized feature for crate rust_xlsxwriter: serde

I'll see if moving to a workspace improves things.

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 8, 2024

I had tried that but it can't seem to process the feature flag:

How come??? It worked on my machine with rustc 1.77.0-nightly (595bc6f00 2024-01-05)

$ cargo add --git https://github.com/jmcnamara/rust_xlsxwriter.git rust_xlsxwriter -F serde
    Updating git repository `https://github.com/jmcnamara/rust_xlsxwriter.git`
remote: Enumerating objects: 10, done.
remote: Counting objects: 100% (10/10), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 10 (delta 6), reused 8 (delta 6), pack-reused 0
Unpacking objects: 100% (10/10), 991 bytes | 20.00 KiB/s, done.
From https://github.com/jmcnamara/rust_xlsxwriter
   9465a21..1f648f6             -> origin/HEAD
      Adding rust_xlsxwriter (git) to dependencies.
             Features:
             + serde
             - chrono
             - js-sys
             - polars
             - test-resave
             - wasm
             - wasm-bindgen
             - zlib

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 8, 2024

Oh, it was a bug in cargo, and it was fixed two weeks ago. But it will land on stable Rust in 1.77.0, which will be released on 21 March, 2024.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 8, 2024

It's a little verbose to repeat Format::new().set_align(FormatAlign::Center), but that's how macros work in Rust, by pasting tokens. Some improvements can be

* (not feasible today) make methods on Format const: replace these tokens with a const `FORMAT` where there is `const FORMAT: Format = ...;` in scope

* encode all the cases as attribute parameters: `#[rust_xlsxwriter(align = "center")]` or something

I would consider this point, because formats can get pretty long and they may need to be applied to several struct fields.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 8, 2024

Do you think it would be possible to refer somehow to defined formats, or is this not feasible at all? Something like

As @zjp-CN points out the format parts need to be const/non-dynamic since they will end up in a fn block within the impl.

However you could use functions to give you re-use and reduce verbosity. Something like this:

use rust_xlsxwriter::{ExcelSerialize, Format, Workbook, XlsxError, FormatBorder};
use serde::Serialize;

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Set up some formats.
    fn my_header_format() -> Format {
        Format::new()
            .set_bold()
            .set_border(FormatBorder::Thin)
            .set_background_color("C6EFCE")
    }

    fn my_red_num_format(s: &str) -> Format {
        Format::new().set_font_color("FF0000").set_num_format(s)
    }

    // Create a serializable struct.
    #[derive(ExcelSerialize, Serialize)]
    #[rust_xlsxwriter(header_format = my_header_format())]
    #[serde(rename_all = "PascalCase")]
    struct Produce {
        fruit: &'static str,

        #[rust_xlsxwriter(value_format = my_red_num_format("$0.00"))]
        cost: f64,
    }

    // Create some data instances.
    let item1 = Produce {
        fruit: "Peach",
        cost: 1.05,
    };

    let item2 = Produce {
        fruit: "Plum",
        cost: 0.15,
    };

    let item3 = Produce {
        fruit: "Pear",
        cost: 0.75,
    };

    // Set the serialization location and headers.
    worksheet.set_serialize_headers::<Produce>(0, 0)?;

    // Serialize the data.
    worksheet.serialize(&item1)?;
    worksheet.serialize(&item2)?;
    worksheet.serialize(&item3)?;

    // Save the file to disk.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

screenshot

@lucatrv
Copy link
Contributor

lucatrv commented Jan 8, 2024

This works, hurrah!

#![feature(lazy_cell)]

use std::sync::LazyLock;

static CENTER_FORMAT: LazyLock<Format> = LazyLock::new(|| {
    Format::new().set_align(FormatAlign::Center)
});

#[derive(Deserialize, Serialize, ExcelSerialize)]
#[serde(rename_all = "UPPERCASE")]
#[rust_xlsxwriter(header_format = &*CENTER_FORMAT)]
struct Record<'a> {
    #[rust_xlsxwriter(column_format = &*CENTER_FORMAT)]
    articolo: &'a str,
    descrizione: &'a str,
    #[rust_xlsxwriter(column_format = &*CENTER_FORMAT)]
    stato: u8,
}

I think this is the best way to apply formats, so I would document it in the examples.

You need rustc nightly to compile this, waiting for LazyCell / LazyLock to be stabilized:
https://blog.rust-lang.org/2023/06/01/Rust-1.70.0.html#oncecell-and-oncelock
https://doc.rust-lang.org/std/cell/struct.LazyCell.html
https://doc.rust-lang.org/std/sync/struct.LazyLock.html

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 9, 2024

I think this is the best way to apply formats, so I would document it in the examples.

I don't think so... I think I should withdraw the first improvement because defining a custom format function is an awesome solution: function is flexible, IDE friendly, and repeatable.

As for your static variable + &*CENTER_FORMAT solution here, it's actually based on the impl From<&Format> for Format fact which fundamentally calls Format::clone(). That is you end up with a new Format instance anyway, so why not use a trivial function returning a new Format instance to do this. Plus, IMHO LazyLock is really abused here.


For my second suggestion

  • encode all the cases as attribute parameters: #[rust_xlsxwriter(align = "center")] or something

I would like to implement it if you guys like the idea. The API in design

// column format: header + value format
#[derive(ExcelSerialize, Serialize)]
#[column_format(align = "center", font_name = "...", border = "thin")] // applies for all columns
struct Data { ... }
#[derive(ExcelSerialize, Serialize)] 
struct Data { #[column_format(align = "center", border = "thin")] field: i32 } // applies only for one field column

// header format
#[derive(ExcelSerialize, Serialize)]
#[header_format(align = "center", font_size = 45, bold, italic)] // applies for all headers
struct Data { ... }
#[derive(ExcelSerialize, Serialize)]
struct Data { #[header_format(bold) field: i32 } // applies only for one field header

// value format 
#[derive(ExcelSerialize, Serialize)]
#[value_format(num_format = "0.00")] // applies for all values of all fields
struct Data { ... }
#[derive(ExcelSerialize, Serialize)]
struct Data { #[value_format(num_format = "0.00")] field: i32 } // applies only for values of one field

// combined formats
#[derive(ExcelSerialize, Serialize)]
struct Data {
    #[header_format(border = "thick")]
    #[value_format(border = "thin", num_format = "0.00")]
    field: i32
}

// overridden formats
#[derive(ExcelSerialize, Serialize)]
#[column_format(background_color= "#123456")]
struct Data {
    #[header_format(background_color= "#7890AB")]
    #[value_format(background_color= "#ABCDEF")]
    field: i32
}

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 9, 2024

I would like to implement it if you guys like the idea. The API in design

@zjp-CN Could you hold off on this. I'll admit that is what I thought the API would look like initially because I wasn't sure if it was possible to parse an expression. However I'd prefer not to get into having to document and test a secondary interface to Format. There are currently 40 public set_xxx() APIs on that struct.

I think a better overall formatting solution would be to add support for Worksheet Tables to get something like this:

screenshot 1

That is what the Polars folks use as the default format on output dataframes.

I am also currently working on some of the refactoring to make that happen as part of the fix for #71.

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 9, 2024

That is what the Polars folks use as the default format on output dataframes.

Sorry, I'm not familiar with polars, what does it mean?

@lucatrv
Copy link
Contributor

lucatrv commented Jan 9, 2024

As for your static variable + &*CENTER_FORMAT solution here, it's actually based on the impl From<&Format> for Format fact which fundamentally calls Format::clone(). That is you end up with a new Format instance anyway, so why not use a trivial function returning a new Format instance to do this. Plus, IMHO LazyLock is really abused here.

I agree with you if in my example above CENTER_FORMAT is actually cloned three times.

For my second suggestion

  • encode all the cases as attribute parameters: #[rust_xlsxwriter(align = "center")] or something

I would like to implement it if you guys like the idea. The API in design

I think the best of both words would be to define a format! macro to construct new Formats, which takes for instance:

let custom_format = format![align = "center", font_name = "...", border = "thin"];

Then the same macro could also be used within the Serde struct annotations. Finally the documentation examples should suggest the best way to define a format in one place and apply it to several fields, for instance using functions:

fn custom_format() -> Format {
    format![align = "center", font_name = "...", border = "thin"]
}

@zjp-CN
Copy link
Contributor

zjp-CN commented Jan 9, 2024

Finally the documentation examples should suggest the best way to define a format in one place and apply it to several fields, for instance using functions:

Great! I like the design. Why not make format! generate the function and return it 😃

let custom_format = format![align = "center", font_name = "...", border = "thin"];

// expansion
let custom_format = {
    fn custom_format() -> Format { ... }
    custom_format
};
// then custom_format is copyable ~

Update: this results in #[rust_xlsxwriter(column_format = custom_format())], but we can make it shorter as
#[rust_xlsxwriter(column_format = custom_format)]:

  • CustomSerializeField::set_*_format already requires Into<Format>
  • we just need to add impl<F: FnOnce() -> Format> From<F> for Format
  • then done

Update2: wait, since most structs are defined under modules instead of scopes of functions, it would be better for format! to generate a Format value as @lucatrv suggests.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 9, 2024

Sorry, I'm not familiar with polars, what does it mean?

@zjp-CN Sorry, I should have added a link there. Polars is a Rust/Python Dataframe library.

They implemented Xlsx output based on the Python version of rust_xlsxwriter and use a default table format for each dataframe.

There are some visual examples here: pola-rs/polars#5568 and here.

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 9, 2024

I think the best of both words would be to define a format! macro to construct new Formats, which takes for instance:

Folks, before anyone gets too far into this I want to say that I'm not in favour of a secondary interface to Format. Not at this point in time. If someone wants to open a Feature Request for it we can look at it once the basic serde support is complete. For now I think the function wrapper suggested above is a reasonable workaround.

At the same time I'm not completely against the format!() idea. That is basically what I implemented in the Python version of Format. In addition to the various setters you can also use an anonymous Dict like this:

header_format = workbook.add_format(
    {
        "bold": True,
        "text_wrap": True,
        "valign": "top",
        "fg_color": "#D7E4BC",
        "border": 1,
    }
)

However, that will need a fair amount of documentation and tests and it loses enum validation for the property values. At the moment that feels like one too many rabbit holes. So let's punt that out until later.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 9, 2024

This does not seem to be working:

#[rust_xlsxwriter(header_format = Format::new().set_bold(), column_width = 10.0)]

It only applies the last annotation. In this case it applies only column_width = 10.0, while if I swap the two annotations it applies only header_format = Format::new().set_bold().

@jmcnamara
Copy link
Owner Author

This does not seem to be working:

Good catch. Fixed on main, with a test.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 9, 2024

Fixed on main, with a test.

Now it works, thanks!

jmcnamara added a commit that referenced this issue Jan 10, 2024
jmcnamara added a commit that referenced this issue Jan 10, 2024
jmcnamara added a commit that referenced this issue Jan 10, 2024
@lucatrv
Copy link
Contributor

lucatrv commented Jan 10, 2024

@jmcnamara, please notice that something in main broke the serialization support for Result<T, E>, which works in v0.60.0.

@jmcnamara
Copy link
Owner Author

please notice that something in main broke the serialization support for Result<T, E>, which works in v0.60.0.

@lucatrv The tests for that feature are still passing. Could you add a small working sample application that demonstrates the break in the #64 thread.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 11, 2024

Sorry yesterday it was too late... now I see that my code does not work with v0.60.0 neither, I'll try to dig into this.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 11, 2024

This code does not work to me:

use rust_xlsxwriter::{Format, FormatBorder, Workbook, XlsxError};
use serde::{Serialize, Deserialize};

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Add some formats to use with the serialization data.
    let header_format = Format::new()
        .set_bold()
        .set_border(FormatBorder::Thin)
        .set_background_color("C6E0B4");

    // Create a serializable struct.
    #[derive(Deserialize, Serialize)]
    #[serde(rename_all = "PascalCase")]
    struct Student<'a> {
        name: &'a str,
        age: Result<f64, String>,
        id: Result<f64, String>,
    }

    let students = [
        Student {
            name: "Aoife",
            age: Ok(1.0),
            id: Err(String::from("564351")),
        },
        Student {
            name: "Caoimhe",
            age: Err(String::new()),
            id: Ok(443287.0),
        },
    ];

    // Set up the start location and headers of the data to be serialized.
    worksheet.deserialize_headers_with_format::<Student>(1, 3, &header_format)?;

    // Serialize the data.
    worksheet.serialize(&students)?;

    // Save the file.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

I get empty fields for age and id.

@jmcnamara
Copy link
Owner Author

@lucatrv I'll check when I'm online later but do the Result<> fields actually serialize with Serde. For example do they serialize to JSON? I don't see it as a handled type in the Serde data model and I don't remember it from the required serialization methods. Option<> is handled as separate actions for Some and None but I'm not sure if Result is handled at all. It might need a serialize_with helper function.

@lucatrv
Copy link
Contributor

lucatrv commented Jan 11, 2024

@jmcnamara, I confirm that the following works, but I thought you could add serialization support for Result<> within rust_xlsxwriter, now that I think twice about it, is it not possible because Result<> is not in the Serde model?

use rust_xlsxwriter::{Format, FormatBorder, Workbook, XlsxError};
use serde::{Serialize, Serializer, Deserialize};

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Add some formats to use with the serialization data.
    let header_format = Format::new()
        .set_bold()
        .set_border(FormatBorder::Thin)
        .set_background_color("C6E0B4");

    // Create a serializable struct.
    #[derive(Deserialize, Serialize)]
    #[serde(rename_all = "PascalCase")]
    struct Student<'a> {
        name: &'a str,
        #[serde(serialize_with = "se_f64_or_string")]
        age: Result<f64, String>,
        #[serde(serialize_with = "se_f64_or_string")]
        id: Result<f64, String>,
    }

    fn se_f64_or_string<S>(
        f64_or_string: &Result<f64, String>,
        serializer: S,
    ) -> Result<S::Ok, S::Error>
    where
        S: Serializer,
    {
        match f64_or_string {
            Ok(f64) => serializer.serialize_f64(*f64),
            Err(string) => serializer.serialize_str(string),
        }
    }

    let students = [
        Student {
            name: "Aoife",
            age: Ok(1.0),
            id: Err(String::from("564351")),
        },
        Student {
            name: "Caoimhe",
            age: Err(String::new()),
            id: Ok(443287.0),
        },
    ];

    // Set up the start location and headers of the data to be serialized.
    worksheet.deserialize_headers_with_format::<Student>(1, 3, &header_format)?;

    // Serialize the data.
    worksheet.serialize(&students)?;

    // Save the file.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

@jmcnamara
Copy link
Owner Author

now that I think twice about it, is it not possible because Result<> is not in the Serde model?

Correct. rust_xslxwriter doesn't even see it during serialization (without a specific handler).

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 13, 2024

Folks, the derive/attribute feature is now released in v0.61.0 on crates.io. Thanks for all the input, testing and suggestions.

See the updates docs at:

https://docs.rs/rust_xlsxwriter/latest/rust_xlsxwriter/serializer/index.html#setting-serialization-headers
https://docs.rs/rust_xlsxwriter/latest/rust_xlsxwriter/serializer/index.html#controlling-excel-output-via-xlsxserialize-and-struct-attributes

@lucatrv
Copy link
Contributor

lucatrv commented Jan 13, 2024

Correct. rust_xslxwriter doesn't even see it during serialization (without a specific handler).

I thought it could be seen as a Serde enum.

@jmcnamara
Copy link
Owner Author

I thought it could be seen as a Serde enum.

@lucatrv You are right, it is handled by Serde but not rust_xlsxwriter since it currently ignores all enum types. However, this particular type which Serde refers to as a "newtype variant" could, and probably should, be handled. If you open a new bug report for that with your first example (#66 (comment)) I push a fix for it.

@jmcnamara
Copy link
Owner Author

rust_xlsxwriter serialization mini-update. I've added support for adding worksheet Tables to serialized areas.

Note the #[xlsx(table_default)] attribute in the following example. There is also #[xlsx(table_style = TableStyle::value)] and #[xlsx(table = Table::new())].

This is currently on main. I'll release it by or on the weekend.

use rust_xlsxwriter::{Workbook, XlsxError, XlsxSerialize};
use serde::Serialize;

fn main() -> Result<(), XlsxError> {
    let mut workbook = Workbook::new();

    // Add a worksheet to the workbook.
    let worksheet = workbook.add_worksheet();

    // Create a serializable struct.
    #[derive(XlsxSerialize, Serialize)]
    #[xlsx(table_default)]
    struct Produce {
        fruit: &'static str,
        cost: f64,
    }

    // Create some data instances.
    let items = [
        Produce {
            fruit: "Peach",
            cost: 1.05,
        },
        Produce {
            fruit: "Plum",
            cost: 0.15,
        },
        Produce {
            fruit: "Pear",
            cost: 0.75,
        },
    ];

    // Set the serialization location and headers.
    worksheet.set_serialize_headers::<Produce>(0, 0)?;

    // Serialize the data.
    worksheet.serialize(&items)?;

    // Save the file to disk.
    workbook.save("serialize.xlsx")?;

    Ok(())
}

Output:

screenshot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted in progress Currently under investigation or development
Projects
None yet
Development

No branches or pull requests

3 participants