Skip to content

Sort derives #154

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ayazhafiz opened this issue Jul 3, 2020 · 9 comments
Closed

Sort derives #154

ayazhafiz opened this issue Jul 3, 2020 · 9 comments

Comments

@ayazhafiz
Copy link

ayazhafiz commented Jul 3, 2020

See rust-lang/rustfmt#4112 and rust-lang/rustfmt#1867 for more context and discussion.

This rfc proposes the sorting of items listed in a derive attribute. Sorting derives keeps uniformity in a project, since derive attributes can often list many items, and expecting a consistent order helps readers more quickly scan an item's derives.

Alternatives

These three alternatives are taken from rust-lang/rustfmt#4112; here I enumerate them and provide what I think are advantages/disadvantages of each.

(1) Alphabetical sorting

#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]

Advantages:

  • Easy to read; it is quick to figure out where a derive is/should be given the placement of other derives
  • Trivial to implement

Disadvantages:

  • Relational grouping of derives is lost. For example, Eq and PartialEq are often derived together and placed adjacently, implicitly implying a relationship; such a relationship cannot be preserved with alphabetical ordering.

(2) "Canonical" sorting

Specifically, by "canonical" sorting I mean the sorting of derivable std traits in the order they are listed in the rust API guidelines. Traits not included in that list are ordered alphabetically at the end of the derive list.

#[derive(Copy, Clone, Eq, PartialEq, Ord, PartialOrd, Hash, Debug, Display, Default, CustomTrait, CrateSpecificTrait)]

Advantages:

  • Preserves implicit relationships among "common" (std) traits.
  • Somewhat complies with the convention already stated in the API guidelines.

Disadvantages:

  • Having two different sortings merged as one could be confusing, particularly when custom traits are derived. It may be harder to scan for custom traits in this sorting scenario, since they are always at the end of the "common" list, but what that end is depends on how many "common" traits were derived, and is thus not consistent.
  • We can only get halfway with preserving implicit relationships this way. Implicit relationships between custom derives cannot be preserved; for example, given
    #[derive(CustomDebug, Injective, Surjective, Bijective)]

    a tool like rustfmt cannot (consistently) deduce that the latter three traits are related, and would format the derive as

    #[derive(Bijective, CustomDebug, Injective, Surjective)]

    I'm not convinced that only partially supporting implicit relationships is a bad thing, but this is important to mention.

  • If the list in the API guidelines doc is ever amended, the order in rustfmt would have to be amended as well. (More insight from the docs/libraries team is wanted here wrt if this would actually happen). Alternatively, rustfmt could copy the order and adopt it independently.

(3) Sort by resolved trait path

Given the following derive

#[derive(
    Debug,
    Clone,
    PartialEq,
    Eq,
    PartialOrd,
    Ord,
    Hash,
    Serialize,
    Deserialize,
    TypeGenerator,
    CopyGetters,
    Getters,
    Builder,
)]
struct S;

try to format the derives alphabetically by their full path

that is, by sorting

#[derive(
    std::fmt::Debug,
    std::clone::Clone,
    std::cmp::PartialEq,
    std::cmp::Eq,
    std::cmp::PartialOrd,
    std::cmp::Ord,
    std::hash::Hash,
    serde::ser::Serialize,
    serde::de::Deserialize,
    bolero::generator::TypeGenerator,
    getset::CopyGetters,
    getset::Getters,
    derive_builder::Builder,
)]
struct S;

into

#[derive(
    bolero::generator::TypeGenerator,
    derive_builder::Builder,
    getset::CopyGetters,
    getset::Getters,
    serde::de::Deserialize,
    serde::ser::Serialize,
    std::clone::Clone,
    std::cmp::Eq,
    std::cmp::Ord,
    std::cmp::PartialEq,
    std::cmp::PartialOrd,
    std::fmt::Debug,
    std::hash::Hash,
)]
struct S;

So the original example would be sorted as

#[derive(
    TypeGenerator,
    Builder,
    CopyGetters,
    Getters,
    Deserialize,
    Serialize,
    Clone,
    Eq,
    Ord,
    PartialEq,
    PartialOrd,
    Debug,
    Hash,
)]
struct S;

Advantages:

  • Preserves crate relationships

Disadvantages:

  • This is probably not feasible to implement, since tools like rustfmt are generally purely syntactic. Implementing this kind of sorting would require name resolution, which is outside the scope of work tools like rustfmt should be doing. Asking rustfmt to include name resolution would incur a significant runtime and complexity penalty, and fail for non-resolvable names (rustfmt should still be able to format such names if they are well-formed).

Risks

  • It was previously mentioned that sorting derives may lead to a failure in compilation because derived traits can only see the traits ahead of them. While this may have been true in the past, it doesn't appear to be the case today (or at least its consequences don't seem to be). Input from someone with more insight is needed to verify this (maybe someone who works on the trait solver?).

Suggestion

My suggestion is to provide options (1): Alphabetical Sorting and (2): "Canonical" Sorting in rustfmt. I hypothesize that the most common use case is to only use std-provided derivable traits; in this situation, "Canonical" sorting is likely the nicest alternative for a user due to its preservation of implicit relationships. For more complex use cases, alphabetical sorting may be more readable and consistent across a code base.

As for the rust style guide, I suggest specifying that derives should be sorted alphabetically.

This is just my opinion; more insights are requested.

@joshtriplett
Copy link
Member

The style team discussed this when initially developing the default Rust style and style guide. Our concern at that time was partly that derives are not always reorderable (because a proc-macro derive might care what other derives go inside/outside), and partly that people do sometimes list derives in a specific order for a logical reason.

I'm not opposed to changing this, but we'd need to address those points.

@ayazhafiz
Copy link
Author

derives are not always reorderable (because a proc-macro derive might care what other derives go inside/outside)

Do you know who can be contacted for more insight on this? I have no knowledge of how ordering derives may make a semantic difference for proc macros, and cannot seem to find any resources online.

partly that people do sometimes list derives in a specific order for a logical reason

This is a good point. I think a reasonable solution would be to treat groups of derives like imports are treated. That is, traits in a single #[derive] clauses are sorted by some strategy, but traits across derive clauses are not merged.

@joshtriplett
Copy link
Member

@ayazhafiz

Do you know who can be contacted for more insight on this? I have no knowledge of how ordering derives may make a semantic difference for proc macros, and cannot seem to find any resources online.

Try the rust-lang Zulip under t-lang or t-compiler, or internals.rust-lang.org, and in either case mention specifically that you're seeking examples where a proc-macro derive could be order-dependent relative to other derives.

@Luro02
Copy link

Luro02 commented Aug 11, 2020

I have no knowledge of how ordering derives may make a semantic difference for proc macros, and cannot seem to find any resources online.

The problem is that proc-macros can do whatever they want at compile-time. For example one derive macro could write something to a file and the other would read it (if they were reordered the file might not exist), but to me this seems like an anti-pattern, because proc-macros are designed to be order independent (you can not infer which other traits have been derived from a proc-macro)

@calebcartwright
Copy link
Member

Going to close this. I appreciate the recurring interest in this and that many folks could leverage the feature without running into issues. However, we can't categorically prescribe something within the style guide that could knowingly be problematic and a subtle source of bugs, and similarly I don't want to have a feature in rustfmt that could cause such bugs.

@mgeisler
Copy link

Going to close this. I appreciate the recurring interest in this and that many folks could leverage the feature without running into issues. However, we can't categorically prescribe something within the style guide that could knowingly be problematic and a subtle source of bugs, and similarly I don't want to have a feature in rustfmt that could cause such bugs.

Could we consider designing the feature so that it won't trigger these bugs?

An idea for this: designate a set of well-known traits which we know we can reorder at will. That is, teach rustfmt about things like Clone, Default, etc and let rustfmt sort the derive statement when it contains only these traits. The list could be expanded to traits outside of the ecosystem on a case-by-case basis when the maintainers promise to not depend on the ordering.

Alternatively or as an extension to the above, you could let rustfmt users include/exclude traits from the list of "sortable traits". In the extreme case, rustfmt could start out with no traits in this list, and user could then expand it themselves as desired.

A system like that would be good enough for many usecases, while still being safe. I think it would make a lot of users happy here.

@MixusMinimax
Copy link

I agree with mgeisler, but I would extend it to this:

from the derive token tree, perform a take_while that only accepts well-known derives. Then, these can be sorted.

let's say, derives starting with A are well-known, as in they are part of the official rust documentation.

Then, the following example:

#[derive(A1, A2, A7, A4, B1, B4, B2, A3)]

Will only consider A1, A2, A7, and A4, and sort those.

This would guarantee that there can not be any weird bugs, as these derives are well understood, and absolutely none of the third-party derives are affected in any way.

@maksym-arutyunyan
Copy link

maksym-arutyunyan commented Sep 1, 2024

Hi all!

In 2017, a request was made about handling the order of Rust derive traits in rustfmt. This was revisited in 2020 and closed with the reasoning that enforcing a specific order could introduce subtle bugs.

It’s now 2024, and I'm unsure if this reasoning still applies. I have created a simple example with two traits, First and Second, where Second depends on First. In my tests, both derive orders worked fine, and I couldn’t find evidence that the order is crucial.

  1. Could you please provide an update on whether the order of derive traits is still considered important? If it is, could you provide a clear example demonstrating why?

  2. If the order is no longer important, could you consider reopening this feature request?

For now, I’ve created a tool to help maintain derive order until this feature is possibly implemented in rustfmt. You can find it here.

Thank you!

UPD: for one-time modifications one can use sort-derive-traits.github.io

github-merge-queue bot pushed a commit to dfinity/ic that referenced this issue Sep 2, 2024
This PR sorts Rust `#[derive(...)]` traits to maintain a consistent
style.

The idea comes from a (now closed) feature request to the `rust-fmt`
style team for adding default sorting,
[link](rust-lang/style-team#154).
The sorting follows the 'canonical' order, where `std` traits come
first, and the rest follow in alphabetical order.
It does not modify autogenerated rust code from protobuf files.

The changes were made automatically using the `keepsorted` CLI app,
[repo](https://github.com/maksym-arutyunyan/keepsorted), with the
`rust_derive_canonical` experimental feature.
@spectria-limina
Copy link

You're correct. This has now largely been fixed.

There is one remaining sensitivity to execution ordering: one derive can emit use items that bring another into scope, to "save" a subsequent derive that would otherwise have been an error. This is an extremely obscure case, but it is technically possible. If the attributes are reordered then, and only then, could this produce an error where previously there was none.

That said, this case is:

  1. Well-defined: The situation that gives rise to it is clearly understood and quite precise.
  2. Unlikely to arise by accident: Derives generally don't emit use to begin with, and there are additional conditions required before they could hit this.
  3. Has basically no valid use case: There's nothing you can do by saving a non-imported derive that you can't already do in better ways.
  4. Not going to produce subtle bugs: It's always a hard error or nothing. This can't change the interpretation of any future Derive, because it's an error for a macro to change the result of another macro's name resolution other than from "Not found" to "found", even retroactively.
  5. Easily fixed: Simply import the Derive properly.

All of these combinations together make it an excellent candidate, IMO, for a lint with an aim of turning it into a hard error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants