Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow autoderef and autoref in operators #2147

Closed
wants to merge 12 commits into from
201 changes: 201 additions & 0 deletions text/0000-eye-of-sauron.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
- Feature Name: eye_of_sauron
- Start Date: 2017-09-11
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Allow for method-dispatch-style trait-driven autoderef and autoref in operators.

# Motivation
[motivation]: #motivation

In today's Rust, working with operators is annoying, because they are supposed to be a "lightweight" syntax like methods and field accesses, but unlike them, they do not have coercions on their LHS and therefore require explicit autoderefs and autorefs.

One common example where this is a nuisance is when using iterator adapters, working which can easily create multiple-reference types such as `&&u32`. For example:

```Rust
let v : Vec<u32> = vec![0, 1, 2, 3];
// Why can't I just write `x > 1`? Why do I have to ** for the compiler???
v.iter().filter(|x| **x > 1).count();
```

There are several cases where these operators are used. The most popular case is indexing associative maps:

```Rust
use std::collections::HashMap;
let x: HashMap<_, _> =
vec![(format!("hello"), format!("world"))].into_iter().collect();
let s = format!("hello");

// I would like to write...
println!("{}", x[s]);
// But instead, I have to write...
println!("{}", x[&s]);
```

In this case, these are merely annoying papercuts. In some other cases, they can be a much worse problem.

One of thim is Isis Lovecruft's "Eye of Sauron" case, which is a problem when using non-`Copy` bignums:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/thim/them/?

BTW I think linking to https://twitter.com/isislovecruft/status/839333846415998976 may provide some context what is meant by Isis Lovecruft's "Eye of Sauron" case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to nit pick, there are no bignums in curve25519, only 256bit numbers, which do not count as bignums because they do not use Box<>. Also, those numbers could easily be Copy so if they are not it's presumably Isis and Henry found that 32 bytes is already big enough for Copy to cause a performance penalty


```Rust,ignore
struct Bignum(..);
// fair enough
impl<'a> Add for &'a Bignum {
type Output = Bignum;
// ...
}
// ...

// I can't see what my code is doing! It looks like one big piece of &.
let a = &(-(&A)) * &(&one + &nrr).invert();
let z = &u * &(&(&u.square() + &(&A * &u)) + &one);

// Would be better:
let a = (-A) * (one + nrr).invert();
let z = u * (u.square() + A * u + one);
```

Allowing method-style autoref could make operator uses as clean as methods and fields, making code that uses them intensively far more ergonomic. It's also a fairly non-invasive extension.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Like methods, operators and indexing support automatic referencing and dereferencing. When you use an operator, the compiler will automatically add `&`, `&mut` and `*` operators to match the signature of the operator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean to include &mut here? The alternatives section mentions "Mutable autorefs ... is a bad idea".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left over from an old version of the RFC.


# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

"Operators" here refers to several typeck operators:
- The "standard" eager binary operators: `+`, `-`, `*`, `/`, `%`, `^`, `&`, `|`, `<<`, `>>`, `==`, `<`, `<=`, `!=`, `>=`, `>`.
- The "standard" unary operators: `!`, `-`. This does *NOT* include the dereference operator `*`.
- The indexing operator `...[...]`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about compound assignment operators (+= etc)?


Operator type-checking behaves similarly to method type-checking. It works as follows:

## S1. Subexpression checking

Both the LHS and the RHS (if binary) of the operator are first type-checked with no expected type.

This differs from rustc 1.20, in which an expected type was sometimes propagated from the LHS into the RHS, potentially triggering coercions within the RHS. I should probably come up with an example in which this matters.
Copy link
Contributor

@oli-obk oli-obk Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one's probably a crazy example, but the RHS is definitely dependent on the LHS and the case is rather horrible:

    let byte = 0u8;
    let word = 1u16;

    let x = word + &byte.into();
    println!("{}", x);

This needs the & for $reasons

Found in rust-lang/rust-clippy#2042

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one's probably a crazy example, but the RHS is definitely dependent on the LHS and the case is rather degenerate:

That's a trait-system issue. Operator overloading does succeed in this case. Without the autoref, you get these trait bounds:

u16: Add<$0>
u8: Into<$0>

Because the compiler only considers each trait bound alone, it can't see that the only solution is $0 = u16.

If you add a reference, instead yoou get

u16: Add<&'a $0>
u8: Into<$0>

And on the first bound, only the impl Add<&u16> for u16 impl matches, which allows a solution.

Copy link

@leoyvens leoyvens Sep 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also won't coerce: &1 + &mut 1. As soon as you have two different implementations for the RHS it stops working. That coercion only fires in such rare situations that I wonder if it's existence is by design or an accident of implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case it's orthogonal to this RFC.


## S2. Adjustment selection

Afterwards, an adjustment list is selected for both operands as follows:

Adjustment lists for the LHS of an indexing operation are selected from these matching the following regular expression:
```
"Deref"* "Autoref(Immutable)" "ConvertArrayToSlice"?
```

Adjustment lists for all other operands (including the RHS of indexing operations) are selected from these matching the following regular expression
```
"Deref"* ( "Autoref(Immutable)" "ConvertArrayToSlice"? )?
```

The adjustment lists selected are the lexicographically first pair of adjustment lists `(lhs_adjust, rhs_adjust)` (or with an unary op, just the `lhs_adjust`) such that
A1. Both adjustment lists match the relevant regular expressions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw this list does not render well in markdown. Perhaps add some bullets?

A2. Both adjustment lists must be valid to apply to their operand types.
A3. After applying both adjustment lists, the adjusted operand types are a potential match for the operator trait (if there is an ambiguity because of inference variables, it is counted as a match).
A3.1. NOTE: the operator trait for overloaded indexing is `Index`, not `IndexMut`, even if indexing is done in a mutable context. rustc 1.20 is inconsistent in that regard.

If the smallest adjustment can't be determined because of the presence of inference variables (because it is not obvious whether an adjustment list would be valid to apply), this is a compilation error.

## S3. Fixups

After adjustments are selected, the following fixups are made. They do not affect adjustment selection.

### Mutability Fixup

If overloaded indexing is used in a mutable context, the `Autoref(Immutable)` adjustment of the LHS is replaced with an `Autoref(Mutable)` adjustment, and the entire chain is required to be consistent with the new mutability (using the `DerefMut` and `IndexMut` traits when needed). If they can't be, this is a compilation error.

### Arithmetic Fixup

If an arithmetic operator was used, and both types are integer or float inference variables, their types are unified as if there existed an impl generic over integer inference variables, e.g.
```Rust
impl<I: IntegerType> Add<I> for I { // or modify for other operators
type Output = I;
// ..
}
```

This is required in order to make `1 + 2` (both parameters are integer inference variables) be known to be an integer before integer defaulting.

## Operator Adjustments

These are basically the same as method adjustments, but because these are underdocumented: for the purpose of overloaded operators, an adjustment is defined as follows:

```Rust
type Adjustments = Vec<Adjustment>;
#[derive(PartialOrd, Ord, PartialEq, Eq)]
enum Mutability {
Immutable,
Mutable
}

#[derive(PartialOrd, Ord, PartialEq, Eq)]
enum Adjustment {
Autoref(Mutability),
ConvertArrayToSlice,
// this must be last, and means that k+1 derefs is always > k derefs
Deref,
}
```

Adjustments have the following effect on types
```
adjust(Deref, ty) = /* do immutable autoderef */
adjust(Autoref(Immutable), ty) = Some(`&$ty`)
adjust(ConvertArrayToSlice, &[ty; N]) = Some(`&[$ty]`)
adjust(ConvertArrayToSlice, &mut [ty; N]) = Some(`&mut [$ty]`)
adjust(ConvertArrayToSlice, _) = None

adjust_list(adjustments, ty) =
let mut ty = ty;
for adjustment in adjustments {
ty = if let Some(ty) = adjust(adjustment, ty) {
ty
} else {
return None;
}
}
Some(ty)
```

And have the obvious effect on values. Adjustments are ordered using the standard lexicographical order.

# Drawbacks
[drawbacks]: #drawbacks

The 2-argument inference adds more complexity to type-checking. It probably has some "interesting" and unexpected uses in some situations, and we need to gain some experience with it before stabilization.

The automatic references can make it less obvious when values are passed by reference or by value. Because operator traits are very general, the created autorefs can easily escape outside of the operator.

# Rationale and Alternatives
[alternatives]: #alternatives

The "Eye of Sauron" case is very annoying for mathematical code, and therefore is something we want to fix. This fix feels rather elegant and is contained to binary operators.

We need to use special coercion logic rather than the "standard" expected type propagation because we are coercing on the basis of trait impls. Therefore, options such as [RFC 2111] will not help in all cases. We could instead try to add trait-impl-based propagation, but that would not solve the "binary" nature of operator traits.

[RFC 2111]: https://github.com/rust-lang/rfcs/pull/2111

Indexing is included in this feature basically because it behaves like the other operators. It's possible that this should not be done.

Because the arithmetic fixup does not apply with references, there could be inference issues with integer variables:
```
let x = &1 + &2; // this uses the <&i32 as Add<&i32>> impl,
// so `x` has type `_` before integer fallback
x.foo(); //~ ERROR cannot infer the type
```
therefore, We might want to extend the arithmetic fixup to references to integer variables in some way.

We might want to allow mutable autorefs in more cases. That would be more consistent with method lookup, but is probablly a bad idea.

# Unresolved questions
[unresolved]: #unresolved-questions

- Should we change the set of operators?
- What priority order should operators use? I think lexicographic is best, but there are other options.
- Should we allow for more general coercions than autoref and autoderef? e.g. function item to function pointer coercions? Is there any use for that? Does it bring disadvantages?