Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexMap::insert_unique_unchecked #200

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

stepancheg
Copy link
Contributor

@stepancheg stepancheg commented Aug 22, 2021

Insert a key-value pair into the map without checking if the key
already exists in the map.

This operation is safe if a key does not exist in the map.

However, if a key exists in the map already, the behavior is
unspecified: this operation may panic, or any following operation
with the map may panic or return arbitrary result.

This operation is faster than regular insert, because it does not
perform lookup before insertion.

This operation is useful during initial population of the map. For
example, when constructing a map from another map, we know that
keys are unique.

Simple benchmark of insert vs insert_unique_unchecked included:

test insert                  ... bench:      14,929 ns/iter (+/- 2,222)
test insert_unique_unchecked ... bench:      11,272 ns/iter (+/- 1,172)

Insert a key-value pair into the map without checking if the key
already exists in the map.

This operation is safe if a key does not exist in the map.

However, if a key exists in the map already, the behavior is
unspecified: this operation may panic, or any following operation
with the map may panic or return arbitrary result.

This operation is faster than regular insert, because it does not
perform lookup before insertion.

This operation is useful during initial population of the map.  For
example, when constructing a map from another map, we know that
keys are unique.

Simple benchmark of `insert` vs `insert_unique_unchecked` included:

```
test insert                  ... bench:      14,929 ns/iter (+/- 2,222)
test insert_unique_unchecked ... bench:      11,272 ns/iter (+/- 1,172)
```
@bluss
Copy link
Member

bluss commented Aug 23, 2021

Does hashbrown itself have any operation like this? Maybe insert_unique_unchecked or insert_new_unchecked would be a name that the user understands more intuitively? An IndexSet method with the same semantics is needed to maintain parity, if this method is accepted.

If this method is good - is it something we can use internally somewhere?

@cuviper
Copy link
Member

cuviper commented Aug 23, 2021

Hashbrown's several RawTable::*insert* methods works this way, where you're expect to check matches separately first. We do take advantage of that in the push being used here, e.g. via VacantEntry::insert, or in our rebuild_hash_table calling insert_no_grow.

I'm not aware of anything public like this on hashbrown::HashMap, but it would be nice to settle on a good name together...

@stepancheg
Copy link
Contributor Author

I like insert_unique_unchecked name.

Give me some time, I'll open a PR to hashbrown and and hear what people say there (about name, and generally about the idea of exposing map internals). One possibility is that they say, just use RawTable directly. And maybe this is what indexmap crate should do to stay consistent with others.

stepancheg added a commit to stepancheg/hashbrown that referenced this pull request Sep 12, 2021
Sometimes a map is constructed when it is known that all keys are
unique (e. e. if keys are coming from another map or from a
sorted/deduplicated iterator). In this case we can make insertion
faster by skipping a check that a key already exists in the map.

`insert_unique_unchecked` is guaranteed to be memory-safe, but does
not guarantee anything beyoyd that: if inserted key is not unique,
`HashMap` can panic, loop forever, return incorrect entry etc.

Added simple benchmark. `insert_unique_unchecked` is about 30%
faster than `insert`.  Your mileage may vary of course.

Similar PR was
[added to `indexmap` crate](indexmap-rs/indexmap#200)
and they asked to discuss the name of the operation with `hashbrown`
crate owners to come to the same naming convention (if `hashbrown`
is willing to have the same operation).
stepancheg added a commit to stepancheg/hashbrown that referenced this pull request Sep 12, 2021
Sometimes a map is constructed when it is known that all keys are
unique (e. e. if keys are coming from another map or from a
sorted/deduplicated iterator). In this case we can make insertion
faster by skipping a check that a key already exists in the map.

`insert_unique_unchecked` is guaranteed to be memory-safe, but does
not guarantee anything beyond that: if inserted key is not unique,
`HashMap` can panic, loop forever, return incorrect entry etc.

Added simple benchmark. `insert_unique_unchecked` is about 30%
faster than `insert`.  Your mileage may vary of course.

Similar PR was
[added to `indexmap` crate](indexmap-rs/indexmap#200)
and they asked to discuss the name of the operation with `hashbrown`
crate owners to come to the same naming convention (if `hashbrown`
is willing to have the same operation).
@stepancheg
Copy link
Contributor Author

Hashbrown PR: rust-lang/hashbrown#293

stepancheg added a commit to stepancheg/hashbrown that referenced this pull request Sep 12, 2021
Sometimes a map is constructed when it is known that all keys are
unique (e. e. if keys are coming from another map or from a
sorted/deduplicated iterator). In this case we can make insertion
faster by skipping a check that a key already exists in the map.

`insert_unique_unchecked` is guaranteed to be memory-safe, but does
not guarantee anything beyond that: if inserted key is not unique,
`HashMap` can panic, loop forever, return incorrect entry etc.

Added simple benchmark. `insert_unique_unchecked` is about 30%
faster than `insert`.  Your mileage may vary of course.

Similar PR was
[added to `indexmap` crate](indexmap-rs/indexmap#200)
and they asked to discuss the name of the operation with `hashbrown`
crate owners to come to the same naming convention (if `hashbrown`
is willing to have the same operation).
bors added a commit to rust-lang/hashbrown that referenced this pull request Sep 13, 2021
insert_unique_unchecked operation

Sometimes a map is constructed when it is known that all keys are
unique (e. e. if keys are coming from another map or from a
sorted/deduplicated iterator). In this case we can make insertion
faster by skipping a check that a key already exists in the map.

`insert_unique_unchecked` is guaranteed to be memory-safe, but does
not guarantee anything beyond that: if inserted key is not unique,
`HashMap` can panic, loop forever, return incorrect entry etc.

Added simple benchmark. `insert_unique_unchecked` is about 30%
faster than `insert`.  Your mileage may vary of course.

Similar PR was
[added to `indexmap` crate](indexmap-rs/indexmap#200)
and they asked to discuss the name of the operation with `hashbrown`
crate owners to come to the same naming convention (if `hashbrown`
is willing to have the same operation).
@stepancheg stepancheg changed the title IndexMap::insert_no_overwrite_unchecked IndexMap::insert_unique_unchecked Sep 14, 2021
@stepancheg
Copy link
Contributor Author

Updated the PR with insert_unique_unchecked name to match new hashbrown API.

However, unlike hashbrown, I opted not to return (&K, &mut V) from insert_unique_unchecked because returning these values caused significant regression even in insert operation (if IndexMapCore::push returns (usize, &K, &mut V) instead of usize). It should be no regression, but perhaps something is inefficient in generated machine code.

facebook-github-bot pushed a commit to facebookincubator/antlir that referenced this pull request Sep 15, 2021
Summary:
Before this diff `kwargs` were collected into `Dict` which is `SmallMap<Hashed<Value>, Value>`.

Now `kwargs` are collected into `SmallMap<Hashed<StringValue>, Value>` and that `SmallMap` is "coerced" into a map with `Value` key.

When inserting keys we lookup previous keys (this should be partially addressed by [this PR in indexmap](indexmap-rs/indexmap#200)), and equality operation on `StringValue` is cheaper that equality on `Value` because there's no dynamic casts. We don't do real equality often (because hash collisions are rare), but having `StringValue` instead of `Value` may generate more efficient machine code.

Also this makes code a little more type-safe.

Reviewed By: ndmitchell

Differential Revision: D30921794

fbshipit-source-id: cf2b4fa72eeef150e6308d2fe2d7c16f59166586
facebook-github-bot pushed a commit to facebook/starlark-rust that referenced this pull request Sep 15, 2021
Summary:
Before this diff `kwargs` were collected into `Dict` which is `SmallMap<Hashed<Value>, Value>`.

Now `kwargs` are collected into `SmallMap<Hashed<StringValue>, Value>` and that `SmallMap` is "coerced" into a map with `Value` key.

When inserting keys we lookup previous keys (this should be partially addressed by [this PR in indexmap](indexmap-rs/indexmap#200)), and equality operation on `StringValue` is cheaper that equality on `Value` because there's no dynamic casts. We don't do real equality often (because hash collisions are rare), but having `StringValue` instead of `Value` may generate more efficient machine code.

Also this makes code a little more type-safe.

Reviewed By: ndmitchell

Differential Revision: D30921794

fbshipit-source-id: cf2b4fa72eeef150e6308d2fe2d7c16f59166586
/// Insert a key-value pair into the map without checking
/// if the key already exists in the map.
///
/// Returns a reference to the key and value just inserted.
Copy link
Member

@bluss bluss Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! It does not return a reference in this version (this doc and the other both say this).

I can see the problem, we'd need some unsafe code to do the index -> &mut V lookup without extra cost here. It's fine to avoid this and just return nothing I think (?) Unless there is a compelling nameable usecase

@bluss
Copy link
Member

bluss commented Sep 15, 2021

@cuviper This is a new feature and in theory we'd bump the version to 1.8 since it's a feature update. I guess it's only debatable where to draw the line between incremental fix and improvement, but I don't mind using the feature bump for new methods (for new trait impls, depends, then it depends I think).

@bluss
Copy link
Member

bluss commented Sep 15, 2021

I prefer to see that we try to keep map-set parity when possible. Should there be a set method for this?

src/map.rs Outdated Show resolved Hide resolved
src/map.rs Outdated Show resolved Hide resolved
@stepancheg
Copy link
Contributor Author

Updated docs.

There's a IndexSet::insert_unique_unchecked operation in the latest version of the PR.

@cuviper
Copy link
Member

cuviper commented Sep 22, 2021

Updated the PR with insert_unique_unchecked name to match new hashbrown API.

However, unlike hashbrown, I opted not to return (&K, &mut V) from insert_unique_unchecked because returning these values caused significant regression even in insert operation (if IndexMapCore::push returns (usize, &K, &mut V) instead of usize). It should be no regression, but perhaps something is inefficient in generated machine code.

I'm conflicted about matching the name without properly matching the API in the return value. I don't see why you would change push though -- e.g. VacantEntry::insert just calls push and then does the indexing itself.

@cuviper
Copy link
Member

cuviper commented Sep 22, 2021

Simple benchmark of insert vs insert_unique_unchecked included:

test insert                  ... bench:      14,929 ns/iter (+/- 2,222)
test insert_unique_unchecked ... bench:      11,272 ns/iter (+/- 1,172)

FWIW, the initial numbers are much closer on my machine, using rustc 1.57.0-nightly (5ecc8ad84 2021-09-19):

test insert                  ... bench:      12,863 ns/iter (+/- 171)
test insert_unique_unchecked ... bench:      11,312 ns/iter (+/- 114)

Adding the return references does consistently slow that down:

test insert                  ... bench:      12,875 ns/iter (+/- 298)
test insert_unique_unchecked ... bench:      11,917 ns/iter (+/- 96)

I also tried running with 10k insertions, and the benefit narrowed further:

test insert_10k                  ... bench:     158,763 ns/iter (+/- 1,472)
test insert_unique_unchecked_10k ... bench:     143,972 ns/iter (+/- 1,155)

and with return references:

test insert_10k                  ... bench:     158,910 ns/iter (+/- 2,191)
test insert_unique_unchecked_10k ... bench:     149,291 ns/iter (+/- 1,396)

@stepancheg
Copy link
Contributor Author

@cuviper looks bad. Let me check again.

@stepancheg
Copy link
Contributor Author

On my laptop, MacBook Pro, Intel Core i9, rustc 1.57.0-nightly (8c2b6ea37 2021-09-11) the numbers I gave are correct.

@stepancheg
Copy link
Contributor Author

For 10k items:

test insert                  ... bench:     137,650 ns/iter (+/- 13,904)
test insert_unique_unchecked ... bench:     116,824 ns/iter (+/- 25,202)

For 100:

test insert                  ... bench:       1,530 ns/iter (+/- 141)
test insert_unique_unchecked ... bench:       1,288 ns/iter (+/- 237)

@cuviper
Copy link
Member

cuviper commented Sep 22, 2021

@bluss

This is a new feature and in theory we'd bump the version to 1.8 since it's a feature update. I guess it's only debatable where to draw the line between incremental fix and improvement, but I don't mind using the feature bump for new methods (for new trait impls, depends, then it depends I think).

I'm not too concerned about that version criteria, personally. I do prefer minor bumps for pseudo-breaking things like MSRV, but otherwise it's fine either way. We also have #195 and #196 pending release that are featureful.

@stepancheg

On my laptop, MacBook Pro, Intel Core i9, rustc 1.57.0-nightly (8c2b6ea37 2021-09-11) the numbers I gave are correct.

That's fine, benchmarks vary. FWIW mine is a desktop running Fedora 34 with an AMD Ryzen 7 3800X.
(I'm surprised your 10k numbers are so much better when the others are pretty close!)

@stepancheg
Copy link
Contributor Author

I don't see why you would change push though -- e.g. VacantEntry::insert just calls push and then does the indexing itself.

And if I do that, there's no regression anymore (at least I don't see it easily: rust builtin benchmarks are too unreliable). I'll update the PR now.

@stepancheg
Copy link
Contributor Author

stepancheg commented Sep 22, 2021

OK, proper (more or less) benchmark for 1000 inserts. 100 runs.

Raw data: https://gist.github.com/stepancheg/b234abc8da06de88acc18c3a1c7adfe3
Averages of 100 run averages:
insert: 15.632
insert_unique_unchecked: 13.105

Yes, I gave a too high number initially. Sorry.

@stepancheg
Copy link
Contributor Author

I'll post an update after another benchmark finishes.

@stepancheg
Copy link
Contributor Author

stepancheg commented Sep 22, 2021

OK, more benchmarks. Returning references from insert_unique_unchecked as implemented in current version of PR (969115b) actually causes small regression for insert_unique_unchecked operation compared to returning nothing, but regression is even smaller when using unsafe.

How I checked.

I created a simple utility:

#![feature(bench_black_box)]

use indexmap::IndexMap;
use std::hint::black_box;

fn main() {
    for _ in 0..200000 {
        let mut m = IndexMap::with_capacity(1000);
        for i in 0..1000 {
            m.insert_unique_unchecked(i, i);
            black_box(&mut m);
        }
    }
}

Compiled it with this branch.

Then applied a patch which removes returning values and compiled the utility again (version C).

And applied a patch to this branch which uses unsafe (version B):

--- a/src/map.rs
+++ b/src/map.rs
@@ -390,10 +390,11 @@ where
     /// This operation is useful during initial population of the map.
     /// For example, when constructing a map from another map, we know
     /// that keys are unique.
+    #[allow(unsafe_code)]
     pub fn insert_unique_unchecked(&mut self, key: K, value: V) -> (&K, &mut V) {
         let hash = self.hash(&key);
         let index = self.core.push(hash, key, value);
-        let entry = &mut self.core.as_entries_mut()[index];
+        let entry = unsafe { self.core.as_entries_mut().get_unchecked_mut(index) };
         (&entry.key, &mut entry.value)
     }

Then I used my absh utility https://github.com/stepancheg/absh which I used for proper A/B benchmarking.

Utility is invoked like this:

absh -a ./default -b ./with-unsafe -c ./no-return -r

After 500 iterations the output is:

A: n=501 mean=2.556 std=0.129 se=0.005 min=2.355 max=3.073 med=2.526
B: n=501 mean=2.497 std=0.128 se=0.005 min=2.313 max=3.203 med=2.462
C: n=501 mean=2.479 std=0.139 se=0.006 min=2.273 max=2.975 med=2.438
A: distr=[      ▁▃▅▆▇▅█▃▆▆▅▂▄▄▃▂▂▃▃▂▁▁▁▁ ▂  ▁                      ]
B: distr=[   ▁▃▆▆▆██▆▇▅▃▄▄▄▁▃▂▁▃▂▁ ▁▁▁▁▁  ▁                        ]
C: distr=[ ▁▄▄▄▆▆█▇▅▅▅▅▃▃▃▃▁▃▁▂▁▁▁▁▁▁▁▁▁▁▁   ▁                     ]
B/A: 0.977 0.971..0.983 (95% conf)
C/A: 0.970 0.963..0.976 (95% conf)

Again:
A: this branch with returning references
B: this branch with returning references with unsafe
C: this branch without returning references

So, simply returning references (A) causes about 3% regression against version with returning nothing (C), but using unsafe wins most of it back.

@cuviper cuviper added the waiting-for-std Changes waiting for stabilization in the standard library, so we can match the API. label Jul 15, 2022
@Sagebati
Copy link

Hi, this is an interesting feature, what can I do too push it forward ? The PR in hashbrown got merged. Do we need it also on the HashMap of std ?

@cuviper
Copy link
Member

cuviper commented Jun 6, 2023

The ideal would be to have this added and stabilized in std first, so the API precedent is locked in. AFAIK, that hasn't even been proposed yet, which should go through this process:
https://std-dev-guide.rust-lang.org/feature-lifecycle/api-change-proposals.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting-for-std Changes waiting for stabilization in the standard library, so we can match the API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants