Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: lock the map during access to prevent recursive modification #40

Open
dead-claudia opened this issue Aug 29, 2021 · 3 comments · May be fixed by #71 or #73
Open

Suggestion: lock the map during access to prevent recursive modification #40

dead-claudia opened this issue Aug 29, 2021 · 3 comments · May be fixed by #71 or #73

Comments

@dead-claudia
Copy link

Essentially, what I'm proposing is that emplace should set a flag to block recursive updates to the same map or weak map, whether it be through set, delete, or emplace. This enables implementations to not have to do nearly the bookkeeping when managing updates.

  • In the current form, engines may have to re-look up the hash - this is always necessary if the entry is deleted on the update callback, and engines optimizing for space may end up doing a second hash table lookup always anyways as it's just simpler and smaller to implement.*
  • With this suggestion, they can just do it once and know for certain it will continue to exist at that exact position when the callback returns.
    • This also means all they need to keep around on the stack is a pointer to the target hash table entry, and all they do before returning is a single pointer-sized store. They need neither the map itself nor the hash key to store, and so it becomes a very lightweight process overall.

It also exposes potential bugs to the user - recursive updates are rarely intended, and if absolutely desired, you can just do something like this to still work around the lock and avoid it:

let deleteMarker = Symbol("deleteMarker")
// Push `[key, deleteMarker]` to trigger delete or `[key, value]` to trigger set
let queued = [[initialKey, initialValue]]

while (queued.length) {
  let [key, value] = queued.shift()
  if (value === deleteMarker) {
    map.delete(key)
  } else {
    map.emplace(key, {
      insert: () => {
        // do insert things
        return newValue
      },
      update: v => {
        // do update things
        return newValue
      },
    })
  }
}

*This is only technically the case if the hash table is represented as an array of inline key/value pairs (as in something like std::vector<HashTable>) rather than an array of pointers to entries (as in something like std::vector<std::shared_ptr<HashTable>>). For efficiency, nearly every hash table implementation in practice does the former if they want to aim for any serious performance, and of course, every engine I'm aware of also does it that way, and so yeah, it's widely enough applicable for me to say "always" here as that's in practice the case because of how everyone implements it.

@dead-claudia
Copy link
Author

I just found this bit about lower-spec VMs, and this just reinforces this suggestion - this really feels like an algorithmic oversight.

@acutmore
Copy link

The latest spec still seems to have this issue. Would be great to decide on a solution.

Right now there is no check after the callback is invoked

1. Let _value_ be ? Call(_callbackfn_, *undefined*, « _key_ »).

That mapData hasn't been updated before we get to

1. Append _p_ to _M_.[[MapData]].

Meaning there could end up with two records with matching keys.

I personally like the idea of throwing a re-entrancy error in map.set. Though as that is a method that does not throw today I can see how this may be unpopular

@dead-claudia
Copy link
Author

Alternatively, a spec note could be used to recommend a special boolean "has reserved entry" flag to make the added check cheap:

  • On emplace:
    1. Lookup entry.
    2. Set "has reserved entry".
    3. Execute insert/update callback.
    4. If "has reserved entry" is clear, re-lookup entry again.
    5. Clear "has reserved entry".
    6. Insert/update/remove value.
  • On every other mutable operator:
    1. Clear "has reserved entry".
    2. Proceed as before.

This could be further optimized by keeping the old value (as just a raw integer reference value - it can't be followed by GC) and comparing that for equality instead. Then, the "clear" woukd be setting it to a sentinel, the "set" would be setting it to the key, and the test would be comparing for equality. And this pattern could then be extended to has/get/update by storing the value of the most recently looked-up entry. (IIRC this is not far off of how WebKit optimized has+get pairs, but this is much more general.)

michaelficarra added a commit to michaelficarra/proposal-upsert that referenced this issue Nov 18, 2024
michaelficarra added a commit to michaelficarra/proposal-upsert that referenced this issue Nov 18, 2024
dminor added a commit to dminor/tc39-agendas that referenced this issue Nov 21, 2024
michaelficarra pushed a commit to tc39/agendas that referenced this issue Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants