Skip to content

RFC: Minimal Implementation of Indexed Components #10671

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

bushrat011899
Copy link
Contributor

@bushrat011899 bushrat011899 commented Nov 21, 2023

Objective

Goals

  • To create an Index system parameter which will provide an automatically managed way to retrieve an Entity by Component value. The planned public API for this is:
    • A system parameter Index
    • An Index::get method for retrieving all entities indexed against a particular value
    • An Index::iter method for retrieving all entities grouped by their index value
  • Opt-In. An index must have zero negative impact on users if they are not using it.
  • Configurable at compile time. The index should share a similar API with the Query type, allowing for a filter.
  • Independent of component definition. A user should be able to create an index for a type they do not own, and was not designed for an index.

Non-Goals (for now!)

  • Any actual implementation of relationship functionality (e.g., changing Parent or Children components)
  • Helper/convenience methods for integrating queries and indexes (e.g., Join<Left, Q1, Q2, Index<PlayerId>>
  • Optimal performance. I am aiming for the simplest agreeable API first and foremost. As long as the performance is as good as the current workarounds, I consider that a success that can be improved upon later.
  • Non-hash based indexing (e.g., ordered indices, spatial index, etc.). I think these are also very valuable, but different enough from hash-based indexing that I want to avoid spiraling the scope out any more than I absolutely have to.

Details

A user would first register an index during app creation:

// Hypothetical API, TBC
app.add_index<PlayerId>();

This will allow users to access the Index<PlayerId> system parameter:

fn log_players(index: Index<PlayerId>) {
    /// snip
}

An Index will then be used to provide access to an Entity based on component value through a method get:

/// Multiple entities could have the same component!
let entities = index.get(PlayerId(3));

This is what I consider to be the core functionality required to make an Index worth using at all.

Current Implementation Details

Index is a SystemParam bundling access to several relevant items from the ECS. It is currently generic over 3 parameters, T, F, and I.:

  • T: The Component this Index is interested in (e.g., Transform)
  • F: A filter query. This can be used to restrict the indexing to only a subset of components T (e.g., With<PlayerFlag>)
  • I: The Indexer, which describes how to get a usable index from a component T.

F and I are both optional, utilising a default Indexer provided for components if they are able to implement Hash + Clone + Eq. Hash and Eq are required because the bidirectional map is built on a HashMap. Clone is required in order to create a constant time bijection. It may be possible to relax this requirement with an alternate implementation of a bidirectional map.

Because Index includes a Query<'w, 's, (Entity, Ref<'static, T>), (Changed<T>, F)>, it already flags to the scheduler that this parameter requires immutable access to the component T, prevent mutable access from an alternate system. One current pain point is the performance of Changed<T>, which walks over all T (O(n)). #5097 could improve this performance, but it is also a potentially contentious change.

In order to catch removals, a RemovedComponents<'w, 's, T> parameter is included. Because this is an event reader, to read the events requires mutable access, making Index itself require mutable access. This is also a pain point, since it will require all systems using the index to operate sequentially, even if they could in principle run in parallel. I do believe it would be possible to remove the mutable access requirement from Index using a custom implementation of SystemParam that hides these items using interior mutability, but this is a future goal for now.

To ensure all removals and changes are witnessed, adding an Index adds a once-per-frame (Update) system which updates the backend bidirectional map. This is unfortunate, and needs to be revisited.

To ensure the Index is up-to-date within a frame (e.g., in-between mutable accesses of T), reading the Index triggers an update check. If the this_run Tick from SystemChangeTick is not equal to the value currently stored with the Index, the map is updated. This should ensure the update only occurs at most once after mutation.

Performance 🏠🐘

To be blunt, I don't believe it is currently feasible to make an index with performance comparable to Bevy's current (and excellent) ECS query model. It's already very well optimised, and if your algorithm can work with queries in the order they come in, I don't think indexing will improve anything for you at all. Caching is incredibly friendly to the SoA storage that Bevy uses for in-order access, and indexing is inherently for out-of-order access.

However, I think for certain algorithms, and certain stages of development, the ergonomics of accessing an entity by a value outweighs the negatives. This is why my focus for this PR is "what API should an Index have?". I believe that agreeing on this API will create a fixed target that can be optimised for far better than trying to tackle this problem in the reverse direction.

It may turn out that the only way to get a performant index is to modify the underlying Table / SparseSet types (or create a whole new alternative), but that represents such a large body of work, that I feel it should only be undertaken if there's an API for users that will benefit from its addition.


Changelog

TBA

Migration Guide

There should be zero changes to existing APIs, with this being a purely additive PR.

Notes

This is still heavily work-in-progress and I would highly encourage feedback from anyone with a vested interest in how this is implemented to provide feedback, desired goals, and non-goals. I am willing to make major changes to the current proposed design based on feedback so please feel free to provide it!

See Also

Copy link
Contributor

The generated examples/README.md is out of sync with the example metadata in Cargo.toml or the example readme template. Please run cargo run -p build-templated-pages -- update examples to update it, and commit the file change.

Comment on lines 115 to 117
for (entity, component) in self.changed.iter() {
self.index.update(entity, Some(component.clone()));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will require iterating over all entities that have a T component every time a Index may be out of sync, which IMO isn't ideal. AFAIK bevy_mod_index explored this possibility and it wasn't very efficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this can definitely be improved. #5097 will naturally improve this problem as well, and by adding a filter query onto the Index, I'm hoping to get performance good enough when users opt-in to indexing to make it worthwhile. Right now, it just needs to be more ergonomic and more performant than what a user could write themselves in a reasonable amount of effort.

@alice-i-cecile alice-i-cecile added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events X-Controversial There is active debate or serious implications around merging this PR labels Nov 21, 2023
@alice-i-cecile
Copy link
Member

@laundmo, your perspectives here will be valuable :) IMO we should try to pull together some benchmarks early in this process to validate performance under a few scenarios.

@alice-i-cecile
Copy link
Member

We may need #5097 to accelerate this.

@laundmo
Copy link
Contributor

laundmo commented Nov 21, 2023

@laundmo, your perspectives here will be valuable :) IMO we should try to pull together some benchmarks early in this process to validate performance under a few scenarios.

the more i think about it, the more what most people want out of indexes/indexed components is very different from what i'd need or have experience with. getting entities by component values is very different to getting entities by how their values relate to other entities. this seems to implement the common case of getting them by their own components value.

@bushrat011899
Copy link
Contributor Author

@laundmo, your perspectives here will be valuable :) IMO we should try to pull together some benchmarks early in this process to validate performance under a few scenarios.

the more i think about it, the more what most people want out of indexes/indexed components is very different from what i'd need or have experience with. getting entities by component values is very different to getting entities by how their values relate to other entities. this seems to implement the common case of getting them by their own components value.

This is my current goal just to keep scope tight. I believe that if we had access to this kind of Index, we could then create Join queries on those indices. I believe with Index and Join, the Bevy ECS will cover enough of the standard SQL language features to borrow industry standard techniques from the relational database world for proper relationship graphs.

But that's all much further in the future. Right now, I know that bevy_ggrs would immediately benefit from the simple ability to get an Entity by a Component value, so that's my goal for a minimal solution.

@bushrat011899
Copy link
Contributor Author

We may need #5097 to accelerate this.

I agree. Right now, the only way I can see to make an Index performant outside of change detection would be to actually modify the Table structure itself to manage the indexing. But I think that has the potential to impact the performance for everyone even if they don't use indexing if it's not done perfectly.

Ideally, an Index should be something a user adds to their ECS, and only then will anything about data storage actually change. Similar to how most databases work. Change detection hooks seem like the best way to achieve that as something additive to me.

@alice-i-cecile
Copy link
Member

Yep, for this PR, I would focus on making this a useful and reliable primitive, regardless of whether or not it's used to implement relations in the future.

- Moved most of the implementation out of the new example and into an `indexing` submodule of `bevy_ecs`
- Added basic documentation to satisfy linting
Ensures access to plugin system without modifying the ECS crate.
Replaced with a simple demonstration of a graph orthogonal to the standard `Parent` / `Child` relationships. Items can have a `Parent` and/or an `Owner`. `Owner` is analogous to `Parent`, and with `Index`, `Owned` (analog of `Children`) is not required.
@TotalKrill
Copy link
Contributor

Is this still under evaluation? Because this basic index functionality that is described is something that pops up quite frequently in networking code.

We are visualizing data from http APIs, and this is a feature we would like the engine to provide for us if possible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Feature A new feature, making something new possible X-Controversial There is active debate or serious implications around merging this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants