Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Component polymorphism support feature & prototype #859

Closed
zheka2304 opened this issue Mar 28, 2022 · 45 comments
Closed

Component polymorphism support feature & prototype #859

zheka2304 opened this issue Mar 28, 2022 · 45 comments
Assignees
Labels
triage pending issue, PR or whatever

Comments

@zheka2304
Copy link

zheka2304 commented Mar 28, 2022

Hello,

I'd like to introduce my work on prototype feature, that adds component polymorphism support. Such thing might seem a little controversial, so I will immediately mention two things:

  • Despite not being a part of pure ECS concept, such feature can be very useful in many real world cases
  • It is enabled only for certain component types at compile time, so it will not affect performance of non-polymorphic components in any way (pay only for what you use)

I already have working prototype in the experimental branch of my fork. Of course it requires some review and improvements, and some parts may not be made in a nicest way possible right now, but it is fully functional.

Here is a basic example:

struct Ticking : public entt::polymorphic {
    virtual void tick(const entt::entity) = 0;
};

class Physics : public entt::inherit<Ticking> {
    // ...
    void tick(const entt::entity e) override {
         // ...
    }
};

class AI : public entt::inherit<Ticking> {
    // ...
    void tick(const entt::entity e) override {
         // ...
    }
};


registry.emplace<Physics>(e); // emplace polymorphic component
Ticking& ticking1 = registry.get<Ticking>(e); // will return Physics as Ticking
registry.emplace<AI>(e); // emplace another polymorphic component
Ticking& ticking2 = registry.get<Ticking>(e); // will return any of two components, because they are both inherit from Ticking, which component is returned is not defined

// get iterable to iterate each component, that is inherited from Ticking, order of iteration is not defined
for (Ticking& ticking : registry.view<entt::every<Ticking>>(e).each()) {
    // ...
}

// view can also get every component instead of one
registry.view<entt::every<Ticking>>(e).each([](const entt::entity, entt::every<Ticking> every_ticking) {
    for(Ticking& ticking : every_ticking) // ...
});

// will remove EVERY component, that is inherited from Ticking
registry.remove<Ticking>(e);

You can try it yourself, it is already in the fork, along with some tests. I will appreciate any feedback about how to improve both the concept and the solution, and hope it will make it way into EnTT in the future.

Now lets dive into some details:

Problems, I try to address

Maybe the main problem, this feature solves, is accessing components without knowing their exact type. It is sometimes can be very useful to access component as some interface/base type without knowing what exactly it is (so basically this is what polymorphism is about in general, and this feature is applying this concept to ECS).

In the context of ECS it results in 2 main ideas: we should be able to access component by any of its base types and we also should be able to access all such components, attached to an entity, because there of course can be more than one. Example of this idea is already shown above, we create two component types, that implement Ticking interface and add both of them to an entity, and then we can just iterate all ticking components and do something with them (probably just call tick in this case).

Additionally, to follow "pay only for what you use" rule, such polymorphic components must be treated separately and must not affect performance of other components.

Some specs

The basic idea is simple: component can be accessed not only by its own type, but by any of its parent types. From the example above we can see, that both Physics and AI components inherit from, and so can be accessed as reference to Ticking. Here I tried to come up with a bit more formal set of rules:

  1. We define polymorphic components separately from non-polymorphic, by inheriting from entt::polymorphic or entt::inherit<Parent1, Parent2, ...>. Polymorphic component can inherit only from other polymorphic components to keep them separate from non-polymorphic. With this requirement we can completely separate all logic at the compile time and preserve performance for non-polymorphic components.
  2. When polymorphic component is added to entity, it can then be accessed not only by its own type, but as any of its parent types. This applies both for getting it for one entity and for iterating with a view.
  3. Now for some parent type there can be several components of its child types attached to an entity. So we must be able to:
    3.1. Still get one of this components, if we just need one. This operation ideally must be as fast, as getting one non-polymorphic component. Which component will be returned cannot be specified in this case, so we just return any of them.
    3.2. Iterate over all components of this type, attached to an entity. Same as with one component, we must be able to use it both with get and with view. To achieve this, we pass entt::every<ParentType> instead of just ParentType to registry.get or registry.view, and get it back. Then we just iterate entt::every<ParentType> to get ParentType&.
    3.3. Access with entt::every must work completely similar for one and for several components.
  4. Calling erase and remove must delete all components inherited from given type. This might seem a bit controversial, because in some cases we need to delete component of exactly this type, so maybe in the future separate method should be added to do such things.
  5. Actions insert, get_or_emplace are deleted for polymorphic components right now, and patch is also controversial one, as they deserve separate discussion.
    ... This list will be expanded as the discussion progresses

Some implementation details

Wrappers (or containers, don't know, which name is better)

Main idea here is to store polymorphic components inside wrappers/containers (polymorphic_component_container<Component>). These wrappers are able to store component value of exactly given type and reference list to other components. They have storage for component value, list of references to other components and 2 bits for flags to store its state.

Wrapper has methods for value construction/destruction, adding/removing references to other components from reference list and for getting reference to one component or entt::every to iterate all of them. When constructing a value, wrapper adds reference to this value into all wrappers for parent component types, when destructing - it will remove all this references. So it requires registry and entity to access those wrappers for parent types.

Polymorphic components reference each other by pointer, so stable pointer is required, in_place_delete is true for all polymorphic types, and empty type optimization is also disabled, because wrappers are never empty.

Changes in basic_storage

Because polymorphic components are stored in wrappers and accessed in a slightly different way, separate basic_storage and iterator implementations are created for them. It changes implementation of get, emplace, erase and remove and also adds hidden methods for polymorphic_component_container like emplace_ref and erase_ref.

I have also added get_as<T> (and similar) methods for all basic_storage implementations and similar as<T> methods for storage iterators. The purpose of those is resolving entt::every (and maybe some other type modifiers in the future). For non-polymorphic storage those methods are completely similar to get/operator*, they actually just call them with an extra static_assert, so no performance overhead here.

And here comes an uglier part: emplace and erase now require registry for polymorphic storage, so right now they are just passed as an additional parameter, that is ignored for non-polymorphic component storages. Again, no performance overhead here, as the registry parameter is just optimized out, but the whole idea of such inverse dependence is pretty nasty. It is another open discussion on how to implement this in a more elegant way.

Changes in basic_view

Here things are simpler: I use get_as<T>/as<T> methods I have created in storage and storage iterators to just forward everything in a right way. Because all of these methods are basically same as get/operator* used before for non-polymorphic types, no overhead is expected for them.

There is also some changes to get unwrap entt::every to get storages for required component, but these are fully resolved at compile time.

Changes in basic_registry

Only changes here are passing registry to emplace/erase/remove and unwrapping entt::every for views.

Dive into polymorphic_component_container implementation

As was said before, polymorphic_component_container aka polymorphic component wrapper can store both component value and reference list. In a very basic implementation it can be done like this (and also thinking of it in this way may help a lot):

template<typename Component>
struct polymorphic_component_container {
    std::optional<Component> value;
    std::vector<Component&> refs;
};

But when I came to thinking about erase implementation the following problem kicked in. When erasing component, we must also erase all components listed in refs (those are child types). When deleting component, it must delete all of its references from parent wrappers, so we must know exact types of components in refs to do it or come up with another way. My idea was to store pointer to 'deleter' function along with component reference, such deleter will be placed when reference is added and called, when it is required to erase component by its reference. So wrapper implementation should look like this:

template<typename Component>
struct polymorphic_component_container {
    struct component_ref {
        Component& ref;
        void* deleter;
    };
    std::optional<Component> value;
    std::vector<component_ref> refs;
};

Now lets move on to real implementation. It does not use std::vector nor std::optional. Instead it has data - buffer for component value or at least one pointer and uintptr_t pointer, which can store pointer to one reference or pointer to list, depending on wrapper state. And to store wrapper state I use a solution with some pointer hacks - I rely on memory alignment, so 2 bits of pointer are always zero and can be used to store the state (it is achieved by aligning entt::inherit by at least 4 bytes). This saves us one memory access when getting reference to the component and also sizeof(pointer) bytes of memory per wrapper, that would be otherwise added by padding. I thought it was worth it, so there is a lot of strange stuff going on inside wrapper, but if you consider this an overkill I of course will rethink it.

Now lets look on the possible storage states: there are 2 bits, 2 flags that are mostly independent of each other, 1st flag - if wrapper has a value, 2nd flag - if wrapper has a reference list. Describing whole implementation here would take forever, I will try to do it separately, so here are some important notes:

  • There is always a value or a reference stored directly inside the wrapper, so one component reference can be always accessed without any indirection and looking into list. If nothing is stored, wrapper is deleted.
  • When wrapper has list, it stores all the references, owned by this wrapper, including reference to the value inside this wrapper (if in has one), so iterating the list does not require extra logic.
  • When wrapper contains only one reference or only value, it does not allocate a list (and also it deallocates a list, if only one reference/value remains)
  • Lists are allocated via component_ref_list_page_source to optimize many small memory allocations of the same size, required for them. Note: right now template of component_ref_list_page_source depends only on underlying allocator type, but maybe it should instead be instantiated per component type (the page size must be reduced greatly in this case).

I am planning to clean up, and move some of this logic out of the wrapper class to abstract underlying memory layout a bit, right now the solution is messy.

...Implementation details are also to be expanded
@skypjack skypjack self-assigned this Mar 29, 2022
@skypjack skypjack added the triage pending issue, PR or whatever label Mar 29, 2022
@skypjack
Copy link
Owner

Hi, I see that your branch has 10 changed files with 2,170 additions and 112 deletions.
My first suggestion is to provide the reader with a walkthrough of your work and reasoning. Otherwise, it may take a while before getting a feedback.
Another thing that helps to focus on your proposal is to clearly define the goals of this work. Polymorphism support is somewhat vague. A list of use cases and problems you're trying to address would help on the other hand. Otherwise, agani, it may take a while before knowing what this code adds to the whole picture.

That being said, let's start with scratching the surface though. A bunch of questions/doubts after a quick glance:

  • The fact that both page_size and in_place_delete depend on the polymorphic nature of a type is really odd to me in terms of design. Why is it required at all?
  • The sparse set doesn't know about the registry class today and and I'd rather stick with this approach.
  • How does this affect the view perf? I see many changes but I can't easily figure out if they add up for the non-polymorphic case.

I won't go into the details of storage.hpp and component.hpp right away, mainly because they are very huge and with many changes. I'm a little short in time today. 😅
Something that looks odd is the storage specialization for is_polymorphic_component_v<Type>. I don't get why being a polymorphic type should affect the storage type.

As I was reading your proposal, I thought of a much less impactful implementation to support polymorphic types. It would also fit with an upcoming feature, literally entering the same design.
However, I'd like more details about your work and it would be really helpful if you could guide me through it as suggested at the beginning. It might just be enough with a clean up already but I really struggle to grasp all the details from the code alone, sorry.

@zheka2304
Copy link
Author

zheka2304 commented Mar 29, 2022

Thanks for your reply! I will write in details about ideas and implementation and put it in the heading comment for easier access as soon as possible.

And also here is a brief response for your first questions:

  • Polymorphic components are stored in wrappers inside the storage, and reference each other by pointer, so they must have stable pointer and also prevent empty type optimization. page_size is zero for empty type components, so polymorphic check was added to prevent this for empty, but polymorphic types. in_place_delete is an overkill, because it was duplicated in entt::inherit, I will clean it up.
  • When polymorphic component is added, it adds its reference into all parent type storages, and when it is removed, it must delete all those references. So the idea that it requires registry to get those storages. Making duplicate emplace/remove methods for storages and sparse sets to pass registry were the fast way to do it, but I admit it is an ugly one and must be refactored. Even when I was writing it I have already decided to discuss it with you, because you will suggest the best way to solve this problem.
  • I haven't yet tested two versions of the library (with and without my changes) side by side. But the whole idea here was to separate all the logic at compile time, so logic for non-polymorphic components stays completely the same. All alternative methods, that were added, contain additional logic only for polymorphic components types, for non-polymorphic they just have additional unused parameters or wrap a call to another function. It should all be optimized out completely, when it is inlined.
  • And for the storage: as was mentioned earlier, polymorphic components are stored inside wrappers, so they can hold both value and references to other components. These wrappers require separate logic for access, especially for handling entt::every. I first started with making this logic inside original storage class, but it quickly became a complete mess with a lot of if constexpr(is_polymorphic), so it seems making separate class for polymorphic components is just a cleaner solution.

@zheka2304
Copy link
Author

zheka2304 commented Mar 30, 2022

I have added a lot of details into the heading comment, and will add more with progression of this discussion. Those should help to dive into what I am trying to achieve, and how this whole thing even works (or, at least, how its supposed to work). If some parts are incorrect or unclear, please, notify me and I will rewrite them.

@skypjack
Copy link
Owner

That's great. Thank you very much. I'll try to go through it as soon as possible.
Sorry if it takes a while but it's really a lot of stuff to read and digest before moving on. 🙂

A quick question (and sorry if it sounds stupid but I still have to dig into your explanation).
Do I get it correctly that the whole thing doesn't apply to types for which the user doesn't have the control?
For example, imagine a third party hierarchy aimed to create UIs (cough coff Qt coff cough). I can't put its graphic items in my registry because I can't make them inherit from entt::polymorphic, is it correct?
Please, also ignore for a moment the fact that probably it doesn't even make sense to put these objects in the registry. It's just to provide an example. 😅

@Innokentiy-Alaytsev
Copy link
Contributor

Innokentiy-Alaytsev commented Mar 30, 2022

I'm not sure how physics libraries work, but if they have some sort of shape hierarchy, then it might be an example of non-extensible third-party class as well. From what I understand, having physics-related components is somewhat common.

@zheka2304
Copy link
Author

zheka2304 commented Mar 30, 2022

It is a good question, right now there are no way to declare hierarchy for third-party components. However it seems possible to add alternative way of declaring hierarchy separately from the classes. entt::inherit is just a template to form type_list of all parent types for given component type, and nothing stops such list from being created outside the component type itself. The only difficulty may be in hidden relation between entt::inherit and hacks in wrapper implementation.

@skypjack
Copy link
Owner

skypjack commented Mar 30, 2022

I'm trying to wrap my head around the implementation details. A bit at a time.
I'm not sure I get them correctly though. Please, correct me if I'm wrong.

Let's suppose there are 3 pools for types T1, T2 and T3. All these types derive from a common base B (polymorphic and inherit and all the rest, that part is pretty clear and I'll drop a bunch of questions/doubts later on probably 🙂).
When I add an instance of T2, what happens is that a wrapper is created within its pool while another two wrappers are also created in the pools for T1 and T3? All these wrappers have the size of 🤔 B plus a pointer?
I think I don't get what Component is here:

template<typename Component>
struct polymorphic_component_container;

That is, if it's the base type or the final type. This also makes it difficult to understand how this type is used.
Can you elaborate a little further on this point? Thanks!

@zheka2304
Copy link
Author

I suppose, you have hierarchy like this:

    B
+---|---+
|   |   |
T1  T2  T3

When you add component T2, it adds references to itself into all parent pools (B in this case), so you following structure:

pool T2: T2        // stores value of newly created T2
pool B : T2&       // stores no value, but has a reference to T2 value in T2 pool
// pools T1, T3 are empty

So T2 can be accessed both as T2 and as B.

When you add, for example T1, you will get the following:

pool T1: T1              // stores value of newly created T1
pool T2: T2              // still stores a value of T2
pool B : [T1&, T2&]      // still stores no value, but now has both reference to T2 value in T2 pool and T1 value in T1 pool
// pool T3 is still empty

Returning to your question about Component template parameter - it is a exact (not parent or child) type, that this wrapper can store by value. Only one value can be stored per wrapper, as there is only one component of this exact type can be attached to a single entity.

When attached to an entity, component (lets assume it is T2) is first put into corresponding pool of T2 into the wrapper for the same type, and then it will go through all of its parents and add reference to itself.

Hope this will help a bit.

@skypjack
Copy link
Owner

Hope this will help a bit.

Yeah, definitely. Thank you a lot.
The first question that pops to my mind is the following: isn't it an issue when it comes to multithreading?
The design of EnTT makes it possible to add and remove (but even sort) components in parallel. In this case, if I decided to store aside a hierarchy like that of Qt where all items inherit from a QObject or QWidget type, multithreading the access to their pools isn't an option anymore.
Do I get it correctly? Standalone and fully independent pools are kinda expected most of the times.

@zheka2304
Copy link
Author

zheka2304 commented Mar 31, 2022

Yes, it limits multithreading for polymorphic components, which share a parent, because accesses to these components will also affect shared parent pool. Only pools, related to separate hierarchies can be accessed in parallel. However I think it is expected, that components, that are somehow related to each other, are not safe to access from different threads.

@skypjack
Copy link
Owner

This is a contrived point imho.
When I access them from the base, I think you're right, I expect to lock all pools in a sense.
However, if I want to access all buttons and all labels in parallel, I can't really see why I can't (I mean, as an user).
I'm accessing different explicit types that are stored in different pools from what I know. Imagine what kind of bugs one could introduce in a codebase by changing the hierarchy definition when all systems are multithreaded already.
That's pretty risky and makes the use of this feature rather limited probably. Food for thoughts anyway.

@zheka2304
Copy link
Author

Yes, I see, from this perspective this issue seems much scarier. From the first glance it seems impossible to make viable polymorphic component implementation with safe multithreaded access for intersecting hierarchy, while keeping whole idea of EnTT unchanged and without introducing some kind of hidden locks/atomics, which of course is not viable too.

The one idea is accessing all child pools while trying to get component by its base type, instead of adding references to parent pools. This way it will be safe to modify different pools, because this modifications will not touch their shared parent pool. But in this way get and iterate operations will be extremely slow and will require a lot of random memory access. So this is not an option.

I think the possible solution from design perspective could be adding some sort of compile time guards, when you can maybe declare what components you will access from which thread (I am not sure, how exactly this will look, it is just my first idea). This will require additional declaration, but it will not compile, in case of conflicts. This may also help with non-polymorphic components, to guard same types from being accessed from different threads.

@Innokentiy-Alaytsev
Copy link
Contributor

Hope this will help a bit.

Yeah, definitely. Thank you a lot. The first question that pops to my mind is the following: isn't it an issue when it comes to multithreading? The design of EnTT makes it possible to add and remove (but even sort) components in parallel. In this case, if I decided to store aside a hierarchy like that of Qt where all items inherit from a QObject or QWidget type, multithreading the access to their pools isn't an option anymore. Do I get it correctly? Standalone and fully independent pools are kinda expected most of the times.

In the particular case of Qt, all QObject-based types are expected to only be used through pointer so storing them by value in a pool will be a bit complicated. On the other hand, storing pointers won't work with polymorphic storage because pointer types are not directly related through inheritance.

It has little to nothing to do with the proposal, just something to keep an eye on: sometimes you shouldn't store things by value by design.

@zheka2304
Copy link
Author

It has little to nothing to do with the proposal, just something to keep an eye on: sometimes you shouldn't store things by value by design.

I think such problem will be solved along with adding the possibility to declare third party hierarchies, I assume that you will be able to say something like "A* is a parent to B*" for types, you want to store by pointer, instead of "A is parent to B" for what you want to store by value.

@Innokentiy-Alaytsev
Copy link
Contributor

What about making this whole thing external with respect to the pools? Something like entt::registry::poly_for_each<Base, Derived1, Derived2,...>() that just iterates all those pools if they are present, returning a reference to the Base. As for functions like entt::registry::get(), I'm not sure how they can support polymorphic components - AFAIK, they assume that only one component of the given type is associated with an entity, it's impossible to achieve adequate behaviour without an optional<T&>.

@zheka2304
Copy link
Author

zheka2304 commented Mar 31, 2022

What about making this whole thing external with respect to the pools? Something like entt::registry::poly_for_each<Base, Derived1, Derived2,...>() that just iterates all those pools if they are present, returning a reference to the Base.

The idea here is same to this one:

The one idea is accessing all child pools while trying to get component by its base type, instead of adding references to parent pools. This way it will be safe to modify different pools, because this modifications will not touch their shared parent pool. But in this way get and iterate operations will be extremely slow and will require a lot of random memory access. So this is not an option.

And also this will require update all of such calls each time new derived type is added. And there can be dozens of child types, all of which will be required to check for each iteration, which is a major drawback, even if you fix the first one by separately declaring derived list once.

As for functions like entt::registry::get(), I'm not sure how they can support polymorphic components - AFAIK, they assume that only one component of the given type is associated with an entity, it's impossible to achieve adequate behaviour without an optional<T&>.

In the heading comment there is explanation about how this is solved right now - search for entt::every. It can handle returning multiple components, and also can be used as optional (but right now it is yet to be implemented unfortunately).

@Innokentiy-Alaytsev
Copy link
Contributor

entt::registry::poly_for_each<Base, Derived1, Derived2,...>()

Could be improved (kinda):

template< class T >
struct derived_types 
{
  using value = entt::type_list<T>;
};

template<>
struct derived_types< Base > 
{
  using value = entt::type_list< Base, Derived1, Derived2 >;
};

template< class... T >
entt::registry::poly_for_each (entt::type_list< T... >) 
{
  entt::registry::each< T > ()...;
}

template< class T >
entt::registry::poly_for_each (...) 
{
  entt::registry::poly_for_each (typename derived_types< T >::value{});
}

@zheka2304
Copy link
Author

Yep, as I've said. However there are still performance issues.

And there can be dozens of child types, all of which will be required to check for each iteration, which is a major drawback, even if you fix the first one by separately declaring derived list once.

@skypjack
Copy link
Owner

What if one implemented this with a storage mixin?
When a pool for a derived type is created, it registers itself as a candidate for elements that derive from a given base class.
Then you can just iterate all these pools to visit all the elements that derive from that class.

@zheka2304
Copy link
Author

zheka2304 commented Mar 31, 2022

This will remove the need for separate declaration, but the main, in my opinion, problem still remains. Say, I have 10 component types, that derive from one base (common case, if the base is some kind of an interface).

With the original approach iterating all the components by this base will be nearly as fast, as doing same thing with non-polymorphic component - just iterate through one storage, everything is already there. However addition and deletion of derived components is slower, because we need to emplace/remove references for one parent. And it will not be thread safe between this 10 components.

On the other hand, @Innokentiy-Alaytsev's approach will be completely thread safe and will not slow down addition and deletion. But get and iterate operations will suffer a lot. It is not the same case, as iterating through entitites with all of given components, it is now any of given components. And to do this, you are required to iterate through all entities, no smallest pool optimization is available + for each of all entitites you will need to check all 10 pools. But, maybe there is a better way to do this, which I cant see right away, so please, correct me, if I am wrong.

@skypjack
Copy link
Owner

Mmm no, wait. What I'm saying is that all pools created with types that derive from B could register themselves on creation in a list (somehow, ignore the technical details for a moment). So, when you want to iterate all and only the entities that have a B, you've a bunch of pools that you visit linearly.
The small set optimization would still apply here, since the size of B is just the sum of their sizes. Not sure why you would have to check all other pools for each entity instead, I'm kinda missing this point. 🤔

Anyway, just thinking aloud at the moment to see if we can have the cake and eat it. 🙂

@zheka2304
Copy link
Author

Oh, I see now, dont know how I've missed this. In this way it will be a good solution, and the only cost will be just remembering already visited entities.

I can try implementing something like this in the upcoming days, what do you think?

@skypjack
Copy link
Owner

Uhm, I mean, as you prefer. I won't stop you for sure. 🙂
At the moment I'm mostly throwing questions and random ideas at you to dig into the topic together. 😅
However, yeah, you're more than welcome if you want to experiment with different approaches to see if we can refine it.
I can share more details on what I have in mind if you like (tomorrow morning though, it's 7PM here).
Just let me know. See you and thanks for the discussion. 👋

@skypjack
Copy link
Owner

skypjack commented Mar 31, 2022

Just to be clear, what I mean is something along this line.
The relevant code is the base_inspector_mixin class and the entt::storage_traits specialization.
It doesn't even require any change to the codebase and also solves the problem with multithreading.
If it feels hacky, that's because I packed the example in less than 30 minutes around midnight and so, yeah, it's hacky. 😅
However, it should give you a grasp of what I mean. Let me know if you've any doubts.

@zheka2304
Copy link
Author

zheka2304 commented Apr 1, 2022

Yes, thank you a lot, I understand what you mean. I just was focused on the one solution I came up with, because I had such idea in mind before and wanted to implement it, it was obviously a mistake, I should have spend more time comparing different approaches.

Although it may provide some advantages like iterating several polymorphic components at once, those just seem to be not so useful compared to the main disadvantages - how complex the implementation is and the multithreading one. And they can be implemented with your idea, maybe even with insufficient additional performance cost.

As for your solution, I have one doubt right now: because registering pools happens at runtime, we don't know the exact type inside them, so converting pointer between them might be a problem with multiple inheritance. Suppose you have this types:

struct A { int a; };
struct B { int b; }
struct C : public A, B {};

Now, when converting to from C& to B& we need to add 4 to the underlying pointer. When doing this conversion at the compile time, the compiler will do this for us in a right way, but at runtime it will become a problem.

dynamic_cast does not seem to be an option here, as the components can be anything (if we assume we can have external hierarchy declaration, and also they just might not be virtual). The one solution I can propose is storing pointer difference along with the sparse set somehow (I hope you wont think badly of me after seeing this one, it is just a random idea) and then just do conversion by pointer arithmetic. But this one seems to be violating standard (although I think it will work in most compilers).

If we cant solve the above problem, then we will need to declare all children for each type, and I really want to avoid such requirement. If you will come up with a better solution, please, let me know. And also I might just again be missing some obvious solution 😅

@Innokentiy-Alaytsev
Copy link
Contributor

Or don't allow multiple inheritance.

@zheka2304
Copy link
Author

Yes, but I think it can be pretty useful, so it is not the best option, imho

@zheka2304
Copy link
Author

Stepping back to earlier discussions, here I came up with some templates to allow hierarchy declaration in several different ways, it is now not dependent on entt::inherit and allows declaration outside of the type itself. Please, tell me, what do you think about it.

@skypjack
Copy link
Owner

skypjack commented Apr 1, 2022

It's really a shame that we can't extract the base classes from a type. At least the most direct ones. It's not that the compiler doesn't know about them, so why not? It would make a lot of things a lot easier.

I just was focused on the one solution I came up with, because I had such idea in mind before and wanted to implement it, it was obviously a mistake, I should have spend more time comparing different approaches.

It was not a mistake actually. It has its pros and cons while the road from point A to point B is never straight. 🤷‍♂️

because registering pools happens at runtime, we don't know the exact type inside them, so converting pointer between them might be a problem with multiple inheritance

Yeah, for sure. When the this pointer changes in a hierarchy, my example breaks. It's just a proof of concept to show what I had in mind.
It's easy to solve though, at least at a first glance. We can bind a fake vtable with the pool and use it when needed. Something like this at the end of the day.

I came up with some templates to allow hierarchy declaration in several different ways

Far better imho. Lot of times we devs use third party libraries and it's important to desing things in such a way that they can also work with these tools. 👍

@zheka2304
Copy link
Author

Thank you! I think now I will go and make an implementation, that could possibly already go into the library, so it would be easier to discuss it in details. I will post it here, when it will be ready.

@zheka2304
Copy link
Author

zheka2304 commented Apr 5, 2022

Hello, again! I've at last got my hands on implementing the new approach. Sorry, if it took a while, unfortunately, last days were busy for me.

So here it is: https://github.com/zheka2304/entt/tree/polymorphic. It is smaller and much cleaner, than the previous one, and its based on ideas, that were proposed here, so it will be easier to describe. It lacks some functionality and is not completely done yet, but it has the core features.

Lets first start with the updated polymorphic component declaration. Now, polymorphic components can be declared in different ways:

  • Old way using entt::inherit:
struct A : public entt::inherit<> {}; // base type
struct B : public entt::inherit<A> {}; // derived type
  • New way, by declaring parents externally:
struct A {}; // base type
struct B : A {}; // derived type

// hierarchy declaration
template<>
struct entt::poly_direct_parent_types<A> {
    using parent_types = type_list<>;
};
template<>
struct entt::poly_direct_parent_types<B> {
    using parent_types = type_list<A>;
};
  • Additionally, parent types can be declared manually inside a polymorphic type. However this one should be used with care, because if you wont redeclare type list, it could be inherited from a parent type without any warning.
struct A {
    using direct_parent_types = entt::type_list<>;
};
struct B : A {
    using direct_parent_types = entt::type_list<A>;
};

There are also some utils to determine, if a type is polymorphic, and for getting all parent types of a given type:

entt::is_poly_type_v<T>
entt::poly_parent_types_t<T>

Also, now polymorphic component hierarchy can be declared for pointers, instead of the components themselves:

template<>
struct entt::poly_direct_parent_types<A*> {
    using parent_types = type_list<>;
};
template<>
struct entt::poly_direct_parent_types<B*> {
    using parent_types = type_list<A*>;
};

In the above example A* will be considered polymorphic, and A will not. Implementation fully supports it by separately handling conversion between A* and B*, internally dereferencing them, so converting pointers with multiple inheritance will also work fine.

The registry now has three new methods:

  • poly_get_all<Type>(entity) - returns iterable to go through all components, derived from Type for a given entity
  • poly_get_any<Type>(entity) - returns pointer to the any component, derived from Type or nullptr
  • poly_each<Type>(func) - iterate all components in the registry, derived from Type

Now, lets move on to the core of the implementation. The example, given by @skypjack used emplace_hint to store child pools for given polymorphic types. This one works basically in a same way, but I've created separate class entt::poly_types_data to store and access all data related to polymorphic components in the registry. It allows to get polymorphic type data by given type, which gives access to all related pools via entt::poly_type and entt::poly_pool_holder classes.

entt::poly_type<Entity, Type> represents runtime data for one polymorphic type, it stores entt::poly_pool_holder for each child pool. It allows iterating through those pool holders, and iterating all components in those pools for a given entity by using specialized iterators.

entt::poly_pool_holder<Entity, Type> holds one child pool and converts components from this pool into Type. It implements one method - try_get, which will return converted pointer to Type or nullptr. Also it allows to access underlying pool as a sparse_set.

Sorry, if I've forgotten something, I still don't have much free time, so this text was written in a hurry. Please, write to me, if something is not clear or there are some mistakes. Thank you!

@zheka2304
Copy link
Author

zheka2304 commented Apr 11, 2022

During the past few days I've worked on some improvements in the same branch, so here they are:

  • registry.poly_get_all, registry.poly_get_any and registry.poly_each and all polymorphic data containers, described above, are now able to correctly handle const qualifiers on input component types: using const Type instead of Type will result in passing back a const reference to the component. This is also applied for pointer components: requesting const Type* will result in const Type* to be passed back.

  • Added registry.poly_remove<Type>(entity) - a method to remove all polymorphic components, derived from Type for a given entity. Will return count of removed components. This one works by carrying additional function pointer in poly_pool_holder for removing component by entity from underlying pool.

  • Implemented signal support for polymorphic storage mixins. Before, on_construct, on_update and on_destroy weren't implemented for poly types, but now they are. Corresponding events are called for all parent types, not only for the exact type of the component, that was constructed/updated/destroyed. To allow listening events for non-constructible base classes, separate poly storage mixin is used for such types, that does not derive from storage and instead just allows signal handling.

I think right now it has most of the required functionality and the implementation is clean enough, so maybe I should move to polishing it and writing tests. Please, write your thoughts on it, I will appreciate any feedback.

@skypjack
Copy link
Owner

@zheka2304 have you pushed the last changes? The most recent commit is from 11 days ago from what I can see. Am I missing something?

@zheka2304
Copy link
Author

It is located in a separate branch - https://github.com/zheka2304/entt/tree/polymorphic

@skypjack
Copy link
Owner

I see, thanks. The first comments that come to my mind after a first glance:

  • Not sure I get the purpose of poly_sigh_holder and why you would want to trigger N events for the same elements. Signals are meant for things like resource cleanup and similar. I can't really imagine an use case for this complex mecahnism honestly.

  • polymorphic_data and all the functions added to the registry interface go somewhat against both the logic and the policy of this library. That is, currently I'm moving towards splitting the logic from the registry to also allow a registry less model. Therefore, adding more logic to the registry won't help. Moreover, this implementation makes me pay (as an user) for something that I don't want necessarily. That is why my suggestion was to use the context to store the relevant data.

  • Similarly, your mixin forces the signal support on the user. This isn't optional, you can't disable it as you can do with the built-in signals. Is it intended? How so?

  • What does the register_for_all_parent_types function do? It just appends the storage to a list of derived classes for all the base types of Type?

Thanks and good job!! 🙂

@zheka2304
Copy link
Author

Thank you for your feedback!

  • I thought sometimes it might be useful to register signal for some base class, so it will be triggered when something is happening with any derived type. However, if signals are used mostly for something like resource cleanup, this is not very useful and better to be removed.
  • Fair enough, I will store and access polymorphic_data through the context, so it will be created, only if any polymorphic components are used. As for the new functions in the registry, if they should not be located there, what will be the best place for them in your opinion?
  • And for mixin I will separate signal handling part and polymorphic storage part of it, so user will be able to specialize polymorphic storage type without signal support.
  • Yes, it does exactly this. When looking at this right now, I think I should probably inline add_self_for_type function.

@skypjack
Copy link
Owner

I thought sometimes it might be useful to register signal for some base class, so it will be triggered when something is happening with any derived type.

I don't know honestly. I can figure out an use case here. Can you help me with that?

As for the new functions in the registry, if they should not be located there, what will be the best place for them in your opinion?

I plan to add an algorithm.hpp file or similar in the entity directory. However, for the purpose of your work, it doesn't really matter. Just create a new file to wrap them all. We can easily move them later eventually. 🤷‍♂️

@zheka2304
Copy link
Author

zheka2304 commented Apr 14, 2022

Hi, and sorry for my late reply.

I thought sometimes it might be useful to register signal for some base class, so it will be triggered when something is happening with any derived type.

I don't know honestly. I can figure out an use case here. Can you help me with that?

I've recently read about observers and it seems to me that they might benefit from this idea. We can, for example, observe updates of the base polymorphic component and then iterate over all entities, where any derived component has been updated, and do something with it. I am still exploring the conception and applications of observers in EnTT, so this might not fit into the idea of the library at all.

However I think, that even if we will decide to keep this feature, it should be disabled by default and enabled for certain component types manually, as it has a performance impact.

As for the new functions in the registry, if they should not be located there, what will be the best place for them in your opinion?

I plan to add an algorithm.hpp file or similar in the entity directory. However, for the purpose of your work, it doesn't really matter. Just create a new file to wrap them all. We can easily move them later eventually. 🤷‍♂️

Understood and working on it and on the other things mentioned in the comments above. Only one question here: should I make algorithms independent from the basic_registry?

@skypjack
Copy link
Owner

Hi, and sorry for my late reply.

Really, no need to apologize. 🙂
Open source works like this, we do our best when we have the time to do that. So, well, thank you for contributing!

I've recently read about observers and it seems to me that they might benefit from this idea.

Mmm, yeah, in theory it might be an option but I still can't figure out a proper use case in practice. Not sure it's worth it but let's keep it in a corner for later. We'll have the time to think about it a little more.

However I think, that even if we will decide to keep this feature, it should be disabled by default and enabled for certain component types manually, as it has a performance impact.

Definitely. Pay for what you use. I don't want to affect the normal uses with a corner case feature.

should I make algorithms independent from the basic_registry?

I think so 🤔 not sure if it's possible but it would be great if it only worked with storage classes and the like.

@zheka2304
Copy link
Author

I've recently read about observers and it seems to me that they might benefit from this idea.

Mmm, yeah, in theory it might be an option but I still can't figure out a proper use case in practice. Not sure it's worth it but let's keep it in a corner for later. We'll have the time to think about it a little more.

Ok, I will move to separating signal handling and polymorphic parts of the mixin. There is not much code, and it should be rewritten anyway, so it is safe to clean it up for now.

should I make algorithms independent from the basic_registry?

I think so 🤔 not sure if it's possible but it would be great if it only worked with storage classes and the like.

Here is my idea, how it should look. I've made a type poly_types_accessor, that is used, to access polymorphic type data. It can be specialized for any type, and by default is specialized for basic_registry to access data from its context.

Then, I've created assure_poly_type<Component> function, that uses poly_types_accessor to get poly_type<Entity, Component> from given polymorphic data holder (for example, registry). This one does some type checks, gets entity type from the holder and handles const qualifiers. In general it provides us with a simple way to get polymorphic type data from some opaque data holder type by just calling assure_poly_type<T>(holder).

With it, implementing polymorphic algorithms, in a way, they don't rely on registry type becomes easy. What do you think about it?

@skypjack
Copy link
Owner

You probably have the whole picture in mind and I'm glad. Although, honestly, I can't easily understand how it turns out. 😅
So, well, I'm tempted to say to give it a shot and see what happens. We can still refine it later eventually. 🤷‍♂️

@zheka2304
Copy link
Author

zheka2304 commented Apr 17, 2022

I've cleaned everything up, removed polymorphic header include in registry, so it can be included separately, only when it is required. I've also added some tests for polymorphic type traits and for algorithms & functionality.

So, well, I'm tempted to say to give it a shot and see what happens. We can still refine it later eventually. 🤷‍♂️

Does this mean, that I should open a pull request with what I have right now? And if it means yes, into which branch should I make it?

@skypjack
Copy link
Owner

Well, opening a PR is probably the best way to make the code easy to review actually.
It doens't always mean that it will be merged but it helps to discuss and refine an idea most of the times.
What I do with your branch usually is to create a fake PR towards EnTT and look into the diff. So, well... 😅 🤷‍♂️

@zheka2304
Copy link
Author

Thank you for making it clear, I've created a pull request #871. Looking forward to continue our discussion there :)

@skypjack
Copy link
Owner

Sounds good. I'm closing the issue then. Let's continue the discussione directly on the PR. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage pending issue, PR or whatever
Projects
None yet
Development

No branches or pull requests

3 participants