Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

has_changed incompatible with model with vector_fields #132

Open
benwhalley opened this issue Sep 6, 2023 · 6 comments
Open

has_changed incompatible with model with vector_fields #132

benwhalley opened this issue Sep 6, 2023 · 6 comments
Labels
question Further information is requested

Comments

@benwhalley
Copy link

I just added a django-pgvector VectorField to my model and am now having an issue with using has_changed.

The error is ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

The source of the error is /mixins.py", line 91, in _diff_with_initial

This doesn't happen if is_changed=False but if True then every field in the model is checked for changes.

I think there are two issues here:

  1. The check for changes should cope with the case where the field value is a vector/list
  2. The check for changes should only be applied to the fields in when_any, not to all field

Cheers!

@EnriqueSoria
Copy link
Collaborator

EnriqueSoria commented Sep 6, 2023

Hi, can you give an example of how you're using the hook? i.e.:

@hook(AFTER_SAVE, has_changed=True)

This doesn't happen if is_changed=False but if True then every field in the model is checked for changes.

This is the expected behaviour

@benwhalley
Copy link
Author

benwhalley commented Sep 6, 2023

class Player(LifecycleModelMixin, AbstractUser):
    name_embedding = VectorField(dimensions=1536, blank=True, null=True)

    # has_changed=False at the moment because of bug with VectorField
    @hook(AFTER_CREATE)
    @hook(AFTER_SAVE, when_any=["username", "first_name", "last_name"])
    def update_name_embedding(self):
        ...

This doesn't happen if is_changed=False but if True then every field in the model is checked for changes.
This is the expected behaviour

I can see that could make sense, but it's ambiguous when using when_any.

Perhaps it would be better if has_changed could accept a boolean (checks all fields) or a string or list of strings (checks only these fields)?

@EnriqueSoria
Copy link
Collaborator

I'm sorry but I'm not sure if I understand what you want to achieve. Could you write in words when you expect your hooks to be fired so I can help you better?

@benwhalley
Copy link
Author

benwhalley commented Sep 7, 2023

I have firstname, last name username fields on the Player model (a User subclass).

I use these to create a text embedding of a combined string f"{firstname} {last_name} {username}.
This is used by a semantic search engine to match usernames in unstructured user-generated data.

I have a function that calls the OpenAI API to generate the embedding. I want this to be called:

  • When a player is added (AFTER_CREATE)
  • When any of the 3 fields mentioned above are changed (AFTER_SAVE, when_any=["username", "first_name", "last_name"])

I don't want it to be called if other fields on the Player model change, so would like the has_changed flag to apply only to the fields mentioned in the AFTER_SAVE decorator.

@EnriqueSoria
Copy link
Collaborator

so would like the has_changed flag to apply only to the fields mentioned in the AFTER_SAVE decorator.

That's how it works.

This should do the trick, it will fire the hook when any of these changes.
@hook(AFTER_SAVE, when_any=["username", "first_name", "last_name"], has_changed=True)

class Player(LifecycleModelMixin, AbstractUser):
    name_embedding = VectorField(dimensions=1536, blank=True, null=True)

    @hook(AFTER_CREATE)
    @hook(AFTER_SAVE, when_any=["username", "first_name", "last_name"], has_changed=True)
    def update_name_embedding(self):
        ...

@EnriqueSoria EnriqueSoria added the question Further information is requested label Sep 12, 2023
@hbradlow
Copy link

hbradlow commented Oct 3, 2024

Any update on this? I am also facing this issue and adding the when_any clause doesn't seem to resolve it.

What does seem to resolve this is overriding the _snapshot_state method to explicitly remove the vector field:

    def _snapshot_state(self):
        state = super()._snapshot_state()
        del state["vector"]
        return state

But this seems pretty hacky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants