Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NDB: Query with UserProperty results in "An entity value is not allowed" error #1002

Open
yihaoWang opened this issue Oct 7, 2024 · 10 comments
Labels
api: datastore Issues related to the googleapis/python-ndb API.

Comments

@yihaoWang
Copy link

yihaoWang commented Oct 7, 2024

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you've tried the usual "quick fixes":

If you are still having issues, please be sure to include as much information as possible:

Environment details

  1. API: Google Cloud NDB
  2. OS type and version: macOS 14.5
  3. Python version: Python 3.10.9 (using pyenv)
  4. google-cloud-ndb version: 2.3.2 (using pip show google-cloud-ndb)

Steps to reproduce

  1. Create a model class with a UserProperty field, such as TestModel.
  2. Use the users.User object to create and store an instance of TestModel.
  3. Attempt to query the stored instance based on the UserProperty.
  4. Observe the error when trying to retrieve the result.

Code example

from google.cloud import ndb

class TestModel(ndb.Model):
    owner = ndb.UserProperty()

from google.appengine.api import users
from junyi.activity.test_model import TestModel
from testutil.gae_model import GAEModelTestCase

class TestTestModel(GAEModelTestCase):
    def test_get_user_data(self):
        user = users.User(email="[email protected]")
        test_model = TestModel(owner=user)
        test_model.put()
        query = TestModel.query().filter(TestModel.owner == user)
        result = query.get()
        print("result", result)

Stack trace

Traceback (most recent call last):
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_api.py", line 98, in rpc_call
    result = yield rpc
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.INVALID_ARGUMENT
        details = "An entity value is not allowed"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"An entity value is not allowed", grpc_status:3, created_time:"2024-10-07T08:15:48.775244+08:00"}"
>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/yihaowang/junyi/junyiacademy/junyi/activity/test_model_test.py", line 11, in test_get_user_data
    result = query.get()
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/query.py", line 1201, in wrapper
    return wrapped(self, *dummy_args, _options=query_options)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/utils.py", line 118, in wrapper
    return wrapped(*args, **new_kwargs)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/utils.py", line 150, in positional_wrapper
    return wrapped(*args, **kwds)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/query.py", line 2067, in get
    return self.get_async(_options=kwargs["_options"]).result()
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 210, in result
    self.check_success()
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 157, in check_success
    raise self._exception
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/query.py", line 2101, in get_async
    results = yield _datastore_query.fetch(options)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_query.py", line 116, in fetch
    while (yield results.has_next_async()):
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_query.py", line 343, in has_next_async
    yield self._next_batch()  # First time
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_query.py", line 373, in _next_batch
    response = yield _datastore_run_query(query)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_query.py", line 1030, in _datastore_run_query
    response = yield _datastore_api.make_call(
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_retry.py", line 97, in retry_wrapper
    raise error
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_retry.py", line 82, in retry_wrapper
    result = yield result
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "/Users/yihaowang/.pyenv/versions/3.10.9/lib/python3.10/site-packages/google/cloud/ndb/_datastore_api.py", line 102, in rpc_call
    raise error
google.api_core.exceptions.InvalidArgument: 400 An entity value is not allowed

Thanks!

@product-auto-label product-auto-label bot added the api: datastore Issues related to the googleapis/python-ndb API. label Oct 7, 2024
@fbukevin
Copy link

fbukevin commented Oct 15, 2024

It looks some wired logic in your code. You declare a class name TestModel.

class TestModel(ndb.Model):
    owner = ndb.UserProperty()

Then you import a custom module with the same name.

from junyi.activity.test_model import TestModel
...
        test_model = TestModel(owner=user)

Won't it be conflicts when you invoke it?

@yihaoWang
Copy link
Author

It looks some wired logic in your code. You declare a class name TestModel.

class TestModel(ndb.Model):
    owner = ndb.UserProperty()

Then you import a custom module with the same name.

from junyi.activity.test_model import TestModel
...
        test_model = TestModel(owner=user)

Won't it be conflicts when you invoke it?

Apologies for the confusion. I've simplified the sample code to make it more straightforward and reproducible. Please check if the following code works for you

import os
import sys
from google.appengine.api import users
from google.cloud import ndb
import dev_appserver

class TestModel(ndb.Model):
    owner = ndb.UserProperty()


def init_ndb():
    os.environ["AUTH_DOMAIN"] = "example.com"
    
    ndb_client = ndb.Client(project="test")
    return ndb_client


def test_get_user_data():
    ndb_client = init_ndb()
    with ndb_client.context():
        user = users.User(email="[email protected]")
        test_model = TestModel(owner=user)
        test_model.put()
        query = TestModel.query().filter(TestModel.owner == user)
        result = query.get()
        print("result", result)


if __name__ == "__main__":
    test_get_user_data()

@fbukevin
Copy link

@yihaoWang

With removing import dev_appserver, which is only supported in Python2 Client Library, I can reproduce the same error as you.

Actually, this error "An entity value is not allowed" mostly occurs when you attemp to store types like google.appengine.api.users.User into Google Cloud NDB module. However, the value of this data type is not supported by Cloud NDB. Cloud NDB library implemented updating for Python 3. It no longer support some data types or modules of App Ebgine.

The line ndb.UserProperty() is the old API of App Engine. It's not compatible to Python 3 with Cloud NDB library

A solution is that using custom defined property to store data such as email. You can create a user instances with google.appengine.api.users and store email by using user.email(), instead of storing entire users.User object (i.e. owner = ndb.UserProperty()).

Here is an example based on amended your code:

import os
import sys
from google.appengine.api import users
from google.cloud import ndb

class TestModel(ndb.Model):
    owner_email = ndb.StringProperty()

def init_ndb():
    os.environ["AUTH_DOMAIN"] = "example.com"
    ndb_client = ndb.Client(project="example-project")
    return ndb_client

def test_get_user_data():
    ndb_client = init_ndb()
    with ndb_client.context():
        user = users.User(email="[email protected]")
        test_model = TestModel(owner_email=user.email())
        test_model.put()
        query = TestModel.query().filter(TestModel.owner_email == user.email())
        result = query.get()
        print("result", result)

if __name__ == "__main__":
	test_get_user_data()

And the result of query is:

result TestModel(key=Key('TestModel', 5644004762845184), owner_email='[email protected]')

@youchenlee
Copy link

Actually, this error "An entity value is not allowed" mostly occurs when you attemp to store types like google.appengine.api.users.User into Google Cloud NDB module.

@fbukevin
test_model.put() succeeds, but the error appears at:

result = query.get()

@fbukevin
Copy link

Sorry about the confused wording. The meaning of the statement is that if you attempt to store type google.appengine.api.users.User into Google Cloud NDB module, when you try to get it with user property, it could lead to the error.

@youchenlee
Copy link

Thank you, @fbukevin .

We are migrating from google.appengine.ext.db to Cloud NDB. Many existing models and queries rely on UserProperty. Is there any workaround to make these queries compatible without needing to migrate billions of rows of data? 😢

@youchenlee
Copy link

youchenlee commented Oct 31, 2024

Previous versions of the Google Cloud Datastore API had an explicit
``UserValue`` field. However, the ``google.datastore.v1`` API returns
previously stored user values as an ``Entity`` with the meaning set to
``ENTITY_USER=20``.

According to the comment, this issue can be resolved by adding meaning = 20 to the final request

project_id: "test"
partition_id {
  project_id: "test"
}
read_options {
}
query {
  kind {
    name: "TestModel"
  }
  filter {
    property_filter {
      property {
        name: "owner"
      }
      op: EQUAL
      value {
        entity_value {
          properties {
            key: "email"
            value {
              string_value: "[email protected]"
              exclude_from_indexes: true
            }
          }
          properties {
            key: "auth_domain"
            value {
              string_value: "example.com"
              exclude_from_indexes: true
            }
          }
        }
        meaning: 20 ### Added this ###
      }
    }
  }
  limit {
    value: 1
  }
}

This was done using a temporary hack in the code:

+++ /site-packages/google/cloud/datastore/helpers.py       2024-11-01 02:21:15.749998162 +0800
@@ -498,6 +498,8 @@
     elif attr == "entity_value":
         entity_pb = entity_to_protobuf(val)
         value_pb.entity_value.CopyFrom(entity_pb._pb)
+        if 'auth_domain' in val.keys() and 'email' in val.keys():
+            value_pb.meaning = 20
     elif attr == "array_value":
         if len(val) == 0:
             array_value = entity_pb2.ArrayValue(values=[])._pb

Looking forward to a better fix, where the correct meaning is applied when encountering UserProperty or appengine User object.

@fbukevin
Copy link

fbukevin commented Nov 1, 2024

Hi @youchenlee ,

You can achieve your requirement like this as a workaround. I suggest you can create a pull request with your fixing.
Either Google engineering team have their consideration to design like this, or it is a bug and can be accepted 🙂.

@fbukevin
Copy link

fbukevin commented Nov 5, 2024

@googleapis

Reproduce steps

  • Environment: Cloud Shell in a Google Cloud Platform project.
  • Python version: 3.12.3
  • Installed packages: pip install appengine-python-standard google-cloud-ndb
  • Reproduce code:
import os
import sys
from google.appengine.api import users
from google.cloud import ndb

class TestModel(ndb.Model):
    owner = ndb.UserProperty()


def init_ndb():
    os.environ["AUTH_DOMAIN"] = "ikala.ai"
    
    ndb_client = ndb.Client(project="cloud-sa-sandbox-1")
    return ndb_client


def test_get_user_data():
    ndb_client = init_ndb()
    with ndb_client.context():
        user = users.User(email="[email protected]")
        test_model = TestModel(owner=user)
        test_model.put()
        query = TestModel.query().filter(TestModel.owner == user)
        result = query.get()
        print("result", result)


if __name__ == "__main__":
    test_get_user_data()

Demo

screenshot_20241105151059.mp4

@yihaoWang
Copy link
Author

I've submitted PR #1004 to fix this issue.

The root cause has been identified: UserProperty meanings need to be handled at the entity's top level in the Datastore, but this wasn't being properly set during query filter creation. The fix implements proper meaning propagation in the UserProperty._comparison method, ensuring that _MEANING_PREDEFINED_ENTITY_USER is correctly set at the entity level.

This should restore the ability to filter by UserProperty fields as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: datastore Issues related to the googleapis/python-ndb API.
Projects
None yet
Development

No branches or pull requests

3 participants