Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model fields that contain a default value are not marked as required in response API #1299

Open
johnthagen opened this issue Sep 25, 2024 · 5 comments

Comments

@johnthagen
Copy link
Contributor

johnthagen commented Sep 25, 2024

Describe the bug

Using the default Django User model, creating a Serializer and ViewSet from this leads to many of the fields being optional in the response model that should always be "required"/assured to be there.

Consider:

from django.contrib.auth.models import User
from rest_framework.serializers import ModelSerializer


class UserSerializer(ModelSerializer):
    class Meta:
        model = User
        fields = "__all__"

And

from django.contrib.auth.models import User
from rest_framework.mixins import (
    CreateModelMixin,
    ListModelMixin,
    RetrieveModelMixin,
    UpdateModelMixin,
)
from rest_framework.permissions import AllowAny


class UserViewSet(
    CreateModelMixin, RetrieveModelMixin, UpdateModelMixin, ListModelMixin, GenericViewSet
):
    queryset = User.objects.all()
    serializer_class = UserSerializer
    permission_classes = [AllowAny]
    filter_backends = []

This creates a schema where many of the fields (such as is_staff) are optional even for the User response type:

Screenshot 2024-09-25 at 11 53 20 AM

But I'm not sure why they would be marked optional in the response, as my API will always return a bool for them? I see in the Model definitions they have default values, perhaps that is related?

class AbstractUser(AbstractBaseUser, PermissionsMixin):
    ...
    is_staff = models.BooleanField(
        _("staff status"),
        default=False,
        help_text=_("Designates whether the user can log into this admin site."),
    )

I'm still not sure if this is a bug, or some kind of expected behaviour for this model.

The problem this causes is that in generated clients such as Typescript the code is:

  /**
   * Designates whether the user can log into this admin site.
   */
  is_staff?: boolean;

And the ? denotes it is optional and all clients then have to do a null check even through the API will always return a value for this field.

In a generated Python client, this looks like the following, with an extra Unset state:

is_staff: Union[Unset, bool] = UNSET

Environment

  • Python 3.11.8
  • drf-spectacular 0.27.2
  • DRF 3.15.2
  • Django 4.2.15
@johnthagen johnthagen changed the title Default User Serializer makes many fields optional Default User Serializer makes many response fields optional Sep 25, 2024
@tfranzel
Copy link
Owner

tfranzel commented Sep 25, 2024

Hi @johnthagen,

One OpenAPI component is not able to fully cover what a DRF serializer can do. What you see is a carefully crafted compromise that would work both for the request and response. If you want to have a better representation, you need to enable COMPONENT_SPLIT_REQUEST as the docs suggest:

https://drf-spectacular.readthedocs.io/en/latest/client_generation.html#component-issues

This is especially important for client generation due to the stated reasons. You will then get 2 components that together very closely resemble what the serializer actually does.

Regarding the required flag, the field is marked as required if the serializer field is marked (or auto-generated) as required or if the field is readOnly. You have to realize that DRF's required field deals with the request side of things. It is not statement about the response. We can only make one simplification for readOnly fields, where making the field required it is not a regression on requests.

Can't really say anything about the specific generator target, but with the COMPONENT_SPLIT_REQUEST, most generator issues go away.

@johnthagen
Copy link
Contributor Author

@tfranzel Thanks for this response.

If you want to have a better representation, you need to enable COMPONENT_SPLIT_REQUEST as the docs suggest

Sorry, I should have also mentioned my drf-spectacular settings, as I always have that set to True due to the important rationale in the docs (this setting is very important!).

SPECTACULAR_SETTINGS = {
    "COMPONENT_SPLIT_REQUEST": True,
    # drf-spectacular-sidecar settings
    "SWAGGER_UI_DIST": "SIDECAR",
    "SWAGGER_UI_FAVICON_HREF": "SIDECAR",
    "REDOC_DIST": "SIDECAR",
    ...
}

In my first post, I tried to simplify my example by not including the generated UserRequest object, but I did generate it locally. Here are both:

Screenshot 2024-09-26 at 8 09 34 AM

I think this will help summarize the crux of my question. If I swap the built in User model for my own:

from django.db import models


class MyUser(models.Model):
    username = models.TextField()
    is_staff = models.BooleanField()
    is_superuser = models.BooleanField()


class UserSerializer(ModelSerializer):
    class Meta:
        model = MyUser
        fields = "__all__"

Then the generated OpenAPI spec is correctly generated as I expect. The response type in the API guarantees that is_staff and is_superuser will always be present.

Screenshot 2024-09-26 at 8 24 08 AM

This is the expected behaviour for my typical Serializers and ViewSets, but I'm trying to figure out why the built in User model does not exhibit this behavior, especially on the generated response type (User in the API).

@johnthagen
Copy link
Contributor Author

johnthagen commented Sep 26, 2024

Okay, I seem to have located the cause. It is the default in the Models field, as I suspected.

If I change to:

class MyUser(models.Model):
    username = models.TextField()
    is_staff = models.BooleanField(default=True)
    is_superuser = models.BooleanField()

is_staff is no longer set as required in the response object.

    User:
      type: object
      properties:
        id:
          type: integer
          readOnly: true
        username:
          type: string
        is_staff:
          type: boolean
        is_superuser:
          type: boolean
      required:
      - id
      - is_superuser
      - username
Screenshot 2024-09-26 at 8 29 15 AM

To me, this makes sense for the request type (e.g., a user does not need to provide is_staff because Django will put in the default if the user does not provide it).

But for the response type, DRF will always return the is_staff field, so it should be marked required, right?

Perhaps there is some logic with regards to looking at default values in Model fields that drf-spectacular should differientiate between response and request processing?


I think that I personally have not run into this issue before because I just checked our codebase and we do not mark any of our Model fields with a default=. But we do use the built in User from Django, and it has default= set, so this is why it finally surfaced.

@johnthagen johnthagen changed the title Default User Serializer makes many response fields optional Model fields that contain a default value are not marked as required in retrieve and list response APIs Sep 26, 2024
@johnthagen johnthagen changed the title Model fields that contain a default value are not marked as required in retrieve and list response APIs Model fields that contain a default value are not marked as required in response API Sep 26, 2024
@tfranzel
Copy link
Owner

tfranzel commented Sep 29, 2024

I think I get your point, but I believe we have a conceptual problem here.

Foundational statements

Perhaps there is some logic with regards to looking at default values in Model fields that drf-spectacular should differientiate between response and request processing?

We cannot really look at the model here, because DRF degenerates the Serializer from the Model for us. There is information loss there for sure. On top of that, the user can modify/patch onto that generated serializer. Trying to untangle what is intentional by the user or just auto-generated is at leased problematic.

This is the serializer we get from DRF for the django standard user and "in theory" that would be all we are allowed to look at:

class UserSerializer:
    id = IntegerField(label='ID', read_only=True)
    password = CharField(max_length=128)
    last_login = DateTimeField(allow_null=True, required=False)
    is_superuser = BooleanField(help_text='Designates that this user has all permissions without explicitly assigning them.', label='Superuser status', required=False)
    username = CharField(help_text='Required. 150 characters or fewer. Letters, digits and @/./ /-/_ only.', max_length=150, validators=[...])
    first_name = CharField(allow_blank=True, max_length=150, required=False)
    last_name = CharField(allow_blank=True, max_length=150, required=False)
    email = EmailField(allow_blank=True, label='Email address', max_length=254, required=False)
    is_staff = BooleanField(help_text='Designates whether the user can log into this admin site.', label='Staff status', required=False)
    is_active = BooleanField(help_text='Designates whether this user should be treated as active. Unselect this instead of deleting accounts.', label='Active', required=False)
    date_joined = DateTimeField(required=False)
    groups = PrimaryKeyRelatedField(help_text='The groups this user belongs to. A user will get all permissions granted to each of their groups.', many=True, queryset=Group.objects.all(), required=False)
    user_permissions = PrimaryKeyRelatedField(help_text='Specific permissions for this user.', many=True, queryset=Permission.objects.all(), required=False)

Regarding the default. This is what DRF created from a model:

    field_without_default = models.BooleanField()
    field_with_default = models.BooleanField(default=False)
    # generates this
    field_without_default = serializers.BooleanField()
    field_with_default = serializers.BooleanField(required=False) # the default makes this not required

So we do nothing special here and merely carry over the required flag, which is indeed problematic (see below)

My conclusion

IMHO, DRF's required flag ONLY applies to requests and makes not statement about the response.
With that in mind, to be consistent, we would either need remove all required on responses or make all required.

The fact that we consider the required flag on responses (with COMPONENT_SPLIT_REQUEST=True) could be considered wrong.

I came to that realization at a point where this would have been to big of a breaking change. However, I think we could add another setting to address this.

Is it a valid assumption that all fields are always present on serializer responses? It is certainly true most of the time, but it is not guaranteed, I think.

What do you think?

@johnthagen
Copy link
Contributor Author

johnthagen commented Sep 30, 2024

We cannot really look at the model here, because DRF degenerates the Serializer from the Model for us.

Oh, right. I forgot that there is a Serializer in between that you are actually processing.

Is it a valid assumption that all fields are always present on serializer responses?

For a response I do think this is valid, or at least this should be the default and users can override this with @extend_schema or otherwise. That's because when you're modeling data from a database what you return will always have values for each field (even it's its nullable, there will still be a NULL value in the database returned as None -> null in the API).

I think we still need to allow the @extend_schema escape hatch (which already exists, in fact I even created a work-around for this specific problem using it).

But the default, I think, should be that the API guarantees the fields the user defines will be returned (e.g., set as required). For ModelSerializer, at least, I certainly think this makes sense.

I'd be interested to hear a counter example where someone wouldn't want these marked as required for a response. The result is that clients couldn't be confident that any fields exist and would need to individually check that each exist before accessing them. To me that sounds like a very brittle way to use an API, especially as the default way to generate it.

In TypeScript (probably the most widely used API client language), this results is lots of ?., if x === undefined, and ?? undefined checking guards littering the client usage of the API.

As always, I appreciate the thoughtful feedback and consideration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants