Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querying for allSpecies and homeworld executes 38 SQL queries #12

Open
simonw opened this issue Jul 7, 2017 · 7 comments
Open

Querying for allSpecies and homeworld executes 38 SQL queries #12

simonw opened this issue Jul 7, 2017 · 7 comments

Comments

@simonw
Copy link

simonw commented Jul 7, 2017

This query:

{
  allSpecies {
    edges {
      node {
        name
        homeworld {
          name
          gravity
        }
      }
    }
  }
}

Executes 38 sql queries, looking something like this:

(0.001) SELECT COUNT(*) AS "__count" FROM "starwars_species"; args=()
(0.000) SELECT "starwars_species"."id", "starwars_species"."created", 
"starwars_species"."edited", "starwars_species"."name", 
"starwars_species"."classification", "starwars_species"."designation", 
"starwars_species"."average_height", "starwars_species"."skin_colors", 
"starwars_species"."hair_colors", "starwars_species"."eye_colors", 
"starwars_species"."average_lifespan", "starwars_species"."homeworld_id", 
"starwars_species"."language" FROM "starwars_species" LIMIT 37; args=()

(0.000) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" = 9; args=(9,)

(0.000) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" = 14; args=(14,)

... repeat another 36 times

Full list here: https://gist.github.com/simonw/762a98de85a2f113d1d7ec6026f73ee4

Is it possible to detect this case and then use select_related() or prefetch_related() here to join against the relevant table and get everything done in just one or two queries?

If so, an example showing how to do that would be a fantastic illustration of a slightly more complex use-case for graphene-django (I tried myself and couldn't figure out how to do it).

@simonw simonw changed the title Querying for allSpecies and homeworld executes 37 SQL queries Querying for allSpecies and homeworld executes 38 SQL queries Jul 7, 2017
@simonw
Copy link
Author

simonw commented Jul 7, 2017

I generated the list of SQL queries by adding the following configuration to swapi_graphene/settings.py

LOGGING = {
    'version': 1,
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
        },
    },
    'loggers': {
        'django.db.backends': {
            'level': 'DEBUG',
            'handlers': ['console'],
        },
    },
}

@simonw
Copy link
Author

simonw commented Jul 7, 2017

Even better: is it possible to detect the fields being requested and update the SELECT fields in those statements, using the Django .only() QuerySet method?

My ideal SQL queries for this GraphQL query would be either this (using prefetch_related):

SELECT COUNT(*) AS "__count" FROM "starwars_species";

SELECT "starwars_species"."id", "starwars_species"."name", "starwars_species"."homeworld_id",
FROM "starwars_species" LIMIT 37;

SELECT "starwars_planet"."name", "starwars_planet"."gravity"
FROM "starwars_planet"
WHERE "starwars_planet"."id" in (
  9, 14, 23, 24, 28, 29, 31, 7, 33, 18, 8, 34,
  35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47,
  48, 49, 50, 11, 51, 54, 55, 10, 56, 57, 58, 59, 12
);

Or this (using a select_related() join):

SELECT COUNT(*) AS "__count" FROM "starwars_species";

SELECT "starwars_species"."id", "starwars_species"."name", "starwars_species"."homeworld_id",
"starwars_planet"."name", "starwars_planet"."gravity"
FROM "starwars_species"
JOIN "starwars_planet" ON 
  "starwars_species"."homeworld_id" = "starwars_planet"."id"
LIMIT 37;

@simonw
Copy link
Author

simonw commented Jul 10, 2017

On closer inspection of the docs, it looks like the solution here is to use a DataLoader - any chance of some example code for that? http://docs.graphene-python.org/en/latest/execution/dataloader/

simonw pushed a commit to simonw/swapi-graphene that referenced this issue Jul 10, 2017
@simonw
Copy link
Author

simonw commented Jul 10, 2017

I got this working using a dataloader: simonw@cea0451

Corresponding SQL:

(0.001) SELECT COUNT(*) AS "__count" FROM "starwars_species"; args=()
(0.001) SELECT "starwars_species"."id", "starwars_species"."created", 
"starwars_species"."edited", "starwars_species"."name", 
"starwars_species"."classification", "starwars_species"."designation", 
"starwars_species"."average_height", "starwars_species"."skin_colors", 
"starwars_species"."hair_colors", "starwars_species"."eye_colors", 
"starwars_species"."average_lifespan", "starwars_species"."homeworld_id", 
"starwars_species"."language" FROM "starwars_species" LIMIT 37; args=()
(0.002) SELECT "starwars_planet"."id", "starwars_planet"."created", 
"starwars_planet"."edited", "starwars_planet"."name", 
"starwars_planet"."rotation_period", "starwars_planet"."orbital_period", 
"starwars_planet"."diameter", "starwars_planet"."climate", 
"starwars_planet"."gravity", "starwars_planet"."terrain", 
"starwars_planet"."surface_water", "starwars_planet"."population" FROM 
"starwars_planet" WHERE "starwars_planet"."id" IN (9, 14, 23, 24, 28, 29,
31, 7, 33, 18, 8, 34, 35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47, 48, 49, 50,
11, 51, 54, 55, 10, 56, 57, 58, 59, 12); args=(9, 14, 23, 24, 28, 29, 31, 7,
33, 18, 8, 34, 35, 37, 38, 39, 40, 41, 43, 44, 45, 46, 47, 48, 49, 50, 11,
51, 54, 55, 10, 56, 57, 58, 59, 12)

simonw pushed a commit to simonw/swapi-graphene that referenced this issue Jul 10, 2017
@simonw
Copy link
Author

simonw commented Jul 10, 2017

Oops, correct resolver on the Specie class looks like this:

    def resolve_homeworld(self, *args):
        if self.homeworld_id:
            return planet_loader.load(self.homeworld_id)
        else:
            return None

@simonw
Copy link
Author

simonw commented Jul 10, 2017

I think that's wrong too. The PlanetLoader batch_load_fn(keys) method needs to return objects for the keys in the same order that the keys were provided.

simonw pushed a commit to simonw/swapi-graphene that referenced this issue Jul 10, 2017
This function needs to return values in the exact same
order as the keys that were passed to the function.

See graphql-python#12
@simonw
Copy link
Author

simonw commented Jul 10, 2017

OK, I think this is the correct implementation of the DataLoader:

from promise import Promise
from promise.dataloader import DataLoader


class PlanetLoader(DataLoader):
    def batch_load_fn(self, keys):
        objects_by_key = {
            o.pk: o
            for o in models.Planet.objects.filter(pk__in=keys)
        }
        results = []
        for key in keys:
            results.append(objects_by_key.get(key, None))
        return Promise.resolve(results)

planet_loader = PlanetLoader()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant