Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace encode/databases with psycopg & pydantic model validation #128

Merged
merged 74 commits into from
Aug 26, 2024

Conversation

spwoodcock
Copy link
Member

@spwoodcock spwoodcock commented Aug 6, 2024

Full details here: hotosm/fmtm#254

psycopg vs asyncpg driver

While asyncpg may be more performant and good for many use cases, I actually think that psycopg has an edge for our application:

  • Type safety when used with Pydantic.
  • Easier static type checking via tools like Pyright / mypy (+ code completion).
  • Usage of standard libpq official Postgres driver, which in theory should have better longevity (asyncpg uses a custom implementation that would need to be maintained...).

Static Type Checking / Pydantic Integration

Other options

  • encode/databases is a great wrapper around asyncpg, but I do worry about it's support. It hasn't been updated to support psycopg in a long time. The syntax may be slightly nicer, but it doesn't do Pydantic model validation & provides little benefit when used with raw SQL (the bundled SQLAlchemy Core syntax may be helpful though).
  • SQLModel is also a good option, allowing for Pydantic model validation built-in. This has a much larger community & will likely be supported for some time. Three concerns though:
    • It's tied quite tightly to the development of FastAPI, and when that gets replaced who knows about the future of SQLModel.
    • It also makes it difficult to replace Pydantic, if we ever want to change out validation / modelling lib (psycopg can just change the row_factory).
    • As with FastAPI, I am a little wary of single developer projects with a tight control over it's maintenance. This could be an unfounded concern though.

Example Logic

  • Here we have much tighter coupling of the models and basic crud logic.
  • It's nice to contain the standard GET, PATCH, POST, DELETE logic in the model itself.
class DbProject(BaseModel):
    """Project model for extracting from database."""

    id: uuid.UUID
    name: str
    slug: Optional[str] = None
    short_description: Optional[str]
    description: str
    per_task_instructions: Optional[str] = None
    organisation_id: Optional[int]
    outline: Polygon
    centroid: Optional[Point]
    no_fly_zones: Optional[MultiPolygon]
    task_count: int = 0
    tasks: Optional[list[TaskOut]] = []

    @staticmethod
    async def one(db: Connection, project_id: uuid.UUID):
        """Get a single project by it's ID, including tasks and task count."""
        async with db.cursor(row_factory=class_row(DbProject)) as cur:
            await cur.execute(
                """
                SELECT
                    p.*,
                    ST_AsGeoJSON(p.outline)::jsonb AS outline,
                    ST_AsGeoJSON(p.centroid)::jsonb AS centroid,
                    COALESCE(JSON_AGG(t.*) FILTER (WHERE t.id IS NOT NULL), '[]'::json) AS tasks,
                    COUNT(t.id) AS task_count
                FROM
                    projects p
                LEFT JOIN
                    tasks t ON p.id = t.project_id
                WHERE
                    p.id = %(project_id)s
                GROUP BY
                    p.id;
                """,
                {"project_id": project_id},
            )
            project = await cur.fetchone()

            if not project:
                raise KeyError(f"Project {project_id} not found")

            return project

TODO

  • Init db / main logic

  • Project routes

  • Task routes

  • User routes

  • Drone routes

  • I have already updated all the logic getting the db connection and user in each endpoint.

  • We just need to update the remaining route logic, changing the usage of db.execute slightly from :project_id placeholders to %(project_id)s and ensure the raw SQL logic works as intended.

  • Also update the usage of project_crud.get_project_by_id to use the project_deps.get_project_by_id dependency in the endpoints (then delete project_crud.get_project_by_id).

@spwoodcock spwoodcock requested review from nrjadkry and Pradip-p August 6, 2024 23:56
@spwoodcock spwoodcock self-assigned this Aug 6, 2024
@github-actions github-actions bot added enhancement New feature or request backend Related to backend code devops Related to deployment or configuration frontend labels Aug 6, 2024
@spwoodcock spwoodcock assigned nrjadkry and Pradip-p and unassigned spwoodcock Aug 7, 2024
@spwoodcock spwoodcock force-pushed the feat/psycopg-pydantic branch from 5006a6a to b5a4b77 Compare August 7, 2024 09:50
@spwoodcock
Copy link
Member Author

I just rebased to the latest main.

Let me know if this looks manageable @nrjadkry @Pradip-p 🙏
Message me if issues / you would rather I finish it off 👍

@nrjadkry
Copy link
Member

nrjadkry commented Aug 9, 2024

@spwoodcock @Pradip-p I have started continuing this refactor process beginning with Drone module.

…ry placeholders, result fetching, and exception handling. Adjusted token expiration handling and password hashing. Updated user and profile operations to align with psycopg syntax.
@spwoodcock
Copy link
Member Author

Looks like this introduced a lot of errors that weren't picked up on initial testing?

What sort of errors are you experiencing @Pradip-p?
Do you have any logs / example?

This is quite a large refactor! 😅

@Pradip-p
Copy link
Collaborator

@spwoodcock I've fixed most of the errors that came up. I did catch a lot of the issues during initial testing, but since then, we've updated some logic and made changes to the main branch, which introduced new challenges. 😅

@spwoodcock
Copy link
Member Author

Oh ok, that makes sense!

We can merge this as soon as it's ready to prevent other conflicts 👍

@Pradip-p
Copy link
Collaborator

@nrjadkry @spwoodcock
Now, this PR is ready to merge with the develop branch. Before we merge, we need to make some changes to the frontend as well because the project details endpoint has changed API parameters like outline_geojson to outline and outline_no_fly_zone to no_fly_zones etc. Otherwise, it’s working perfectly.

TODO:

  • The frontend needs to be updated accordingly. The preview of the project details dashboard is currently broken. @suzit-10

@nrjadkry nrjadkry changed the base branch from main to develop August 22, 2024 05:50
@nrjadkry
Copy link
Member

@spwoodcock @Pradip-p
I have changed the base branch for this Pull Request to develop branch. Now, we have a development server as well. Let's make sure it works in the dev server and we can migrate it to the production server.

@spwoodcock
Copy link
Member Author

develop can be named dev if it's an easy change 🙏

4 less keystrokes 😆
(more importantly, less verbose BRANCH_NAME var, for the containers and volume names)

@nrjadkry nrjadkry merged commit d89cff1 into develop Aug 26, 2024
2 checks passed
@nrjadkry nrjadkry deleted the feat/psycopg-pydantic branch August 26, 2024 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend Related to backend code devops Related to deployment or configuration enhancement New feature or request frontend
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants