Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new fields to UserVisit #25

Merged
merged 6 commits into from
Nov 7, 2023

Conversation

mboboc
Copy link
Contributor

@mboboc mboboc commented Oct 30, 2023

This is an optimization to allow filtering by os/browser/device directly in the database.

django-user-visit is very useful in order to track user visits for websites, however, there are 2 main things happening when using this library:

  1. The UserVisit table gets big very fast
  2. One would like to filter after Device/OS/Browser which is not possible by default inside the database and needs additional parsing steps of the ua_string in Python

Because the user agent is stored as a string and the corresponding parsing functions are implemented in Python, the filtering is fated to be done in Python resulting in poor performance. Also, because the UserVisit table gets populated fast, the waiting times for filtering goes up fast too.

The straightforward solution is to store the string as separate fields inside the database.

Copy link
Collaborator

@hugorodgerbrown hugorodgerbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - I've added some basic edit suggestions just to bring it inline with our other libs.

user_visit/models.py Outdated Show resolved Hide resolved
user_visit/models.py Outdated Show resolved Hide resolved
user_visit/models.py Outdated Show resolved Hide resolved
user_visit/models.py Outdated Show resolved Hide resolved
@hugorodgerbrown
Copy link
Collaborator

@mboboc thanks for this update - good idea. Once the linting issues are fixed (and feedback applied) will be happy to merge it.

@hugorodgerbrown
Copy link
Collaborator

@mboboc just a couple of linting issues to fix - if you have the poetry environment set up you should be able to run pre-commit run --all-files and it'll fix them. (I hope.)

@mboboc
Copy link
Contributor Author

mboboc commented Nov 1, 2023

@hugorodgerbrown Got it, I'll update the code soon (today I hope).

@mboboc
Copy link
Contributor Author

mboboc commented Nov 1, 2023

The linting errors are from the .flake8 file that has inline comments.

flake8 docs:

Following the recommended settings for Python’s configparser, Flake8 does not support inline comments for any of the keys.

In order to reproduce the error locally I had to update the .pre-commit-config.yaml.

# Flake8 includes pyflakes, pycodestyle, mccabe, pydocstyle, bandit
- repo: https://gitlab.com/pycqa/flake8
rev: 3.9.2

with

# Flake8 includes pyflakes, pycodestyle, mccabe, pydocstyle, bandit
  - repo: https://github.com/PyCQA/flake8
    rev: 6.1.0

Saw you fixed it in #26.

@hugorodgerbrown
Copy link
Collaborator

Ah - yes, we're migrating all our repos from flake8 to ruff, so those errors don't matter in the long run. I'll merge your changes in and fix on master.

@hugorodgerbrown
Copy link
Collaborator

Although, just looking at this - the migrations need to be recreated, as they are now out of sync with the fields (default, null).

@hugorodgerbrown
Copy link
Collaborator

re. flake8 - if you "sync upstream" on your repo that should bring in the new linting?

@mboboc
Copy link
Contributor Author

mboboc commented Nov 2, 2023

After you merge #26 on master, I could bring the changes to this branch. This should solve it.

@mboboc
Copy link
Contributor Author

mboboc commented Nov 7, 2023

@hugorodgerbrown I merged the master branch into this branch, it should be okay now.

@hugorodgerbrown hugorodgerbrown self-requested a review November 7, 2023 11:18
@hugorodgerbrown hugorodgerbrown merged commit f5b2a27 into yunojuno:master Nov 7, 2023
24 checks passed
@hugorodgerbrown
Copy link
Collaborator

@mboboc merged - I'll cut a release in due course - and will add you to the changelog as the author.

@radudum10
Copy link

@hugorodgerbrown

What about the existing entries of UserVisits?
The migration should contain an operation that iterates through the existing entries, sets the new fields (browser, device, os) and saves them, shouldn't it?

@hugorodgerbrown
Copy link
Collaborator

@hugorodgerbrown

What about the existing entries of UserVisits? The migration should contain an operation that iterates through the existing entries, sets the new fields (browser, device, os) and saves them, shouldn't it?

Hi @radudum10 - yes, these will need backfilling. Our (hard won) experience with large tables and data migrations is that they can easily block a release or lock up a website at release time, and so our pattern for these types of releases is a management command that can a.) be aborted, and b.) be run out of hours.

There is a bunch of work to be done before the release as this is a major one (upgrading Django / Python classifiers), so I'll add in the management command and instructions on the upgrade.

@hugorodgerbrown
Copy link
Collaborator

I have released this as v2.0. There is a bug in the management command (:facepalm:) which means you have to run it with the --force flag for it to pick up anything, but I've run it over 1 million rows of real data, and it's working. It'll take time to run. I will fix the management command and release a patch later.

Screenshot 2023-11-07 at 16 45 29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants