Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove modulestore init to avoid pymongo deadlocks #30

Merged
merged 1 commit into from
Jun 14, 2024

Conversation

johanseto
Copy link
Collaborator

@johanseto johanseto commented Jun 14, 2024

Description

Backport for mongo errors related serverSelectionTimeOut.
openedx#34663

Prior to this commit, the LMS would log the following error in tutor
production:

  pymongo/topology.py:175: UserWarning: MongoClient opened before fork.
  Create MongoClient only after forking. See PyMongo's documentation for
  details:
  https://pymongo.readthedocs.io/en/stable/faq.html#is-pymongo-fork-safe

Quoting from that page:

> PyMongo is not fork-safe. Care must be taken when using instances of
> MongoClient with fork(). Specifically, instances of MongoClient must
> not be copied from a parent process to a child process. Instead, the
> parent process and each child process must create their own instances
> of MongoClient. Instances of MongoClient copied from the parent
> process have a high probability of deadlock in the child process due
> to the inherent incompatibilities between fork(), threads, and locks
> described below. PyMongo will attempt to issue a warning if there is a
> chance of this deadlock occurring.

For edx-platform, the MongoClient connection is initalized with the
modulestore() invocation. That call creates and caches a global variable
that Studio or the LMS will reuse across the life of the worker process.

That initialization was put into lms/wsgi.py in 7c758ec, but originated
in lms/startup.py with 51d0dd1. The original reason for it is because at
that time (2013), we still supported the XML Modulestore, which stored
courses on disk as directories of OLX files and static assets. The XML
Modulestore would then read the entirety of those courses into memory at
startup. This meant that the startup process was *extremely* expensive,
so we needed to have it happen before the workers started serving
requests to users, instead of having the system lazily read them in when
the first user request arrived.

Loading course content in this form hasn't been supported since 2016,
meaning that modulestore initialization is no longer the performance
time bomb that it once was. The fact that this code remained here is
likely an oversight, which was considered harmless until @ztraboo
reported these pymongo log messages during the course of investigating
performance issues:

https://discuss.openedx.org/t/atlas-mongodb-performance-issues-un-indexed-queries/12803/16

Two potential followups that should be explored after this:

1. Tutor should probably be forking earlier than this, before we load
   Django settings and initialize database and cache connections.
2. It's possible that the caching mechanism for modulestore should be
   revisited to operate at the request cache level. The performance
   benefit of keeping it around may not be worth the potential memory
   leaks. Anything we do here would have to be very carefully monitored
   though, since connection costs may add up.

(cherry picked from commit d8aab3f)
@johanseto johanseto requested a review from andrey-canon June 14, 2024 17:10
Copy link
Collaborator

@andrey-canon andrey-canon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@johanseto johanseto merged commit 893c70b into open-release/palm.nelp Jun 14, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants