From 97c58e08eac72d8328467af4c06c95265004c834 Mon Sep 17 00:00:00 2001
From: Alexander Dusenbery <adusenbery@2u.com>
Date: Tue, 23 Jul 2024 15:26:52 -0400
Subject: [PATCH] docs: WIP ADR on replicated content metadata

---
 .../0023-replicated-content-metadata.rst      | 78 +++++++++++++++++++
 1 file changed, 78 insertions(+)
 create mode 100644 docs/decisions/0023-replicated-content-metadata.rst

diff --git a/docs/decisions/0023-replicated-content-metadata.rst b/docs/decisions/0023-replicated-content-metadata.rst
new file mode 100644
index 00000000..f7a9d5d2
--- /dev/null
+++ b/docs/decisions/0023-replicated-content-metadata.rst
@@ -0,0 +1,78 @@
+0023 Replicated Content Metadata
+********************************
+
+Status
+======
+**Proposed** (July 2024)
+
+Context
+=======
+We rely on content metadata from an upstream service, enterprise-catalog,
+for many of our flows. In particular, all of our flows related to content assignments
+depend on content metadata, beyond just the ``content_key`` related
+to a given ``LearnerContentAssignment``. Specifically, these flows
+all depend on content start dates for determining the expiration time of an assignment:
+* Our assignment read API
+* The assignment allocation write API
+* The ``credits-available`` endpoint
+* The assignment expiration management command
+* Email-sending tasks to create, remind, and nudge learners about assignments
+
+Additionally, future feature enhancements will rely on knowing the ``parent_content_key``
+related to a particular assignment. Specifically, assigning course *runs*, as opposed
+to top-level courses, will depend on this field of content metadata.
+
+Decision
+========
+Instead of fetching and caching content metadata from enterprise-catalog
+(the upstream system of record), we should fetch and persist this data in
+the database via a Django Model. Persisting content metadata in a dedicated model
+would allow us to create FKs from other models that refer to such a model. This gives
+us the following advantages:
+* We could allow for filtering/sorting/searching on content metadata fields in the
+  typical Django/DRF manner.
+* We could include content metadata fields in serialized API response payloads
+  via typical Django/DRF methods.
+* We could write/modify business logic in management commands, tasks, etc. based
+  on such a model, allowing us to bypass calls to the upstream service.
+
+High-level implementation
+-------------------------
+General outline of a phased approach here:
+1. Introduce model(s) to persist replicated content metadata.
+2. Define a serializer (or similar) for turning the response payload
+   into an instance of the model above.
+3. Define nullable foreign key from ``LearnerContentAssignment`` to the new model.
+4. Begin asynchronously populating model instances on reads or writes during a subset
+   of the assignment flows enumerated above.
+5. Define and implement a strategy for refreshing replicated metadata
+   (e.g. via event bus, or cron, or on read, etc.).
+6. Begin reading from replicated model instead of relying on ``get_and_cache_catalog_content_metadata()``
+   to fetch/cache without persistence.
+
+There will naturally be iteration between steps. For example, we'll likely
+need to modify or augment our serialization logic and/or fields in (2)
+as we attempt to execute step (6).
+
+Consequences
+============
+* We make more explicit the dependency of the assignments domain on content metadata.
+* ...
+
+Alternatives Considered
+=======================
+
+Continue to fetch and cache
+---------------------------
+This is our current state. While it does support the flows stated above, it doesn't
+easily support filtering/sorting/searching.  Furthermore, much of the serialization logic
+is complex (particularly with the context of ``credits-available``), which is a tradeoff
+we've made to enhance the performance of that serialization and its dependent views.
+
+Fetch and persist on models of other domains
+--------------------------------------------
+This is partially our current state as well - for instance, we store the content title
+directly on the ``LearnerContentAssignment`` model. This alternative becomes less
+attractive as we start to replicate fields onto models from *multiple* dependent domains.
+For example, there are flows in the domain of ``subsidy_access_policy`` that depend
+on content metadata fields.