-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Reworked README for improved clarity (#1002)
- Loading branch information
1 parent
3d4084e
commit 067880e
Showing
1 changed file
with
92 additions
and
164 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,225 +1,153 @@ | ||
The Aspects plugin for Tutor | ||
============================ | ||
============================= | ||
The Aspects Plugin for Tutor | ||
============================= | ||
|
||
Aspects Learner Analytics combines several free, open source, tools to add analytics and reporting capabilities to the Open edX platform. This plugin offers easy installation, configuration, and deployment of these tools using `Tutor <https://docs.tutor.overhang.io>`__. The tools Aspects uses are: | ||
Aspects Learner Analytics integrates several open-source tools to add powerful analytics and reporting capabilities to the Open edX platform. This plugin enables seamless installation, configuration, and deployment of these tools via `Tutor <https://docs.tutor.overhang.io>`_. The tools integrated by Aspects are: | ||
|
||
- `ClickHouse <https://clickhouse.com>`__, a fast, scalable analytics database that can be run anywhere | ||
- `Apache Superset <https://superset.apache.org>`__, a data visualization platform and data API | ||
- `OpenFUN Ralph <https://openfun.github.io/ralph/>`__, a Learning Record store (and more) that can validate and store xAPI statements in ClickHouse | ||
- `Vector <https://vector.dev/>`__, a log forwarding tool that can be used to forward tracking log and xAPI data to ClickHouse | ||
- `event-routing-backends <https://event-routing-backends.readthedocs.io/en/latest/>`__, an Open edX plugin that transforms tracking logs into xAPI and optionally forwards them to one or more Learning Record Stores in near real time | ||
- `event-sink-clickhouse <https://github.com/openedx/openedx-event-sink-clickhouse>`__, an Open edX plugin that exports course structure and high level data to ClickHouse at publish time | ||
- `dbt <https://www.getdbt.com/>`__, a tool to build data pipelines from SQL queries. The dbt project used by this plugin is `aspects-dbt <https://github.com/openedx/aspects-dbt>`__. | ||
- `ClickHouse <https://clickhouse.com>`_: A fast and scalable analytics database. | ||
- `Apache Superset <https://superset.apache.org>`_: A data visualization and exploration platform. | ||
- `OpenFUN Ralph <https://openfun.github.io/ralph/>`_: A Learning Record Store that validates and stores xAPI statements in ClickHouse. | ||
- `Vector <https://vector.dev>`_: A tool for forwarding logs and xAPI data to ClickHouse. | ||
- `Event-Routing-Backends <https://event-routing-backends.readthedocs.io/en/latest/>`_: An Open edX plugin that transforms tracking logs into xAPI and forwards them to Learning Record Stores in near real-time. | ||
- `dbt <https://www.getdbt.com>`_: A SQL-based data pipeline builder, utilizing the `aspects-dbt <https://github.com/openedx/aspects-dbt>`_ project. | ||
|
||
See https://github.com/openedx/openedx-aspects for more details about the Aspects architecture and high level documentation. | ||
For more information, refer to the `Aspects architecture documentation <https://docs.openedx.org/projects/openedx-aspects/en/latest/technical_documentation/concepts/aspects_overview.html>`_. | ||
|
||
Aspects is a community developed effort combining the Cairn project by Overhang.io and the OARS project by EduNEXT, OpenCraft, and Axim Collaborative. | ||
Key Features | ||
============ | ||
|
||
Note: Aspects is beta and not yet production ready! Please feel free to experiment with the system and offer feedback about what you'd like to see by adding Issues in this repository. Current details on the beta progress can be found here: https://openedx.atlassian.net/wiki/spaces/COMM/pages/3861512203/Aspects+Beta | ||
- Streamlined deployment of analytics and reporting tools. | ||
- Integration with Open edX for real-time and historical data analytics. | ||
- Extensible architecture supporting customization. | ||
|
||
Compatibility | ||
------------- | ||
|
||
This plugin is compatible with Tutor 15.0.0 and later and is expected to be compatible with Open edX releases from Nutmeg forward. | ||
|
||
Installation | ||
------------ | ||
|
||
Aspects is implemented as a Tutor plugin. Documentation will be coming soon to cover how to install Aspects in non-Tutor environments, but by far the easiest way to try and install it is via Tutor. These instructions assume you are running a `tutor local` install, which is the fastest and easiest way to get started. | ||
|
||
#. Install Tutor: https://docs.tutor.overhang.io/install.html#install | ||
|
||
#. Create an admin user on the LMS: https://docs.tutor.overhang.io/whatnext.html#logging-in-as-administrator | ||
|
||
#. Install the Aspects plugin (in your Tutor Python environment):: | ||
|
||
pip install tutor-contrib-aspects | ||
|
||
#. Enable the plugins:: | ||
|
||
tutor plugins enable aspects | ||
Compatibility | ||
============= | ||
|
||
#. Save the changes to the environment:: | ||
The plugin is compatible with Tutor 15.0.0 and later and supports Open edX releases from Nutmeg onward. | ||
|
||
tutor config save | ||
Installation | ||
============ | ||
|
||
#. Because we're installing new applications in LMS (event-routing-backends, event-sink-clickhouse) you will need to rebuild your openedx Docker image:: | ||
Aspects is implemented as a Tutor plugin. For now, the easiest installation method is via Tutor. Follow these steps for a ``tutor local`` installation: | ||
|
||
tutor images build openedx --no-cache | ||
1. **Install Tutor**: | ||
Follow the instructions at `Tutor Installation Guide <https://docs.tutor.overhang.io/install.html#install>`_. | ||
|
||
#. Build the Aspects-flavored Superset image to bake your settings (such as database passwords) into the Superset assets:: | ||
2. **Create an Admin User**: | ||
Refer to the `Tutor Setup Guide <https://docs.tutor.overhang.io/whatnext.html#logging-in-as-administrator>`_. | ||
|
||
tutor images build aspects-superset | ||
3. **Install and Enable the Plugin**: | ||
|
||
#. Run the initialization scripts:: | ||
.. code-block:: bash | ||
tutor local do init | ||
pip install tutor-contrib-aspects | ||
tutor plugins enable aspects | ||
tutor config save | ||
At this point you should have a working Tutor / Aspects environment, but with no way to create data! There are a few options for how to proceed. | ||
4. **Rebuild Docker Images**: | ||
|
||
#. If you would just like to see some data populated in the charts without loading a real course in the LMS you can create test data in the database (use ``--help`` for usage):: | ||
.. code-block:: bash | ||
tutor local do load-xapi-test-data | ||
tutor images build openedx --no-cache | ||
tutor images build aspects aspects-superset | ||
#. OR Load the test course and generate real data from the LMS: | ||
5. **Initialize the Environment**: | ||
|
||
#. https://docs.tutor.overhang.io/whatnext.html#importing-a-demo-course | ||
.. code-block:: bash | ||
#. Log into the LMS with your admin user and enroll / proceed through the demo course | ||
tutor local do init | ||
#. OR If you are adding Aspects to an existing LMS that already has data | ||
Data Population Options | ||
------------------------ | ||
|
||
#. Sink course data from the LMS to clickhouse (see https://github.com/openedx/openedx-event-sink-clickhouse for more information):: | ||
To visualize data: | ||
|
||
tutor local do dump-data-to-clickhouse --options "--object course_overviews" | ||
- Generate test data: | ||
|
||
#. Sink Historical event data to ClickHouse:: | ||
.. code-block:: bash | ||
tutor [dev|local] do transform-tracking-logs \ | ||
--source_provider LOCAL --source_config '{"key": "/openedx/data", "container": | ||
"logs", "prefix": "tracking.log"}' \ | ||
--transformer_type xapi | ||
tutor local do load-xapi-test-data | ||
# Note that this will work only for default tutor installation. If you store your tracking logs any other way, you need to change the source_config option accordingly. | ||
# See https://event-routing-backends.readthedocs.io/en/latest/howto/how_to_bulk_transform.html#sources-and-destinations for details on how to change the source_config option. | ||
- Import a demo course and create real data: | ||
|
||
#. If your assets have changed since the last time you ran init, you will need to rebuild the aspects-superset image and re-import the assets:: | ||
Follow `these steps <https://docs.tutor.overhang.io/whatnext.html#importing-a-demo-course>`_. | ||
|
||
tutor images build aspects-superset --no-cache | ||
tutor local do import-assets | ||
- Interact with the course to generate data: | ||
|
||
#. Make sure to build and push your Superset image in the following cases: | ||
Complete a few activities within the course (e.g., enroll, take quizzes, watch videos) to generate real data. This will provide a more realistic dataset for analytics. | ||
|
||
|
||
#. If you have made changes to the Superset assets. | ||
#. If you have made changes to the Clickhouse/DBT schema. | ||
#. If you are using custom translations. | ||
|
||
- Sync data from an existing Tutor installation with default settings: | ||
|
||
You should now have data to look at in Superset! Log in to https://superset.local.overhang.io/ with your admin account and you should see charts with your data. | ||
.. code-block:: bash | ||
Aspects Autoscaling | ||
------------------- | ||
tutor local do dump-data-to-clickhouse --options "--object course_overviews" | ||
tutor local do transform-tracking-logs --source_provider LOCAL --source_config '{"key": "/openedx/data", "container": "logs", "prefix": "tracking.log"}' --transformer_type xapi | ||
Aspects adds default autoscaling values for `Ralph`, `Superset` and the `Superset Worker` deployments via | ||
`tutor-contrib-pod-autoscaling <https://github.com/eduNEXT/tutor-contrib-pod-autoscaling>`_. To apply the | ||
autoscaling settings make sure to install the plugin and enable it. To modify the autoscaling values | ||
see the `Configuration <https://github.com/eduNEXT/tutor-contrib-pod-autoscaling?tab=readme-ov-file#configuration>`_ section. | ||
Superset and Autoscaling | ||
========================= | ||
|
||
Superset Assets | ||
--------------- | ||
|
||
Aspects maintains the Superset assets in this repository, specifically the dashboards, | ||
charts, datasets, and databases. That means that any updates made here will be reflected | ||
on your Superset instance when you update your deployment. | ||
|
||
But it also means that any local changes you make to these assets will be overwritten | ||
when you update your deployment. To prevent your local changes from being overwritten, | ||
please create new assets and make your changes there instead. You can copy an existing | ||
asset by editing the asset in Superset and selecting "Save As" to save it to a new name. | ||
|
||
# Note: If you are using custom assets you will need to rebuild your aspects-superset | ||
# image on your local machine with `tutor images build aspects-superset --no-cache`. | ||
|
||
Assets (charts/datasets) created for Aspects that are no longer used can be listed in | ||
`aspects_asset_list.yaml`. These assets & any translated assets created from them, | ||
are deleted from Superset during `init` (specifically `import-assets`). The corresponding | ||
YAML files are deleted during `import_superset_zip` or and `check_superset_assets`. | ||
|
||
Sharing Charts and Dashboards | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
To share your charts with others in the community, use Superset's "Export" button to | ||
save a zip file of your charts and related datasets. | ||
|
||
.. warning:: | ||
The exported datasets will contain hard-coded references to your particular | ||
databases, including your database hostname, port, and username, in some cases | ||
it may also contain database passwords. It is vital that you review the | ||
database and dataset files before sharing them. | ||
|
||
To import charts or dashboards shared by someone in the community: | ||
|
||
#. Expand the zip file and look for any files added under ``databases``. | ||
Update the ``sqlalchemy_uri`` to match your database's connection details. | ||
#. Compress the files back into a ``.zip`` file. | ||
#. On the Charts or Dashboards page, use the "Import" button to upload your ``.zip`` file. | ||
|
||
|
||
Contributing Charts and Dashboards to Aspects | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
The Superset assets provided by Aspects can be found in the templated | ||
`tutoraspects/templates/aspects/build/aspects-superset/openedx-assets/assets/` directory. For the most part, | ||
these files are what Superset exports, but with some crucial differences | ||
which make these assets usable across all Tutor deployments. | ||
Aspects maintains its Superset assets (dashboards, charts, datasets) in the repository. Local changes to these assets will be overwritten during updates unless saved as new assets. | ||
|
||
To contribute assets to Aspects: | ||
To rebuild and re-import assets: | ||
|
||
#. Fork this repository and have a locally running Tutor set up with this plugin | ||
installed. | ||
#. Export the assets you want to contribute as described in `Sharing Charts and Dashboards` | ||
#. Run the command: | ||
`tutor aspects import_superset_zip ~/Downloads/your_file.zip` | ||
#. This command will copy the files from your zip to the assets directory and | ||
attempt to warn you if there are hard coded connection settings where it expects | ||
template variables. These are usually in database and dataset assets, and those are | ||
often assets that already exist. The warnings look like: | ||
.. code-block:: bash | ||
`WARN: fact_enrollments.yaml has schema set to reporting instead of a setting.` | ||
#. Check the diff of files and update any database connection strings or table names | ||
to use Tutor configuration template variables instead of hard-coded strings, e.g. | ||
replace ``clickhouse`` with ``{{CLICKHOUSE_HOST}}``. Passwords can be left as | ||
``{{CLICKHOUSE_PASSWORD}}``, though be aware that if you are adding new | ||
databases, you'll need to update ``SUPERSET_DB_PASSWORDS`` in the init scripts. | ||
Here is the default connection string for reference:: | ||
tutor images build aspects-superset --no-cache | ||
tutor local do import-assets | ||
``clickhousedb+connect://{{CLICKHOUSE_REPORT_URL}}`` | ||
#. You will likely also run into issues where our SQL templates have been expanded into | ||
their actual SQL. If you haven't changed the SQL of these queries (stored in | ||
`tutoraspects/templates/openedx-assets/queries` you can just revert that change back | ||
to their `include` values such as: | ||
`sql: "{% include 'openedx-assets/queries/fact_enrollments_by_day.sql' %}"` | ||
#. The script will also warn about missing `_roles` in dashboards. Superset does not export | ||
these, so you will need to manually add this key with the roles that are necessary to | ||
view the dashboard. See the existing dashboards for how this is done. | ||
#. Re-build your ``aspects-superset`` image with `tutor images build aspects-superset --no-cache` | ||
#. Run the command `tutor aspects check_superset_assets` to confirm there are no | ||
duplicate assets, which can happen when you rename an asset, and will cause import | ||
to fail. The command will automatically delete the older file if it finds a duplicate. | ||
#. Check that everything imports correctly by running `tutor local do import-assets` | ||
and confirming there are no errors. | ||
#. Double check that your database password did not get exported before committing! | ||
#. Commit and submit a PR with screenshots of your new chart or dashboards, along with an | ||
explanation of what data question they answer. | ||
Autoscaling | ||
----------- | ||
|
||
Aspects supports Kubernetes autoscaling configurations for Ralph, Superset, and Superset Worker via the `Pod Autoscaling plugin <https://github.com/eduNEXT/tutor-contrib-pod-autoscaling>`_. Modify autoscaling settings as needed. | ||
|
||
Virtual datasets in Superset | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
Contributing Charts and Dashboards | ||
=================================== | ||
|
||
Superset supports creating virtual datasets, which are datasets defined using a SQL query instead of mapping directly to an underlying database object. Aspects leverages virtual datasets, along with `SQL templating <https://superset.apache.org/docs/installation/sql-templating/>`_, to make better use of table indexes. | ||
To contribute Superset assets: | ||
|
||
To make it easier for developers to manage virtual datasets, there is an extra step that can be done on the output of ``tutor aspects serialize``. The ``sql`` section of the dataset yaml can be moved to its own file in the `queries`_ directory and included in the yaml like so: | ||
1. Fork this repository and set up a local Tutor instance with Aspects installed. | ||
2. You should work on the non-localized versions of the Superset dashboards. Export the new or updated dashboard(s) using Superset’s “Export” feature. It is best to export the entire dashboard instead of just charts or datasets to ensure that all of the correct changes are captured. | ||
3. Use the command: | ||
|
||
.. code-block:: yaml | ||
.. code-block:: bash | ||
sql: "{% include 'openedx-assets/queries/query.sql' %}" | ||
tutor aspects import_superset_zip ~/Downloads/your_file.zip | ||
4. Update database connection strings to use template variables. | ||
5. Validate and rebuild: | ||
|
||
However, please keep in mind that the assets declaration is itself a jinja template. That means that any jinja used in the dataset definition should be escaped. There are examples of how to handle this in the existing queries, such as `dim_courses.sql`_. | ||
.. code-block:: bash | ||
.. _queries: tutoraspects/templates/openedx-assets/queries/ | ||
tutor images build aspects-superset --no-cache | ||
tutor aspects check_superset_assets | ||
tutor local do import-assets | ||
.. _dim_courses.sql: tutoraspects/templates/openedx-assets/queries/dim_courses.sql | ||
6. Submit a pull request with screenshots and details of your contributions. | ||
|
||
Release Workflow | ||
================ | ||
|
||
Releasing tutor-contrib-aspects | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
Releases are handled by repository maintainers via GitHub Actions: | ||
|
||
Changelog, package version, PyPI release, and image building are all handled via manually triggered Githib Actions. | ||
- Trigger the **Bump version and changelog** action to update the version and changelog. | ||
- Merge the PR to initiate the **release** and **build-image** workflows. | ||
|
||
To trigger a build you must have access to manually trigger the "Bump version and changelog" action. This will update the version and changelog in a new PR. If the PR looks good, you can approve and merge it. Merging this PR will: | ||
Ensure the updated version appears on `PyPI <https://pypi.org>`_ and DockerHub. | ||
|
||
- Trigger the "release" workflow which will tag a Github release with the new version number, and then push the release to PyPI | ||
- Trigger the "build-image" workflow, which builds our images for aspects, aspects-superset, and openedx to the EduNEXT DockerHub repositories | ||
Additional Resources | ||
===================== | ||
|
||
When the workflows are finished you should confirm that you see the new version on PyPI and images in DockerHub. | ||
- `Tutor Documentation <https://docs.tutor.overhang.io>`_ | ||
- `Aspects Beta Progress <https://openedx.atlassian.net/wiki/spaces/COMM/pages/3861512203/Aspects+Beta>`_ | ||
- `Superset Documentation <https://superset.apache.org/docs>`_ | ||
- `DBT Documentation <https://www.getdbt.com/docs/>`_ | ||
- `Event Routing Backends Documentation <https://event-routing-backends.readthedocs.io/en/latest/>`_ | ||
- `Tracking Logs Documentation <https://vector.dev/docs/>`_ |