-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make NPM post-processing agnostic to node_modules location #31519
Make NPM post-processing agnostic to node_modules location #31519
Conversation
Apologies for calling the original task into question a bit, but this very first sentence feels suspect to me:
This is a very odd situation that doesn't follow the normal rules of npm packages. I'd expect non-Webpack frontend code living in a vendor directory to have its own package.json file that lists its dependencies, and that they should also be I guess it's already happening that we're sharing the stuff we're copying out of node_modules? Do we ever have version skew issues? I'd expect so if edx-platform's version of its dependencies is being used for other, arbitrary JS code... seems like it would come up from time to time and be hard to debug. Another option that comes to mind would be for us to install the needed "stuff" as global npm packages (i.e., Sidebar, what is the vendor stuff we're referring to?! Got a quick pointer? |
I find the rules of how npm determines which node_modules directory to use to be a bit confusing, but my understanding is that it crawls up until it finds a package. When installing, it does the same thing... my concern is that by moving this up a directory, we create a weird situation where any siblings of edx-platform are finding a node_modules directory above them and resolving dependencies to it. Maybe that's not a thing that'll actually happen in reality, but it sort of puts us in a really confusing spot. Some more information we could stare at here: https://docs.npmjs.com/cli/v8/configuring-npm/folders |
Thanks for taking a look @davidjoy .
Indeed it is odd. I do not know exactly how we got to this point, but it feels like the result of moving from one framework to another without cleaning up the old stuff first. git-blame on most of the files in question sent me back 5-10 years.
Yup. I'm just tweaking what's already going on. Unless you pass a custom
I mean, the JS that is using these copied depencies is not "arbitrary other" JS; it's just older JS in edx-platform. Anyway, if you take a look at the list of dependencies, they don't strike me as things that we are keeping particularly fresh. To pick a random example, the last time our pin of (To be clear, I'm not defending the system, I'm just guessing as to why it's held together all this time :)
That would defeat the purpose of this PR, which is to allow node_modules to be placed in a parent directory edx-platform in a Docker image.
Yes--in fact, that's the behavior I'm counting on for the Tutor side of this change. The idea is that developers should be able to mount their own copy of edx-platform into a container /openedx/edx-platform, such that:
In the context of Tutor, I'm not concerned about this, because edx-platform is the only app on the container image. There is no sibling app to worry about. In this edx-platform PR, I've made sure not say where node_modules could be moved to. If you're ignoring Tutor and using edx-platform directly, then node_modules will still be faithfully located under edx-platform/node_modules. This PR just allows for node_modules to be moved by external tooling. |
You know, I think there's some terminology misuse going on either by me, edx-platform, or both. Let me try to clarify: edx-platform has a package.json and running There are JS modules in edx-platform that were never wired up look at node_modules; I guess this is the "non-Webpack" JS. Instead, they expect the code to be located in certain "vendor" directories. If I had to speculate, I'd assume they're called "vendor" dirs because the code was previously vendored-in (commited) to the git repo at those paths; once npm+node_modules was introduced, we wrote tooling to copy said libraries from node_modules into the vendor dirs instead. |
Sounds like we're doing this to cut down on install time, and just adding configuration for how you decide where the vendor files are originally copied from so that those using the tutor image can use the pre-baked node_modules in the image. I think it should work fine, according to the npm docs. My only hesitation is that node_modules can be particularly finicky at times, and this adds a layer to debugging dependency resolution that will be different depending on whether or not a given operator is using the image. Maybe it'll never come up, or rarely, at least. 😄 But I get it, and if it helps shorten the development cycle in some situations, I say Which is also to say it's reversible and doesn't sound like one need rely on this behavior. Yeehah. |
Ah, I think I see what happened here. My interpretation:
Does that sound accurate? Does any of it help? |
That's super helpful context, I think, yeah... @kdmccormick and I chatted in Slack and I summarized the conversation above for posterity. Without worrying too much about the off-the-rails-ness of the situation we're in, the PR adds some helpful flexibility (especially for backend work!) that doesn't destabilize anything in any significant way, just adds a tiny bit more wackiness for a big benefit. :) |
scripts/process-npm-assets.sh
Outdated
"$NODE_MODULES_PATH/requirejs/require.js" \ | ||
"$NODE_MODULES_PATH/underscore.string/dist/underscore.string.js" \ | ||
"$NODE_MODULES_PATH/underscore/underscore.js" \ | ||
"$NODE_MODULES_PATH/which-country/index.js" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cp
command is not going to replicate the previous paver command. For instance: $NODE_MODULES_PATH/which-country/index.js
will be moved to JS_VENDOR_PATH/index.js
. But it should be moved to "JS_VENDOR_PATH/which-country/index.js
. (if I understand the code correctly)
I suspect that the same comment applies to the other cp
commands below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code here is correct, although I wrote it months ago so I'm glad you are having me double-check.
If you run tutor local run lms openedx-assets npm
on tutor master (which calls the old process_npm_assets
), you get:
...
Copying vendor files into static directory
/bin/cp -rf node_modules/backbone.paginator/lib/backbone.paginator.js common/static/common/js/vendor
/bin/cp -rf node_modules/backbone/backbone.js common/static/common/js/vendor
/bin/cp -rf node_modules/bootstrap/dist/js/bootstrap.bundle.js common/static/common/js/vendor
... etc
Copying vendor library dir: node_modules/@edx/studio-frontend/dist/
/bin/cp -rf node_modules/@edx/studio-frontend/dist/assets.min.js.map common/static/common/js/vendor
/bin/cp -rf node_modules/@edx/studio-frontend/dist/courseHealthCheck.min.js common/static/common/js/vendor
/bin/cp -rf node_modules/@edx/studio-frontend/dist/common.min.js common/static/common/js/vendor
/bin/cp -rf node_modules/@edx/studio-frontend/dist/common.min.css common/static/common/css/vendor
/bin/cp -rf node_modules/@edx/studio-frontend/dist/courseHealthCheck.min.js.map common/static/common/js/vendor
... etc
...
It's copying the files directly into the common/static/common/[js|css]/vendor
directories. This matches my reading of copy_vendor_library.
The corresponding output from this PR's process-npm-assets.sh
script is:
+ find ./node_modules/@edx/studio-frontend/dist -type f ! -name *.css ! -name *.css.map -print0
+ xargs --null cp --target-directory=./common/static/common/js/vendor
+ find ./node_modules/@edx/studio-frontend/dist -type f ( -name *.css -o -name *.css.map+ ) -print0
xargs --null cp --target-directory=./common/static/common/css/vendor
+ rm -rf ./common/static/edx-ui-toolkit/js
+ cp --recursive ./node_modules/edx-ui-toolkit/src/js ./common/static/edx-ui-toolkit
+ cp --force ./node_modules/backbone.paginator/lib/backbone.paginator.js ./node_modules/backbone/backbone.js ./node_modules/bootstrap/dist/js/bootstrap.bundle.js ./node_modules/hls.js/dist/hls.js ./node_modules/jquery-migrate/dist/jquery-migrate.js ./node_modules/jquery.scrollto/jquery.scrollTo.js ./node_modules/jquery/dist/jquery.js ./node_modules/moment-timezone/builds/moment-timezone-with-data.js ./node_modules/moment/min/moment-with-locales.js ./node_modules/picturefill/dist/picturefill.js ./node_modules/requirejs/require.js ./node_modules/underscore.string/dist/underscore.string.js ./node_modules/underscore/underscore.js ./node_modules/which-country/index.js ./common/static/common/js/vendor
(The find ... | xargs cp ...
calls correspond to the Copying vendor library dir: node_modules/@edx/studio-frontend/dist/
section. The simple cp
calls correspond to the Copying vendor files into static directory
section.)
@@ -604,60 +571,11 @@ def process_npm_assets(): | |||
""" | |||
Process vendor libraries installed via NPM. | |||
""" | |||
def copy_vendor_library(library, skip_if_missing=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this process_npm_assets
function now call the process-npm-assets.sh
script? It seems to me that we would need this to be backward compatible.
As a side note, the openedx-assets.py
script in Tutor calls the process_npm_assets
function, so we would have to modify it if the function does not call the bash script: https://github.com/overhangio/tutor/blob/b903c69fac95e797532288d4e6f60530faf1db94/tutor/templates/build/openedx/bin/openedx-assets#L117
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this process_npm_assets function now call the process-npm-assets.sh script? It seems to me that we would need this to be backward compatible.
It does, see this line of the diff. And yes, goal is backwards-compatibility.
As a side note, the openedx-assets.py script in Tutor calls the process_npm_assets function, so we would have to modify it if the function does not call the bash script: https://github.com/overhangio/tutor/blob/b903c69fac95e797532288d4e6f60530faf1db94/tutor/templates/build/openedx/bin/openedx-assets#L117
For what it's worth, I do plan to change Tutor's openedx-assets.py script to call this bash script directly--we will need to do so in order to pass in the --node-modules=/openedx/node_modules
. That change is not necessary in order to merge this current PR, though.
When you've got the chance @regisb, this is ready for review. |
I updated the script's path from
|
Reimplement `pavelib/assets.py:process_npm_assets` equivalently as `scripts/assets/copy-node-modules.sh`. For backwards compatibility, `pavelib/assets.py:process_npm_assets` will still exist, just as a thin wrapper around the new shell script. Rationale: * Other parts of pavelib have already been reimplemented, like Python unit tests. We're following that trend. * The Python logic in pavelib is harder to understand than simple shell scripts. * pavelib has dependencies (Python, paver, edx-platform, other libs) which means that any pavelib scripts must be executed later in the edx-platform build process than we might want them to. For example, in a Dockerfile, it might be more performant to process npm assets *before* installing Python, but as long as we are still using pavelib, that is not an option. * The benefits of paver have been eclipsed by other tools, like Docker (for requisite management) and Click (for CLI building). * In the next couple commits, we make improvements to process-npm-assets.sh. These improvements would have been possible in the pavelib implementation, but would have been more complicated. Part of openedx-unsupported/wg-developer-experience#150 Closes #31606
The problem: In Tutor, NPM packages are installet d into the openedx image at /openedx/edx-platform/node_modules. So, when you bind-mount your own edx-platform, you override that node_modules folder. Then, you must re-install node_modules yourself (`tutor dev run --mount=edx-platform lms npm install`), which takes a long time and is completely redundant with the node_modules that you had to download when you downloaded the openedx image. If you forget to do this, then your edx-platform frontend will be broken. The solution: In Tutor: Move node_modules somewhere else in the openedx image, such as /openedx/node_modules. What this commit changes: Add a `--node-modules` parameter to copy-node-modules.sh, thus helping remove the assumption that node_modules must be a child directory of edx-platform. Part of openedx-unsupported/wg-developer-experience#150
The problem: (See previous commit) The solution: (See previous commit) What this commit changes: In order to allow certain especially old parts of edx-platform to use edx-ui-toolkit without involving webpack, the platform has a symlink: $EDX_PLATFORM/common/static/edx-ui-toolkit/js -> $EDX_PLATFORM/node_modules Unfortunately, this symlink is based on the assumption that node_modules will always be a child directory of edx-platform. In order to eliminate this assumption, we delete the symlink, and then have the existing `scripts/assets/copy-node-modules.sh` script copy edx-ui-toolkit/js from node_modules over to where the symlink used to be. Part of openedx-unsupported/wg-developer-experience#150
This is ready for review again. |
Thanks everyone for the reviews on this! I decided it'd be best to rewrite all of assets.py in bash first, separate from the node_modules changes here. Here's the ADR for that: #31790 Once the asset build is fully rewritten, I'll come back to this specific node_modules change, which will be much smaller and easier to review. |
Description
See openedx-unsupported/wg-developer-experience#150 for the full background.
This PR handles this particular piece of that issue:
Specifically, this PR:
edx-platform/pavelib/assets.py:process_npm_assets
as a shell script (first commit),common/static/edx-ui-toolkit/js -> ../../../node_modules/edx-ui-toolkit/src/js
in favor of just copying edx-ui-tookit/js as part of process_npm_assets (third commit).You can view the individual commits for more depth on each step.
You can see here how Tutor will use this new shell script.
Why not just edit edx-platform/pavelib/assets.py instead? Frankly, I could have. I just find the pavelib code to be overly complex and way too dependency-heavy what is really just a set of build scripts. Furthermore, Arbi-BOM's good work with unit tests affirms that we are moving away from pavelib. With that in mind, I decided to fully re-implement rather than update process_npm_assets in place. If you disagree with this decision, let me know; I'm happy to revisit it.
This is not meant to be a breaking change for any consumers of edx-platform.
Supporting information
Part of:
Blocks:
Related to
Testing instructions
tutor config save && tutor images pull openedx && tutor dev dc build lms
Deadline
No hard deadline, but sooner is better, since this is one small piece in a greater effort to make
tutor dev
an acceptable replacement for devstack.Other information
N/A