-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract Python and Dask Executor
classes from Workflow
#1609
Extract Python and Dask Executor
classes from Workflow
#1609
Conversation
Click to view CI ResultsGitHub pull request #1609 of commit 4f3e941e62750333eccd6899cccf6181575b9b1e, no merge conflicts. Running as SYSTEM Setting status of 4f3e941e62750333eccd6899cccf6181575b9b1e to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4573/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 4f3e941e62750333eccd6899cccf6181575b9b1e^{commit} # timeout=10 Checking out Revision 4f3e941e62750333eccd6899cccf6181575b9b1e (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 4f3e941e62750333eccd6899cccf6181575b9b1e # timeout=10 Commit message: "Clean up `MerlinDaskExecutor.fit()`" > git rev-list --no-walk 1be6d8849ce7ced685fb755e168766b150e37536 # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins4082524424769022190.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1428 items |
Documentation preview |
Click to view CI ResultsGitHub pull request #1609 of commit 64914a5f8965c646133e4417b807717ebfde610f, no merge conflicts. Running as SYSTEM Setting status of 64914a5f8965c646133e4417b807717ebfde610f to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4583/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 64914a5f8965c646133e4417b807717ebfde610f^{commit} # timeout=10 Checking out Revision 64914a5f8965c646133e4417b807717ebfde610f (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 64914a5f8965c646133e4417b807717ebfde610f # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk d5d379101ec42f6ba7b7f31fc9f3237f29d1b5fb # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins1193956549961660074.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1428 items |
I suppose this would (partially) intersect with NVIDIA-Merlin/core#70 |
Yeah, good point @rjzamora. I would like to be able to do Dask computations across all the Merlin libraries, and also use Merlin graphs to run computations without Dask in some contexts (e.g. in Triton), so I ended up with a somewhat different design. |
Click to view CI ResultsGitHub pull request #1609 of commit 7ca7c0def80043f81602f0400142d8e866a5d562, no merge conflicts. Running as SYSTEM Setting status of 7ca7c0def80043f81602f0400142d8e866a5d562 to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4600/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 7ca7c0def80043f81602f0400142d8e866a5d562^{commit} # timeout=10 Checking out Revision 7ca7c0def80043f81602f0400142d8e866a5d562 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 7ca7c0def80043f81602f0400142d8e866a5d562 # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk 54c0038e16bfb8603e3f6ec7cbebb8ae5a4dc4a9 # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins2239481737267509604.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1428 items |
arbitration: which initiative is this under ? |
) | ||
) | ||
|
||
def fit(self, ddf, nodes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📄 missing nodes in Paramaters docstring here
|
||
def __getstate__(self): | ||
# dask client objects aren't picklable - exclude from saved representation | ||
return {k: v for k, v in self.__dict__.items() if k != "client"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❓ I'm wondering where the client attribute is being set on the object (that this code is trying exclude). I don't see a self.client
in here. Could be something outside this module doing something I suppose. Not suggesting we remove this now since it was here before and to reduce risk it makes sense to keep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is leftover from a version before I realized I should use set_client_deprecated
, so it is likely safe to remove. I know from the process of writing this that NVT tests will fail when saving a Workflow if there's a non-serializable client attribute on this object, so if it's problematic to remove, we'll find out quickly.
😃 This PR is a great example of separating changes into well defined commits that makes reviewing a refactor like this easy to follow. 🚀 It looks like a great step in the direction toward being able to run these transforms in different modes. I imagine we may identify further changes as we try to use this in Systems. In the interest of keeping the changes relatively small, it seems like in a merge-able state to me. |
@viswa-nvidia This PR was opened on the premise that we'd be working on offline batch recs generation in 22.08, as we'd planned before session-based bumped it out of the way. Since we still plan to work on offline batch (albeit later than we'd originally hoped), this PR is still relevant but not tied to one of the pieces of work slated for 22.08. |
Click to view CI ResultsGitHub pull request #1609 of commit 242fc3657c847d7ed026dc657dc5a331c73ca015, no merge conflicts. Running as SYSTEM Setting status of 242fc3657c847d7ed026dc657dc5a331c73ca015 to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4612/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 242fc3657c847d7ed026dc657dc5a331c73ca015^{commit} # timeout=10 Checking out Revision 242fc3657c847d7ed026dc657dc5a331c73ca015 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 242fc3657c847d7ed026dc657dc5a331c73ca015 # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk 302f7c355a27bd485f293a4494785ea89d29949e # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins2058300991048675202.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1432 items |
I'm not able to reproduce these test failures locally, even in the |
rerun tests |
Click to view CI ResultsGitHub pull request #1609 of commit 242fc3657c847d7ed026dc657dc5a331c73ca015, no merge conflicts. GitHub pull request #1609 of commit 242fc3657c847d7ed026dc657dc5a331c73ca015, no merge conflicts. Running as SYSTEM Setting status of 242fc3657c847d7ed026dc657dc5a331c73ca015 to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4613/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 242fc3657c847d7ed026dc657dc5a331c73ca015^{commit} # timeout=10 Checking out Revision 242fc3657c847d7ed026dc657dc5a331c73ca015 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 242fc3657c847d7ed026dc657dc5a331c73ca015 # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk 242fc3657c847d7ed026dc657dc5a331c73ca015 # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins7892443554037532412.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1432 items |
Click to view CI ResultsGitHub pull request #1609 of commit 9df466c566c9f80b1282693baecbd07c6a2d6bb6, no merge conflicts. Running as SYSTEM Setting status of 9df466c566c9f80b1282693baecbd07c6a2d6bb6 to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4626/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 9df466c566c9f80b1282693baecbd07c6a2d6bb6^{commit} # timeout=10 Checking out Revision 9df466c566c9f80b1282693baecbd07c6a2d6bb6 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 9df466c566c9f80b1282693baecbd07c6a2d6bb6 # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk 8bd1260ba233898308f1416f79cefbd75013f4ff # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins15266300073057526636.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1430 items / 1 skipped |
rerun tests |
Click to view CI ResultsGitHub pull request #1609 of commit 9df466c566c9f80b1282693baecbd07c6a2d6bb6, no merge conflicts. GitHub pull request #1609 of commit 9df466c566c9f80b1282693baecbd07c6a2d6bb6, no merge conflicts. Running as SYSTEM Setting status of 9df466c566c9f80b1282693baecbd07c6a2d6bb6 to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4627/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 9df466c566c9f80b1282693baecbd07c6a2d6bb6^{commit} # timeout=10 Checking out Revision 9df466c566c9f80b1282693baecbd07c6a2d6bb6 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 9df466c566c9f80b1282693baecbd07c6a2d6bb6 # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk 9df466c566c9f80b1282693baecbd07c6a2d6bb6 # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins5374540968505043348.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1430 items / 1 skipped |
rerun tests |
Click to view CI ResultsGitHub pull request #1609 of commit 9df466c566c9f80b1282693baecbd07c6a2d6bb6, no merge conflicts. GitHub pull request #1609 of commit 9df466c566c9f80b1282693baecbd07c6a2d6bb6, no merge conflicts. Running as SYSTEM Setting status of 9df466c566c9f80b1282693baecbd07c6a2d6bb6 to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4628/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 9df466c566c9f80b1282693baecbd07c6a2d6bb6^{commit} # timeout=10 Checking out Revision 9df466c566c9f80b1282693baecbd07c6a2d6bb6 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 9df466c566c9f80b1282693baecbd07c6a2d6bb6 # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk 9df466c566c9f80b1282693baecbd07c6a2d6bb6 # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins16619067910149829981.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1430 items / 1 skipped |
rerun tests |
Click to view CI ResultsGitHub pull request #1609 of commit 9df466c566c9f80b1282693baecbd07c6a2d6bb6, no merge conflicts. GitHub pull request #1609 of commit 9df466c566c9f80b1282693baecbd07c6a2d6bb6, no merge conflicts. Running as SYSTEM Setting status of 9df466c566c9f80b1282693baecbd07c6a2d6bb6 to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4632/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 9df466c566c9f80b1282693baecbd07c6a2d6bb6^{commit} # timeout=10 Checking out Revision 9df466c566c9f80b1282693baecbd07c6a2d6bb6 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 9df466c566c9f80b1282693baecbd07c6a2d6bb6 # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk 5e149c8a6f16a47cd99a23f4c060318f247fca7b # timeout=10 First time build. Skipping changelog. [nvtabular_tests] $ /bin/bash /tmp/jenkins3170339596225298332.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1430 items / 1 skipped |
The tests for this keep hanging on the multi-GPU Jenkins machine. Not sure if it's an issue with this PR specifically, or NVTabular PRs in general... |
Click to view CI ResultsGitHub pull request #1609 of commit 35f7c158c6023ef878644de0b65dbdfa3d28b609, no merge conflicts. Running as SYSTEM Setting status of 35f7c158c6023ef878644de0b65dbdfa3d28b609 to PENDING with url http://10.20.17.181:8080/job/nvtabular_tests/4633/ and message: 'Build started for merge commit.' Using context: Jenkins Unit Test Run Building on master in workspace /var/jenkins_home/workspace/nvtabular_tests using credential nvidia-merlin-bot Cloning the remote Git repository Cloning repository https://github.com/NVIDIA-Merlin/NVTabular.git > git init /var/jenkins_home/workspace/nvtabular_tests/nvtabular # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git > git --version # timeout=10 using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10 > git config remote.origin.url https://github.com/NVIDIA-Merlin/NVTabular.git # timeout=10 Fetching upstream changes from https://github.com/NVIDIA-Merlin/NVTabular.git using GIT_ASKPASS to set credentials This is the bot credentials for our CI/CD > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/NVTabular.git +refs/pull/1609/*:refs/remotes/origin/pr/1609/* # timeout=10 > git rev-parse 35f7c158c6023ef878644de0b65dbdfa3d28b609^{commit} # timeout=10 Checking out Revision 35f7c158c6023ef878644de0b65dbdfa3d28b609 (detached) > git config core.sparsecheckout # timeout=10 > git checkout -f 35f7c158c6023ef878644de0b65dbdfa3d28b609 # timeout=10 Commit message: "Merge branch 'main' into refactor/decouple-dask" > git rev-list --no-walk 9df466c566c9f80b1282693baecbd07c6a2d6bb6 # timeout=10 [nvtabular_tests] $ /bin/bash /tmp/jenkins5945207459896974934.sh ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0 rootdir: /var/jenkins_home/workspace/nvtabular_tests/nvtabular, configfile: pyproject.toml plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0 collected 1430 items / 1 skipped |
We'd like to re-use some of the mechanics of graph execution (both local and distributed) in other parts of Merlin, so this is a step in the direction of disentangling graph execution from
Workflow
itself. It removes direct dependencies on Dask fromWorkflow
and centralizes them inMerlinDaskExecutor
, whichWorkflow
can then use in conjunction with a Merlin operator DAG to run distributed computations.In the future, we'd like to use these
Executor
classes in Merlin Systems too, so that we can run the full process of generating recommendations (also represented as a Merlin DAG) interchangeably either in Triton (usingMerlinPythonExecutor
) or on Dask (usingMerlinDaskExecutor
.)