-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run GC compaction before applying solutions #5376
Conversation
How much time does the compaction take on your machine (easiest way would just be to add another debug line before it and run with debugging). In the grand scheme of things, I’m wondering if we could just unconditionally do the compaction. It could also be done as a (priority) parallel task at the start of the action graph, which would further mitigate any speed impact. There’s a slight worry for the future that OCaml 5 at present has no compactor! |
You should be able to get the same effect with a best-fit GC (the default in OCaml 4.14) and a |
Thank you for the replies. I will try the suggestions/experiments.
Ooops, I see. |
It took 747ms. Hmmm, much bigger than I expected. Is this small enough for us to run it unconditionally?
Taking some time for understanding how it works and where it is. Any pointer would be appreciated.
Unfortunately,
|
That's what I'd expect - best fit might reduce the peak heap requirement (although I expect opam's memory profile is a stream of allocations, followed by a large amount of garbage immediately after the solution is converted to an action graph, so fragmentation is hopefully not causing much problem). @rjbou - what I was thinking here was that possibly at the point where opamProcess first is going to wait for the completion of running jobs, we could add a |
In ocaml#5376, @dra27 suggested running `Gc.compact` when the main process is waiting for the children processes at the first time. > what I was thinking here was that possibly at the point where opamProcess first is going to wait for the completion of running jobs, we could add a Gc.compact? In my local running on `opam install ppxlib`, "GC compact" ran in the middle of parallel processing of actions. ``` The following actions will be performed: === install 1 package ∗ ppxlib 0.28.0 00:07.216 XSYS Adding to env { LC_ALL=C } 00:09.099 STATE depexts loaded in 1.883s 00:09.100 SOLUTION parallel_apply 00:09.100 SOLUTION Regroup shared source packages: {} <><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><> 🐫 00:09.106 PARALLEL Iterate over 3 task(s) with 11 process(es) 00:09.106 PARALLEL Starting job 444950918 (worker -/11 -/1 1/3): ⬇ ppxlib.0.28.0 00:09.106 SOLUTION Fetching sources for ppxlib.0.28.0 00:09.106 ACTION download_package: ppxlib.0.28.0 00:09.106 SYSTEM rmdir /Users/scho/.opam/4.14.0/.opam-switch/sources/ppxlib.0.28.0 00:09.109 SYSTEM mkdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.110 PARALLEL Next task in job 444950918: /usr/bin/tar xfj /Users/scho/.opam/download-cache/sha256/d8/d87ae5f9a081206308ca964809b50a66aeb8e83d254801e8b9675448b60cf377 -C /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.619 PARALLEL GC compact (heap 490 MB -> 328 MB) 00:09.619 PARALLEL Collected task for job 444950918 (ret:0) 00:10.158 SYSTEM rmdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 ⬇ retrieved ppxlib.0.28.0 (cached) 00:10.321 PARALLEL Job 444950918 finished ``` TBD
In ocaml#5376, @dra27 suggested running `Gc.compact` when the main process is waiting for the children processes for the first time. > what I was thinking here was that possibly at the point where opamProcess first is going to wait > for the completion of running jobs, we could add a Gc.compact? In my local running on `opam install ppxlib`, "GC compact" ran in the middle of parallel processing of actions. ``` The following actions will be performed: === install 1 package ∗ ppxlib 0.28.0 00:07.216 XSYS Adding to env { LC_ALL=C } 00:09.099 STATE depexts loaded in 1.883s 00:09.100 SOLUTION parallel_apply 00:09.100 SOLUTION Regroup shared source packages: {} <><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><> 🐫 00:09.106 PARALLEL Iterate over 3 task(s) with 11 process(es) 00:09.106 PARALLEL Starting job 444950918 (worker -/11 -/1 1/3): ⬇ ppxlib.0.28.0 00:09.106 SOLUTION Fetching sources for ppxlib.0.28.0 00:09.106 ACTION download_package: ppxlib.0.28.0 00:09.106 SYSTEM rmdir /Users/scho/.opam/4.14.0/.opam-switch/sources/ppxlib.0.28.0 00:09.109 SYSTEM mkdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.110 PARALLEL Next task in job 444950918: /usr/bin/tar xfj /Users/scho/.opam/download-cache/sha256/d8/d87ae5f9a081206308ca964809b50a66aeb8e83d254801e8b9675448b60cf377 -C /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.619 PARALLEL GC compact (heap 490 MB -> 328 MB) 00:09.619 PARALLEL Collected task for job 444950918 (ret:0) 00:10.158 SYSTEM rmdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 ⬇ retrieved ppxlib.0.28.0 (cached) 00:10.321 PARALLEL Job 444950918 finished ```
In ocaml#5376, @dra27 suggested running `Gc.compact` when the main process is waiting for the children processes for the first time. > what I was thinking here was that possibly at the point where opamProcess first is going to wait > for the completion of running jobs, we could add a Gc.compact? In my local running on `opam install ppxlib`, "GC compact" ran in the middle of parallel processing of actions. ``` The following actions will be performed: === install 1 package ∗ ppxlib 0.28.0 00:07.216 XSYS Adding to env { LC_ALL=C } 00:09.099 STATE depexts loaded in 1.883s 00:09.100 SOLUTION parallel_apply 00:09.100 SOLUTION Regroup shared source packages: {} <><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><> 🐫 00:09.106 PARALLEL Iterate over 3 task(s) with 11 process(es) 00:09.106 PARALLEL Starting job 444950918 (worker -/11 -/1 1/3): ⬇ ppxlib.0.28.0 00:09.106 SOLUTION Fetching sources for ppxlib.0.28.0 00:09.106 ACTION download_package: ppxlib.0.28.0 00:09.106 SYSTEM rmdir /Users/scho/.opam/4.14.0/.opam-switch/sources/ppxlib.0.28.0 00:09.109 SYSTEM mkdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.110 PARALLEL Next task in job 444950918: /usr/bin/tar xfj /Users/scho/.opam/download-cache/sha256/d8/d87ae5f9a081206308ca964809b50a66aeb8e83d254801e8b9675448b60cf377 -C /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.619 PARALLEL GC compact (heap 490 MB -> 328 MB) 00:09.619 PARALLEL Collected task for job 444950918 (ret:0) 00:10.158 SYSTEM rmdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 ⬇ retrieved ppxlib.0.28.0 (cached) 00:10.321 PARALLEL Job 444950918 finished ``` Similar to ocaml#5376, this PR enabled my 1GB-RAM machine to install `ppxlib` or `js_of_ocaml` without OOM.
In ocaml#5376, @dra27 suggested running `Gc.compact` when the main process is waiting for the children processes for the first time. > what I was thinking here was that possibly at the point where opamProcess first is going to wait > for the completion of running jobs, we could add a Gc.compact? In my local running on `opam install ppxlib`, "GC compact" ran in the middle of parallel processing of actions. ``` The following actions will be performed: === install 1 package ∗ ppxlib 0.28.0 00:07.216 XSYS Adding to env { LC_ALL=C } 00:09.099 STATE depexts loaded in 1.883s 00:09.100 SOLUTION parallel_apply 00:09.100 SOLUTION Regroup shared source packages: {} <><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><> 🐫 00:09.106 PARALLEL Iterate over 3 task(s) with 11 process(es) 00:09.106 PARALLEL Starting job 444950918 (worker -/11 -/1 1/3): ⬇ ppxlib.0.28.0 00:09.106 SOLUTION Fetching sources for ppxlib.0.28.0 00:09.106 ACTION download_package: ppxlib.0.28.0 00:09.106 SYSTEM rmdir /Users/scho/.opam/4.14.0/.opam-switch/sources/ppxlib.0.28.0 00:09.109 SYSTEM mkdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.110 PARALLEL Next task in job 444950918: /usr/bin/tar xfj /Users/scho/.opam/download-cache/sha256/d8/d87ae5f9a081206308ca964809b50a66aeb8e83d254801e8b9675448b60cf377 -C /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.619 PARALLEL GC compact (heap 490 MB -> 328 MB) 00:09.619 PARALLEL Collected task for job 444950918 (ret:0) 00:10.158 SYSTEM rmdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 ⬇ retrieved ppxlib.0.28.0 (cached) 00:10.321 PARALLEL Job 444950918 finished ``` Similar to ocaml#5376, this PR enabled my 1GB-RAM machine to install `ppxlib` or `js_of_ocaml` without OOM.
In ocaml#5376, @dra27 suggested running `Gc.compact` when the main process is waiting for the children processes for the first time. > what I was thinking here was that possibly at the point where opamProcess first is going to wait > for the completion of running jobs, we could add a Gc.compact? In my local running on `opam install ppxlib`, "GC compact" ran in the middle of parallel processing of actions. ``` The following actions will be performed: === install 1 package ∗ ppxlib 0.28.0 00:07.216 XSYS Adding to env { LC_ALL=C } 00:09.099 STATE depexts loaded in 1.883s 00:09.100 SOLUTION parallel_apply 00:09.100 SOLUTION Regroup shared source packages: {} <><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><> 🐫 00:09.106 PARALLEL Iterate over 3 task(s) with 11 process(es) 00:09.106 PARALLEL Starting job 444950918 (worker -/11 -/1 1/3): ⬇ ppxlib.0.28.0 00:09.106 SOLUTION Fetching sources for ppxlib.0.28.0 00:09.106 ACTION download_package: ppxlib.0.28.0 00:09.106 SYSTEM rmdir /Users/scho/.opam/4.14.0/.opam-switch/sources/ppxlib.0.28.0 00:09.109 SYSTEM mkdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.110 PARALLEL Next task in job 444950918: /usr/bin/tar xfj /Users/scho/.opam/download-cache/sha256/d8/d87ae5f9a081206308ca964809b50a66aeb8e83d254801e8b9675448b60cf377 -C /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.619 PARALLEL GC compact (heap 490 MB -> 328 MB) 00:09.619 PARALLEL Collected task for job 444950918 (ret:0) 00:10.158 SYSTEM rmdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 ⬇ retrieved ppxlib.0.28.0 (cached) 00:10.321 PARALLEL Job 444950918 finished ``` Similar to ocaml#5376, this PR enabled my 1GB-RAM machine to install `ppxlib` or `js_of_ocaml` without OOM.
In ocaml#5376, @dra27 suggested running `Gc.compact` when the main process is waiting for the children processes for the first time. > what I was thinking here was that possibly at the point where opamProcess first is going to wait > for the completion of running jobs, we could add a Gc.compact? In my local running on `opam install ppxlib`, "GC compact" ran in the middle of parallel processing of actions. ``` The following actions will be performed: === install 1 package ∗ ppxlib 0.28.0 00:07.216 XSYS Adding to env { LC_ALL=C } 00:09.099 STATE depexts loaded in 1.883s 00:09.100 SOLUTION parallel_apply 00:09.100 SOLUTION Regroup shared source packages: {} <><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><> 🐫 00:09.106 PARALLEL Iterate over 3 task(s) with 11 process(es) 00:09.106 PARALLEL Starting job 444950918 (worker -/11 -/1 1/3): ⬇ ppxlib.0.28.0 00:09.106 SOLUTION Fetching sources for ppxlib.0.28.0 00:09.106 ACTION download_package: ppxlib.0.28.0 00:09.106 SYSTEM rmdir /Users/scho/.opam/4.14.0/.opam-switch/sources/ppxlib.0.28.0 00:09.109 SYSTEM mkdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.110 PARALLEL Next task in job 444950918: /usr/bin/tar xfj /Users/scho/.opam/download-cache/sha256/d8/d87ae5f9a081206308ca964809b50a66aeb8e83d254801e8b9675448b60cf377 -C /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.619 PARALLEL GC compact (heap 490 MB -> 328 MB) 00:09.619 PARALLEL Collected task for job 444950918 (ret:0) 00:10.158 SYSTEM rmdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 ⬇ retrieved ppxlib.0.28.0 (cached) 00:10.321 PARALLEL Job 444950918 finished ``` Similar to ocaml#5376, this PR enabled my 1GB-RAM machine to install `ppxlib` or `js_of_ocaml` without OOM.
In ocaml#5376, @dra27 suggested running `Gc.compact` when the main process is waiting for the children processes for the first time. > what I was thinking here was that possibly at the point where opamProcess first is going to wait > for the completion of running jobs, we could add a Gc.compact? In my local running on `opam install ppxlib`, "GC compact" ran in the middle of parallel processing of actions. ``` The following actions will be performed: === install 1 package ∗ ppxlib 0.28.0 00:07.216 XSYS Adding to env { LC_ALL=C } 00:09.099 STATE depexts loaded in 1.883s 00:09.100 SOLUTION parallel_apply 00:09.100 SOLUTION Regroup shared source packages: {} <><> Processing actions <><><><><><><><><><><><><><><><><><><><><><><><><><> 🐫 00:09.106 PARALLEL Iterate over 3 task(s) with 11 process(es) 00:09.106 PARALLEL Starting job 444950918 (worker -/11 -/1 1/3): ⬇ ppxlib.0.28.0 00:09.106 SOLUTION Fetching sources for ppxlib.0.28.0 00:09.106 ACTION download_package: ppxlib.0.28.0 00:09.106 SYSTEM rmdir /Users/scho/.opam/4.14.0/.opam-switch/sources/ppxlib.0.28.0 00:09.109 SYSTEM mkdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.110 PARALLEL Next task in job 444950918: /usr/bin/tar xfj /Users/scho/.opam/download-cache/sha256/d8/d87ae5f9a081206308ca964809b50a66aeb8e83d254801e8b9675448b60cf377 -C /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 00:09.619 PARALLEL GC compact (heap 490 MB -> 328 MB) 00:09.619 PARALLEL Collected task for job 444950918 (ret:0) 00:10.158 SYSTEM rmdir /var/folders/gx/6809fgsd3nndyyf69y3h12_h0000gn/T/opam-14220-67d907 ⬇ retrieved ppxlib.0.28.0 (cached) 00:10.321 PARALLEL Job 444950918 finished ``` Similar to ocaml#5376, this PR enabled my 1GB-RAM machine to install `ppxlib` or `js_of_ocaml` without OOM.
Hello. I am curious about if you are interested in running additional garbage collections for small-RAM machines.
This PR runs garbage collection (
Gc.compact
) before applying solutions, with an additional option--gc-before-action
.context:
I have a tiny machine with 1 GB of RAM. When opam-installing
ppxlib
orjs_of_ocaml
in it, it died with OOM. However, building the libraries directly from their source code was fine.In my local runnings, this PR saved about 240 MB (from 482 to 242) before starting library builds (e.g. when
opam install ppxlib
). While the save is not huge (and it is not a big deal for most of modern machines), it was sufficient save for my tiny machine to be able to buildppxlib
orjs_of_ocaml
without OOM.