Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leakage upon repeated training #230

Open
CasBex opened this issue Jan 24, 2024 · 1 comment
Open

Memory leakage upon repeated training #230

CasBex opened this issue Jan 24, 2024 · 1 comment

Comments

@CasBex
Copy link

CasBex commented Jan 24, 2024

Hi, I've been creating some random forest regressors lately and I've noticed high memory usage during hyperparameter tuning. It turns out that there is some memory leakage in the package. For some reason Julia does not delete the trees when they become unreachable.

Following is a MWE: after finishing run_forests, some memory should be reclaimed but it doesn't happen and memory usage increases. When running the second loop however, memory usage stays constant.

using DecisionTree
function run_forests(features, labels)
    forest = build_forest(labels, features)
    labels .+= apply_forest(forest, features)
    labels ./= 2
end

function run_something_else(features, labels)
    C = repeat(features, inner=(2,2))
    labels ./= vec(sum(C, dims=2))[1:length(labels)]
end

const features = rand(10_000, 10)
const labels = sum(features, dims=2) |> vec

# notice memory consumption increases every couple of iterations
for i = 1:1_000
    run_forests(features, labels)
    @info "Iteration $i current memory used" Sys.maxrss()
end

# notice memory consumption does not increase every couple of iterations
for i = 1:1_000
    run_something_else(features, labels)
    @info "Iteration $i current memory used" Sys.maxrss()
end

Any idea what might cause this?

@rikhuijzer
Copy link
Member

Yes. I can confirm on Julia 1.10 aarch64 Apple Darwin. During execution, htop memory usage will slowly increase each time that I call include("tmp.jl").

Any idea what might cause this?

Could be a (multi-threading) leak somewhere in Julia: https://github.com/search?q=repo%3AJuliaLang%2Fjulia+memory&type=issues&p=2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants