Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA Support for ALS #37

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
d48f0f1
Fixes #36 (with simple MTTKRP implementation)
alexmul1114 Jan 8, 2024
4596484
Change to Arrays AbstractArrays
alexmul1114 Jan 8, 2024
6e2a476
Replace skipmissing
alexmul1114 Jan 10, 2024
01e2ab9
Use ismissing in mapreduce instead of skipmissing
alexmul1114 Jan 10, 2024
1e6097e
Add new _gcp method for CuArray
alexmul1114 Jan 10, 2024
fd71ca4
Add CUDA gcp as extension, use simple MTTKRP implementation and khatr…
alexmul1114 Jan 10, 2024
e5b30ea
Add CUDA to deps for backward compatiblity
alexmul1114 Jan 10, 2024
c34cc3c
Move CUDA to extras
alexmul1114 Jan 10, 2024
966e30b
Change CUDA version compatibility
alexmul1114 Jan 10, 2024
9938c46
Modify project.toml
alexmul1114 Jan 10, 2024
e8077a3
Test with new MTTKRP from #35
alexmul1114 Jan 11, 2024
fb1d0ce
Change .= back to =
alexmul1114 Jan 11, 2024
e368433
Fix typos
alexmul1114 Jan 11, 2024
614929d
Use latest MTTKRP from #35
alexmul1114 Jan 11, 2024
7854ec8
collect selectdim for now
alexmul1114 Jan 11, 2024
df803e1
Try reusing inner
alexmul1114 Jan 11, 2024
2a7f0c2
Use copy in MTTKRP middle mode Rn computation for now
alexmul1114 Jan 11, 2024
b72f6e9
Change abstract arrays back to arrays
alexmul1114 Jan 12, 2024
12f0004
Remove typo
alexmul1114 Jan 12, 2024
e9680f6
Switch khatrirao to accept abstractmatrix
alexmul1114 Jan 12, 2024
ebc5455
Fix typo
alexmul1114 Jan 12, 2024
ab40583
Temporarily use Float32 for comparison against GPU
alexmul1114 Jan 15, 2024
f4696c0
Replace norm function
alexmul1114 Jan 18, 2024
d7f68ed
Merge branch 'master' into pr/alexmul1114/37
dahong67 Mar 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ authors = ["David Hong <[email protected]> and contributors"]
version = "0.1.2"

[deps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
Compat = "34da2185-b29b-5c13-b0c7-acf172513d20"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
IntervalSets = "8197267c-284f-5f27-9208-e0e47529a953"
Expand All @@ -13,12 +14,15 @@ LossFunctions = "30fc2ffe-d236-52d8-8643-a9d8f7c094a7"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"

[weakdeps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
LossFunctions = "30fc2ffe-d236-52d8-8643-a9d8f7c094a7"

[extensions]
CUDAExt = "CUDA"
LossFunctionsExt = "LossFunctions"

[compat]
CUDA = ">= 4.4.1"
Compat = "3.42, 4"
ForwardDiff = "0.10.36"
IntervalSets = "0.7.7"
Expand All @@ -27,3 +31,6 @@ LinearAlgebra = "1.6"
LossFunctions = "0.11.1"
Random = "1.6"
julia = "1.6"

[extras]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
47 changes: 47 additions & 0 deletions ext/CUDAExt.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
module CUDAExt

using GCPDecompositions, CUDA

GCPDecompositions.gcp(
X::CuArray,
r,
loss = LeastSquaresLoss();
constraints = (),
algorithm = GCPAlgorithms.ALS(),
) = _gcp(X, r, loss, constraints, algorithm)
function _gcp(
X::CuArray{TX,N},
r,
loss::LeastSquaresLoss,
constraints::Tuple{},
algorithm::GCPAlgorithms.ALS,
) where {TX<:Real,N}
T = promote_type(TX, Float32)

# Random initialization
M0 = CPD(ones(T, r), rand.(T, size(X), r))
M0norm = sqrt(sum(abs2, M0[I] for I in CartesianIndices(size(M0))))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added CUDA as an extension, where the extension has a gcp definition for CuArray input. Right now M0 is created and normalized on the CPU then moved to the GPU for ALS (and moved back to the CPU to be returned at the end). Need to figure out how rewrite line 24 without scalar indexing so M0 can be created directly as a CuArray.

Xnorm = sqrt(mapreduce(x -> isnan(x) ? 0 : abs2(x), +, X, init=0f0))
for k in Base.OneTo(N)
M0.U[k] .*= (Xnorm / M0norm)^(1 / N)
end
λ, U = M0.λ, collect(M0.U)

# Move λ, U to gpu
λ = CuArray(λ)
U = [CuArray(U[i]) for i in 1:N]

# Inefficient but simple implementation
for _ in 1:algorithm.maxiters
for n in 1:N
V = reduce(.*, U[i]'U[i] for i in setdiff(1:N, n))
U[n] = GCPDecompositions.mttkrp(X, U, n) / V
λ = vec(sqrt.(sum(abs2, U[n]; dims=1)))
U[n] = U[n] ./ permutedims(λ)
end
end

return CPD(Array(λ), Tuple([Array(U[i]) for i in 1:N]))
end

end
4 changes: 3 additions & 1 deletion src/GCPDecompositions.jl
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ include("gcp-algorithms.jl")

if !isdefined(Base, :get_extension)
include("../ext/LossFunctionsExt.jl")
include("../ext/CUDAExt.jl")
end

# Main fitting function
Expand Down Expand Up @@ -113,7 +114,8 @@ default_init(X, r, loss, constraints, algorithm) =
function default_init(rng, X, r, loss, constraints, algorithm)
# Generate CPD with random factors
T, N = nonmissingtype(eltype(X)), ndims(X)
T = promote_type(T, Float64)
#T = promote_type(T, Float64)
T = promote_type(T, Float32)
M = CPD(ones(T, r), rand.(rng, T, size(X), r))

# Normalize
Expand Down
2 changes: 1 addition & 1 deletion src/gcp-algorithms/als.jl
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ function _gcp(
V = reduce(.*, M.U[i]'M.U[i] for i in setdiff(1:N, n))
mttkrp!(M.U[n], X, M.U, n, mttkrp_buffers[n])
rdiv!(M.U[n], lu!(V))
M.λ .= norm.(eachcol(M.U[n]))
M.λ .= vec(sqrt.(sum(abs2, U[n]; dims=1)))
M.U[n] ./= permutedims(M.λ)
end
end
Expand Down
Loading