feat: nanoGPT implementation using Reactant #1062

avik-pal · 2024-11-09T02:18:54Z

Needs

NNlib.gather support in Reactant: feat: partial NNlib.gather support + better indexing support EnzymeAD/Reactant.jl#252
NNlib.make_causal_mask: feat: add support for NNlib make causal mask EnzymeAD/Reactant.jl#249
adjoint for dynamic_gather xref Jmp/scatter gather EnzymeAD/Enzyme-JAX#142

error: could not compute the adjoint for this operation %51 = "stablehlo.dynamic_gather"(%15, %0, %12) <{dimension_numbers = #stablehlo.gather<offset_dims = [0], collapsed_slice_dims = [1], start_index_map = [1], index_vector_dim = 1>}> : (tensor<64x64xf32>, tensor<64xi64>, tensor<2xi64>) -> tensor<64x64xf32>
error: could not compute the adjoint for this operation %47 = "stablehlo.dynamic_gather"(%14, %46, %12) <{dimension_numbers = #stablehlo.gather<offset_dims = [0], collapsed_slice_dims = [1], start_index_map = [1], index_vector_dim = 1>}> : (tensor<64x67xf32>, tensor<8192xi64>, tensor<2xi64>) -> tensor<64x8192xf32>
ERROR: LoadError: "failed to run pass manager on module"

examples/NanoGPT/main.jl

Vaibhavdixit02 · 2024-12-07T01:34:38Z

examples/NanoGPT/main.jl

+
+    train_loader = DataLoader(
+        (trainX, trainY); batchsize, shuffle=true, parallel=true
+    ) |> dev


Suggested change

) |> dev

) .|> dev

why is this needed? broadcasting over the dataloader will avoid constructing the DeviceIterator

Atleast, on GPU without this the labels weren't being transferred to the device

Vaibhavdixit02 · 2024-12-07T01:41:32Z

examples/NanoGPT/main.jl

+        end
+
+        test_loss = loss_fn(
+            Array(first(model_compiled(testX, ps, Lux.testmode(st)))), testY


Suggested change

Array(first(model_compiled(testX, ps, Lux.testmode(st)))), testY

first(model_compiled(testX, ps, Lux.testmode(st))), testY

We need to compile the loss_fn for the non-Array version to work

Ah yeah this might not be needed for your case here

Copilot reviewed 3 out of 4 changed files in this pull request and generated no suggestions.

Files not reviewed (1)

examples/NanoGPT/main.jl: Language not supported

avik-pal marked this pull request as draft November 9, 2024 02:19

avik-pal added the reactant label Nov 9, 2024

github-actions bot reviewed Nov 9, 2024

View reviewed changes

examples/NanoGPT/main.jl Outdated Show resolved Hide resolved

avik-pal force-pushed the ap/nanogpt_reactant branch 4 times, most recently from 844a274 to d2e7299 Compare November 15, 2024 02:58

avik-pal force-pushed the ap/nanogpt_reactant branch from d2e7299 to f4504bc Compare November 16, 2024 03:33

avik-pal added 3 commits November 16, 2024 13:53

feat: nanoGPT implementation using Reactant

e37484d

fix: make the dimensions correct

a08df76

feat: finish the implementation

9b97721

avik-pal force-pushed the ap/nanogpt_reactant branch from f4504bc to 9b97721 Compare November 16, 2024 18:53

Vaibhavdixit02 reviewed Dec 7, 2024

View reviewed changes

avik-pal requested a review from Copilot December 11, 2024 14:23

Copilot AI reviewed Dec 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: nanoGPT implementation using Reactant #1062

feat: nanoGPT implementation using Reactant #1062

avik-pal commented Nov 9, 2024 •

edited

Loading

Vaibhavdixit02 Dec 7, 2024

avik-pal Dec 7, 2024

Vaibhavdixit02 Dec 7, 2024

Vaibhavdixit02 Dec 7, 2024

avik-pal Dec 7, 2024

Vaibhavdixit02 Dec 7, 2024

	Array(first(model_compiled(testX, ps, Lux.testmode(st)))), testY
	first(model_compiled(testX, ps, Lux.testmode(st))), testY

feat: nanoGPT implementation using Reactant #1062

Are you sure you want to change the base?

feat: nanoGPT implementation using Reactant #1062

Conversation

avik-pal commented Nov 9, 2024 • edited Loading

Needs

Vaibhavdixit02 Dec 7, 2024

Choose a reason for hiding this comment

avik-pal Dec 7, 2024

Choose a reason for hiding this comment

Vaibhavdixit02 Dec 7, 2024

Choose a reason for hiding this comment

Vaibhavdixit02 Dec 7, 2024

Choose a reason for hiding this comment

avik-pal Dec 7, 2024

Choose a reason for hiding this comment

Vaibhavdixit02 Dec 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

avik-pal commented Nov 9, 2024 •

edited

Loading