Mv optimizer from the network level to the layer level #184

jvdp1 · 2024-06-14T17:31:43Z

As discussed, here is a draft in which I suggst to moved the optimizer from the network level to the layer level.

This is just a draft with an implementation for the dense layer only.

Here are the wall clock times using my dataset (with 2 hidden dense layers):

v0.17.0

Forward + backward: 4.79s
Update: 4.59s

Current PR

Forward + backward: 4.81s
Update: 1.40s

OneAdder · 2025-03-05T08:26:12Z

@jvdp1 That's actually a great idea. Apart from obvious performance gains, it can simplify code for combined layers. I will arrange everything in similar fashion in my project here: https://github.com/OneAdder/llm.f
Then we can backport it here along with implementation for all other layers

Vandenplas, Jeremie and others added 4 commits May 9, 2024 22:11

Example to move the optimizer to the layer level

8579fde

Merge remote-tracking branch 'upstream/main' into mv_optimizer

51c1add

format

1dbcb80

Merge remote-tracking branch 'upstream/main' into mv_optimizer

94df8d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mv optimizer from the network level to the layer level #184

Mv optimizer from the network level to the layer level #184

jvdp1 commented Jun 14, 2024

OneAdder commented Mar 5, 2025

Mv optimizer from the network level to the layer level #184

Are you sure you want to change the base?

Mv optimizer from the network level to the layer level #184

Conversation

jvdp1 commented Jun 14, 2024

v0.17.0

Current PR

OneAdder commented Mar 5, 2025