Fix/nn transformer block #367

cxzhang4 · 2025-03-21T09:52:18Z

A sketch of the FT-Transformer graph.

…tion

…k module

cxzhang4 · 2025-03-21T09:58:45Z

R/PipeOpTorchTransformerLayer.R

+)
+mlr3pipelines::mlr_pipeops$add("transformer_layer", PipeOpTorchTransformerLayer)
+
+# TODO: remove deofault values from here


Since defaults should be handled by the PipeOp wrapper class

cxzhang4 · 2025-03-21T09:59:49Z

R/PipeOpTorchTransformerLayer.R

+
+    # TODO: remove layer_idx and ask about how we want to handle this condition
+    # layer_idx = -1
+    if (!is_first_layer || !prenormalization || first_prenormalization) {


Should match the official implementation now, but it is still confusing that !prenormalization is here...

cxzhang4 · 2025-03-21T10:02:59Z

attic/ft-transformer-graph/modules_as_pipeops.R

+d_embedding = 32
+
+# # TODO: access x[, -1] first
+# # TODO: sometimes there is no normalization, i.e. nn_identity instead of nn_layer_norm, figure out how to handle this


Could parameterize like the pre-implemented nn_ft_head, i.e. with an activation parameter

cxzhang4 · 2025-03-21T10:07:09Z

R/PipeOpTorchTransformerLayer.R

+      self$last_layer_query_idx = param_vals$last_layer_query_idx
+    }
+  ),
+  private = list(


Confirm the ordering of the dimensions of the tensors. It may be the case that here, the first dimension is the sequence dimension (NOT the batch dimension)

sebffischer · 2025-03-21T10:35:20Z

#' @title Custom Function
#' @inherit torch::nnf_linear description
#' @section nn_module:
#' Calls [`torch::nn_linear()`] when trained where the parameter `in_features` is inferred as the second
#' to last dimension of the input tensor.
#' @section Parameters:
#' * `out_features` :: `integer(1)`\cr
#'   The output features of the linear layer.
#' * `bias` :: `logical(1)`\cr
#'   Whether to use a bias.
#'   Default is `TRUE`.
#'
#' @templateVar id nn_linear
#' @template pipeop_torch_channels_default
#' @templateVar param_vals out_features = 10
#' @template pipeop_torch
#' @template pipeop_torch_example
#'
#'
#' @export
PipeOpTorchFn = R6Class("PipeOpTorchFn",
  inherit = PipeOpTorch,
  public = list(
    #' @description Creates a new instance of this [R6][R6::R6Class] class.
    #' @template params_pipelines
    initialize = function(id = "nn_fn", param_vals = list()) {
      param_set = ps(fn = p_uty(...))
      super$initialize(
        id = id,
        param_set = param_set,
        param_vals = param_vals,
        module_generator = nn_linear
      )
    }
  ),
  private = list(
    .shapes_out = function(shapes_in, param_vals, task) {
      # Implement this.
      # 1. Generate a tensor of shape shapes_in (fill NA with something)
      # 2. Apply function private$.f
      # 3. Meausre shapes and fill dimensions with NA again

      # Should also be possible to implement shapes_out properly

      # Also take inspiration from pipeop_preproc_torch
    },
    .make_module = function(shapes_in, param_vals, task) {
      self$param_set$values$fn
    },
    .fn = NULL
  )
)

#' @include aaa.R
register_po("nn_fn", PipeOpTorchFn)

sebffischer and others added 29 commits February 6, 2025 16:46

feat(layer): add tokenizers for categoricals and numerics

3fca75c

Update news

9ff800a

both versions of NEWS

64f1b09

init

4bc5446

Merge branch 'main' into fix/nn_transformer_block

8580b29

both news

b7d1f6b

Merge branch 'feat/reglu-geglu' into fix/nn_transformer_block

91b9792

conda env for torch 0.13, some comments about logic of old implementa…

4ed0fce

…tion

comments/notes, old implementation

0ac124f

comments and such, assertion on new code looks ok

1b2b594

copying old style/logic back in

b82beeb

comments, move kv_compression to the layer module and out of the bloc…

8aeac29

…k module

tests pass

8f71b22

this version passes tests I think

5851e09

factored out head

d4160ee

idk

9419328

skeleton code for a mlr3 task

b0bde4e

sketches of graph

6f84354

Copilot PipeOps

98ab02a

cls token should be ok

4ca8640

re-implement feedback from old PR

7bfd4a1

added a comment reminder for old feedback

89116e4

modified shapes_out for the PipeOps

d287d45

graph looks better

4af8610

idrk

c341781

CoPilot reglu geglu

f4480a1

added back first_layer flag

7335e33

graph still buggy

91f6049

Merge branch 'main' into fix/nn_transformer_block

80179ef

cxzhang4 commented Mar 21, 2025

View reviewed changes

idk

6b90805

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/nn transformer block #367

Fix/nn transformer block #367

cxzhang4 commented Mar 21, 2025

cxzhang4 Mar 21, 2025

cxzhang4 Mar 21, 2025

cxzhang4 Mar 21, 2025

cxzhang4 Mar 21, 2025

sebffischer commented Mar 21, 2025 •

edited

Loading

Fix/nn transformer block #367

Are you sure you want to change the base?

Fix/nn transformer block #367

Conversation

cxzhang4 commented Mar 21, 2025

cxzhang4 Mar 21, 2025

Choose a reason for hiding this comment

cxzhang4 Mar 21, 2025

Choose a reason for hiding this comment

cxzhang4 Mar 21, 2025

Choose a reason for hiding this comment

cxzhang4 Mar 21, 2025

Choose a reason for hiding this comment

sebffischer commented Mar 21, 2025 • edited Loading

sebffischer commented Mar 21, 2025 •

edited

Loading