Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: return float instead of tensor from get_rotary_seq_len #1419

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jasonchiu-codeium
Copy link

The below error stacktrace occurs due to get_rotary_seq_len potentially returning a tensor value instead of its float value.

[...]

  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/local_pip_torch/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/local_pip_torch/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/_main/third_party/megatron_lm/Megatron-LM/megatron/core/distributed/data_parallel_base.py", line 22, in forward
    return self.module(*inputs, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/local_pip_torch/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/local_pip_torch/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/_main/third_party/megatron_lm/Megatron-LM/megatron/core/transformer/module.py", line 178, in forward
    outputs = self.module(*inputs, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/local_pip_torch/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/local_pip_torch/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/_main/third_party/megatron_lm/Megatron-LM/megatron/core/models/gpt/gpt_model.py", line 265, in forward
    rotary_pos_emb = self.rotary_pos_emb(
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/local_pip_torch/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/local_pip_torch/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/_main/third_party/megatron_lm/Megatron-LM/megatron/core/models/common/embeddings/rotary_pos_embedding.py", line 172, in forward
    freqs = self.get_freqs_non_repeated(max_seq_len, offset)
  File "/ephemeral/devcontainer/jasonchiu/cache/bazel/_bazel_jasonchiu/996a28cb1c2af162dca7531bd6a2de53/execroot/_main/bazel-out/k8-opt/bin/exa/trainer/megatron/megatron_trainer_test.runfiles/_main/third_party/megatron_lm/Megatron-LM/megatron/core/models/common/embeddings/rotary_pos_embedding.py", line 137, in get_freqs_non_repeated
    torch.arange(max_seq_len, device=self.inv_freq.device, dtype=self.inv_freq.dtype)

This PR fixes that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant