You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! Loving the textbook so far :) I've encountered a minor issue though in the chapter 3 section Choosing a single token from the probability distribution (sampling / decoding)...
When I run lm_head_output.shape I get an output shape of [1, 5, 32064], whereas the source code and textbook states that it should be [1, 6, 32064]. I'm not sure why there's a difference, I've kept all the preceding code the same...
Interestingly, running the next line of code returns the expected output ("Paris"):
Hi! Loving the textbook so far :) I've encountered a minor issue though in the chapter 3 section Choosing a single token from the probability distribution (sampling / decoding)...
When I run
lm_head_output.shape
I get an output shape of[1, 5, 32064]
, whereas the source code and textbook states that it should be[1, 6, 32064]
. I'm not sure why there's a difference, I've kept all the preceding code the same...Interestingly, running the next line of code returns the expected output ("Paris"):
token_id = lm_head_output[0,-1].argmax(-1) tokenizer.decode(token_id)
The text was updated successfully, but these errors were encountered: