You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using the skiplayer provided by the project to run CodeLlama2-13B and LLaMA2-13B-Chat, the speculated decode time in evaluate_sum and evaluate_code is significantly longer than the base model. Could you please explain why this might be the case?
The text was updated successfully, but these errors were encountered:
Yes, the acceptance rate is normal and should be accelerated. Can you rule out whether it is a problem with the sss mode and try essg? In addition, you can update the environment and re-search the skipped layers.
Using the skiplayer provided by the project to run CodeLlama2-13B and LLaMA2-13B-Chat, the speculated decode time in evaluate_sum and evaluate_code is significantly longer than the base model. Could you please explain why this might be the case?
The text was updated successfully, but these errors were encountered: