The experimental results are inconsistent. #20

HuYunhai-Alex · 2024-09-25T16:26:16Z

Using the skiplayer provided by the project to run CodeLlama2-13B and LLaMA2-13B-Chat, the speculated decode time in evaluate_sum and evaluate_code is significantly longer than the base model. Could you please explain why this might be the case?

HimanshuJanbandhu · 2024-10-02T17:40:27Z

Can you elaborate on what system you are doing this? As I can see matchness is quite high, so this problem shouldn't occur

junzhang-zj · 2024-10-12T01:59:25Z

Yes, the acceptance rate is normal and should be accelerated. Can you rule out whether it is a problem with the sss mode and try essg? In addition, you can update the environment and re-search the skipped layers.

junzhang-zj closed this as completed Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The experimental results are inconsistent. #20

The experimental results are inconsistent. #20

HuYunhai-Alex commented Sep 25, 2024

HimanshuJanbandhu commented Oct 2, 2024

junzhang-zj commented Oct 12, 2024

The experimental results are inconsistent. #20

The experimental results are inconsistent. #20

Comments

HuYunhai-Alex commented Sep 25, 2024

HimanshuJanbandhu commented Oct 2, 2024

junzhang-zj commented Oct 12, 2024