how to change y1? #6

Lucas-TY · 2024-06-25T18:13:45Z

It seems like gamma is y2, but how do you change y1?

preminstrel · 2024-06-25T19:50:37Z

Hello,

Thanks for your interest in our work! In our provided implementation, we set $\gamma_1 = 1$ because we observed that the performance is nearly the same for $\gamma_1 = 2$, and it decreases for larger values of $\gamma_1$. This is due to the low acceptance rate for Llama-68M. To keep things simple, our open-source code uses $\gamma_1 = 1$.

If you’d like to try using better draft models with higher acceptance rates, you can directly modify the function linked below. You only need to add an extra inner loop for $\gamma_1$:

TriForce/utils/decoding.py

Lines 182 to 222 in e865a1d

    
           while n < gamma: 
        
               speculation_prob = graph_engine.graph_draft_inference(input_ids=verify_tokens[:,:n+1], gamma_offset = n) 
        
               pred_token_idx = sample(speculation_prob) 
        
               token_idx = pred_token_idx.item() 
        
               draft_count += 1 
        
               verify_tokens[:, n+1:n+2] = pred_token_idx 
        
               verify_prob = graph_engine.graph_verify(input_ids=verify_tokens, position_ids=position_ids) 
        
               r = torch.rand(1, device = graph_engine.engine.model.device) 
        
               if r < torch.min(torch.tensor([1], device=r.device), (verify_prob[n, token_idx] / speculation_prob[token_idx])): 
        
                   return_speculation_probs.append(verify_prob[n]) 
        
                   return_generated_ids.append(token_idx) 
        
                   if verbose: 
        
                       spec_stream(pred_token_idx, tokenizer, 'green') 
        
                   accepted_count += 1 
        
                   n += 1 
        
                   pred_token_idx = sample(verify_prob[n]) 
        
                   return_speculation_probs.append(verify_prob[n]) 
        
                   return_generated_ids.append(pred_token_idx.item()) 
        
                   if verbose: 
        
                       spec_stream(pred_token_idx, tokenizer, 'blue') 
        
                   target_sample_count += 1 
        
                   n += 1 
        
                   verify_tokens[:, n:n+1] = pred_token_idx 
        
               else: 
        
                   pred_token_idx = sample(verify_prob[n]) 
        
                   return_speculation_probs.append(verify_prob[n]) 
        
                   return_generated_ids.append(pred_token_idx.item()) 
        
                   if verbose: 
        
                       spec_stream(pred_token_idx, tokenizer, 'red') 
        
                   resample_count += 1 
        
                   n += 1 
        
                   verify_tokens[:, n:n+1] = pred_token_idx 
        
           acceptance_rate = accepted_count / draft_count

If you have any further questions, feel free to ask.

preminstrel added the good first issue Good for newcomers label Jun 25, 2024

preminstrel self-assigned this Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to change y1? #6

how to change y1? #6

Lucas-TY commented Jun 25, 2024

preminstrel commented Jun 25, 2024

how to change y1? #6

how to change y1? #6

Comments

Lucas-TY commented Jun 25, 2024

preminstrel commented Jun 25, 2024