Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flexGrow behaviour - token limit reached #156

Open
onlyutkarsh opened this issue Feb 26, 2025 · 5 comments
Open

flexGrow behaviour - token limit reached #156

onlyutkarsh opened this issue Feb 26, 2025 · 5 comments

Comments

@onlyutkarsh
Copy link

onlyutkarsh commented Feb 26, 2025

My Copilot extension requires me to send a rather large JSON (1000 elements) to the model.

I frequently encountered token limits with string prompts, which is why I've made the switch to prompt-tsx. You can see my new component here: https://gist.github.com/onlyutkarsh/b972c5a0942d757de1c599e927069e3a.

I call the prompt as below

const dataPrompt = await renderPrompt(
  DataPrompt,
  { data: data },
  {
    modelMaxPromptTokens: request.model.maxInputTokens
  },
  request.model
);

Although I have flexGrow, I still see that all of the JSON data is sent, and I keep getting a "message exceeds token limit" error.

<UserMessage flexGrow={1}>{state.summaryData}</UserMessage>

I searched the prompt-tsx repo for more examples, but most use flexGrow, and I am unclear about what I am doing wrong.

The extension is private at the moment, but I'm happy to share further details if any further details required.

  1. What is wrong with my prompt that I am getting token limit?
  2. How to get the tokens used by the Prompt (in my case DataPrompt)? dataPrompt.tokenCount returns 24, which seems wrong given I have a large JSON.
@connor4312
Copy link
Member

I would ensure that you're using the correct tokenizer for your model. Note that the default tokenizer this library ships with is the GPT 3 O100k tokenizer.

Also, simply wrapping a string in a component with flexGrow won't truncate the string to fit within your limit. Sizing is cooperative, and a string is just a string. You can use TextChunk to wrap the string, or implement your own logic, to size as necessary https://github.com/microsoft/vscode-prompt-tsx/tree/main?tab=readme-ov-file#flex-behavior

@onlyutkarsh
Copy link
Author

Thank you, @connor4312. I mistakenly thought it would be automatic. I'll check out TextChunk. By the way, just to confirm, is dataPrompt.tokenCount the right method for determining the tokens used by the prompt?

@connor4312
Copy link
Member

That's right

@onlyutkarsh
Copy link
Author

@connor4312 I have made some progress, thanks for your advice. Since I am sending a JSON, I am breaking data with custom logic.

I do have a quick question, though. I have a prompt which consists of child elements, and I add a JSON to one of the child elements. The JSON has a budget of 7000 tokens.

//dataPrompt.tsx

async render(state: DataPromptState, sizing: PromptSizing) {
      return (
   ...
   <ChunkedJsonPrompt data={this.props.data} />
  );
}
// chunkedJsonPrompt.tsx
export class ChunkedJsonPrompt extends PromptElement<ChunkedJsonPromptProps, ChunkedJsonPromptState> {
  override async prepare(sizing: PromptSizing): Promise<ChunkedJsonPromptState> {
       //tokenCount is approx 7000 here
       let tokenCount = await sizing.countTokens(JSON.stringify(summary, null, 2));

  }
...
}

//extension.ts
const urtaPrompt = await renderPrompt(
        MyPrompt,
        { data: data },
        { modelMaxPromptTokens: maxTokensAllowed },
        request.model,
        undefined,
        token
      );

//!!urtaPrompt.tokenCount is 39 here!! WHY?
log.info(
        `Analyzing data...Used Token: ${urtaPrompt.tokenCount}, Max Tokens: ${maxTokensAllowed}`
      );

However, when I call renderPrompt(), I've noticed that the tokenCount is surprisingly low. Could you provide more details on how prompt-tsx calculates the token count? I'm trying to better understand the tokenCount. Thank you for the help!!

Image

@connor4312
Copy link
Member

The token count is from the countMessageTokens method on the ITokenizer object passed to renderPrompt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants