Skip to content

Commit

Permalink
Merge pull request #94 from microsoft/connor4312/local-priorities
Browse files Browse the repository at this point in the history
refactor: make priority values local
  • Loading branch information
connor4312 authored Oct 9, 2024
2 parents 89dbf39 + af5ac44 commit 385914d
Show file tree
Hide file tree
Showing 13 changed files with 1,229 additions and 758 deletions.
32 changes: 27 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,11 +143,10 @@ export class TestPrompt extends PromptElement<PromptProps, PromptState> {
Please note:
- If your prompt does asynchronous work e.g. VS Code extension API calls or additional requests to the Copilot API for chunk reranking, you can precompute this state in an optional async `prepare` method. `prepare` is called before `render` and the prepared state will be passed back to your prompt component's sync `render` method.
- Newlines are not preserved in JSX text or between JSX elements when rendered, and must be explicitly declared with the builtin `<br />` attribute.
- For now, if two prompt messages _with the same priority_ are up for pruning due to exceeding the token budget, it is not possible for a subtree of the prompt message declared before to prune a subtree of the prompt message declared later.

### Managing your budget
### Prioritization

If a rendered prompt has more message tokens than can fit into the available context window, the prompt renderer prunes messages with the lowest priority from the `ChatMessage`s result, preserving the order in which they were declared.
If a rendered prompt has more message tokens than can fit into the available context window, the prompt renderer prunes messages with the lowest priority from the `ChatMessage`s result.

In the above example, each message had the same priority, so they would be pruned in the order in which they were declared, but we could control that by passing a priority to element:

Expand All @@ -159,9 +158,32 @@ In the above example, each message had the same priority, so they would be prune
</>
```

In this case, a very long `userQuery` would get pruned from the output first if it's too long.
In this case, a very long `userQuery` would get pruned from the output first if it's too long. Priorities are local in the element tree, so for example the tree of nodes...

```html
<UserMessage priority={1}>
<TextChunk priority={100}>A</TextChunk>
<TextChunk priority={0}>B</TextChunk>
</UserMesssage>
<SystemMessage priority={2}>
<TextChunk priority={200}>C</TextChunk>
<TextChunk priority={20}>D</TextChunk>
</SystemMessage>
```

...would be pruned in the order `B->A->D->C`. If two sibling elements share the same priority, the renderer looks ahead at their direct children and picks whichever one has a child with the lowest priority: if the `SystemMessage` and `UserMessage` in the above example did not declare priorities, the pruning order would be `B->D->A->C`.

Continuous text strings and elements can both be pruned from the tree. If you have a set of elements that you want to either be include all the time or none of the time, you can use the simple `Chunk` utility element:

```html
<Chunk>
The file I'm editing is: <FileLink file={f}>
</Chunk>
```

### Flex Behavior

But, this is not ideal. Instead, we'd prefer to include as much of the query as possible. To do this, we can use the `flexGrow` property, which allows an element to use the remainder of its parent's token budget when it's rendered.
Wholesale pruning is not always already. Instead, we'd prefer to include as much of the query as possible. To do this, we can use the `flexGrow` property, which allows an element to use the remainder of its parent's token budget when it's rendered.

`prompt-tsx` provides a utility component that supports this use case: `TextChunk`. Given input text, and optionally a delimiting string or regular expression, it'll include as much of the text as possible to fit within its budget:

Expand Down
Loading

0 comments on commit 385914d

Please sign in to comment.