Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do not merge] When chunk_size=0, skip vector db #99

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

gaya3-zipstack
Copy link
Contributor

@gaya3-zipstack gaya3-zipstack commented Sep 3, 2024

What

Current; implementation always uses the indexed nodes when fetching context for prompts. However, when chunk_size=0, since we have to send the entire context, we can directly send the extracted text instead of fetching the chunk from the vector db.

Why

This will improve response time for prompts when chunk_size=0 as vector db need not be accessed

How

When chunk_size=0, the context can be fetched from the extracted text present in the container file system

Relevant Docs

Related Issues or PRs

Unstract PR

Dependencies Versions

Notes on Testing

Screenshots

Profile with Chunk_size=0
Manual indexing on a document. Here after indexing is completed, no nodes are added to the vector DB as shown
image

Prompt run on top of manual indexing. Here after prompt run, still no records in the vector db. But still, prompt answers are right as the context gets picked up from the extracted text and works fine.
image

Running a prompt before manual indexing (dynamic indexing would kick in).
image

image

Manually remove the extracted file after indexing. Run prompt. This gives an error saying the extracted file is missing
image

Now, do a manual re-indexing. Extracted file will be re-created. Then run prompt.

image

image

image

Profile with chunk_size =1024

Manual indexing on a document. Here after indexing is completed, nodes are added to the vector DB as shown
image

Prompt run on top of manual indexing. Prompt run works fine picking context from vector DB.
image

Running a prompt before manual indexing (dynamic indexing would kick in) as there are no records in vector db.
image

Dynamic indexing kicked in and prompt run worked fine

image

Manually remove the records from vector db
image

On running prompt, we see an error
image

Manually re-index. Run prompt again and prompt should work fine. Nodes added to vector DB.
image

image

Checklist

I have read and understood the Contribution Guidelines.

@gaya3-zipstack gaya3-zipstack marked this pull request as draft September 4, 2024 09:33
@gaya3-zipstack gaya3-zipstack changed the title When chunk_size=0, skip vector db [Do not merge] When chunk_size=0, skip vector db Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant