[Do not merge] When chunk_size=0, skip vector db #99
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
Current; implementation always uses the indexed nodes when fetching context for prompts. However, when chunk_size=0, since we have to send the entire context, we can directly send the extracted text instead of fetching the chunk from the vector db.
Why
This will improve response time for prompts when chunk_size=0 as vector db need not be accessed
How
When chunk_size=0, the context can be fetched from the extracted text present in the container file system
Relevant Docs
Related Issues or PRs
Unstract PR
Dependencies Versions
Notes on Testing
Screenshots
Profile with Chunk_size=0
Manual indexing on a document. Here after indexing is completed, no nodes are added to the vector DB as shown
Prompt run on top of manual indexing. Here after prompt run, still no records in the vector db. But still, prompt answers are right as the context gets picked up from the extracted text and works fine.
Running a prompt before manual indexing (dynamic indexing would kick in).
Manually remove the extracted file after indexing. Run prompt. This gives an error saying the extracted file is missing
Now, do a manual re-indexing. Extracted file will be re-created. Then run prompt.
Profile with chunk_size =1024
Manual indexing on a document. Here after indexing is completed, nodes are added to the vector DB as shown
Prompt run on top of manual indexing. Prompt run works fine picking context from vector DB.
Running a prompt before manual indexing (dynamic indexing would kick in) as there are no records in vector db.
Dynamic indexing kicked in and prompt run worked fine
Manually remove the records from vector db
On running prompt, we see an error
Manually re-index. Run prompt again and prompt should work fine. Nodes added to vector DB.
Checklist
I have read and understood the Contribution Guidelines.