Considerations about metadata and MemoryRecord #2689
afederici75
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, yesterday I came across a challenge with storing vectors in kernel.memory (I am using the REDIS connector) and I'd like to know how other people deal with this.
This diagram should help follow the rest of my ramblings:
The application essentially stores metadata and transcripts of videos (e.,g. YouTube's) alongside their embeddings. I wanted to use the additionalMetadata string to store a bunch of data that goes with the 'record' and this is what originally prompted my message.
Today I realized that some of the information I was trying to add to metadata is already in the record via the key or other properties (i.e. I removed the redundant data, practically without much (or all) the need for additionalMetadata.
That resolved I still like to know what you guys think of the following:
-------------------- Yesterday post --------------------
Hi, I have a few questions about metadata and MemoryRecord.
Given the diagram:
1 & 2) 'externalSource' makes it in the vector database even when unused.
MemoryRecord uses MemoryMetadata as a common datatype regardless of being called from SaveInformationAsync or SaveReferenceAsync. This results in the vector database getting an extra redundant data value (e.g. the externaSourceName in this case, after using SaveInformationAsync) versus a zero-length string. In other words, should we trim the text [externalSoutrceName:""] when we save information (vs refs)?
3 & 4) Since additionalMetadata is passed as string, we get additional encoding applied to what I wanted to be pure JSON.
-What's the recommended approach for this type of metadata 'injection'?
-Related to the question above, is it possible to search using the metadata? I mean, is it meant to be searched on, too, or it's meant for after-retrieval only?
-Can we make some wrappers so we can pass (and automatically serialize, consistently with how we configure SK) object instances or I should really think string here?
Thanks,
Alessandro Federici
PS The "I have done a [search for similar discussions]" checkbox is brilliant! :)
Beta Was this translation helpful? Give feedback.
All reactions