Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Milvus] Store array and JSON metadata fields directly #7429

Open
5 tasks done
rakuzen25 opened this issue Dec 25, 2024 · 0 comments
Open
5 tasks done

[Milvus] Store array and JSON metadata fields directly #7429

rakuzen25 opened this issue Dec 25, 2024 · 0 comments
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@rakuzen25
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

// Based on https://js.langchain.com/docs/integrations/vectorstores/milvus/
import { Milvus } from "@langchain/community/vectorstores/milvus";
import { OpenAIEmbeddings } from "@langchain/openai";
import { Document } from "langchain/document";

const docs: Document[] = [
    new Document({
        pageContent: "This is a test document.",
        metadata: {
            source: "test.txt",
            foo: {
                bar: "baz",
            },
            qux: [1, 2, 3],
        },
    })
]

const vectorStore = await Milvus.fromDocuments(docs, new OpenAIEmbeddings(), {
    collectionName: "foobar",
});

const response = await vectorStore.similaritySearch("test", 2,
    // This won't work...
    "array_contains(qux, 1)",
    // Only this will
    "qux like '%1%'",
);

Error Message and Stack Trace (if applicable)

No response

Description

Milvus 2.2.9 and 2.3.2, released in June 2023 and October 2023, added support for JSON and array data types respectively. This enables access to more efficient operators such as json_contains and array_contains. However, LangChain's current implementation uses VarChar for all metadata fields:

// use json for other types
try {
fields.push({
name: key,
description: `Metadata JSON field`,
data_type: DataType.VarChar,
type_params: {
max_length: jsonFieldMaxLength.toString(),
},
});
} catch (e) {
throw new Error("Failed to parse metadata field as JSON");
}

Is it possible to offer it as an option to the user, or do some magic version detection through MilvusClient.getVersion?

System Info

Not sure how pnpm info langchain would be useful since it always shows the latest version, but my installed versions are:

@langchain/community 0.2.33
langchain 0.2.20

Windows, node v23.4.0, pnpm v9.15.1

@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

1 participant