Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1936378: Add possibility for Loader to support Vector type #2079

Open
mhryb opened this issue Feb 17, 2025 · 3 comments
Open

SNOW-1936378: Add possibility for Loader to support Vector type #2079

mhryb opened this issue Feb 17, 2025 · 3 comments
Assignees
Labels
feature status-triage_done Initial triage done, will be further handled by the driver team

Comments

@mhryb
Copy link

mhryb commented Feb 17, 2025

Hello, is it possible to support Vector type for StreamLoader ?

What is the current behavior?

Currently Loader does not support Vector type, only array.

What is the desired behavior?

It would be good if loader supports Vector type.

How would this improve snowflake-jdbc?

It gives a possibility to support Vector type for loader. So users can load vector data as other types.

References, Other Background

As I can see for loading operation the driver creates a temp table and then use select/merge operation to move the data. It would be nice if during this operation we could convert an array to Vector if the target column is vector.

@mhryb mhryb added the feature label Feb 17, 2025
@github-actions github-actions bot changed the title Add possibility for Loader to support Vector type SNOW-1936378: Add possibility for Loader to support Vector type Feb 17, 2025
@sfc-gh-sghosh sfc-gh-sghosh self-assigned this Feb 18, 2025
@sfc-gh-sghosh
Copy link
Contributor

Hello @mhryb ,

Thanks for raising the issue.
At present VECTOR data type is only supported for SQL , python connector etc.
The supported VECTOR data type is only (FLOAT/INT )
If you want to convert ARRAY to VECTOR data type in target table with data loading, you can use casting

`String createTableSQL = "CREATE OR REPLACE TABLE my_vector_table (vector_col VECTOR(FLOAT, 3));";
stmt.execute(createTableSQL);
System.out.println("Table 'my_vector_table' created successfully.");

    String createArrayTableSQL = "CREATE OR REPLACE TABLE array_example (array_column ARRAY);";
    stmt.execute(createArrayTableSQL);
    System.out.println("Table 'array_example' created successfully.");
    
   
    String insertArrayDataSQL = "INSERT INTO array_example (array_column) "
                              + "SELECT ARRAY_CONSTRUCT(12, 14.0, 100);";
    stmt.execute(insertArrayDataSQL);
    System.out.println("Data inserted into 'array_example' table successfully.");
    
    
    String insertVectorDataSQL = "INSERT INTO my_vector_table (vector_col) "
                               + "SELECT array_column::VECTOR(FLOAT, 3) "
                               + "FROM array_example;";
    stmt.execute(insertVectorDataSQL);
    System.out.println("Data inserted into 'my_vector_table' successfully.");
    

    String selectSQL = "SELECT * FROM my_vector_table;";
    ResultSet rs = stmt.executeQuery(selectSQL);
    

    while (rs.next()) {
        System.out.println("VECTOR Column: " + rs.getString("VECTOR_COL"));
    }

Output:
VECTOR Column: [12.000000,14.000000,100.000000]
`

Regards,
Sujan

@sfc-gh-sghosh sfc-gh-sghosh added the status-triage_done Initial triage done, will be further handled by the driver team label Feb 18, 2025
@mhryb
Copy link
Author

mhryb commented Feb 18, 2025

Hello @sfc-gh-sghosh, thanks for the answer.
Yes, I can do that, but it would be nice if snowflake-jdbc StreamLoader supports this.
So we would have a possibility to do all the operations "Insert", "Update", "Upsert" with Vectors using StreamLoader .

@mhryb
Copy link
Author

mhryb commented Feb 18, 2025

Let me describe with more details what I mean.
For example here we are inserting / merging data from temporary table to a real table:

So it would be nice to add possibility to cast array to a vector if target column is a vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature status-triage_done Initial triage done, will be further handled by the driver team
Projects
None yet
Development

No branches or pull requests

2 participants