Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/vector support #201

Merged
merged 6 commits into from
Sep 22, 2023
Merged

Feature/vector support #201

merged 6 commits into from
Sep 22, 2023

Conversation

msmygit
Copy link
Collaborator

@msmygit msmygit commented Sep 20, 2023

What this PR does: Introduces support for migrating tables with vector cql data type.

Which issue(s) this PR fixes:
Fixes #200

Checklist:

  • Automated Tests added/updated
  • Documentation added/updated
  • CLA Signed: DataStax CLA

@msmygit msmygit added the enhancement New feature or request label Sep 20, 2023
@msmygit msmygit requested a review from a team as a code owner September 20, 2023 23:38
@msmygit msmygit self-assigned this Sep 20, 2023
@msmygit msmygit force-pushed the feature/vector_support branch from 9594224 to 2a04ba9 Compare September 20, 2023 23:41
@@ -5,17 +5,17 @@ assertCmd="egrep 'JobSession.* Final ' \${OUTPUT_FILE} | sed 's/^.*Final //'"
_usage() {
cat <<EOF

usage: $0 -f output_file -a assertFile [-d directory]
usage: $0 -f output_file -a assert_file [-d directory]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cosmetic rename to match file name pattern.

@@ -0,0 +1,4 @@
Read Record Count: 2
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These whole files are being added to smoke_inflight which isn't being tested actively, but will be moved to smoke directory once we've a docker container that supports vector cql data type.

@msmygit msmygit force-pushed the feature/vector_support branch from 2a04ba9 to ec1c449 Compare September 20, 2023 23:52
<artifactId>java-driver-query-builder</artifactId>
<version>${java-driver.version}</version>
</dependency>
<!-- <dependency>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one was pulled earlier via the SCC, but actually not required. Keeping it here just in case commented out.

@@ -13,9 +13,10 @@
<spark.version>3.3.1</spark.version>
<scalatest.version>3.2.12</scalatest.version>
<connector.version>3.2.0</connector.version>
<cassandra.version>3.11.13</cassandra.version>
<cassandra.version>5.0-alpha1</cassandra.version>
Copy link
Collaborator Author

@msmygit msmygit Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - There is a datastax/dse-server:7.0.0-a from this blog which I didn't want to use it on here (just yet) for supporting vector data type.

@@ -50,7 +50,7 @@ public ByteBuffer encode(Double value, @NotNull ProtocolVersion protocolVersion)
@Override
public Double decode(ByteBuffer bytes, @NotNull ProtocolVersion protocolVersion) {
String stringValue = TypeCodecs.TEXT.decode(bytes, protocolVersion);
return new Double(stringValue);
return Double.valueOf(stringValue);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI - This addresses the deprecation of new Double(String value) in newer versions with the correct one to use.

@@ -74,7 +75,48 @@
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_${scala.main.version}</artifactId>
<version>${connector.version}</version>
<exclusions>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason we're excluding the Java Driver from the SCC dependency and adding separatly is that the version that comes with SCC (i.e. 4.13.0) doesn't support Vectors

@msmygit msmygit force-pushed the feature/vector_support branch from ec1c449 to 056cda0 Compare September 22, 2023 16:24
@msmygit msmygit merged commit 125eefe into main Sep 22, 2023
@msmygit msmygit deleted the feature/vector_support branch September 22, 2023 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Introduce support for new vector CQL data type
3 participants