Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-1812949 add get object and get bytes support for native arrow structured types #1968

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

sfc-gh-mkubik
Copy link
Contributor

@sfc-gh-mkubik sfc-gh-mkubik commented Nov 18, 2024

Add getObject and getBytes implementations for native arrow structured types

SNOW-1812949

The getString unification was already introduced to JDBC driver. This PR introduces changes for getObject and getBytes functions:

  • toObject on structure types would return object cast of StructObjectWrapper that contains both Object and String representation - to take the advantage of structure type and allow casting to Map or Array but also keep the backward compatibility with semi-structured types and return proper string representations
  • toBytes returns toBytes of the string representation of a structured type
  • also toBytes got added to the vector type as it wasn't available

Pre-review self checklist

  • PR branch is updated with all the changes from master branch
  • The code is correctly formatted (run mvn -P check-style validate)
  • New public API is not unnecessary exposed (run mvn verify and inspect target/japicmp/japicmp.html)
  • The pull request name is prefixed with SNOW-XXXX:
  • Code is in compliance with internal logging requirements

@sfc-gh-mkubik sfc-gh-mkubik marked this pull request as ready for review November 20, 2024 10:51
@sfc-gh-mkubik sfc-gh-mkubik requested a review from a team as a code owner November 20, 2024 10:51
return createJsonSqlInput(columnIndex, obj);
} else if (converter instanceof StructConverter) {
return createArrowSqlInput(columnIndex, (Map<String, Object>) obj);
if (type == Types.STRUCT && isStructuredType && converter instanceof VarCharConverter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we extract these conditions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean something like isVarcharConvertedStructuredType() function? I also consider some rearrangement of conditions but not sure how to make it pretty:

    if (type != Types.STRUCT) {
      return obj;
    }
    if (!resultSetMetaData.isStructuredTypeColumn(columnIndex)) {
      return obj;
    }
    if (!(converter instanceof VarCharConverter)) {
      return obj;
    }
    if (obj != null) {
      return new StructObjectWrapper((String) obj, createJsonSqlInput(columnIndex, obj));
    }

    return null;

return text;
}

private static String buildJsonStringFromElements(Object elements) throws SQLException {
try {
return SnowflakeUtil.mapJson(elements);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we create JSON using ObjectMapper. Shouldn't it be created by our types mappers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method is used for both JSON and ARROW data. For arrow structured types the text field will always be set so it won't be executed. Now I think that it may be worth to check for potential divergence in primitive types though

private int baseType;
private Object elements;

public SfSqlArray(String text, int baseType, Object elements) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. can we reuse new constructor in other constructors?
  2. can we remove other constructors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll reuse it but also keep the old one to avoid passing null in usages that remain untouched

public void testRunAsGetString() throws SQLException {
withFirstRow(
connections.get(queryResultFormat),
selectSql,
(resultSet) -> assertGetStringIsCompatible(resultSet, expectedStructureTypeRepresentation));
}

@Test
public void testRunAsGetObject() throws SQLException {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are fetching and getting the same data for Object and Bytes - can we run bot (or even more tests) on the same result set?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants