Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

org.apache.spark.sql.mleap.TypeConverters can not convert 2D tensor to Matrix #854

Open
austinzh opened this issue Jun 27, 2023 · 0 comments
Assignees
Labels

Comments

@austinzh
Copy link
Contributor

Current implementation always will convert Tensor to Vector
Bug is hidden in tt.dimensions.size where tt.dimensions is Option[Seq[Int]], so calling size on Some will have size of 1 and calling size on None will have size of 0. So in following code, TensorType will always convert to VectorUDT

  def mleapTensorToSpark(tt: types.TensorType): DataType = {
    assert(TypeConverters.VECTOR_BASIC_TYPES.contains(tt.base),
      s"cannot convert tensor with base ${tt.base} to vector")
    assert(tt.dimensions.isDefined, "cannot convert tensor with undefined dimensions")

    if(tt.dimensions.isEmpty) {
      mleapBasicTypeToSparkType(tt.base)
    } else if(tt.dimensions.size == 1) {
      new VectorUDT
    } else if(tt.dimensions.size == 2) {
      new MatrixUDT
    } else {
      throw new IllegalArgumentException("cannot convert tensor for non-scalar, vector or matrix tensor")
    }
  }

Same bug exists in mleapToSparkValue function as well.

@austinzh austinzh added the bug label Jun 27, 2023
@austinzh austinzh self-assigned this Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant