Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] cudf::read_json incorrectly parses invalid JSON string #16999

Closed
Tracked by #11630
ttnghia opened this issue Oct 4, 2024 · 1 comment
Closed
Tracked by #11630

[BUG] cudf::read_json incorrectly parses invalid JSON string #16999

ttnghia opened this issue Oct 4, 2024 · 1 comment
Assignees
Labels
bug Something isn't working cuIO cuIO issue

Comments

@ttnghia
Copy link
Contributor

ttnghia commented Oct 4, 2024

Given the input data {"key": "}, which is invalid, the output of read_json is " (a quote character) instead of NULL.

Reproducible code:

#include <cudf_test/base_fixture.hpp>
#include <cudf_test/column_wrapper.hpp>
#include <cudf_test/debug_utilities.hpp>

#include <cudf/io/json.hpp>

struct Test : public cudf::test::BaseFixture {};

template <typename T>
auto dtype()
{
  return cudf::data_type{cudf::type_to_id<T>()};
}

TEST_F(Test, X)
{
  auto const data = std::string{"{\"key\": \"}"};
  std::map<std::string, cudf::io::schema_element> schema{{"key", {dtype<cudf::string_view>()}}};
  auto const opts =
    cudf::io::json_reader_options::builder(cudf::io::source_info{data.data(), data.size()})
      .dtypes(schema)
      .lines(true)
      .recovery_mode(cudf::io::json_recovery_mode_t::RECOVER_WITH_NULL)
      .build();
  auto const result = cudf::io::read_json(opts);
  cudf::test::print(result.tbl->view().column(0));
}

Note that the bug only shows up if this invalid JSON object is at the end of the input data. If it is followed by another valid JSON object then the output is just correct.

In particular, the bug will show up with this input:

{"key": "1"}
{"key": "}

but it will not show up with this:

{"key": "}
{"key": "1"}
@ttnghia
Copy link
Contributor Author

ttnghia commented Oct 31, 2024

Closed by #17098.

@ttnghia ttnghia closed this as completed Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cuIO cuIO issue
Projects
None yet
Development

No branches or pull requests

2 participants