Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOP-15561] Cast DateTimeHWM to DateTime64 in Clickhouse #267

Merged
merged 1 commit into from
Apr 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/data/clickhouse/matrix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,16 @@ matrix:
clickhouse-version: 23.6.1-alpine
<<: *max
full:
# the lowest supported Clickhouse version by JDBC driver
# Clickhouse version with proper DateTime > DateTime64 comparison
- clickhouse-image: yandex/clickhouse-server
clickhouse-version: '20.7'
clickhouse-version: '21.1'
<<: *min
- clickhouse-image: clickhouse/clickhouse-server
clickhouse-version: 23.6.1-alpine
<<: *max
nightly:
- clickhouse-image: yandex/clickhouse-server
clickhouse-version: '20.7'
clickhouse-version: '21.1'
<<: *min
- clickhouse-image: clickhouse/clickhouse-server
clickhouse-version: latest-alpine
Expand Down
26 changes: 26 additions & 0 deletions docs/changelog/next_release/267.breaking.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Serialize DateTimeHWM to Clickhouse's ``DateTime64(6)`` (precision up to microseconds) instead of ``DateTime`` (precision up to seconds).

For Clickhouse below 21.1 comparing column of type ``DateTime`` with a value of type ``DateTime64`` was not supported, returning an empty dataframe.
To avoid this, replace:

.. code:: python

DBReader(
...,
hwm=DBReader.AutoDetectHWM(
name="my_hwm",
expression="hwm_column", # <--
),
)

with:

.. code:: python

DBReader(
...,
hwm=DBReader.AutoDetectHWM(
name="my_hwm",
expression="CAST(hwm_column AS DateTime64)", # <--
),
)
2 changes: 1 addition & 1 deletion docs/connection/db_connection/clickhouse/prerequisites.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Prerequisites
Version Compatibility
---------------------

* Clickhouse server versions: 20.7 or higher
* Clickhouse server versions: 21.1 or higher
* Spark versions: 2.3.x - 3.5.x
* Java versions: 8 - 20

Expand Down
8 changes: 5 additions & 3 deletions onetl/connection/db_connection/clickhouse/dialect.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,11 @@ def get_min_value(self, value: Any) -> str:
return f"minOrNull({result})"

def _serialize_datetime(self, value: datetime) -> str:
result = value.strftime("%Y-%m-%d %H:%M:%S")
return f"CAST('{result}' AS DateTime)"
# this requires at least Clickhouse 21.1, see:
# https://github.com/ClickHouse/ClickHouse/issues/16655
result = value.strftime("%Y-%m-%d %H:%M:%S.%f")
return f"toDateTime64('{result}', 6)"

def _serialize_date(self, value: date) -> str:
result = value.strftime("%Y-%m-%d")
return f"CAST('{result}' AS Date)"
return f"toDate('{result}')"
2 changes: 1 addition & 1 deletion tests/fixtures/processing/clickhouse.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ class ClickhouseProcessing(BaseProcessing):
"text_string": "String",
"hwm_int": "Int32",
"hwm_date": "Date",
"hwm_datetime": "DateTime",
"hwm_datetime": "DateTime64(6)",
"float_value": "Float32",
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -302,12 +302,37 @@ def test_clickhouse_strategy_incremental_explicit_hwm_type(
ColumnDateHWM,
lambda x: x.isoformat(),
),
pytest.param(
"hwm_date",
"CAST(text_string AS Date32)",
ColumnDateHWM,
lambda x: x.isoformat(),
marks=pytest.mark.xfail(reason="Date32 type was added in ClickHouse 21.9"),
),
(
"hwm_datetime",
"CAST(text_string AS DateTime)",
ColumnDateTimeHWM,
lambda x: x.isoformat(),
),
(
"hwm_datetime",
"CAST(text_string AS DateTime64)",
ColumnDateTimeHWM,
lambda x: x.isoformat(),
),
(
"hwm_datetime",
"CAST(text_string AS DateTime64(3))",
ColumnDateTimeHWM,
lambda x: x.isoformat(),
),
(
"hwm_datetime",
"CAST(text_string AS DateTime64(6))",
ColumnDateTimeHWM,
lambda x: x.isoformat(),
),
],
)
def test_clickhouse_strategy_incremental_with_hwm_expr(
Expand Down
Loading