From b1fa6be2a4083be8bf93e02dda56a2646eb2b3e9 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 25 Apr 2025 10:42:40 +0800 Subject: [PATCH 01/15] Add temp.md --- temp.md | 1 + 1 file changed, 1 insertion(+) create mode 100644 temp.md diff --git a/temp.md b/temp.md new file mode 100644 index 0000000000000..af27ff4986a7b --- /dev/null +++ b/temp.md @@ -0,0 +1 @@ +This is a test file. \ No newline at end of file From 7028e4daabe881131c0a56616fe37bbb4364231c Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 25 Apr 2025 10:42:44 +0800 Subject: [PATCH 02/15] Delete temp.md --- temp.md | 1 - 1 file changed, 1 deletion(-) delete mode 100644 temp.md diff --git a/temp.md b/temp.md deleted file mode 100644 index af27ff4986a7b..0000000000000 --- a/temp.md +++ /dev/null @@ -1 +0,0 @@ -This is a test file. \ No newline at end of file From f768ad817d5715833d6e29e0ce9fa5c2488a112b Mon Sep 17 00:00:00 2001 From: Test User Date: Fri, 25 Apr 2025 11:11:15 +0800 Subject: [PATCH 03/15] Update sql-faq.md --- faq/sql-faq.md | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/faq/sql-faq.md b/faq/sql-faq.md index 1d1351dfd67ab..bb2603531c0da 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -337,6 +337,73 @@ Whether your cluster is a new cluster or an upgraded cluster from an earlier ver - If the owner does not exist, try manually triggering owner election with: `curl -X POST http://{TiDBIP}:10080/ddl/owner/resign`. - If the owner exists, export the Goroutine stack and check for the possible stuck location. +## Collation used in JDBC connections + +This section describes the collation behavior of JDBC connections and provides solutions for collation changes after TiDB upgrades. For information about character sets and collations supported by TiDB, see [Character Set and Collation](/character-set-and-collation.md). + +### What collation is used in a JDBC connection when `connectionCollation` is not configured in the JDBC URL? + +When `connectionCollation` is not configured in the JDBC URL, there are two scenarios: + +**Scenario 1**: Neither `connectionCollation` nor `characterEncoding` is configured in the JDBC URL + +- For Connector/J 8.0.25 and earlier versions, the JDBC driver attempts to use the server's default character set. Because the default character set of TiDB is `utf8mb4`, the driver uses `utf8mb4_bin` as the connection collation. +- For Connector/J 8.0.26 and later versions, the JDBC driver uses the `utf8mb4` character set and automatically selects the collation based on the return value of `SELECT VERSION()`. + + - When the return value is less than `8.0.1`, the driver uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver and uses `utf8mb4_general_ci` as the collation. + - When the return value is greater than or equal to `8.0.1`, the driver uses `utf8mb4_0900_ai_ci` as the connection collation. TiDB v7.4.0 and later versions follow the driver and use `utf8mb4_0900_ai_ci` as the collation, while TiDB versions earlier than v7.4.0 fall back to using the default collation `utf8mb4_bin` because the `utf8mb4_0900_ai_ci` collation is not supported in these versions. + +**Scenario 2**: `characterEncoding=utf8` is configured in the JDBC URL but `connectionCollation` is not configured. The JDBC driver uses the `utf8mb4` character set according to the mapping rules. The collation is determined according to the rules described in Scenario 1. + +### How to solve collation changes after TiDB upgrade? + +In TiDB v7.4 and earlier versions, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the TiDB [`collation_connection`](/system-variable-reference.md#collation_connection) variable defaults to the `utf8mb4_bin` collation. + +Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the value of the [`collation_connection`](/system-variable-reference.md#collation_connection) variable depends on the JDBC driver version. For example, for Connector/J 8.0.26 and later versions, the JDBC driver defaults to the `utf8mb4` character set and uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver, and the [`collation_connection`](/system-variable-reference.md#collation_connection) variable uses the `utf8mb4_0900_ai_ci` collation. For more information, see [Collation used for JDBC connections](#what-collation-is-used-for-a-jdbc-connection-when-connectioncollation-is-not-configured-in-the-jdbc-url). + +When upgrading from an earlier version to v7.4 or later (for example, from v6.5 to v7.5), if you need to maintain the `collation_connection` as `utf8mb4_bin` for JDBC connections, it is recommended to configure the `connectionCollation` parameter in the JDBC URL. + +The following is a common JDBC URL configuration in TiDB v6.5: + +``` +spring.datasource.url=JDBC:mysql://{TiDBIP}:{TiDBPort}/{DBName}?characterEncoding=utf8&useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&prepStmtCacheSqlLimit=10000&prepStmtCacheSize=1000&useConfigs=maxPerformance&rewriteBatchedStatements=true&defaultfetchsize=-2147483648&allowMultiQueries=true +``` + +After upgrading to TiDB v7.5 or a later version, it is recommended to configure the `connectionCollation` parameter in the JDBC URL: + +``` +spring.datasource.url=JDBC:mysql://{TiDBIP}:{TiDBPort}/{DBName}?characterEncoding=utf8&connectionCollation=utf8mb4_bin&useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&prepStmtCacheSqlLimit=10000&prepStmtCacheSize=1000&useConfigs=maxPerformance&rewriteBatchedStatements=true&defaultFetchSize=-2147483648&allowMultiQueries=true +``` + +### What are the differences between the `utf8mb4_bin` and `utf8mb4_0900_ai_ci` collations? + +| Collation | Case-sensitive | Ignore trailing spaces | Ignore diacritics | Comparison method | +|----------------------|----------------|------------------|--------------|------------------------| +| `utf8mb4_bin` | Yes | Yes | Yes | Compare binary values | +| `utf8mb4_0900_ai_ci` | No | No | No | Use Unicode sorting algorithm | + +For example: + +```sql +-- utf8mb4_bin is case-sensitive +SELECT 'apple' = 'Apple' COLLATE utf8mb4_bin; -- Returns 0 (FALSE) + +-- utf8mb4_0900_ai_ci is not case-sensitive +SELECT 'apple' = 'Apple' COLLATE utf8mb4_0900_ai_ci; -- Returns 1 (TRUE) + +-- utf8mb4_bin ignores trailing spaces +SELECT 'Apple ' = 'Apple' COLLATE utf8mb4_bin; -- Returns 1 (TRUE) + +-- utf8mb4_0900_ai_ci does not ignore trailing spaces +SELECT 'Apple ' = 'Apple' COLLATE utf8mb4_0900_ai_ci; -- Returns 0 (FALSE) + +-- utf8mb4_bin distinguishes diacritics +SELECT 'café' = 'cafe' COLLATE utf8mb4_bin; -- Returns 0 (FALSE) + +-- utf8mb4_0900_ai_ci does not distinguish diacritics +SELECT 'café' = 'cafe' COLLATE utf8mb4_0900_ai_ci; -- Returns 1 (TRUE) +``` + ## SQL optimization ### TiDB execution plan description From 78a8e0566660a659f507106a5c02f984285fc01b Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 25 Apr 2025 11:15:40 +0800 Subject: [PATCH 04/15] Update faq/sql-faq.md Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --- faq/sql-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/faq/sql-faq.md b/faq/sql-faq.md index bb2603531c0da..d88a4bacdadcd 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -359,7 +359,7 @@ When `connectionCollation` is not configured in the JDBC URL, there are two scen In TiDB v7.4 and earlier versions, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the TiDB [`collation_connection`](/system-variable-reference.md#collation_connection) variable defaults to the `utf8mb4_bin` collation. -Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the value of the [`collation_connection`](/system-variable-reference.md#collation_connection) variable depends on the JDBC driver version. For example, for Connector/J 8.0.26 and later versions, the JDBC driver defaults to the `utf8mb4` character set and uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver, and the [`collation_connection`](/system-variable-reference.md#collation_connection) variable uses the `utf8mb4_0900_ai_ci` collation. For more information, see [Collation used for JDBC connections](#what-collation-is-used-for-a-jdbc-connection-when-connectioncollation-is-not-configured-in-the-jdbc-url). +Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the value of the [`collation_connection`](/system-variable-reference.md#collation_connection) variable depends on the JDBC driver version. For example, for Connector/J 8.0.26 and later versions, the JDBC driver defaults to the `utf8mb4` character set and uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver, and the [`collation_connection`](/system-variable-reference.md#collation_connection) variable uses the `utf8mb4_0900_ai_ci` collation. For more information, see [Collation used in JDBC connections](#what-collation-is-used-in-a-jdbc-connection-when-connectioncollation-is-not-configured-in-the-jdbc-url). When upgrading from an earlier version to v7.4 or later (for example, from v6.5 to v7.5), if you need to maintain the `collation_connection` as `utf8mb4_bin` for JDBC connections, it is recommended to configure the `connectionCollation` parameter in the JDBC URL. From 2b7b4f63cffe1a739f0919de2d6523df856efb01 Mon Sep 17 00:00:00 2001 From: Test User Date: Fri, 25 Apr 2025 11:44:02 +0800 Subject: [PATCH 05/15] Update sql-faq.md --- faq/sql-faq.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/faq/sql-faq.md b/faq/sql-faq.md index bb2603531c0da..c73344ba8b3a6 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -339,7 +339,7 @@ Whether your cluster is a new cluster or an upgraded cluster from an earlier ver ## Collation used in JDBC connections -This section describes the collation behavior of JDBC connections and provides solutions for collation changes after TiDB upgrades. For information about character sets and collations supported by TiDB, see [Character Set and Collation](/character-set-and-collation.md). +This section lists questions related to collations used in JDBC connections. For information about character sets and collations supported by TiDB, see [Character Set and Collation](/character-set-and-collation.md). ### What collation is used in a JDBC connection when `connectionCollation` is not configured in the JDBC URL? @@ -355,7 +355,7 @@ When `connectionCollation` is not configured in the JDBC URL, there are two scen **Scenario 2**: `characterEncoding=utf8` is configured in the JDBC URL but `connectionCollation` is not configured. The JDBC driver uses the `utf8mb4` character set according to the mapping rules. The collation is determined according to the rules described in Scenario 1. -### How to solve collation changes after TiDB upgrade? +### How to handle collation changes after upgrading TiDB? In TiDB v7.4 and earlier versions, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the TiDB [`collation_connection`](/system-variable-reference.md#collation_connection) variable defaults to the `utf8mb4_bin` collation. From 610c3a35d430cd748c46151b0a37af3eba40f28a Mon Sep 17 00:00:00 2001 From: Test User Date: Fri, 25 Apr 2025 13:44:05 +0800 Subject: [PATCH 06/15] Update dev-guide-sample-application-java-jdbc.md --- develop/dev-guide-sample-application-java-jdbc.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/develop/dev-guide-sample-application-java-jdbc.md b/develop/dev-guide-sample-application-java-jdbc.md index 7fd96eced87f3..111fab0ade697 100644 --- a/develop/dev-guide-sample-application-java-jdbc.md +++ b/develop/dev-guide-sample-application-java-jdbc.md @@ -16,7 +16,8 @@ In this tutorial, you can learn how to use TiDB and JDBC to accomplish the follo > **Note:** > -> This tutorial works with TiDB Cloud Serverless, TiDB Cloud Dedicated, and TiDB Self-Managed. +> - This tutorial works with TiDB Cloud Serverless, TiDB Cloud Dedicated, and TiDB Self-Managed. +> - Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the collation used in a JDBC connection depends on the JDBC driver version. For more information, see [Collation used in JDBC connections](/faq/sql-faq.md#collation-used-in-jdbc-connections). ## Prerequisites From 4b2ffa994b7f74ce999d5f9d858549d55e712d88 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 25 Apr 2025 13:47:14 +0800 Subject: [PATCH 07/15] Update faq/sql-faq.md Co-authored-by: Aolin --- faq/sql-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/faq/sql-faq.md b/faq/sql-faq.md index 6dd3af76f2ad7..2af7ddaf47250 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -388,7 +388,7 @@ For example: -- utf8mb4_bin is case-sensitive SELECT 'apple' = 'Apple' COLLATE utf8mb4_bin; -- Returns 0 (FALSE) --- utf8mb4_0900_ai_ci is not case-sensitive +-- utf8mb4_0900_ai_ci is case-insensitive SELECT 'apple' = 'Apple' COLLATE utf8mb4_0900_ai_ci; -- Returns 1 (TRUE) -- utf8mb4_bin ignores trailing spaces From 5b1ca9dc8715d930aa295da0337f8bfd3361b5a0 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 25 Apr 2025 14:08:53 +0800 Subject: [PATCH 08/15] Apply suggestions from code review Co-authored-by: Aolin --- faq/sql-faq.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/faq/sql-faq.md b/faq/sql-faq.md index 2af7ddaf47250..60a0a86624457 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -353,7 +353,7 @@ When `connectionCollation` is not configured in the JDBC URL, there are two scen - When the return value is less than `8.0.1`, the driver uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver and uses `utf8mb4_general_ci` as the collation. - When the return value is greater than or equal to `8.0.1`, the driver uses `utf8mb4_0900_ai_ci` as the connection collation. TiDB v7.4.0 and later versions follow the driver and use `utf8mb4_0900_ai_ci` as the collation, while TiDB versions earlier than v7.4.0 fall back to using the default collation `utf8mb4_bin` because the `utf8mb4_0900_ai_ci` collation is not supported in these versions. -**Scenario 2**: `characterEncoding=utf8` is configured in the JDBC URL but `connectionCollation` is not configured. The JDBC driver uses the `utf8mb4` character set according to the mapping rules. The collation is determined according to the rules described in Scenario 1. +**Scenario 2**: `characterEncoding=utf8` is configured in the JDBC URL but `connectionCollation` is not configured. The JDBC driver uses the `utf8mb4` character set according to the mapping rules. The collation is determined according to the rules described in scenario 1. ### How to handle collation changes after upgrading TiDB? @@ -397,10 +397,10 @@ SELECT 'Apple ' = 'Apple' COLLATE utf8mb4_bin; -- Returns 1 (TRUE) -- utf8mb4_0900_ai_ci does not ignore trailing spaces SELECT 'Apple ' = 'Apple' COLLATE utf8mb4_0900_ai_ci; -- Returns 0 (FALSE) --- utf8mb4_bin distinguishes diacritics +-- utf8mb4_bin is accent-sensitive SELECT 'café' = 'cafe' COLLATE utf8mb4_bin; -- Returns 0 (FALSE) --- utf8mb4_0900_ai_ci does not distinguish diacritics +-- utf8mb4_0900_ai_ci is accent-insensitive SELECT 'café' = 'cafe' COLLATE utf8mb4_0900_ai_ci; -- Returns 1 (TRUE) ``` From 217cf4fffbf4bad774c2b1d078c5f5686aeb38f1 Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Fri, 25 Apr 2025 14:09:50 +0800 Subject: [PATCH 09/15] Update faq/sql-faq.md Co-authored-by: Aolin --- faq/sql-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/faq/sql-faq.md b/faq/sql-faq.md index 60a0a86624457..c8ecc3b4f55a2 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -377,7 +377,7 @@ spring.datasource.url=JDBC:mysql://{TiDBIP}:{TiDBPort}/{DBName}?characterEncodin ### What are the differences between the `utf8mb4_bin` and `utf8mb4_0900_ai_ci` collations? -| Collation | Case-sensitive | Ignore trailing spaces | Ignore diacritics | Comparison method | +| Collation | Case-sensitive | Ignore trailing spaces | Accent-sensitive | Comparison method | |----------------------|----------------|------------------|--------------|------------------------| | `utf8mb4_bin` | Yes | Yes | Yes | Compare binary values | | `utf8mb4_0900_ai_ci` | No | No | No | Use Unicode sorting algorithm | From 3c5d69b1b18841e82180ae3fa1cc45cef77d2054 Mon Sep 17 00:00:00 2001 From: Test User Date: Sun, 27 Apr 2025 10:57:49 +0800 Subject: [PATCH 10/15] Update upgrade-faq.md --- faq/upgrade-faq.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/faq/upgrade-faq.md b/faq/upgrade-faq.md index d60feef4df6c0..6db7a808fbf4d 100644 --- a/faq/upgrade-faq.md +++ b/faq/upgrade-faq.md @@ -36,6 +36,12 @@ It is not recommended to upgrade TiDB using the binary. Instead, it is recommend This section lists some FAQs and their solutions after you upgrade TiDB. +### The collation in JDBC connections changes after upgrading TiDB + +When upgrading from an earlier version to v7.4 or later, if the `connectionCollation` is not configured, and the `characterEncoding` is either not configured or configured as `UTF-8` in the JDBC URL, the default collation in your JDBC connections might change from `utf8mb4_bin` to `utf8mb4_0900_ai_ci` after upgrading. If you need to maintain the collation as `utf8mb4_bin`, configure `connectionCollation=utf8mb4_bin` in the JDBC URL. + +For more information, see [Collation used in JDBC connections](/faq/sql-faq.md#the-collation-used-in-jdbc-connections). + ### The character set (charset) errors when executing DDL operations In v2.1.0 and earlier versions (including all versions of v2.0), the character set of TiDB is UTF-8 by default. But starting from v2.1.1, the default character set has been changed into UTF8MB4. From 4bfb1656008121db4f60dbfd8e8eb3eef9846b1d Mon Sep 17 00:00:00 2001 From: Test User Date: Sun, 27 Apr 2025 14:42:28 +0800 Subject: [PATCH 11/15] characterEncoding: utf8 ->UTF-8 --- develop/dev-guide-sample-application-java-jdbc.md | 2 +- faq/sql-faq.md | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/develop/dev-guide-sample-application-java-jdbc.md b/develop/dev-guide-sample-application-java-jdbc.md index 111fab0ade697..cba2dd7c5347f 100644 --- a/develop/dev-guide-sample-application-java-jdbc.md +++ b/develop/dev-guide-sample-application-java-jdbc.md @@ -17,7 +17,7 @@ In this tutorial, you can learn how to use TiDB and JDBC to accomplish the follo > **Note:** > > - This tutorial works with TiDB Cloud Serverless, TiDB Cloud Dedicated, and TiDB Self-Managed. -> - Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the collation used in a JDBC connection depends on the JDBC driver version. For more information, see [Collation used in JDBC connections](/faq/sql-faq.md#collation-used-in-jdbc-connections). +> - Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the collation used in a JDBC connection depends on the JDBC driver version. For more information, see [Collation used in JDBC connections](/faq/sql-faq.md#collation-used-in-jdbc-connections). ## Prerequisites diff --git a/faq/sql-faq.md b/faq/sql-faq.md index c8ecc3b4f55a2..68e89c2c0ac28 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -357,22 +357,22 @@ When `connectionCollation` is not configured in the JDBC URL, there are two scen ### How to handle collation changes after upgrading TiDB? -In TiDB v7.4 and earlier versions, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the TiDB [`collation_connection`](/system-variable-reference.md#collation_connection) variable defaults to the `utf8mb4_bin` collation. +In TiDB v7.4 and earlier versions, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the TiDB [`collation_connection`](/system-variable-reference.md#collation_connection) variable defaults to the `utf8mb4_bin` collation. -Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `utf8` in the JDBC URL, the value of the [`collation_connection`](/system-variable-reference.md#collation_connection) variable depends on the JDBC driver version. For example, for Connector/J 8.0.26 and later versions, the JDBC driver defaults to the `utf8mb4` character set and uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver, and the [`collation_connection`](/system-variable-reference.md#collation_connection) variable uses the `utf8mb4_0900_ai_ci` collation. For more information, see [Collation used in JDBC connections](#what-collation-is-used-in-a-jdbc-connection-when-connectioncollation-is-not-configured-in-the-jdbc-url). +Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the value of the [`collation_connection`](/system-variable-reference.md#collation_connection) variable depends on the JDBC driver version. For example, for Connector/J 8.0.26 and later versions, the JDBC driver defaults to the `utf8mb4` character set and uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver, and the [`collation_connection`](/system-variable-reference.md#collation_connection) variable uses the `utf8mb4_0900_ai_ci` collation. For more information, see [Collation used in JDBC connections](#what-collation-is-used-in-a-jdbc-connection-when-connectioncollation-is-not-configured-in-the-jdbc-url). When upgrading from an earlier version to v7.4 or later (for example, from v6.5 to v7.5), if you need to maintain the `collation_connection` as `utf8mb4_bin` for JDBC connections, it is recommended to configure the `connectionCollation` parameter in the JDBC URL. The following is a common JDBC URL configuration in TiDB v6.5: ``` -spring.datasource.url=JDBC:mysql://{TiDBIP}:{TiDBPort}/{DBName}?characterEncoding=utf8&useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&prepStmtCacheSqlLimit=10000&prepStmtCacheSize=1000&useConfigs=maxPerformance&rewriteBatchedStatements=true&defaultfetchsize=-2147483648&allowMultiQueries=true +spring.datasource.url=JDBC:mysql://{TiDBIP}:{TiDBPort}/{DBName}?characterEncoding=UTF-8&useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&prepStmtCacheSqlLimit=10000&prepStmtCacheSize=1000&useConfigs=maxPerformance&rewriteBatchedStatements=true&defaultfetchsize=-2147483648&allowMultiQueries=true ``` After upgrading to TiDB v7.5 or a later version, it is recommended to configure the `connectionCollation` parameter in the JDBC URL: ``` -spring.datasource.url=JDBC:mysql://{TiDBIP}:{TiDBPort}/{DBName}?characterEncoding=utf8&connectionCollation=utf8mb4_bin&useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&prepStmtCacheSqlLimit=10000&prepStmtCacheSize=1000&useConfigs=maxPerformance&rewriteBatchedStatements=true&defaultFetchSize=-2147483648&allowMultiQueries=true +spring.datasource.url=JDBC:mysql://{TiDBIP}:{TiDBPort}/{DBName}?characterEncoding=UTF-8&connectionCollation=utf8mb4_bin&useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&prepStmtCacheSqlLimit=10000&prepStmtCacheSize=1000&useConfigs=maxPerformance&rewriteBatchedStatements=true&defaultFetchSize=-2147483648&allowMultiQueries=true ``` ### What are the differences between the `utf8mb4_bin` and `utf8mb4_0900_ai_ci` collations? From ff3f18a4824c1702f8c5030595462d973669782c Mon Sep 17 00:00:00 2001 From: Test User Date: Mon, 28 Apr 2025 10:03:45 +0800 Subject: [PATCH 12/15] fix broken links of TiDB Cloud docs --- develop/dev-guide-sample-application-java-jdbc.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/develop/dev-guide-sample-application-java-jdbc.md b/develop/dev-guide-sample-application-java-jdbc.md index cba2dd7c5347f..d757ac55b014a 100644 --- a/develop/dev-guide-sample-application-java-jdbc.md +++ b/develop/dev-guide-sample-application-java-jdbc.md @@ -14,11 +14,25 @@ In this tutorial, you can learn how to use TiDB and JDBC to accomplish the follo - Connect to your TiDB cluster using JDBC. - Build and run your application. Optionally, you can find [sample code snippets](#sample-code-snippets) for basic CRUD operations. + + + > **Note:** > > - This tutorial works with TiDB Cloud Serverless, TiDB Cloud Dedicated, and TiDB Self-Managed. > - Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the collation used in a JDBC connection depends on the JDBC driver version. For more information, see [Collation used in JDBC connections](/faq/sql-faq.md#collation-used-in-jdbc-connections). + + + + +> **Note:** +> +> - This tutorial works with TiDB Cloud Serverless, TiDB Cloud Dedicated, and TiDB Self-Managed. +> - Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the collation used in a JDBC connection depends on the JDBC driver version. For more information, see [Collation used in JDBC connections](https://docs.pingcap.com/tidb/stable/sql-faq#collation-used-in-jdbc-connections). + + + ## Prerequisites To complete this tutorial, you need: From 888a3af055cbd377b97706203aef6bd9d9501a0e Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Mon, 28 Apr 2025 10:04:29 +0800 Subject: [PATCH 13/15] Update faq/upgrade-faq.md --- faq/upgrade-faq.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/faq/upgrade-faq.md b/faq/upgrade-faq.md index 6db7a808fbf4d..d7fc3eeb3a560 100644 --- a/faq/upgrade-faq.md +++ b/faq/upgrade-faq.md @@ -40,7 +40,7 @@ This section lists some FAQs and their solutions after you upgrade TiDB. When upgrading from an earlier version to v7.4 or later, if the `connectionCollation` is not configured, and the `characterEncoding` is either not configured or configured as `UTF-8` in the JDBC URL, the default collation in your JDBC connections might change from `utf8mb4_bin` to `utf8mb4_0900_ai_ci` after upgrading. If you need to maintain the collation as `utf8mb4_bin`, configure `connectionCollation=utf8mb4_bin` in the JDBC URL. -For more information, see [Collation used in JDBC connections](/faq/sql-faq.md#the-collation-used-in-jdbc-connections). +For more information, see [Collation used in JDBC connections](/faq/sql-faq.md#collation-used-in-jdbc-connections). ### The character set (charset) errors when executing DDL operations From 19353bd7323c959e43c74757a71e4a09e2d242ee Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Mon, 28 Apr 2025 13:41:52 +0800 Subject: [PATCH 14/15] Update develop/dev-guide-sample-application-java-jdbc.md --- develop/dev-guide-sample-application-java-jdbc.md | 1 - 1 file changed, 1 deletion(-) diff --git a/develop/dev-guide-sample-application-java-jdbc.md b/develop/dev-guide-sample-application-java-jdbc.md index d757ac55b014a..8266e17bcc9ff 100644 --- a/develop/dev-guide-sample-application-java-jdbc.md +++ b/develop/dev-guide-sample-application-java-jdbc.md @@ -14,7 +14,6 @@ In this tutorial, you can learn how to use TiDB and JDBC to accomplish the follo - Connect to your TiDB cluster using JDBC. - Build and run your application. Optionally, you can find [sample code snippets](#sample-code-snippets) for basic CRUD operations. - > **Note:** From e7f8bfcf3dd0e50a86564733d845d5f5fe349066 Mon Sep 17 00:00:00 2001 From: Test User Date: Mon, 28 Apr 2025 13:52:03 +0800 Subject: [PATCH 15/15] fix the link to `collation_connection` --- faq/sql-faq.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/faq/sql-faq.md b/faq/sql-faq.md index 68e89c2c0ac28..334209b7dfd83 100644 --- a/faq/sql-faq.md +++ b/faq/sql-faq.md @@ -357,9 +357,9 @@ When `connectionCollation` is not configured in the JDBC URL, there are two scen ### How to handle collation changes after upgrading TiDB? -In TiDB v7.4 and earlier versions, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the TiDB [`collation_connection`](/system-variable-reference.md#collation_connection) variable defaults to the `utf8mb4_bin` collation. +In TiDB v7.4 and earlier versions, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the TiDB [`collation_connection`](/system-variables.md#collation_connection) variable defaults to the `utf8mb4_bin` collation. -Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the value of the [`collation_connection`](/system-variable-reference.md#collation_connection) variable depends on the JDBC driver version. For example, for Connector/J 8.0.26 and later versions, the JDBC driver defaults to the `utf8mb4` character set and uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver, and the [`collation_connection`](/system-variable-reference.md#collation_connection) variable uses the `utf8mb4_0900_ai_ci` collation. For more information, see [Collation used in JDBC connections](#what-collation-is-used-in-a-jdbc-connection-when-connectioncollation-is-not-configured-in-the-jdbc-url). +Starting from TiDB v7.4, if `connectionCollation` is not configured, and `characterEncoding` is either not configured or set to `UTF-8` in the JDBC URL, the value of the [`collation_connection`](/system-variables.md#collation_connection) variable depends on the JDBC driver version. For example, for Connector/J 8.0.26 and later versions, the JDBC driver defaults to the `utf8mb4` character set and uses `utf8mb4_general_ci` as the connection collation. TiDB follows the driver, and the [`collation_connection`](/system-variables.md#collation_connection) variable uses the `utf8mb4_0900_ai_ci` collation. For more information, see [Collation used in JDBC connections](#what-collation-is-used-in-a-jdbc-connection-when-connectioncollation-is-not-configured-in-the-jdbc-url). When upgrading from an earlier version to v7.4 or later (for example, from v6.5 to v7.5), if you need to maintain the `collation_connection` as `utf8mb4_bin` for JDBC connections, it is recommended to configure the `connectionCollation` parameter in the JDBC URL.