Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL Emoji character replicate "?" for PostgreSQL #157

Open
giant-ksj opened this issue Apr 13, 2023 · 4 comments
Open

MySQL Emoji character replicate "?" for PostgreSQL #157

giant-ksj opened this issue Apr 13, 2023 · 4 comments
Assignees
Labels

Comments

@giant-ksj
Copy link

giant-ksj commented Apr 13, 2023

Describe the bug
A clear and concise description of what the bug is.

  1. Source Data (MySQL v8)
    mysql> select * from test;
    +----+------+
    | id | nm |
    +----+------+
    | 2 | abc |
    | 3 | def |
    | 4 | def |
    | 5 | 😥 |
    +----+------+

  2. init_replica & start_replica

  3. Target Data (PostgreSQL v13)
    postgres> select * from test;
    id | nm
    ----+-----
    2 | abc
    3 | def
    4 | def
    5 | ? <- Emoji character print "?"

  4. Inset new Emoji on MySQL
    mysql> insert into test values(6, '😥');
    mysql> select * from test;;
    +----+------+
    | id | nm |
    +----+------+
    | 2 | abc |
    | 3 | def |
    | 4 | def |
    | 5 | 😥 |
    | 6 | 😥 |
    +----+------+

  5. PostgreSQL print correct Emoji.
    postgres> select * from test;
    id | nm
    ----+-----
    2 | abc
    3 | def
    4 | def
    5 | ?
    6 | 😥

I tried file and direct options when execute init_replica. but both option has same problem.
After updated id=5 row on MySQL, then PostgreSQL print Emoji correctly.

mysql> alter table test add col varchar(10);
mysql> update test set col = 'help' where id = 5;
mysql> select * from test;
+----+------+------+
| id | nm | col |
+----+------+------+
| 2 | abc | NULL |
| 3 | def | NULL |
| 4 | def | NULL |
| 5 | 😥 | help |
| 6 | 😥 | NULL |
+----+------+------+

postgres> select * from test;
id | nm | col
----+-----+--------
2 | abc | (null)
3 | def | (null)
4 | def | (null)
6 | 😥 | (null)
5 | 😥 | help

To Reproduce
Steps to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.
image

image

Environment(please complete the following information):

  • OS: [CentOS7]
  • MySQL Version[8.0.19]
  • PostgreSQL Version[13]
  • Python Version [3.6]
  • Cloud hosted database [self managed vm]

Additional context
Add any other context about the problem here.

@the4thdoctor the4thdoctor self-assigned this Apr 16, 2023
@the4thdoctor
Copy link
Owner

I'll try to install mysql 8 and check what's going on.
Can you provide me with the create table statement please?

@giant-ksj
Copy link
Author

@the4thdoctor

CREATE TABLE test (
id int NOT NULL,
nm varchar(10) COLLATE utf8mb4_general_ci DEFAULT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

@splainez
Copy link

The problem is that using the configuration copy_mode: 'file' and the mysql connection charset as charset: 'utf8mb4', it gives the following error when running it

MainProcess CRITICAL mysql_lib.py (1585): init replica for source mysql failed
Traceback (most recent call last):
  File "/workspaces/pg_chameleon/venv/bin/chameleon.py", line 58, in <module>
    getattr(replica, args.command)()
  File "/workspaces/pg_chameleon/venv/lib/python3.9/site-packages/pg_chameleon/lib/global_lib.py", line 357, in init_replica
    self.__init_mysql_replica()
  File "/workspaces/pg_chameleon/venv/lib/python3.9/site-packages/pg_chameleon/lib/global_lib.py", line 367, in __init_mysql_replica
    self.mysql_source.init_replica()
  File "/workspaces/pg_chameleon/venv/lib/python3.9/site-packages/pg_chameleon/lib/mysql_lib.py", line 1565, in init_replica
    self.__copy_tables()
  File "/workspaces/pg_chameleon/venv/lib/python3.9/site-packages/pg_chameleon/lib/mysql_lib.py", line 826, in __copy_tables
    master_status = self.copy_data(schema, table)
  File "/workspaces/pg_chameleon/venv/lib/python3.9/site-packages/pg_chameleon/lib/mysql_lib.py", line 659, in copy_data
    csv_file = codecs.open(out_file, 'wb', self.charset)
  File "/usr/lib/python3.9/codecs.py", line 910, in open
    info = lookup(encoding)
LookupError: unknown encoding: utf8mb4

To fix this change the configuration to copy_mode: 'direct' and keep charset: 'utf8mb4'.

It seems you are trying to create the file with the mysql charset so these charsets are not valid.

@giant-ksj
Copy link
Author

@splainez
Thank you. In my case mysql:charset is 'utf8' and mysql:copy_mode is 'file'.
I changed that utf8 -> utf8mb and file -> direct. It works! I will test more case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants