Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot bulk insert from one local table to another local table if one has an identity #7887

Open
nick-oehmen opened this issue Feb 5, 2025 · 4 comments
Assignees

Comments

@nick-oehmen
Copy link

Hello! While implementing Citus distributed and reference tables, we encountered an issue inserting data from one local table to another when the target local table has an identity. Repro steps below:

Citus version: Citus 12.1.6 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit

--create a test schema
create schema test;

-- add our first local table
create table test.local1(
    id text not null primary key
);

-- add our reference table
create table test.reference1(
    id int not null primary key,
    reference_col1 text not null
);

-- apply the reference table type
select create_reference_table('test.reference1');

-- add our second local table
create table test.local2(
    id int not null generated always as identity,
    local1fk text not null,
    reference1fk int not null,
    constraint loc1fk foreign key (local1fk) references test.local1(id),
    constraint reference1fk foreign key (reference1fk) references test.reference1(id),
    constraint testlocpk primary key (id)
);

-- create a distributed table
create table test.distributed1(
    id int not null generated always as identity,
    distro_key int not null,
    constraint testdistro1pk primary key (id, distro_key)
);

-- apply distribution on the distro key
select create_distributed_table('test.distributed1', 'distro_key');

-- add our first local table values
insert into test.local1(id) values ('aaaaa'), ('bbbbb'), ('ccccc');

-- add our reference data
insert into test.reference1(id, reference_col1) values (1, 'test'), (2, 'test2'), (3, 'test3');

-- add data to our distributed table
insert into test.distributed1(distro_key) values (1), (1), (1), (2), (2), (3);

-- we confirm that we have two local tables and one reference table
select
    a.table_name,
    b.citus_table_type
from information_schema.tables a
    left join citus_tables b on b.table_name::text = concat(a.table_schema::text, '.', a.table_name::text)
where a.table_schema = 'test'
	and a.table_type = 'BASE TABLE'
order by a.table_name

-- bulk insert from our first local table into our second
insert into test.local2(local1fk, reference1fk)
select id, 1 from test.local1

-- why does this fail?  invalid string enlargement request size: -4

At first I assumed this might be an unsupported operation, but I tried it again in the exact same scenario in which the local table has a serial PK rather than an identity, and did not receive the error. As a result, I suspect this is a bug related to support of identities.

Successful test with serial instead:

--create a test schema
create schema testserial;

-- add our first local table
create table testserial.local1(
    id text not null primary key
);

-- add our reference table
create table testserial.reference1(
    id int not null primary key,
    reference_col1 text not null
);

-- apply the reference table type
select create_reference_table('testserial.reference1');

-- add our second local table
create table testserial.local2(
    id bigserial,
    local1fk text not null,
    reference1fk int not null,
    constraint loc1fk foreign key (local1fk) references testserial.local1(id),
    constraint reference1fk foreign key (reference1fk) references testserial.reference1(id),
    constraint testlocpk primary key (id)
);

-- create a distributed table
create table testserial.distributed1(
    id bigserial,
    distro_key int not null,
    constraint testdistro1pk primary key (id, distro_key)
);

-- apply distribution on the distro key
select create_distributed_table('testserial.distributed1', 'distro_key');

-- add our first local table values
insert into testserial.local1(id) values ('aaaaa'), ('bbbbb'), ('ccccc');

-- add our reference data
insert into testserial.reference1(id, reference_col1) values (1, 'test'), (2, 'test2'), (3, 'test3');

-- add data to our distributed table
insert into testserial.distributed1(distro_key) values (1), (1), (1), (2), (2), (3);

-- we confirm that we have two local tables and one reference table
select
    a.table_name,
    b.citus_table_type
from information_schema.tables a
    left join citus_tables b on b.table_name::text = concat(a.table_schema::text, '.', a.table_name::text)
where a.table_schema = 'testserial'
	and a.table_type = 'BASE TABLE'
order by a.table_name

-- bulk insert from our first local table into our second
insert into testserial.local2(local1fk, reference1fk)
select id, 1 from test.local1

-- why did this work, but an identity did not?
@colm-mchugh
Copy link
Contributor

@nick-oehmen we need to investigate and fix the error you are hitting in your first test, but as a workaround could you omit the reference1fk constraint from the CREATE TABLE test.local2 statement and add it after test.local2 has been populated with the INSERT from test.local1 ?

@nick-oehmen
Copy link
Author

Thank you for the response @colm-mchugh! The example I provided is a pretty simple one just to demonstrate the issue, the actual issue is occurring in a much larger DB as we're testing implementing distribution prior to pushing to the live production DB. The bulk insert operations occur throughout the DB in stored procedures and there's no way to prevent them from running after distribution is implemented, if anything this may just delay our implementation of table distribution until this issue is fixed. Do you happen to know a timeline for resolution?

@colm-mchugh
Copy link
Contributor

Do you happen to know a timeline for resolution?

We will aim to fix this in Citus 13.0.1, so approx. 6-8 weeks from now.

@m3hm3t m3hm3t self-assigned this Feb 17, 2025
@m3hm3t
Copy link
Contributor

m3hm3t commented Feb 19, 2025

Analysis suggests that in Citus’s multi-copy logic, the row descriptor becomes mismatched for the identity column. Instead of allowing Postgres to handle the default identity value, Citus incorrectly maps the returning SELECT columns onto the table columns. This leads to corrupted data in the slot, resulting in the meltdown within textsend/enlargeStringInfo().

Actual Behavior

Citus triggers a corruption in the row descriptor for the local table’s identity column. The code attempts to call textsend on a non-text datum (or interprets memory incorrectly).

Technical Analysis

  • Stack Trace Excerpts show multi_copy.c -> AppendCopyRowData -> textsend -> enlargeStringInfo(...).
  • Debugger Logs indicate that for the identity column, the code sets up int4send or tries to skip it, but the actual valueArray[] gets mismatched with the text column.
  • This mismatch typically arises in partial inserts where only 2 columns are provided for a table of 3 columns, and the identity is not explicitly listed.
  • Using a SERIAL column instead of IDENTITY avoids the bug, implying that Citus’s logic for partial inserts has been well-tested for serial-like defaults but not for identity definitions.

Why SERIAL Works but IDENTITY Fails

  • SERIAL is simply INT + DEFAULT nextval(...). Citus historically handles partial inserts that omit SERIAL columns without trouble.
  • IDENTITY (especially GENERATED ALWAYS) uses a different internal catalog mechanism (attidentity), so the Citus code path that aligns columns may not treat it as a default properly.

Potential Fix Outline

  • In Citus’s multi_copy.c, ensure that identity columns omitted from the INSERT target list are treated the same as columns with a default.
  • Confirm that column → slot assignment in partial inserts does not shift or misalign the rest of the columns.

References

  • Postgres error message chain: textsend() -> pq_sendtext() -> enlargeStringInfo() -> “invalid string enlargement request size: -4”.
  • Citus code paths: CitusSendTupleToPlacements, AppendCopyRowData, AddSlotToBuffer in multi_copy.c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants