Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WAL: encrypt segment ranges #64

Draft
wants to merge 4 commits into
base: TDE_REL_17_STABLE
Choose a base branch
from

Conversation

dAdAbird
Copy link
Member

@dAdAbird dAdAbird commented Feb 13, 2025

WIP

Encryption and decryption works. Server would create WAL internal key during the start if needed (if wal_encryption GUC has changed)

TODO:

  • Refactoring
  • Comments
  • Docs

@it-percona-cla
Copy link

it-percona-cla commented Feb 13, 2025

CLA assistant check
All committers have signed the CLA.

Pass additional args to seg_read/seg_write functions
so they don't rely on page header to calculate IV offsets
We don't rely on page headers anymore. All nececary encryption data
now in internal keys
@@ -58,6 +58,8 @@ extern PGDLLIMPORT int wal_decode_buffer_size;

extern PGDLLIMPORT int CheckPointSegments;

extern PGDLLIMPORT bool wal_need_seg_switch;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover variable? I see no use anywhere.

{
if (key == NULL)
/*
* Check if the key's range overlaps with the buffer's and decypt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the optimal case, but if we can have multiple keys in the same segment file, not necessarily true. We can turn on / off / on / off / on / off ... wal encryption, and this function will have to deal with scenarios where the specified range is encrypted by many keys in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works with multiple keys. We decrypt everything up to the key's end_lsn or the buffer end (whatever comes first). And the go to the next key and so on and so forth.
The implementation here isn't the most efficient though. See my reply to your loop comments below.

XLogSegNoOffsetToRecPtr(segno, offset + count, wal_segsz_bytes, data_end);

curr_key = keys;
while (curr_key)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This while loop (and generally the use of a linked list here) seems to be far from optimal, especially for a long running server.

  • We can assume that usually we want to read the more recent records - iterating from the last key would make more sense
  • We can assume that multiple keys to the seq read function usually read sequentially. The key for the second call will be most likely the same key as the first key, or if not, the next one.
  • And if we have many keys, using a vector and logarithmic search would be much better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know. This is one of TODOs, but I forgot to put it in writing.
Linked lists are rubbish and the implementation here is far from ideal. We shouldn't go through the key but rather pick buffer the needed range(s). But as you said, there has to be some b-tree or heap or whatever log data structure for that. There plethora of nuances that can be improved. But I'd like to make it work altogether and then improve from there.

PG_TDE_DECRYPT_DATA(iv_prefix, dec_off,
(char *) buf + (offset - dec_off),
dec_sz, (char *) buf + (offset - dec_off),
curr_key->key);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing break here if we the current lsn of the key is > end lsn of data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

if (key == NULL)
{
ereport(ERROR,
(errmsg("WAL encryption can only be enabled with a properly configured key. Disable pg_tde.wal_encrypt and create one using pg_tde_crete_wal_key() before enabling it.")));
(errmsg("WAL encryption can only be enabled with a properly configured principal key. Disable pg_tde.wal_encrypt and create one using pg_tde_set_server_principal_key() before enabling it.")));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default principal key also should work.

#define TDE_KEY_TYPE_HEAP_BASIC 0x01
#define TDE_KEY_TYPE_SMGR 0x02
#define TDE_KEY_TYPE_GLOBAL 0x04
#define TDE_KEY_TYPE_WAL_DECRYPTED 0x08
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unencrypted?

@dutow
Copy link
Collaborator

dutow commented Feb 17, 2025

Also, in this implementation we only generate a new internal key for WAL on server restart, if the encryption status is changed - the server has to be started with wal encryption off, shut down then again with wal encryption on, as it is only done in the initialization code after changes.

When we discussed ideas, I thought the concept was to have a different internal key for each wal segment file, to make it more robust/secure.

@@ -144,6 +144,7 @@ bool XLOG_DEBUG = false;

int wal_segment_size = DEFAULT_XLOG_SEG_SIZE;

bool wal_need_seg_switch = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to swtich segments? I thought with the new solution we would not need to do that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a leftover, I'll remove it

RelKeyData *key;

struct WALKeyCacheRec *next;
} WALKeyCacheRec;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the plan to handle timelines? Do they cause any issues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the thing a wanted to discuss today

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants