-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for uuid7 or ulid #528
Comments
I'm assuming something like this would generally not need ongoing partition generation? As in, you'd define your child tables up front for each block of values and then be done? That's pretty well supported without needing something like pg_partman which primarily provides ongoing maintenance features. Or is there some scenario where you see that being needed with this data type? |
Also, is this data type supported in PG? Don't see it specifically listed here in the uuid data types. |
Both UUIDv7 and ULID encode a unix timestamp which allow partitioning by time (hours, day, etc...) so the idea is to be able to use time partitioning using a unique identifier. Also, both UUIDv7 and ULID use UUID format and therefore are compatible with the PostgreSQL |
Is it possible to decode the time value out of that? Otherwise, how do you see defining the partitioning interval on this? Not quite that familiar with UUID usage myself. |
Absolutely, see for the format: https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-01.html#section-4.4 In a nutshell because the first 36-bit of the 128-bit UUID is a big-endian unsigned Unix Timestamp value, I suppose we can easily define the partitions range by time. Also found https://github.com/fboulnois/pg_uuidv7 |
Ok, I see how it might work now, but it's a pretty specialized request. I have a lot of other stuff I'd like to get done first (see open issues) before I'd really have time to consider looking into adding this myself. However, after 5.x is out, if you'd like to work on a PR to support this before I would have time to look into this, I'd be happy to review it. |
I came here to ask for a similar feature, but from a different angle. UUID7/ULID is a good identifier that is time-based and hence grows monotonically. It’s very natural to use it not just as Primary Key, but also for partitioning based on timestamp derived from it. One way to solve this would have been maintaining a generated/unstored column that would extract timestamp from the UUID, but it would bring in extra effort for the application to account for the generated column on the read path, not to mention changing the PK to include it. I thought a generic solution to this would be a way for pgpartman to support another way of declaring partitions: not by a date-type column, but by a function, so that the function would take the PK(s) and return the applicable time range for the partitioning. |
@rauanmayemir declaring functions by a function seems like a brilliant idea. |
Hi, is there any solution to this problem? |
@linglom1 you may want to consider timescaledb that supports the described scheme out of the box. |
I have a similar requirement where we use time ordered uids similar to twitter's snowflake ids. This practice is becoming pretty common in many places. I was also thinking along the lines of @rauanmayemir 's suggestion to externalize the logic by a function, except something lightweight in the form of "encoder" function that encodes a timestamp as per one's id specification. This way the time range generation logic itself can continue to be leveraged from the extension. |
@keithf4 bumping this up, let me know if you're open for a PR along the above lines ^ |
I'm certainly open to a PR. Thank you! |
Hey @keithf4, wondering if you had a chance to look at this? |
Sorry I have not yet. Pretty busy with some other projects at the moment. Hoping to get back to partman more in the next month or so |
Version 5.2.0 has been released. Thanks to everyone that helped get this feature implemented! |
This is a much better solution to what I opened #388 to discuss, especially since I have been using timestamptz encode/decode functions for other things. Having the extension ready to go for mainstream UUIDv7 support in PG18 is also fantastic. Unfortunately, my snowflake ID columns are bigint, which is not a control column type allowed by the changes added in #683. Up to this point I've been using non-epoch ID partitioning, with large values for interval and retention. Enough data comes in that unexpected gaps almost never happen. To make it time-based, I propose a special type value for |
@calebj |
In addition to time and id range it would be nice to have support for uuid7 or ulid (Universally Unique Lexicographically Sortable Identifiers).
Thanks!
The text was updated successfully, but these errors were encountered: