-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster/easier loading of types #2046
Conversation
@jackc when tests run against cockroachdb, I see:
I don't have any experience with this database, and I don't see any mention of being incompatible with this function. If pgx needs to support cockroachdb, how would you suggest we do so? I could try using an alternate syntax, but I'd like your advice before doing so. |
All regression tests apart from cockroachdb are passing: https://github.com/nicois/pgx/actions/runs/9526586068 |
b026acd
to
a9dbc1b
Compare
Regarding Cockroach DB, if it is something that is not supported by that database engine then there is a helper method to skip tests. But as far as I know, I also was wondering if the loading all custom types includes table types. |
Regarding the loading of all types: yes, there is further work needed there. I've added a simple test and there are complications. I think this needs some experimentation/discussion, and it's probably better to cover that in its own PR. I will remove the public method to invoke the loading of all types from this PR. |
9fba6cf
to
f238d1d
Compare
It's passing with cockroachdb now. I've also reinstated the original code for |
f238d1d
to
e7e89b3
Compare
I have refined the SQL further, to deal with more complex heirachies of types/domains/arrays/composite types. |
8b6bf61
to
924f798
Compare
I suppose there is no way around the bit recursive query. But, wow, that is scary looking. I'd hate to have to debug that. 🤷 I also think the server version should be an internal function. |
Sure, I'll make it internal. I thought it might be helpful for a client, but perhaps it's better to not expose it. That query isn't nice, and it's quite possible it could be optimised in some way too; it takes longer than I'd like it to. However, my upcoming pgxpool PR has support for caching this mapping (for non-replicated environments), meaning the overhead is only felt once per invocation. |
I'll add a few comments to the scary SQL to make it a little less nightmarish. |
924f798
to
ca18af4
Compare
I've changed the |
Hmmm... The functionality looks good, but I'm not so sure about the interface. At first glance, I didn't like Then I think I understand at this point why Upon further reflection, I also question if So here's some suggestions for your consideration. I'm not sure if they are good ideas, but I think they are worth looking into.
Also, one other random thing. I noticed |
ca18af4
to
06c0451
Compare
I've addressed the feedback.
There are a couple of reasons I didn't change the original
|
Regarding the use of the simple protocol: when I remove this, at least when running locally the test fails:
I am quite new to pgx so perhaps I am missing something obvious which would allow the connection to work in the default mode. |
c2503dd
to
f68972a
Compare
Regarding scanning I like the progress that is being made here. But I've been reviewing this PR, looking back at the existing pgx type code, and mulling it over. And I think there might be more fundamental improvements that can be made. As I understand it, you are trying to improve two issues. 1. This PR is making loading types faster. 2. The pgxpool PR is trying to make it unnecessary to introspect the database for each connection. They are somewhat related and improvements in one may affect what is desirable or necessary for the other. Regarding loading types faster, as we've discussed above, I consider it unfortunate that we end up with 2 similar things ... 🤔 ... 🤔 ... 🤔 ... Actually, wait a second. Why do we need the My previous observation was wrong:
It is possible to safely share the I think this means PR this and the pgxpool PR can be significantly simplified. We still need the complicated SQL query, but I think much of the rest could go away. Here's an idea: We add a
Add another small change and we also solve the This should allow for all type introspection to be done in a single SQL query regardless of how many types or what kind types are used. Regarding the pgxpool change to avoid having to introspect the database for each connection, I'm not sure how much of that would still be necessary either. Since |
Ok, I see where you're going with this. I will try to avoid the need to define a second type, if I can. Regarding pgxpool: the type map has to be populated for each connection, doesn't it? And there is still the need to support situations where the pool does not always connect to servers sharing the same OID mapping. Assuming that, there are wins in making it easy to let the custom and automatic loaders be defined more easily, and to cache the results when it's safe to do so. Either way, it's good to keep the PRs separate as the first one is needed either way, and once it's solid, it's easier to look at the incremental benefit of the other. |
f68972a
to
4660f1d
Compare
I've made some changes which didn't take very long at all, and am pretty happy with the results:
|
I've tried removing the simple protocol designator, then using |
I mentioned this idea on #2056, but I will restate it here. What if This would remove the need to manually load types after connection, and would probably remove most uses of the after connect hooks provided by pgxpool and stdlib. It would also remove the need to have any new functions or methods. The only new interface would be additional fields on This would be an even simpler interface for loading types. |
4660f1d
to
80ddeed
Compare
@jackc I have tried to modify these 3 PRs to minimise the change in the interface, as you suggested. It did work out a bit less messy than I anticipated, so let me know what you think. |
abda9f0
to
22fb4e4
Compare
Thanks for making all these changes. I still think this configuration should not live on the pool. But now actually seeing it on the connection and especially seeing how it complicates #2048 ( Here's what I think we should do. Let's go back to some of your original design.
I'll merge just that core functionality. Then we can evaluate what is the most convenient interface for exposing the functionality. With regard to that convenient interface, the pgxpool interface and possibly configuration on |
OK, I will undo those changes. It is hard to get this right; there are competing desires and we have to choose a single implementation to service them all. There is one refinement I think I need to make to the autoloader, which I am testing locally at the moment: I believe that we need to register both the namespace and non-namespace name with the type map. I have been running into trouble where some types are defined in namespaces not in the search path, and I think this is needed. It would be nicer if a single type could be assigned multiple names (to result in a single OID being associated with a single typemap entry), but I don't think this is currently possible. |
84103a1
to
dcb32f9
Compare
I've pushed the earlier version back, with the duplicate creation of types for the namespaced version of each type's name. You can more easily see this here: https://github.com/jackc/pgx/compare/4660f1d7546adb2f6091625ed9fe42f393daf522..dcb32f94021386bf4f450b858281687faa4a0641 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few notes above in the review, but none of them are critical. We also might want to add a test for the namespaced type path.
But I think this is mergeable at this point. Just let me know if you are planning on making any more changes.
supportsMultirange := (pgVersion >= 14) | ||
var typeNamesClause string | ||
|
||
if typeNames == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this branch still valid? As far as I can see the only caller, LoadTypes, enforces len(typeNames) > 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer this branch to be kept, as the alternative is to fail at runtime if a nil
is passed in. It is an unexercised branch right now, and the compiler will optimise it away.
If there were a way to make it a compile-time error, if a nil
was passed in, it would be even better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified this branch to run, but not match any records. This shouldn't usually happen, but at least avoids a runtime error.
derived_types.go
Outdated
for i, fieldName := range ti.Attnames { | ||
//if fieldOID64, err = strconv.ParseUint(composite_fields[i+1], 10, 32); err != nil { | ||
// return nil, fmt.Errorf("While extracting OID used in composite field: %w", err) | ||
//} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be removed?
I'll address your feedback above, then let you know. We can follow up further if I need be. |
dcb32f9
to
7f85b19
Compare
ok, this should be good to merge now. |
When loading even a single type into pgx's type map, multiple SQL queries are performed in series. Over a slow link, this is not ideal. Worse, if multiple types are being registered, this is repeated multiple times. This commit add LoadTypes, which can retrieve type mapping information for multiple types in a single SQL call, including recursive fetching of dependent types. RegisterTypes performs the second stage of this operation.
👍 |
As discussed in #2030 , improve the
LoadType
method to use a single SQL call.Add
LoadTypes()
which adds multiple types in a single call, and supports recursively loading of other implicitly-required types in the same SQL call.