-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Query endpoint #1643
base: main
Are you sure you want to change the base?
✨ Query endpoint #1643
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
1a96032
to
a27fada
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1643 +/- ##
==========================================
- Coverage 67.42% 66.22% -1.21%
==========================================
Files 55 57 +2
Lines 4632 4950 +318
==========================================
+ Hits 3123 3278 +155
- Misses 1284 1427 +143
- Partials 225 245 +20
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
17e0f9a
to
729645c
Compare
What's the relation of this PR to #1642? Which PR is a duplicate and which should be reviewed? |
729645c
to
478c152
Compare
@azych apologies for the confusion. You're just seeing the result of rapid experimentation. This is the one to review, but it's still a WIP. I've converted this PR to a draft. |
d8dcf61
to
f94eb2e
Compare
httpError(w, fs.ErrNotExist) | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should change to write a 200 status and write no content to the response body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that index.Get()
was only ever returning true
(so this code path was never actually being hit). Something else solved the test case failure though, but I went ahead and cleaned up Get()
's signature 53692a9
storeMetaFuncs = append(storeMetaFuncs, storeIndexData) | ||
} | ||
|
||
var ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throughout these lines where we setup and run concurrent channel writers and readers, we should add some comments to explain what is happening.
func serveJsonLines(w http.ResponseWriter, r *http.Request, modTime time.Time, rs io.ReadSeeker) { | ||
w.Header().Add("Content-Type", "application/jsonl") | ||
http.ServeContent(w, r, "", modTime, rs) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With my suggestion of backing away from the multi-read-seeker code, we'll have to have a different version of this function that the query handler can use (the reader returned by the index will no longer be a io.ReadSeeker, it'll just be a reader)
Unfortunately, if we don't use http.ServeContent
for the query handler, we also do not get the Last-Modified
/If-Modified-Since
header handling. Does the standard library have a way for us to implement this caching logic so that we avoid implementing bugs in our own implementation? If we copy code from the Go stdlib, let's just make sure it's license allows that and we do whatever attribution is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was able to find a way to http.ServeContent
with io.Reader
79da4de
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied the checkPreconditions
function and all the code associated with it from the net/http package in the http_preconditions_check.go
file. I did have to make modifications to the copied code too. Imho it's smells funky to me, it's a lot of copied code that we'll have to hope won't carry any bugs, especially with the modifications I did. But, for now it's in there.
|
||
func (m *mockStorageInstance) StorageServerHandler() http.Handler { | ||
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { | ||
w.Write([]byte(m.content)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually care what this content is, or do we just care about its length? If the latter, I'd suggest removing the testCompressableJSON
variable and replacing with with something like strings.Rand(sizeThreshold+1)
"github.com/operator-framework/operator-registry/alpha/declcfg" | ||
) | ||
|
||
type index struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add some GoDoc to index
to help our future selves understand how the indexing works.
type index struct { | |
// index is a simple index of sections of an FBC file used to lookup FBC blobs that | |
// match any combination of their schema, package, and name fields. A section is the | |
// byte offset and length of an FBC blob within the file. | |
// | |
// This index strikes a balance between space and performance. It indexes each field | |
// separately, and performs logical set intersections at lookup time in order to implement | |
// a multi-parameter query. | |
// | |
// Note: it is permissible to change the indexing algorithm later if it is necessary to | |
// tune the space / performance tradeoff. However care should be taken to ensure | |
// that the actual content returned by the index remains identical, as users of the index | |
// may be sensitive to differences introduced by index algorithm changes (e.g. if the | |
// order of the returned sections changes). | |
type index struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that we'd want robust test cases for the indexing algorithm too, but that too is probably a follow up in the interest of time.
return newMultiReadSeeker(srs...), true | ||
} | ||
|
||
func (i *index) getSectionSet(schema, packageName, name string) sets.Set[section] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that occurs to me is that the indexed sections are always guaranteed to be sorted by the offset. We could likely take advantage of that fact and implement a more performant algorithm that can step directly through each slice rather than building a set for each slice, performing a set intersection, the converting back into a slice and resorting.
But this is something we can capture as a follow-up.
ee14624
to
9009472
Compare
// in os.ErrNotExists | ||
type filesOnlyFilesystem struct { | ||
FS fs.FS | ||
func serveJsonLines(w http.ResponseWriter, r *http.Request, modTime time.Time, rs io.ReadSeeker) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have a separate function to serve data for the query handler, I think we should have the all handler directly call http.ServeFile
(after explicitly setting the "Content-Type")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like 046c66c. We get to retain the benefits of serverContent
that way.
9009472
to
aa78a6c
Compare
Signed-off-by: Joe Lanford <[email protected]>
Signed-off-by: Joe Lanford <[email protected]>
Signed-off-by: Joe Lanford <[email protected]>
Signed-off-by: Joe Lanford <[email protected]>
aa78a6c
to
3b6b263
Compare
Description
Implementation of Query Endpoint API RFC
Reviewer Checklist