Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(base): change base to SeaweedFS #1

Merged
merged 7 commits into from
Jan 22, 2024

Conversation

andrewazores
Copy link
Member

  • chore(base): change base to SeaweedFS
  • add customized config template and entrypoint script
  • add timed rebuilds

@andrewazores andrewazores requested a review from ebaron January 15, 2024 20:18
@andrewazores andrewazores added the feat New feature or request label Jan 15, 2024
@andrewazores andrewazores changed the title chore(base): change base to SeaweedFS feat(base): change base to SeaweedFS Jan 15, 2024
@tthvo
Copy link
Member

tthvo commented Jan 15, 2024

Just curious here but wondering what are the reasons to switch to seaweed here?

@andrewazores
Copy link
Member Author

A couple of things:

  • SeaweedFS is Apache licensed, same as our projects, which is more comfortable than Minio's AGPL
  • Configuration for SeaweedFS authentication/authorization is a bit simpler and easier. Should be feasible to hide it completely behind an auth proxy and run it with no auth on its own, though this at least breaks the presigned URL feature that we're currently using in 3.0, so that would need to be redesigned
  • (supposedly) better performance and data consistency guarantees

@tthvo
Copy link
Member

tthvo commented Jan 15, 2024

Oh i see! Thanks for explaining!!

though this at least breaks the presigned URL feature that we're currently using in 3.0, so that would need to be redesigned

I am pretty curious how this issue actually is going to work out. But make sense to me.

@andrewazores
Copy link
Member Author

andrewazores commented Jan 15, 2024

In 3.0 when the client asks to download a file, Cryostat puts the requested data into the S3 provider (seaweed) if necessary (ie for an active recording. For an archived recording it's already there so we skip this first step), then responds to the client with a redirect response to another endpoint. When the client hits that endpoint, Cryostat asks S3 to generate a presigned URL for downloading that file, which comes with some query parameters and things that ensure the request has not been tampered with and the client is only able to access that one file. Cryostat then responds to the client again with another redirect response, this time pointing it to the S3 presigned URL. The client finally goes and makes a request to this URL, including the query parameters, and S3 sends the file back to the client.

In 2.2 or 2.3 2.1 I implemented a system where the client requests to download a particular file and the server generates a JWT and returns that back to the client, and the client can then make a new request to a different URL and pass that JWT as a query parameter instead of passing an Authorization header. The JWT is signed and encrypted and embeds information about the resource being requested etc., so when Cryostat receives a request with a JWT and no Authorization header, it's able to check that the signature is still valid, decrypt the token, and validate that the data embedded in the JWT matches the request it was attached to. If it does then Cryostat sends back the file the client requested.

So if 3.0 can't use presigned URLs in the end due to the auth proxy architecture, then it'll just need to reimplement something like that 2.x JWT download flow. From the client's perspective they are actually quite similar already, so this is just internal implementation detail.

@tthvo
Copy link
Member

tthvo commented Jan 15, 2024

Understood now! Thanks for the detailed explanations!

@andrewazores
Copy link
Member Author

I still need to get around to deploying 3.0 with Seaweed in OpenShift and an auth proxy, set up the Services and Routes, and test how authentication between Cryostat and Seaweed works, and then how authentication between Seaweed and an external client (your browser) works.

  • we want presigned URL support, but this requires Seaweed to be configured with user accounts (not no-auth), like I added as a template config file in this PR
  • but if Seaweed requires authentication, are we able to pass those credentials through the auth proxy along with the credentials required for the auth proxy itself? ie does Cryostat need to somehow supply both OpenShift credentials to the proxy, and have the forwarded request also include Seaweed credentials?
  • we could probably avoid that requirement by exposing Seaweed through the proxy and a Route externally, and directly (no proxy) on a Service internally, but it would be even better security to always pass traffic through the proxy
  • we could also avoid that by not using an auth proxy in front of Seaweed at all and simply depending on strong generated secrets as in this PR's template config. This is not ideal since Seaweed is exposed directly internally and externally, but we can restrict that to a single port and with strong secrets for credentials

@tthvo
Copy link
Member

tthvo commented Jan 15, 2024

how authentication between Seaweed and an external client (your browser) works.

Actually, I am wondering if the client needs to authenticate with Seaweed at all if using presigned url. I thought the credentials will only be used to generate the url and specify the scope/action. Then, wouldn't it be a publicly available url to download anonymously?

Then, I guess seaweed can just be behind proxy with OpenShift credentials (some tweaks on web client to include the bearer when accessing presigned url) or no proxy at all. Or I might be missing sth.

@andrewazores
Copy link
Member Author

The client won't need Seaweed credentials when using a presigned URL, but it will still need to pass through the auth proxy with OpenShift credentials. That should be easy, I don't see any reason that won't look basically the same as how it already looks with oauth2-proxy in smoketest.

@tthvo
Copy link
Member

tthvo commented Jan 16, 2024

Sounds good! Thanks for the details ^^

@andrewazores
Copy link
Member Author

andrewazores commented Jan 17, 2024

@ebaron I think there's more work to do in here around tuning the volume server parameters - for some reason it didn't like the defaults of "no max (number of) volumes" and 1GB per volume, so over in cryostatio/cryostat-helm#111 I have it deployed using #4 with some smaller hardcoded limits that seem to work for a basic smoketest.

Otherwise, I think this is settled enough to use as a launching point for more work like cryostatio/cryostat-helm#111 , cryostatio/cryostat#241 , cryostatio/cryostat-operator#710 .

Dockerfile Show resolved Hide resolved
Copy link
Member

@ebaron ebaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Dockerfile Show resolved Hide resolved
@andrewazores andrewazores merged commit 528cb40 into cryostatio:main Jan 22, 2024
1 check passed
@andrewazores andrewazores deleted the seaweed branch January 22, 2024 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants