Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): Support dynamic SAVE path #4729

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

mkaruza
Copy link
Contributor

@mkaruza mkaruza commented Mar 7, 2025

Add functionality to use SAVE and BGSAVE commands with dynamic CLOUD storage path.

New syntax should be (as discussed with @romange):

SAVE [RDB|DF] [CLOUD_URI] [BASENAME] where CLOUD_URI should start with S3 or GCS prefix.

For example, now it should work to have working directory pointing to some local folder and executing SAVE DF s3://bucket/snapshots my_snapshot would save snapshots to s3://bucket/snapshots with basename my_snapshot.

Resolves #4660

@mkaruza mkaruza requested a review from adiholden March 7, 2025 12:36
@mkaruza mkaruza force-pushed the mkaruza/github#4660 branch 2 times, most recently from 127fa72 to 759b39c Compare March 7, 2025 12:44
@romange
Copy link
Collaborator

romange commented Mar 9, 2025

I think there is a misunderstanding about the feature, so I would like to clarify it.

We need to support functionality that when dragonfly is configured with dragonfly --dir /disk/local/snapshots we can still be able to save it into an arbitrary cloud path using save df s3://mybucket/path (by ignoring dir). Regular SAVE will work as usual and will use dir to write local snapshots.

The challenge here is that the file system controller is created upon server initialization based on dir and it chooses its filesystem type then. once it was defined to say local fs, you can not use it for saving into cloud. so today with dragonfly --dir /disk/local/snapshots you can not save to s3://mybucket/path but with dragonfly --dir s3://bucket1/ you can save into dragonfly --dir s3://bucket2/anotherdir but you can not save to gs://googlebucket/path. so the goal is to create a temporary controller during save df configurations. Similarly we should do it with DFLY LOAD as well in the future. maybe even remove save_controller_ altogether and always use a temporary one.

mkaruza added 2 commits March 10, 2025 08:17
Add functionality to use SAVE and BGSAVE commands with dynamic path.
Third argument to this two functions was used to change basename for
snapshot, now we should also support writing to different locations
(directories).~

For example, command `SAVE DF snapshots/my-snapshot` will write to
directory `snapshots` with basename `my-snapshot`. If that directory
doesn't exists it will be created relative to provided working directory.

When cloud storage is used same logic is applied. Executing previous
command snapshots will be written to `s3://bucket/my-snapshot` folder
with basename `my-snapshot`.

Resolves #4660

Signed-off-by: mkaruza <[email protected]>
@mkaruza mkaruza force-pushed the mkaruza/github#4660 branch from 759b39c to efb4351 Compare March 10, 2025 12:58
@romange romange self-requested a review March 10, 2025 13:24
Copy link
Collaborator

@romange romange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, what's missing is a test in snapshot_test.py.
we already have s3 snapshot tests lets add one that uses a local dir but saves into cloud.

}
snapshot_storage_ = std::move(gcs);

if (IsS3Path(flag_dir) || IsGCSPath(flag_dir)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_stages_controller defines IsCloudPath internal function. Consider expose it and use here

// if new_version is true, saves DF specific, non redis compatible snapshot.
// if basename is not empty it will override dbfilename flag.
GenericError DoSave(bool new_version, std::string_view basename, Transaction* transaction,
GenericError DoSave(SaveCmdOptions save_cmd_opts, Transaction* transaction,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const SaveCmdOptions&

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's trivially copyable, IMO by value is fine. I guess the problem is that if we add a member that makes the struct non trivially copyable and then we need to remember to change that (or enforce it via static_assert so if it ever changes we are aware). Sooooo const & it is 🤣


void BgSaveFb(boost::intrusive_ptr<Transaction> trans);

GenericError DoSaveCheckAndStart(bool new_version, string_view basename, Transaction* trans,
GenericError DoSaveCheckAndStart(SaveCmdOptions save_cmd_opts, Transaction* trans,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const SaveCmdOptions&

} else if (IsGCSPath(ArgS(args, 1))) {
save_cmd_opts.cloud_uri = ArgS(args, 1);
} else {
// Probably basename than
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Probably basename than
// Probably basename then.

}
}

if (save_cmd_opts.basename.empty() && args.size() == 3) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider moving into if (args.size() >= 2) { block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support dynamic s3 path to save df command
3 participants