Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new CSAL ftl bdev #1771

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

MaisenbacherD
Copy link

For this PR to be merged the PR openebs/spdk-rs#67
for spdk-rs needs to be resolved and the submodule reference must be
updated accordingly. :)

The SPDK ftl bdev allows to create a layered device with a (fast) cache
device for buffering writes that get eventually flushed out sequentially
to a base device. SDPK ftl is also known as the Cloud Storage
Acceleration Layer (CSAL).

This kind of device potentially enables the use of emerging storage
interfaces like Zoned Namespace (ZNS) or Flexible Data Placement (FDP)
capable NVMe devices. Up to this point, those NVMe command sets are not
yet supported in SPDK upstream. However, the acceleration aspect of a
fast cache device already adds value.
With future support of new devices for SDPK ftl, Mayastor would already
be capable of utilizing those features simply by upgrading SDPK.

For now, the ftl_mount_fs test cases are hidden behind the ignore
attribute because PCIe devices are required for this test until
SPDK v24.09 is picked up, which introduces variable sector size emulation
for ftl devices.
To run these tests use the following:

RUST_LOG=TRACE cargo test -- --test-threads 1 --test ftl_mount_fs --nocapture --ignored

This patch introduces a new ftl device uri scheme:

ftl:///<ftl-device-name>?bbdev=<dev-uri-nested-encoding>&cbdev=<dev-uri-nested-encoding>

<dev-uri-nested-encoding> can be any already valid device uri where '?'
are replaced with '!' and '&' are replaced with '|'.
With SPDK v24.05 only PCIe devices will work where the cache device is
formatted to 4KiB+64B LBA format and the base device to 4KiB LBA format.

From SPDK v24.09 on, any device with a block size of 4KiB will work for
both cache and base device.

CSAL paper reference:
https://dl.acm.org/doi/10.1145/3627703.3629566

@MaisenbacherD
Copy link
Author

Discussion question: The spdk_ftl_conf allows us to set configurations like core_mask and l2p_dram_limit. Are those configurations worth exposing through the ftl uri, or should we go with the default configuration?

sudo PCI_ALLOWED="<PCI-ADDRESS>" ./spdk-rs/spdk/scripts/setup.sh reset
```

Please do not submit pull requests with active cargo test cases that require PCIe devices to be present.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in the future we can have a feature flag for PCIe tests which can be run on a dedicated test system :)
Btw, can we add virtual PCIe devices with qemu? That could be a way of running those tests without having spare pcie nvme devices

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a rust feature flag, such that PCIe support can be selected at compile time?
It makes sense to me! I briefly thought about adding a feature flag because of the failing test case but went with the ignore attribute for simplicity.
I can rework this to be a rust feature flag within this PR, however, if I attempt this (https://doc.rust-lang.org/cargo/reference/features.html) I get the following error:

error: failed to parse manifest at `/home/dennis/src/mayastor/Cargo.toml`

Caused by:
  this virtual manifest specifies a `features` section, which is not allowed

Do you know what needs to be done here?

A QEMU NVMe emulated device via PCIe would work for testing. One just needs to make sure that the cache device is configured with ms=8 https://qemu-project.gitlab.io/qemu/system/devices/nvme.html#metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what needs to be done here?

The feature flag would need to be added to the io-engine crate, see some examples:

extended-tests = [] # Extended I/O engine tests: not intended for daily runs.

I'll leave it up to you, fine to keep ignore for now as well.

A QEMU NVMe emulated device via PCIe would work for testing. One just needs to make sure that the cache device is configured with ms=8 https://qemu-project.gitlab.io/qemu/system/devices/nvme.html#metadata.

Maybe something we may be able to do here then. We run the python BDD tests in a vm, maybe we can configure it with an emulated PCIe device and be able to run these tests then?
Alternatively we can setup VM's on demand from the tests themselves? With nix we can setup test vm's easily, example: https://github.com/openebs/mayastor-extensions/blob/develop/tests/helm/test.nix
But I don't think there's emulated nvme support there...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now introduced a new feature flag :)

Yes we should definitely look into running the test in an emulated QEMU environment or something similar.

That on demand VM setup you linked looks interesting. It would take a bit time to play with it and figure out the capabilities.
Should we address the CI setup within this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do in another PR, let's get this one merged :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will talk with you outside of this PR to discuss this :)

io-engine/src/bdev/ftl.rs Outdated Show resolved Hide resolved
io-engine/src/bdev/ftl.rs Show resolved Hide resolved
io-engine/src/bdev/ftl.rs Outdated Show resolved Hide resolved
io-engine/src/bdev/ftl.rs Outdated Show resolved Hide resolved
io-engine/src/bdev/ftl.rs Show resolved Hide resolved
io-engine/src/bdev/ftl.rs Outdated Show resolved Hide resolved
io-engine/src/bdev/ftl.rs Outdated Show resolved Hide resolved
io-engine/src/core/env.rs Show resolved Hide resolved
io-engine/tests/ftl_mount_fs.rs Outdated Show resolved Hide resolved
@tiagolobocastro
Copy link
Contributor

bors try

bors-openebs-mayastor bot pushed a commit that referenced this pull request Nov 22, 2024
@bors-openebs-mayastor
Copy link

try

Build failed:

MaisenbacherD and others added 3 commits November 22, 2024 11:41
The SPDK ftl bdev allows to create a layered device with a (fast) cache
device for buffering writes that get eventually flushed out sequentially
to a base device. SDPK ftl is also known as the Cloud Storage
Acceleration Layer (CSAL).

This kind of device potentially enables the use of emerging storage
interfaces like Zoned Namespace (ZNS) or Flexible Data Placement (FDP)
capable NVMe devices. Up to this point, those NVMe command sets are not
yet supported in SPDK upstream. However, the acceleration aspect of a
fast cache device already adds value.
With future support of new devices for SDPK ftl, Mayastor would already
be capable of utilizing those features simply by upgrading SDPK.

For now, the `ftl_mount_fs` test cases are hidden behind the
`nvme-pci-tests` flag because PCIe devices are required for this test
until SPDK v24.09 is picked up, which introduces variable sector size
emulation for ftl devices.
To run these tests use the following:
```
RUST_LOG=TRACE cargo test --features nvme-pci-tests -- --test-threads 1 \
--test ftl_mount_fs --nocapture
```

This patch introduces a new ftl device uri scheme:
```
ftl:///$ftl_device_name?bbdev=$bbdev_uri_percent_encoded&cbdev=$cbdev_uri_percent_encoded
```
The bbdev_uri and cbdev_uri need to use percent encoding on '?' (= '%3F')
and '&' (= '%26') segment dividers. This is needed so we can successfully
parse the ftl uri. Optionally, the sub-uris may also be fully
percent-encoded.

With SPDK v24.05 only PCIe devices will work where the cache device is
formatted to 4KiB+64B LBA format and the base device to 4KiB LBA format.

From SPDK v24.09 on, any device with a block size of 4KiB will work for
both cache and base device.

CSAL paper reference:
https://dl.acm.org/doi/10.1145/3627703.3629566

Co-authored-by: Indraneel M <[email protected]>
Signed-off-by: Dennis Maisenbacher <[email protected]>
Init SPDK tracing when `MayastorEnvironment.num_enties` is set to a
positive integer. This can be helpfull when developing new features.
Traces can be found in `/dev/shm` and have the pid or if not negative
the `MayastorEnvironment.shm_id` as a suffix.

Further information about traces and how to read the captured traces can
be found here: https://spdk.io/doc/nvmf_tgt_tracepoints.html

Signed-off-by: Dennis Maisenbacher <[email protected]>
@MaisenbacherD
Copy link
Author

Addressed the linter errors :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants