Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue(s) this PR fixes:
Issues longhorn/longhorn#5159 , longhorn/longhorn#6600
I am reposting this PR because i think it is a solid solution on integrating ublk with longhorn updated with more benchmarks on a more real-life enviroment
What this PR does / why we need it:
This PR uses @derekbit 's POC ubdsrv to integrate ublk with longhorn-engine and adds support for multiple frontend queues.
Using ublk as a frontend option significantly boosts the IOPS in the frontend part of the engine. In our tests, we measure up to ~280k IOPS at the frontend (before the controller communicates with the replicas). Those IOPS are bounded because of my setup.
For the benchmarks i used the folowing setup :
CPU : Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
RAM: 128GB
Disk: Samsung 860 Evo SSD 250GB
Network : 10Gbps
One replica , one controller , 2 machines in a local cluster
On the ublk :
Ublk frontend queues : 6
Controller-Replica connections : 6
The fio command used for testing IOPS:
sudo fio --name=read_iops --filename=/dev/ublkb0 --size=5G --numjobs=12 --time_based --runtime=30s --ramp_time=2s --ioengine=libaio --direct=1 --verify=0 --bs=4K --iodepth=64 --rw=randwrite --group_reporting=1
fio command used for testing Bandwidth:
sudo fio --name=bandwidth --filename=/dev/ublkb0 --size=5G --numjobs=12--time_based --runtime=30s --ramp_time=2s --ioengine=libaio --direct=1 --verify=0 --bs=1M --iodepth=64 --rw=read --group_reporting=1
*Replica R/W disabled means that the replica code is altered in order to answer dummy replies and not do the actual R/W in the Linux sparse files.
Note that the frontend performance was bounded by my pc's process power. Using a more powerful pc , ublk can achieve close to raw disk speeds. Based on these findings we can see that there is also need to improve other parts of the system too in order to utilize ublk fully.
Special notes for your reviewer:
In order to replicate the results you will need my version of ubdsrv installed: https://github.com/Kampadais/ubdsrv
To run the controller with ublk frontend, 6 frontend queues and 6 replica-streams:
./longhorn controller test --frontend ublk --size 1g --current-size 1g --frontend-queues 6 --replica-streams 6 --replica tcp://....:9502
We are still the need of a go-ublk-helper because the whole startup procedure is within the frontend.go file and monitoring status is limited. There is also a limitation regarding the name of the block device provided by ublk.
Additional documentation or context