-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce overhead when doing SPI transactions #28093
base: master
Are you sure you want to change the base?
Conversation
7ed2419
to
c8da9c9
Compare
@@ -87,6 +87,9 @@ void Shared_DMA::unregister() | |||
// lock one stream | |||
bool Shared_DMA::lock_stream(uint8_t stream_id) | |||
{ | |||
if (!is_shared(stream_id)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is safe. I'm pretty sure the dshot code for instance uses the DMA lock to mediate access, so you would need to also check whether another thread has taken the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andyp1per can you point to the code where this is done, I don't think Shared DMA is used that way in RCOutput, its only role is to allocate dma for use for dshot and bdshot, as far as I am aware dshot stuff happens entirely from the single thread. What is DMA lock used to mediate access between?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ardupilot/libraries/AP_HAL_ChibiOS/RCOutput.cpp
Line 1628 in 86f51ee
group.dma_handle->lock(); |
It's mediating access between the _UP DMA channel and the IC DMA channel. Looking at it further it may be ok since these should end up meaning it actually shows up as shared here, but would need some careful testing.
SPIDEV rm3100 SPI2 DEVID2 RM3100_CS MODE3 1*MHZ 1*MHZ | ||
|
||
SPIDEV icm20649 SPI4 DEVID1 ICM_2_CS MODE3 4*MHZ 8*MHZ | ||
# alternative to icm20649 | ||
SPIDEV icm45686 SPI4 DEVID1 ICM_2_CS MODE3 2*MHZ 8*MHZ | ||
SPIDEV icm45686 SPI4 DEVID1 ICM_2_CS MODE3 20*MHZ 20*MHZ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the slower rate was slower for a reason, i.e. in case of recovery?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andyp1per the reason is simply legacy, we have been using that frequency in older Sensors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but for instance we needed that to solve a problem on ICM42688 (changed it on Matek for instance), so not 100% convinced this is an older sensor problem.
last_spicfg.cfg2 == spicfg.cfg2 | ||
#else | ||
last_spicfg.cr1 == spicfg.cr1 && | ||
last_spicfg.cr2 == spicfg.cr2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like you might need to check the DMA lock state here as well. If the lock is shared (and therefore the channel shared), the periperal may need to be shutdown to allow another to use the DMA channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andyp1per we only get to this line near the end of acquire bus which must have been successful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that then begs the question as to why stop/start is necessary here at all? Also why not put the start() inside the if as well?
uint8_t n = MIN(n_samples, INV3_FIFO_BUFFER_LEN); | ||
if (!block_read(reg_data, (uint8_t*)fifo_buffer, n * fifo_sample_size)) { | ||
goto check_registers; | ||
do { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know how much difference this makes? I ask because there is a bug in the v3 driver to do with DMA usage which we should fix - a56660e - that said this also has the cost of two SPI transactions, so I am wondering if there is a way of doing both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a comment for the IMU changes
uint8_t n; | ||
void *fifo_result = nullptr; | ||
if (n_samples == 0) { | ||
if (!block_read(reg_counth, (uint8_t*)fifo_buffer, 2 + fifo_sample_size)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a comment. I think I understand what you are trying to achieve here, but its not at all clear.
I did a short flight with just the SPI changes - had no issues and this gives a nice bump of about 5% CPU gain, which with my Dynamic FIFO change (#27841) is giving me about 15% CPU gain |
@bugobliterator I'd like to discuss this a bit more with you. There are risks to not restarting the SPI peripheral. If (due to eg. a voltage spike) the SPI peripheral gets into a bad state, then with a single sensor bus that state could persist forever and take down the vehicle. We switched to always re-init some time ago when we saw this sort of bad persistent state on I2C peripherals. |
uint8_t n; | ||
void *fifo_result = nullptr; | ||
if (n_samples == 0) { | ||
if (!block_read(reg_counth, (uint8_t*)fifo_buffer, 2 + fifo_sample_size)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be using _dev->transfer() like in InvensenseV2 and V1, to save the stack space and copy, and preventing a watchdog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be split into SPI changes and IMU changes
Tested on CubeOrangePlus and CubeRedPrimary with Invensensev3 driver
I used debug trace to analyse timeline for read_fifo to check if we have any substantial overhead while doing SPI transactions. It turned out that calling spiStart and spiStop which in turn call dmaStreamAlloc and and dealloc to take substantial amount of CPU call almost 3-4us. Also we were calling Shared DMA mutex lock and unlock when the DMA is not shared between peripherals.
Following changes made SPI transactions much less CPU intensive:
Results:
CubeRedPrimary:
Before:
SPI1 PRI=181 sp=0x20001AA8 STACK= 792/1464 LOAD=13.4%
SPI4 PRI=181 sp=0x200021C0 STACK= 872/1464 LOAD=13.4%
SPI2 PRI=181 sp=0x20005380 STACK= 896/1464 LOAD=13.3%
After:
SPI1 PRI=181 sp=0x20001B08 STACK= 792/1464 LOAD= 9.7%
SPI4 PRI=181 sp=0x20002220 STACK= 792/1464 LOAD=11.2%
SPI2 PRI=181 sp=0x200053E0 STACK= 840/1464 LOAD= 8.3%
CubeOrangePlus:
Before:
SPI4 PRI=181 sp=0x3002AF58 STACK= 668/1464 LOAD=25.5%
SPI1 PRI=181 sp=0x3002B668 STACK= 668/1464 LOAD=14.5%
After:
SPI4 PRI=181 sp=0x3002AF78 STACK= 668/1464 LOAD=19.2%
SPI1 PRI=181 sp=0x3002B688 STACK= 668/1464 LOAD= 9.5%