-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPI DMA PSRAM QuadSPI half_duplex_write() doesn't resolve #2863
Comments
Update: found that the bus speed is relevant: |
Thanks for the tidbit about bus speed. It's possible that PSRAM can't keep up with the SPI. Would you mind printing out the SPI interrupt registers (with the PAC) a few seconds after the transfer starts? |
@Dominaezzz attempted to add your change but it seems adding a delay of >600us (bisected to +-50us, in this range it occasionally works) causes it to succeed again. So it seems like there might be a race condition with an interrupt or something. The result I'm getting is always log::info!("SPI2: {}", unsafe {
*(SPI2::ptr().add(0x0040) as *const u32).as_ref().unwrap()
}); |
I'm looking for INT_RAW rather than INT_ST, which is Did you add the delay to the example or to the SPI driver? I was thinking of here. Since this example is blocking and not async, a race condition with an interrupt is very unlikely. |
Yes exactly there, I've pushed my latest version now. I'll check 0x003C Update: |
Can you add a larger delay? I still want to see what the interrupts look like if it succeeds |
Hmm switched to log::info!("pre-SPI2: {}", unsafe {
core::ptr::read_volatile(SPI2::ptr().add(0x003C) as *const u32)
});
delay.delay_millis(2000);
log::info!("post-SPI2: {}", unsafe {
core::ptr::read_volatile(SPI2::ptr().add(0x003C) as *const u32)
});
(spi, dma_buf) = transfer.wait();
However it does not resolve again anymore. Given that; maybe it has to do with memory layout since that will only be randomized with some significant change? This seems to be confirmed by placing the delay between the prints also fails with 593us, even though placing it before the prints will succeed once then fail once. Likewise placing 750us between prints fails, placing it before the prints get repeated successes. I'm not sure which parts require a specific alignment and how I can check if the are aligned. |
The alignment requirements should be checked by Can you print out the addresses of |
Oh looks like it's already done here.
|
Hmm yeah the descriptors/buffer addresses don't seem to different: Working:
Failing:
|
I've run out of ideas for what could be going on but as a final hail mary, try adding a delay to the update function here. You can copy that into the gdma section and make it bigger, like 10 instead of just 1. |
I tried the linked code and can reproduce the issue. When I change Placing most of the involved code into RAM lets me go up to 32MHz Seems to be a bandwidth issue (i.e. placing code in RAM "eliminates" concurrently accessing flash) |
I'll try both suggestions and report back. @bjoernQ assuming it's the same issue I encountered; any chance it can be detected; I'd rather it panic or Err than hang. |
Hmm I put everything relevant in ram removed logs before wait, and added:
However the env variables actually causes it to fail (commenting succeeds again). If it is flash contention, it seems like it would be at the @Dominaezzz I tried your hail-mary, unfortunately doesn't seem to change anything. I'll see if I can get a minimal break condition (e.g. call to an empty function or similar) to maybe get a better look in ghidra. |
I've reduced the break condition to a few I've also tried manually allocating and generating some PSRAM traffic (see commented fail code) but that didn't seem to matter much. In the fail=false condition, it just works. In the fail=true condition without tuning I'm getting the original failure mode (infinite wait).
Which seems to trigger before even starting a DMA transfer, which again would indicate a codegen layout issue. |
The panic looks sus - we have seen some miscompiles / misoptimizations lately. However I tried your changed code on 36095e4 plus my changes and some more changes and it seems to work Here are my changes (on top of the latest commit on main)diff --git a/esp-hal/src/delay.rs b/esp-hal/src/delay.rs
index a361aabc..5fb26144 100644
--- a/esp-hal/src/delay.rs
+++ b/esp-hal/src/delay.rs
@@ -68,6 +68,7 @@ impl Delay {
}
/// Delay for the specified number of microseconds
+ #[procmacros::ram]
pub fn delay_micros(&self, us: u32) {
let delay = MicrosDurationU64::micros(us as u64);
self.delay(delay);
diff --git a/esp-hal/src/spi/master.rs b/esp-hal/src/spi/master.rs
index 48116c03..6e3d6d01 100644
--- a/esp-hal/src/spi/master.rs
+++ b/esp-hal/src/spi/master.rs
@@ -1316,6 +1316,7 @@ mod dma {
/// This method blocks until the transfer is finished and returns the
/// `SpiDma` instance and the associated buffer.
#[instability::unstable]
+ #[procmacros::ram]
pub fn wait(mut self) -> (SpiDma<'d, Dm, T>, Buf) {
self.spi_dma.wait_for_idle();
let retval = unsafe {
@@ -1581,6 +1582,7 @@ mod dma {
#[allow(clippy::type_complexity)]
#[cfg_attr(place_spi_driver_in_ram, ram)]
#[instability::unstable]
+ #[ram]
pub fn half_duplex_write<TX: DmaTxBuffer>(
mut self,
data_mode: DataMode,
diff --git a/examples/.cargo/config.toml b/examples/.cargo/config.toml
index 600f74b2..d25c3d15 100644
--- a/examples/.cargo/config.toml
+++ b/examples/.cargo/config.toml
@@ -35,5 +35,9 @@ GATEWAY_IP = "1.1.1.1"
HOST_IP = "1.1.1.1"
ESP_WIFI_CSI_ENABLE = "true"
+ESP_HAL_PLACE_SPI_DRIVER_IN_RAM = "true"
+ESP_HAL_PLACE_ANON_IN_RAM = "true"
+ESP_HAL_PLACE_SWITCH_TABLES_IN_RAM = "true"
+
[unstable]
build-std = ["alloc", "core"]
diff --git a/examples/src/bin/spi_loopback_dma_psram.rs b/examples/src/bin/spi_loopback_dma_psram.rs
index 41b785b4..61937507 100644
--- a/examples/src/bin/spi_loopback_dma_psram.rs
+++ b/examples/src/bin/spi_loopback_dma_psram.rs
@@ -16,13 +16,16 @@
//! If your module is quad PSRAM then you need to change the `psram` feature in the
//! in the features line below to `quad-psram`.
-//% FEATURES: esp-hal/log esp-hal/octal-psram esp-hal/unstable
+//% FEATURES: esp-hal/log esp-hal/quad-psram esp-hal/unstable
//% CHIPS: esp32s3
#![no_std]
#![no_main]
+#![feature(vec_push_within_capacity)]
use esp_backtrace as _;
+use esp_hal::macros::ram;
+use esp_hal::peripherals::SPI2;
use esp_hal::{
delay::Delay,
dma::{DmaRxBuf, DmaTxBuf, ExternalBurstConfig},
@@ -51,13 +54,48 @@ macro_rules! dma_alloc_buffer {
}};
}
-const DMA_BUFFER_SIZE: usize = 8192;
const DMA_ALIGNMENT: ExternalBurstConfig = ExternalBurstConfig::Size64;
const DMA_CHUNK_SIZE: usize = 4096 - DMA_ALIGNMENT as usize;
+const DMA_BUFFER_SIZE: usize = 8192;
+
+// #[ram]
+fn effectively_empty(fail: bool) // -> bool
+{
+ const size: usize = 2;
+ let mut x: alloc::vec::Vec<usize> = alloc::vec::Vec::with_capacity(size);
+ // unsafe { x.set_len(size) };
+
+ // let layout = core::alloc::Layout::new::<[usize; size]>();
+ // let ptr = unsafe { alloc::alloc::alloc(layout) as *mut usize };
+ // let mut lsize = 0;
+
+ if fail {
+ // for i in 0..size {
+ // // unsafe { *(ptr.add(i)) = i }
+ // unsafe { core::ptr::write(ptr.add(i), i) };
+ // let j = unsafe { *(ptr.add(i)) };
+ // lsize += j;
+ // }
+ for i in 0..size {
+ // ram op
+ // x.push(i);
+ x.push_within_capacity(i);
+ // x.get(i).unwrap_or(&0);
+ }
+ } else {
+ for i in 0..size {
+ // nop
+ unsafe { core::ptr::read_volatile(SPI2::ptr().add(0x003C) as *const u32) };
+ }
+ }
+
+ // x[size - 2] % 2 == 0
+}
+#[link_section = ".rwtext"]
#[entry]
fn main() -> ! {
- esp_println::logger::init_logger(log::LevelFilter::Info);
+ esp_println::logger::init_logger(log::LevelFilter::Trace);
info!("Starting SPI loopback test");
let peripherals = esp_hal::init(esp_hal::Config::default());
esp_alloc::psram_allocator!(peripherals.PSRAM, esp_hal::psram);
@@ -67,6 +105,8 @@ fn main() -> ! {
let mosi = peripherals.GPIO48;
let miso = unsafe { mosi.clone_unchecked() };
let cs = peripherals.GPIO38;
+ let sio2 = peripherals.GPIO7;
+ let sio3 = peripherals.GPIO6;
let (_, tx_descriptors) =
esp_hal::dma_descriptors_chunk_size!(0, DMA_BUFFER_SIZE, DMA_CHUNK_SIZE);
@@ -78,66 +118,85 @@ fn main() -> ! {
tx_descriptors.len()
);
let mut dma_tx_buf =
- DmaTxBuf::new_with_config(tx_descriptors, tx_buffer, DMA_ALIGNMENT).unwrap();
- let (rx_buffer, rx_descriptors, _, _) = esp_hal::dma_buffers!(DMA_BUFFER_SIZE, 0);
- info!(
- "RX: {:p} len {} ({} descripters)",
- rx_buffer.as_ptr(),
- rx_buffer.len(),
- rx_descriptors.len()
+ Some(DmaTxBuf::new_with_config(tx_descriptors, tx_buffer, DMA_ALIGNMENT).unwrap());
+
+ log::info!(
+ "Alignment: {}, Chunk size: {}",
+ ExternalBurstConfig::Size64 as usize,
+ DMA_CHUNK_SIZE
);
- let mut dma_rx_buf = DmaRxBuf::new(rx_descriptors, rx_buffer).unwrap();
- // Need to set miso first so that mosi can overwrite the
- // output connection (because we are using the same pin to loop back)
+
+ let (rx_buffer, rx_descriptors, tx_buffer, tx_descriptors) =
+ esp_hal::dma_buffers!(DMA_BUFFER_SIZE);
+
+ let dma_rx_buf2 = DmaRxBuf::new(rx_descriptors, rx_buffer).unwrap();
+ let mut dma_tx_buf2 = DmaTxBuf::new(tx_descriptors, tx_buffer).unwrap();
+
let mut spi = Spi::new(
peripherals.SPI2,
Config::default()
- .with_frequency(100.kHz())
+ .with_frequency(25.MHz())
.with_mode(Mode::Mode0),
)
.unwrap()
.with_sck(sclk)
.with_miso(miso)
.with_mosi(mosi)
+ .with_sio2(sio2)
+ .with_sio3(sio3)
.with_cs(cs)
.with_dma(peripherals.DMA_CH0);
delay.delay_millis(100); // delay to let the above messages display
- for (i, v) in dma_tx_buf.as_mut_slice().iter_mut().enumerate() {
+ for (i, v) in dma_tx_buf
+ .as_mut()
+ .unwrap()
+ .as_mut_slice()
+ .iter_mut()
+ .enumerate()
+ {
*v = (i % 256) as u8;
}
+ log::info!("First dmabuffer: {:?}", dma_tx_buf.as_mut());
+
let mut i = 0;
loop {
- dma_tx_buf.as_mut_slice()[0] = i;
- *dma_tx_buf.as_mut_slice().last_mut().unwrap() = i;
- i = i.wrapping_add(1);
-
+ use esp_hal::spi::master::{Address, Command};
+ let mut dma_buf = dma_tx_buf.take().unwrap();
let transfer = spi
- .transfer(dma_rx_buf.len(), dma_rx_buf, dma_tx_buf.len(), dma_tx_buf)
- .map_err(|e| e.0)
+ .half_duplex_write(
+ esp_hal::spi::DataMode::Quad,
+ Command::None,
+ Address::Address24(0, esp_hal::spi::DataMode::Quad),
+ 0u8,
+ dma_buf.len(),
+ dma_buf,
+ )
.unwrap();
+ // delay.delay_micros(1000);
+ // delay.delay_micros(1000);
+ // delay.delay_micros(1000);
+ // let transfer = spi.write(dma_buf.len(), dma_buf).unwrap();
+ // log::info!("pre-SPI2: {}", unsafe {
+ // core::ptr::read_volatile(SPI2::ptr().add(0x003C) as *const u32)
+ // });
+
+ // core::hint::black_box(effectively_empty(false));
+
+ // delay.delay_micros(1000);
+ // delay.delay_millis(1000);
+
+ // log::info!("post-SPI2: {}", unsafe {
+ // core::ptr::read_volatile(SPI2::ptr().add(0x003C) as *const u32)
+ // });
+ (spi, dma_buf) = transfer.wait();
+ dma_tx_buf.replace(dma_buf);
+
+ log::info!("Done");
- (spi, (dma_rx_buf, dma_tx_buf)) = transfer.wait();
- for (i, v) in dma_tx_buf.as_mut_slice().iter_mut().enumerate() {
- if dma_rx_buf.as_slice()[i] != *v {
- error!(
- "Mismatch at index {}: expected {}, got {}",
- i,
- *v,
- dma_rx_buf.as_slice()[i]
- );
- break;
- }
- }
- info!(
- "{:0x?} .. {:0x?}",
- &dma_rx_buf.as_slice()[..10],
- &dma_rx_buf.as_slice().last_chunk::<10>().unwrap()
- );
- dma_tx_buf.as_mut_slice().reverse();
delay.delay_millis(1000);
}
-}
+}
\ No newline at end of file (I copied your new example as the original one) |
@bjoernQ Can confirm your example works (similar to fail=false in my case), however I'm a bit concerned on the fragility. Adding the log back will halt it again, and so the thing I'm trying to find out is if |
Yeah it doesn't work in the project I originally found the issue (although I can't set Update: I removed the |
Bug description
Exact cause in not clear to me, but attempting to send a larger buffer with SPI DMA using PSRAM causes no interrupts to trigger (manually verified), and thus never resolves
is_done()
.I can only reliably reproduce it with half_duplex_write, although I think I've also seen it with regular
write()
.To Reproduce
Run example: https://github.com/i404788/esp-hal/blob/spi-dma-wip/examples/src/bin/spi_loopback_dma_psram.rs
Logs:
Expected behavior
Expected it to either error, or complete successfully. See the
Done
log repeatedly.Environment
The text was updated successfully, but these errors were encountered: