-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SDRAM controller intermittant problems #802
Comments
The two failing outputs are either:
Another form the failure can take is that some bits of first byte in a row of 8 are stuck high or low. Again, this looks like a variation of (2) above. Actually, it is reading the image of the same bit from another byte, in some cases byte 2 of the row of 8, suggesting that it is reading a bit too late in the DDR if the SDRAM has been put into the correct latency mode (which can't be assumed). In terms of things that trigger changes between PASS and FAIL behaviour:
The above is consistent with the SDRAM controller attempting to set the latency mode on bitstream start up, but not on reset button press. This makes sense, because SDRAM controller does not have a /RESET input. Maybe it should? |
Trying software induced reset of SDRAM chip by writing to $C000000 doesn't work (although SDRAM controller does report that it is trying to reset when this occurs). |
Note that the "slow mode" for the SDRAM controller only changes the read latency -- it doesn't change the SDRAM clock from 162MHz to 81MHz -- which is why if the Mode Register Set operation fails, that the test continues to fail. |
SDRAM is directly clocked from 162MHz clock via an ODDR primitive, not that it should matter. |
Things that might be worth trying:
|
Commit above should allow (2). |
Goals for tonight: (1) Test the above. (2) Drop clock from 162 to 81MHz and see if that fixes it, if so, check whether it is writing or reading that is failing. |
With the above we have 8 settings that can be explored. This points to an issue during writing in terms of the underlying problem. (Note that if the problem is with write timing latching, this would explain why writing to the configuration register is not always working.) Now trying to get it into the passing state, to see how these affect it in that situation. Ok, in the passing state, reading works fine with $02, and writing with $05. Reading with $05 in the passing state works, but with an extra cycle of delay on the read data. Common theme: Switching the clock phase between reading and writing helps. Now, what's weird is that bit 2 of this value should switch whether we latch reads on rising or falling edge of the clock, so we shouldn't need to switch the clock phase... but it doesn't. So there is some other thing going on here. Anyway, I'm adding the ability to add an extra cycle of latency to the read side, so that in the passing state it should be possible to have a single setting that works for both read and write. If that works, then I can look at having a flag that allows swapping clock phase between read and write. With those two things, it should be possible to have the SDRAM work correctly in both states. This would then just require a bit of boot-strap code that determines the correct timing setting on boot. I'd still like to figure out why it doesn't reliably configure into one or the other state, however. |
The above commits should allow automatic inversion of the clock when writing via bit 4 of $C000000. |
Using the above new bitstream, I wrote the following program to detect which configurations in $C000000 work:
This suggests that several settings work, at least on this boot (not sure if it is a fail or pass mode case). |
I've then had to patch attictest.prg, as to bust the cache it writes to ADDR + something, which means for the end part of the test, it writes to the $Cxxxxxx where the SDRAM controller is treating any write as setting the configuration. |
It looks that previous run was with it in the "pass" state. With it in the fail state, no combination passes the 2nd half of the test, where we write different values over the 16 byte range to check. Okay.... I just somehow flipped the "fail" mode to "good" mode. Poked $D7FE to $10, $30, $00 and $10. But not sure if it is that sequence that did it or not. Anyway, this needs further investigation to figure out what is going on. |
Reproduced by poking $D7FE,$00 and then $D7FE,$10 @lydon42 's sdram test program then passes. The only other change here is that I poke $C000000 twice. But that shouldn't make a difference. Ah, okay, so actually it is that $D7FE must have $10, not $30. Just that single POKE $D7FE,$10 before running the SDRAM settings program is enough to make it work in the "fail" state now, it seems. |
Now trying loading the bitstream many times, running @lydon42 's SDRAM test, and if it fails, running my updated SDRAM settings program that does $D7FE,$10 first, to see if we get reliable SDRAM passing. sdram_settings3.prg:
Patched attictest.prg:
Run 1 - Was in FAIL state, but running sdram_settings3.prg fixed it. Used $C000000 value $09 (but many others (more than the 4 in run 2) were indicated as likely good. I wasn't expecting changes in the list of good configs, so didn't write them down). |
Above commit makes config $09 default. While that synthesises, continuing to test with previous bitstream: Run 6 - Was in FAIL state, Configs $09, $0D, $1A, $1E, $29, $2D, $3A and $3E indicated as likely good. Config $09 worked. 12/12 is pretty good :) |
Still waiting for that bitstream, but running the old hyperram test program. |
Okay, new bitstream is cooked. Testing attictest2.prg on repeated bitstream loads... Run 1 - Instant pass :) So let's look at what config $09 actually means:
Basically we have flipped the SDRAM clock and removed the need for all the inverted clock stuff that we had. So why did this not work originally? Still some questions remain. |
Now I need others to test on their R5/R6 boards... |
Hmm... Here's my theory as to what might be going on: We were latching on the opposite side of the clock, and thus it was marginal timing as to whether the data arrived a cycle early or not, causing the appearance of the SDRAM maybe not having accepted the latency count via the configuration set command. It's still not totally satisfying, however, as we might expect to see some jitter between reads. |
$FFD37FE must be $10, not $14 for SDRAM now, else some failures occur. The difference is that $14 enables the cache. So we have some bug with the slow RAM cache with SDRAM still. That said, the SDRAM without cache is not so much slower than the HyperRAM with cache. But it would still be good to get this working. Hmm... or maybe I have tickled some other problem. Okay, so there is still some problem: Even addressed bytes seem to be ok, but odd addressed bytes are being smeared between successive odd addresses it seems. e.g.:
(not $48 and $B7 are complementary in binary) But is this happening during read or write? Changing from config $09 to $02 after the write allows us to read back the $FF correctly. So I'm suspecting it is on the read side.
BUT in config 2, we can't write anything. Configs 9 and 2 differ effectively in which side of the clock writes and commands happen on, because flipping bits 0,1 and 3 inverts both the clock phase, and the phase on which reads occur. This means that the bit for switching whether clock phase is flipped during writes should be working around exactly this problem. But it's not. If we are issuing commands on wrong side of the clock, then the incorrect command can be latched, including config register set commands. However, we'd expect then that reads and writes would go missing. Well, we are seeing writes go missing in some situations of this... Configs 2 and 9 should have identical read timing, but clearly don't:
Config 9 writes fine, but doesn't read fine.
It should only affect reading, and not writing. And this time at least, it does. Passes the attictest2.prg |
Updating sdram_settings4.prg to check odd address behaviour:
It now recommends config $D instead of config $9. hyperramtest.prg also likes config $D better. Read stability etc now happy with the cache enabled :) |
Now to try many runs and make sure it always works... Run 1 - good The smearing of bits between the successive odd bytes cannot be resolved by any configuration. Tellingly, the 2 byte offset is happening again here, so I am still very much of the view that the root cause here is that the configuration register set command is not being accepted. Try issuing config register set twice in a row, so that the values are stable on the SDRAM bus, even if it is latching on wrong clock half? |
That change is synthesising, but let's see how often this problem is occurring 4 more good runs in a row. Let's see if the new bitstream helps. |
With the new bitstream: |
Now doing repeated runs to confirm function, with attictest2.prg 7 consecutive run(s) passed, just using default config $2. Changing read latching between positive to negative edge does not resolve the problem this time. |
Okay, for the first time, I have flipped the SDRAM from fail to pass state, without resetting the FPGA:
How and why? |
Get it to boot in fail state again, and then try to reproduce the mode fix. Ah, but the new bitstream has also built. So let's test performance again... Okay, so that looks good. Let's disable the $Cxxxxxx debug register, so that test programs can't mash it. |
Something is still sensitive for timing, as the first commit above consistently fails SDRAM test. |
commit c1ded00 does a few things that seems to make slow mode more stable. fast still seems to be problematic. |
SDRAM fails with test program sometimes.
Suspect that command to set SDRAM latencies doesn't always work, so it has latency set one cycle wrong, and timing requirements are then not met at full speed.
Attached test program fails sometimes
attictest.prg.gz
Failing test runs can look like this:
The text was updated successfully, but these errors were encountered: