Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

u-boot: Fix the SPI and eMMC hang issue #570

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BaochengSu
Copy link
Collaborator

@BaochengSu BaochengSu commented Dec 16, 2024

Another fix for the infamous #440 , should be the right one.

Refer to 1 and 2, the root cause is the wrong data type of the timer value.

Thanks [email protected] for the patches.

  1. https://lists.denx.de/pipermail/u-boot/2024-December/574817.html
  2. https://lists.denx.de/pipermail/u-boot/2024-December/574818.html

The SD patch is kept, however most probably the SD card issue is a different one.

Refer to 1 and 2, the root cause is the wrong data type of the timer
value.

Thanks [email protected] for the patches.

1. https://lists.denx.de/pipermail/u-boot/2024-December/574817.html
2. https://lists.denx.de/pipermail/u-boot/2024-December/574818.html

Signed-off-by: Baocheng Su <[email protected]>
@BaochengSu BaochengSu marked this pull request as ready for review December 17, 2024 03:36
file://0012-spi-cadence-quadspi-Fix-error-message-on-stuck-busy-.patch \
file://0013-spi-cadence-quadspi-fix-potential-malfunction-after-.patch \
file://0014-mmc-Fix-potential-timer-value-truncation.patch \
file://0015-driver-iot2050-Add-a-temporary-workaround-for-the-SD.patch \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need patch 15, given patch 14?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually during my speeded-up reproducing (by using a u16 start), I have not observed the SD card issue, this echoes the code exactly, no wrong sized timer variables are used in the SD driver part.

So I think the SD issue is a different one, so I keep the patch 15.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I actually tested the SD on an advanced device, IIRC, the SD card issue is reported on a basic device, let me try it on the basic device then let's see.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested on both PG1 & 2 basic devices, SD card booting does not have such issue pattern. So better to keep the patch 15.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • the mmc fails when also SPI failed
  • patch 15 also deals with timeouts
  • the patched code also used an "int" for holding the timeout
  • patch 14 changes some uint to and ulong in the same code file to address an overflow issue

Isn't all that screaming for a closer look? I suspect sd_send_op_cond is just missing the same fix as __mmc_switch and mmc_send_op_cond receive via patch 14. Please do so, upstream first.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The int timeout in sd_send_op_cond does not bind to any get_timer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, but it does not change the key point: Either, patch 15 is no obsolete because the root cause was found and fixed, or we should look for any remaining timer overflows that caused that issues. It is more than likely that the same root cause applies to "our" mmc issue as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also what I thought, however, after a deep dive into the SD card calling stack (not only, I've checked almost every get_timer calling in the u-boot -- indeed we are in lucky, there are few u32 cases, and most of them are command which does not matter in our issue...), I found there is no other place to overflow the timer, besides the reported place, so this SD card issue is either a false alarm or a different issue, in this case, better to keep the patch 15, since that patch won't cause anything harmful even if it does not fix the issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tune get_time to return an overflowing value until the mmc issue reoccurs and debug the root cause / validate that it is fixed by patch 14. Just keeping the hack of patch 15 is not really the way forward anymore, I would say.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants