-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue #100 resolved - ASIC not always starting/hashing after boot (due to race condition) #152
issue #100 resolved - ASIC not always starting/hashing after boot (due to race condition) #152
Conversation
Looks like a pretty simple fix, can you elaborate how you figured out the tasks were blocking each other? Which part of the tasks? |
First of all, I'm sorry for the log post. I was trying to organizing the log with collapsed sections but it was not readable anymore..... Anyway, back to topic. I love to read the source code of things I'm using (if available).... While I was playing with Bitaxe and reading the code I noticed that there is a timing factor involved in causing hashing not start. This was indicating a possible race condition to me. So I started to add logging-code and start my troubleshooting . See logs below (the first boot and wifi is not included). Reordering the xTaskCreate task-workers resolved the race condition for me. However, while I'm writing this post, I see that maybe lowering the serial timeout in "BM1366_receive_work" might also be an option (not yet tested). 20:41:42.218 > I (2059) serial: Initializing serial As you can see, after the line "20:41:43.504 > I (3329) bm1366Module: BM1366_receive_work: wait for a response." (reference below) the ASIC_result_task is IMHO almost dead. Maybe lowering the timeout (currently 60000ms) would also be a possible solution but I have not tested it yet.
Below is the 2nd part of the debug log (just in case someone wants to read it ;-) 20:41:49.406 > I (9249) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12 |
I did some more troubleshooting in the meantime. Changing/lowering the timeout (currently 60000ms) in BM1366_receive_work did not help. However, IMHO the root cause might be related to serial communication with the chip as ASIC_result_task and ASIC_task dealing with serial stuff at the same time. When ASIC_result_task get started, one of the first things it is doing is calling SERIAL_clear_buffer() As ASIC_task got started earlier and is messing with serial already However, changing the order of the task-workers as outlined in this PR seems to improve the serial timing and resolved it (for me). I took several logs, just let me know if you want me to put it somewhere. |
This is awesome, thanks for looking into it @MoellerDi I have also noticed on some occasions mining fails to start. It would make sense that this is a race condition. I'll give this a try on my Bitaxe and see how it works. We do have some smarter serial parsing on the hex branch that should be merged into master at some point. |
This reverts commit c313a6f.
…ing serial from ASIC_task / ASIC_result_task to app_main()
@skot / @benjamin-wilson |
It looks good, i'lll test this on a max, ultra and supra. If it works, i'll merge. |
This PR should fix #100 by resolving a race condition during boot state. It seems the work tasks/queues were blocking each other.