Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue #100 resolved - ASIC not always starting/hashing after boot (due to race condition) #152

Merged

Conversation

MoellerDi
Copy link
Contributor

This PR should fix #100 by resolving a race condition during boot state. It seems the work tasks/queues were blocking each other.

@benjamin-wilson
Copy link
Collaborator

Looks like a pretty simple fix, can you elaborate how you figured out the tasks were blocking each other? Which part of the tasks?

@MoellerDi
Copy link
Contributor Author

First of all, I'm sorry for the log post. I was trying to organizing the log with collapsed sections but it was not readable anymore.....

Anyway, back to topic. I love to read the source code of things I'm using (if available).... While I was playing with Bitaxe and reading the code I noticed that there is a timing factor involved in causing hashing not start. This was indicating a possible race condition to me. So I started to add logging-code and start my troubleshooting .

See logs below (the first boot and wifi is not included).

Reordering the xTaskCreate task-workers resolved the race condition for me. However, while I'm writing this post, I see that maybe lowering the serial timeout in "BM1366_receive_work" might also be an option (not yet tested).

20:41:42.218 > I (2059) serial: Initializing serial
20:41:42.223 > I (2059) bm1366Module: Initializing BM1366
20:41:43.416 > I (3259) bm1366Module: 1 chip(s) detected on the chain
20:41:43.449 > final refdiv: 2, fbdiv: 220, postdiv1: 5, postdiv2: 1, min diff value: 0.000000
20:41:43.451 > I (3289) bm1366Module: Setting Frequency to 550.00MHz (0.01)
20:41:43.456 > I (3289) stratum_task: Get IP for URL: public-pool.io
20:41:43.457 >
20:41:43.465 > I (3299) stratum_task: Connecting to: stratum+tcp://public-pool.io:21496 (68.235.52.36)
20:41:43.465 >
20:41:43.473 > I (3309) stratum_task: Socket created, connecting to 68.235.52.36:21496
20:41:43.479 > I (3309) bm1366Module: Setting max baud of 1000000
20:41:43.484 > I (3319) asic_result: ASIC_result_task started
20:41:43.492 > I (3319) asic_result: ASIC_result_task still alive and trying to call receive_result_fn
20:41:43.497 > I (3329) serial: Changing UART baud to 1000000
20:41:43.504 > I (3329) bm1366Module: BM1366_receive_work: wait for a response. -> or for a 60000ms timeout
20:41:43.508 > I (3339) ASIC_task: ASIC Ready!
20:41:43.515 > I (3349) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 0
20:41:43.520 > I (3319) main_task: Returned from app_main()
20:41:43.651 > I (3489) stratum_api: tx: {"id": 1, "method": "mining.subscribe", "params": ["bitaxe/BM1366"]}
20:41:43.652 >
20:41:43.856 > I (3689) stratum_api: Received result {"id":1,"error":null,"result":[[["mining.notify","52ad8130"]],"52ad8130",4]}
20:41:43.867 > I (3699) stratum_api: tx: {"id": 2, "method": "mining.configure", "params": [["version-rolling"], {"version-rolling.mask": "ffffffff"}]}
20:41:43.869 >
20:41:44.061 > I (3899) stratum_api: Received result {"id":2,"error":null,"result":{"version-rolling":true,"version-rolling.mask":"1fffe000"}}
20:41:44.067 > I (3899) stratum_api: Set version mask: 1fffe000
20:41:44.076 > I (3909) stratum_api: tx: {"id": 3, "method": "mining.suggest_difficulty", "params": [1024]}
20:41:44.076 >
20:41:44.087 > I (3919) stratum_api: tx: {"id": 4, "method": "mining.authorize", "params": [".BitaxeUltra", "x"]}
20:41:44.090 >
20:41:44.096 > I (3929) stratum_task: stratum_queue.count: 0
20:41:44.268 > I (4109) stratum_task: rx: {"id":null,"method":"mining.set_difficulty","params":[1024]}
20:41:44.271 > I (4109) stratum_task: Set stratum difficulty: 1024
20:41:44.276 > I (4109) stratum_task: stratum_queue.count: 0
20:41:44.479 > I (4319) stratum_task: rx: {"id":4,"error":null,"result":true}
20:41:44.479 > I (4319) stratum_task: message result accepted
20:41:44.485 > I (4319) stratum_task: stratum_queue.count: 0
20:41:44.735 > I (4569) stratum_task: rx: {"id":null,"method":"mining.notify","params":["2aa484d","f8cc186bd5462799d796b113389a1de15e9bd62100003eb10000000000000000","02000000010000000000000000000000000000000000000000000000000000000000000000ffffffff1703b2c20c5075626c69632d506f6f6c","ffffffff02301d182700000000160014979f36b9a92622b82767820f6477bdefc03e86b70000000000000000266a24aa21a9ed55ac346ea3144fcb8d261a3194b18a367657d3a7415548980a1e356194b0fa1400000000",["7d1f19df501e44c041d4658d2d4ae49636baeb0d94b7043ce8fd9e9762897c45","7ec186fe54b7671c58c5ea7e59eec6c419a799a02b26ad0b9f3aefc03be8aef0","b53860582d16530a8946f3d5e11e46cf91f02cafdd637ea6f83a13950464c99a","1e3efad6c254c0b73705499dfaebabdc4d80a795e8ac82df866bba47538bbce5","3435bb107fcd3d82c6f3a27cabeade530ba73d81b4dede9039e84613ee7342e5","14f72162593abfc2d287d1e3654cffddb317068313961643ab95ecec53e00ee5","225c74467015ca7b4bc5b0a619b9532e0b0a92c45124236148d807550ff550f1","dec2d4d1ab2f5fd759d8efeea69f7015ed56d3a975c63f83e7450a8d1e632df7","1548aa1d0e57c68bc67afefad7247fa6ef90732c040eae2b5da33ab04a35081b","640b7abd71c9463a94220857442fc8baff4d4144abd907683f7a71cb4b5e7797","87c03045fe66b0f6fd01eabbbd9cf4bc2ffce21c03fd3550f53389c422b26443"],"20000000","17035a59","6601d35a",false]}
20:41:44.836 > I (4669) SystemModule: Syncing clock
20:41:44.841 > I (4679) stratum_task: stratum_queue.count: 1
20:41:44.846 > I (4679) create_jobs_task: New Work Dequeued 2aa484d
20:41:44.855 > I (4689) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 0
20:41:44.862 > I (4689) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 1
20:41:44.870 > I (4689) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 0
20:41:44.878 > I (4699) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 0
20:41:44.884 > I (4709) ASIC_task: send Task to ASIC. job: 2aa484d
20:41:44.896 > I (4719) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 1
20:41:44.896 > I (4719) bm1366Module: pthread_mutex_lock
20:41:44.906 > I (4729) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 1
20:41:44.909 > I (4739) bm1366Module: TX Job ID: 08
20:41:44.918 > I (4739) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 2
20:41:44.923 > I (4749) bm1366Module: pthread_mutex_unlock
20:41:44.931 > I (4759) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 2
20:41:44.936 > I (4769) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 3
20:41:44.945 > I (4779) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 3
20:41:44.952 > I (4779) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 4
20:41:44.963 > I (4789) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 4
20:41:44.968 > I (4799) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 5
20:41:44.976 > I (4809) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 5
20:41:44.984 > I (4819) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 6
20:41:44.992 > I (4829) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 6
20:41:45.000 > I (4829) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 7
20:41:45.008 > I (4839) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 7
20:41:45.015 > I (4849) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 8
20:41:45.024 > I (4859) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 8
20:41:45.031 > I (4859) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 9
20:41:45.039 > I (4869) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 9
20:41:45.047 > I (4879) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 10
20:41:45.055 > I (4889) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 10
20:41:45.063 > I (4889) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 11
20:41:45.072 > I (4899) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 11
20:41:45.079 > I (4909) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:41:45.088 > I (4919) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:41:47.146 > I (6989) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:41:47.149 > I (6989) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:41:47.157 > I (6989) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:41:47.163 > I (6989) ASIC_task: send Task to ASIC. job: 2aa484d
20:41:47.171 > I (6999) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:41:47.176 > I (6999) bm1366Module: pthread_mutex_lock
20:41:47.180 > I (7019) bm1366Module: TX Job ID: 10
20:41:47.185 > I (7019) bm1366Module: pthread_mutex_unlock

As you can see, after the line "20:41:43.504 > I (3329) bm1366Module: BM1366_receive_work: wait for a response." (reference below) the ASIC_result_task is IMHO almost dead. Maybe lowering the timeout (currently 60000ms) would also be a possible solution but I have not tested it yet.

asic_result * BM1366_receive_work(void)
{
    // wait for a response, wait time is pretty arbitrary
    ESP_LOGI(TAG, "BM1366_receive_work: wait for a response.");
    int received = SERIAL_rx(asic_response_buffer, 11, 60000);
    ESP_LOGI(TAG, "BM1366_receive_work: ok received something");
<cut>
void ASIC_result_task(void *pvParameters)
{
    GlobalState *GLOBAL_STATE = (GlobalState *)pvParameters;
    SERIAL_clear_buffer();
    char *user = nvs_config_get_string(NVS_CONFIG_STRATUM_USER, STRATUM_USER);
    ESP_LOGI(TAG, "ASIC_result_task started");
    while (1)
    {
        ESP_LOGI(TAG, "ASIC_result_task still alive and trying to call receive_result_fn");
        task_result *asic_result = (*GLOBAL_STATE->ASIC_functions.receive_result_fn)(GLOBAL_STATE);
        ESP_LOGI(TAG, "ASIC_result_task still alive and receive_result_fn got called");
<cut>

Below is the 2nd part of the debug log (just in case someone wants to read it ;-)

20:41:49.406 > I (9249) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:41:49.410 > I (9249) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:41:49.417 > I (9249) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:41:49.423 > I (9249) ASIC_task: send Task to ASIC. job: 2aa484d
20:41:49.431 > I (9259) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:41:49.436 > I (9259) bm1366Module: pthread_mutex_lock
20:41:49.440 > I (9279) bm1366Module: TX Job ID: 18
20:41:49.445 > I (9279) bm1366Module: pthread_mutex_unlock
20:41:51.666 > I (11509) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:41:51.670 > I (11509) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:41:51.677 > I (11509) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:41:51.683 > I (11509) ASIC_task: send Task to ASIC. job: 2aa484d
20:41:51.691 > I (11519) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:41:51.696 > I (11519) bm1366Module: pthread_mutex_lock
20:41:51.700 > I (11539) bm1366Module: TX Job ID: 20
20:41:51.705 > I (11539) bm1366Module: pthread_mutex_unlock
20:41:53.936 > I (13779) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:41:53.940 > I (13779) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:41:53.947 > I (13779) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:41:53.953 > I (13779) ASIC_task: send Task to ASIC. job: 2aa484d
20:41:53.961 > I (13789) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:41:53.966 > I (13789) bm1366Module: pthread_mutex_lock
20:41:53.970 > I (13809) bm1366Module: TX Job ID: 28
20:41:53.975 > I (13809) bm1366Module: pthread_mutex_unlock
20:41:56.206 > I (16049) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:41:56.210 > I (16049) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:41:56.217 > I (16049) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:41:56.223 > I (16049) ASIC_task: send Task to ASIC. job: 2aa484d
20:41:56.231 > I (16059) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:41:56.236 > I (16059) bm1366Module: pthread_mutex_lock
20:41:56.241 > I (16079) bm1366Module: TX Job ID: 30
20:41:56.245 > I (16079) bm1366Module: pthread_mutex_unlock
20:41:58.476 > I (18319) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:41:58.480 > I (18319) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:41:58.487 > I (18319) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:41:58.493 > I (18319) ASIC_task: send Task to ASIC. job: 2aa484d
20:41:58.501 > I (18329) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:41:58.506 > I (18329) bm1366Module: pthread_mutex_lock
20:41:58.510 > I (18349) bm1366Module: TX Job ID: 38
20:41:58.515 > I (18349) bm1366Module: pthread_mutex_unlock
20:42:00.746 > I (20589) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:00.750 > I (20589) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:00.757 > I (20589) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:42:00.763 > I (20589) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:00.771 > I (20599) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:42:00.776 > I (20599) bm1366Module: pthread_mutex_lock
20:42:00.780 > I (20619) bm1366Module: TX Job ID: 40
20:42:00.785 > I (20619) bm1366Module: pthread_mutex_unlock
20:42:03.006 > I (22849) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:03.010 > I (22849) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:03.017 > I (22849) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:42:03.023 > I (22849) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:03.031 > I (22859) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:42:03.036 > I (22859) bm1366Module: pthread_mutex_lock
20:42:03.040 > I (22879) bm1366Module: TX Job ID: 48
20:42:03.046 > I (22879) bm1366Module: pthread_mutex_unlock
20:42:05.276 > I (25119) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:05.280 > I (25119) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:05.287 > I (25119) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:42:05.293 > I (25119) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:05.302 > I (25129) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:42:05.306 > I (25129) bm1366Module: pthread_mutex_lock
20:42:05.310 > I (25149) bm1366Module: TX Job ID: 50
20:42:05.315 > I (25149) bm1366Module: pthread_mutex_unlock
20:42:07.546 > I (27389) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:07.550 > I (27389) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:07.557 > I (27389) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:42:07.563 > I (27389) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:07.571 > I (27399) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:42:07.576 > I (27399) bm1366Module: pthread_mutex_lock
20:42:07.580 > I (27419) bm1366Module: TX Job ID: 58
20:42:07.586 > I (27419) bm1366Module: pthread_mutex_unlock
20:42:09.816 > I (29659) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:09.820 > I (29659) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:09.827 > I (29659) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:42:09.833 > I (29659) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:09.841 > I (29669) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:42:09.846 > I (29669) bm1366Module: pthread_mutex_lock
20:42:09.850 > I (29689) bm1366Module: TX Job ID: 60
20:42:09.856 > I (29689) bm1366Module: pthread_mutex_unlock
20:42:12.086 > I (31929) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:12.090 > I (31929) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:12.097 > I (31929) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:42:12.103 > I (31929) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:12.111 > I (31939) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:42:12.116 > I (31939) bm1366Module: pthread_mutex_lock
20:42:12.120 > I (31959) bm1366Module: TX Job ID: 68
20:42:12.126 > I (31959) bm1366Module: pthread_mutex_unlock
20:42:14.356 > I (34199) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:14.360 > I (34199) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:14.367 > I (34199) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:42:14.373 > I (34199) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:14.381 > I (34209) create_jobs_task: Ready to enqueue work 2aa484d. ASIC_jobs_queue.count: 12
20:42:14.386 > I (34209) bm1366Module: pthread_mutex_lock
20:42:14.390 > I (34229) bm1366Module: TX Job ID: 70
20:42:14.395 > I (34229) bm1366Module: pthread_mutex_unlock
20:42:14.986 > I (34819) stratum_task: rx: {"id":null,"method":"mining.notify","params":["2aa4860","f8cc186bd5462799d796b113389a1de15e9bd62100003eb10000000000000000","02000000010000000000000000000000000000000000000000000000000000000000000000ffffffff1703b2c20c5075626c69632d506f6f6c","ffffffff02301d182700000000160014979f36b9a92622b82767820f6477bdefc03e86b70000000000000000266a24aa21a9ed55ac346ea3144fcb8d261a3194b18a367657d3a7415548980a1e356194b0fa1400000000",["7d1f19df501e44c041d4658d2d4ae49636baeb0d94b7043ce8fd9e9762897c45","7ec186fe54b7671c58c5ea7e59eec6c419a799a02b26ad0b9f3aefc03be8aef0","b53860582d16530a8946f3d5e11e46cf91f02cafdd637ea6f83a13950464c99a","1e3efad6c254c0b73705499dfaebabdc4d80a795e8ac82df866bba47538bbce5","3435bb107fcd3d82c6f3a27cabeade530ba73d81b4dede9039e84613ee7342e5","14f72162593abfc2d287d1e3654cffddb317068313961643ab95ecec53e00ee5","225c74467015ca7b4bc5b0a619b9532e0b0a92c45124236148d807550ff550f1","dec2d4d1ab2f5fd759d8efeea69f7015ed56d3a975c63f83e7450a8d1e632df7","1548aa1d0e57c68bc67afefad7247fa6ef90732c040eae2b5da33ab04a35081b","640b7abd71c9463a94220857442fc8baff4d4144abd907683f7a71cb4b5e7797","87c03045fe66b0f6fd01eabbbd9cf4bc2ffce21c03fd3550f53389c422b26443"],"20000000","17035a59","6601d396",false]}
20:42:15.087 > I (34919) stratum_task: stratum_queue.count: 1
20:42:16.626 > I (36469) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:16.630 > I (36469) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:16.637 > I (36469) create_jobs_task: Work enqueued 2aa484d. ASIC_jobs_queue.count: 12
20:42:16.643 > I (36469) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:16.649 > I (36479) create_jobs_task: New Work Dequeued 2aa4860
20:42:16.654 > I (36479) bm1366Module: pthread_mutex_lock
20:42:16.662 > I (36489) create_jobs_task: Ready to enqueue work 2aa4860. ASIC_jobs_queue.count: 12
20:42:16.666 > I (36489) bm1366Module: TX Job ID: 78
20:42:16.671 > I (36509) bm1366Module: pthread_mutex_unlock
20:42:18.896 > I (38739) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:18.900 > I (38739) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:18.907 > I (38739) create_jobs_task: Work enqueued 2aa4860. ASIC_jobs_queue.count: 12
20:42:18.913 > I (38739) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:18.921 > I (38749) create_jobs_task: Ready to enqueue work 2aa4860. ASIC_jobs_queue.count: 12
20:42:18.926 > I (38749) bm1366Module: pthread_mutex_lock
20:42:18.930 > I (38769) bm1366Module: TX Job ID: 00
20:42:18.935 > I (38769) bm1366Module: pthread_mutex_unlock
20:42:21.166 > I (41009) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:21.170 > I (41009) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:21.177 > I (41009) create_jobs_task: Work enqueued 2aa4860. ASIC_jobs_queue.count: 12
20:42:21.183 > I (41009) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:21.191 > I (41019) create_jobs_task: Ready to enqueue work 2aa4860. ASIC_jobs_queue.count: 12
20:42:21.196 > I (41019) bm1366Module: pthread_mutex_lock
20:42:21.200 > I (41039) bm1366Module: TX Job ID: 08
20:42:21.205 > I (41039) bm1366Module: pthread_mutex_unlock
20:42:23.436 > I (43279) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:23.440 > I (43279) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:23.447 > I (43279) create_jobs_task: Work enqueued 2aa4860. ASIC_jobs_queue.count: 12
20:42:23.453 > I (43279) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:23.461 > I (43289) create_jobs_task: Ready to enqueue work 2aa4860. ASIC_jobs_queue.count: 12
20:42:23.466 > I (43289) bm1366Module: pthread_mutex_lock
20:42:23.470 > I (43309) bm1366Module: TX Job ID: 10
20:42:23.476 > I (43309) bm1366Module: pthread_mutex_unlock
20:42:25.706 > I (45549) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:25.710 > I (45549) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:25.717 > I (45549) create_jobs_task: Work enqueued 2aa4860. ASIC_jobs_queue.count: 12
20:42:25.723 > I (45549) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:25.731 > I (45559) create_jobs_task: Ready to enqueue work 2aa4860. ASIC_jobs_queue.count: 12
20:42:25.736 > I (45559) bm1366Module: pthread_mutex_lock
20:42:25.740 > I (45579) bm1366Module: TX Job ID: 18
20:42:25.745 > I (45579) bm1366Module: pthread_mutex_unlock
20:42:27.976 > I (47819) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:27.980 > I (47819) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:27.987 > I (47819) create_jobs_task: Work enqueued 2aa4860. ASIC_jobs_queue.count: 12
20:42:27.993 > I (47819) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:28.001 > I (47829) create_jobs_task: Ready to enqueue work 2aa4860. ASIC_jobs_queue.count: 12
20:42:28.006 > I (47829) bm1366Module: pthread_mutex_lock
20:42:28.010 > I (47849) bm1366Module: TX Job ID: 20
20:42:28.015 > I (47849) bm1366Module: pthread_mutex_unlock
20:42:30.246 > I (50089) ASIC_task: Ready to dequeue Task. ASIC_jobs_queue.count: 12
20:42:30.250 > I (50089) ASIC_task: Task dequeued job: 2aa484d. ASIC_jobs_queue.count: 11
20:42:30.257 > I (50089) create_jobs_task: Work enqueued 2aa4860. ASIC_jobs_queue.count: 12
20:42:30.263 > I (50089) ASIC_task: send Task to ASIC. job: 2aa484d
20:42:30.271 > I (50099) create_jobs_task: Ready to enqueue work 2aa4860. ASIC_jobs_queue.count: 12
20:42:30.276 > I (50099) bm1366Module: pthread_mutex_lock
20:42:30.280 > I (50119) bm1366Module: TX Job ID: 28
20:42:30.285 > I (50119) bm1366Module: pthread_mutex_unlock

@MoellerDi
Copy link
Contributor Author

I did some more troubleshooting in the meantime. Changing/lowering the timeout (currently 60000ms) in BM1366_receive_work did not help.

However, IMHO the root cause might be related to serial communication with the chip as ASIC_result_task and ASIC_task dealing with serial stuff at the same time.

When ASIC_result_task get started, one of the first things it is doing is calling SERIAL_clear_buffer()

As ASIC_task got started earlier and is messing with serial already
(like calling SERIAL_set_baud, ASIC_functions.set_difficulty_mask_fn, ASIC_functions.send_work_fn, etc)
it IMHO might cause ASIC_result_task to get somewhat out-of-sync and/or when ASIC_result_task got started and is calling SERIAL_clear_buffer at a time where ASIC_task is already running might cause issues as well.

However, changing the order of the task-workers as outlined in this PR seems to improve the serial timing and resolved it (for me). I took several logs, just let me know if you want me to put it somewhere.

@skot
Copy link
Owner

skot commented Mar 27, 2024

This is awesome, thanks for looking into it @MoellerDi

I have also noticed on some occasions mining fails to start. It would make sense that this is a race condition.

I'll give this a try on my Bitaxe and see how it works.

We do have some smarter serial parsing on the hex branch that should be merged into master at some point.

@MoellerDi
Copy link
Contributor Author

@skot / @benjamin-wilson
Thanks for your comments. I changed my mind and moved some critical serial stuff into app_main(). This seems to be a much cleaner and better approach. May I ask you to recheck/review once again?

@benjamin-wilson
Copy link
Collaborator

It looks good, i'lll test this on a max, ultra and supra. If it works, i'll merge.

@benjamin-wilson benjamin-wilson merged commit ec4af81 into skot:master May 24, 2024
1 check passed
aaron3481 pushed a commit to aaron3481/ESP-Miner that referenced this pull request Jun 11, 2024
… (due to race condition) (skot#152)

* fix ASIC not starting - order is important

* Revert "fix ASIC not starting - order is important"

This reverts commit c313a6f.

* better approach resolving race condition in serial communication; moving serial from ASIC_task / ASIC_result_task to app_main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Have to press the restart button in AxeOS to start hashing
3 participants