Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InstrFetchProhibited: AsyncClient::_close() -> _removeAllCallbacks #4

Open
DOSpt opened this issue Apr 11, 2021 · 5 comments
Open

InstrFetchProhibited: AsyncClient::_close() -> _removeAllCallbacks #4

DOSpt opened this issue Apr 11, 2021 · 5 comments

Comments

@DOSpt
Copy link

DOSpt commented Apr 11, 2021

Hi

I've started using this lib due to the locking problems with the original asyncTCP.

Last week while doing some tests I got this error on 2 different projects. It seems that the pointer to _removeAllCallbacks was cleared. The only way I see this happening is by the destructor being called, although it seems very strange. I was trying to replicate the problem but it doesn't seem to be easy to force it to happen. Regarding heap I know that I'm ok. Just before it happened it was like this: Heap: Free:157588, Min:117616, Size:340244, Alloc:73304.

I believe we can increase the scope of the _asyncsock_mutex until the end of the close method. Other thing that might be happening is that the program counter is pointing to the next instruction after the problem. In this case it could be while doing this (if (_discard_cb) _discard_cb(_discard_cb_arg, this);). If the _error method is called right after checking if the pointer is null it will clear this pointer and generate the error. Either way increasing the scope of the _asyncsock_mutex should solve the problem.

This happened immediately after a simple post with less than 150bytes.

Guru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x11ef03e1  PS      : 0x00060130  A0      : 0x800e970f  A1      : 0x3ffd8790  
A2      : 0x3ffdf258  A3      : 0x00000000  A4      : 0x3ffdf258  A5      : 0x00000001  
A6      : 0x00060720  A7      : 0x00000000  A8      : 0x800e8d94  A9      : 0x3ffd8770  
A10     : 0x3ffdf2b8  A11     : 0x3ffdf2b8  A12     : 0x00000003  A13     : 0x00000000  
A14     : 0x00000000  A15     : 0x00000000  SAR     : 0x0000000a  EXCCAUSE: 0x00000014  
EXCVADDR: 0x11ef03e0  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  

ELF file SHA256: 0000000000000000

Backtrace: 0x11ef03e1:0x3ffd8790 0x400e970c:0x3ffd87b0 0x400e9b73:0x3ffd87d0 0x400e95ee:0x3ffd8800 0x4008a332:0x3ffd8860
PC: 0x11ef03e1
EXCVADDR: 0x11ef03e0

Decoding stack results
0x400e970c: AsyncClient::_close() at C:\Users\Daniel\Documents\Arduino\libraries\AsyncTCP\src\AsyncTCP.cpp line 744
0x400e9b73: AsyncClient::_sockPoll() at C:\Users\Daniel\Documents\Arduino\libraries\AsyncTCP\src\AsyncTCP.cpp line 706
0x400e95ee: _asynctcpsock_task(void*) at C:\Users\Daniel\Documents\Arduino\libraries\AsyncTCP\src\AsyncTCP.cpp line 206
0x4008a332: vPortTaskWrapper at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c line 143
@DOSpt
Copy link
Author

DOSpt commented Apr 11, 2021

I've just increased the scope and added an extra validation that _socket is still -1 and got the following error after some minutes with 2 PCs updating values every 300ms.

Guru Meditation Error: Core  0 panic'ed (InstrFetchProhibited). Exception was unhandled.
Core 0 register dump:
PC      : 0x00070000  PS      : 0x00060330  A0      : 0x800e8dcc  A1      : 0x3ffd8490  
A2      : 0x3ffde3e4  A3      : 0x3ffd84ac  A4      : 0x3ffd84a8  A5      : 0x00000000  
A6      : 0x00000000  A7      : 0x000001a3  A8      : 0x8017fdc4  A9      : 0x3ffd8460  
A10     : 0x3ffde3e4  A11     : 0x3ffde3e4  A12     : 0x00000003  A13     : 0x00000001  
A14     : 0x00060120  A15     : 0x00000000  SAR     : 0x0000000a  EXCCAUSE: 0x00000014  
EXCVADDR: 0x00070000  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff  

ELF file SHA256: 0000000000000000

Backtrace: 0x00070000:0x3ffd8490 0x400e8dc9:0x3ffd84b0 0x400e9797:0x3ffd84d0 0x400e97ae:0x3ffd84f0 0x400e3276:0x3ffd8510 0x400e353e:0x3ffd8530 0x400e3561:0x3ffd8550 0x400e9222:0x3ffd8570 0x400e9c1e:0x3ffd85a0 0x400e967e:0x3ffd85d0 0x4008a332:0x3ffd8630
PC: 0x00070000
EXCVADDR: 0x00070000

Decoding stack results
0x400e8dc9: AsyncClient::_removeAllCallbacks() at C:\Users\Daniel\Documents\Arduino\libraries\AsyncTCP\src\AsyncTCP.cpp line 717
0x400e9797: AsyncClient::_close() at C:\Users\Daniel\Documents\Arduino\libraries\AsyncTCP\src\AsyncTCP.cpp line 745
0x400e97ae: AsyncClient::close(bool) at C:\Users\Daniel\Documents\Arduino\libraries\AsyncTCP\src\AsyncTCP.cpp line 863
0x400e3276: AsyncWebServerRequest::_onAck(unsigned int, unsigned int) at C:\Users\Daniel\Documents\Arduino\libraries\ESPAsyncWebServer\src\WebRequest.cpp line 224
0x400e353e: AsyncWebServerRequest::_onPoll() at C:\Users\Daniel\Documents\Arduino\libraries\ESPAsyncWebServer\src\WebRequest.cpp line 203
0x400e3561: std::_Function_handler   >::_M_invoke(const std::_Any_data &,  ,  ) at C:\Users\Daniel\Documents\Arduino\libraries\ESPAsyncWebServer\src\WebRequest.cpp line 73
0x400e9222: std::function ::operator()(void*, AsyncClient*) const at c:\users\daniel\appdata\local\arduino15\packages\esp32\tools\xtensa-esp32-elf-gcc\1.22.0-97-gc752ad5-5.2.0\xtensa-esp32-elf\include\c++\5.2.0/functional line 2271
0x400e9c1e: AsyncClient::_sockPoll() at C:\Users\Daniel\Documents\Arduino\libraries\AsyncTCP\src\AsyncTCP.cpp line 711
0x400e967e: _asynctcpsock_task(void*) at C:\Users\Daniel\Documents\Arduino\libraries\AsyncTCP\src\AsyncTCP.cpp line 206
0x4008a332: vPortTaskWrapper at /home/runner/work/esp32-arduino-lib-builder/esp32-arduino-lib-builder/esp-idf/components/freertos/port.c line 143

Changes:

void AsyncClient::_close(void)
{
    //Serial.print("AsyncClient::_close: "); Serial.println(_socket);
    xSemaphoreTakeRecursive(_asyncsock_mutex, (TickType_t)portMAX_DELAY);
	if(_socket != -1)
	{
		_conn_state = 0;
		lwip_close_r(_socket);
		_socket = -1;

		_clearWriteQueue();
		if (_discard_cb) _discard_cb(_discard_cb_arg, this);
		_removeAllCallbacks();
	}
	xSemaphoreGiveRecursive(_asyncsock_mutex);
}

void AsyncClient::_error(int8_t err)
{
    xSemaphoreTakeRecursive(_asyncsock_mutex, (TickType_t)portMAX_DELAY);
	if(_socket != -1)
	{
		_conn_state = 0;
		lwip_close_r(_socket);
		_socket = -1;

		_clearWriteQueue();
		if (_error_cb) _error_cb(_error_cb_arg, this, err);
		if (_discard_cb) _discard_cb(_discard_cb_arg, this);
		_removeAllCallbacks();
	}
	xSemaphoreGiveRecursive(_asyncsock_mutex);
}

void AsyncServer::end()
{
	xSemaphoreTakeRecursive(_asyncsock_mutex, (TickType_t)portMAX_DELAY);
    if (_socket != -1)
	{
		lwip_close_r(_socket);
		_socket = -1;
	}
    xSemaphoreGiveRecursive(_asyncsock_mutex);
}

Exception:

void AsyncClient::_removeAllCallbacks(void)
{
    _connect_cb = NULL;  \\  <-- line 717
    _connect_cb_arg = NULL;
    _discard_cb = NULL;
    _discard_cb_arg = NULL;
    _sent_cb = NULL;
    _sent_cb_arg = NULL;
    _error_cb = NULL;
    _error_cb_arg = NULL;
    _recv_cb = NULL;
    _recv_cb_arg = NULL;
    _timeout_cb = NULL;
    _timeout_cb_arg = NULL;
    _poll_cb = NULL;
    _poll_cb_arg = NULL;
}

@avillacis
Copy link
Member

I see you are using the ESPAsyncWebServer library. Which fork and branch are you using? I do all my tests with https://github.com/yubox-node-org/ESPAsyncWebServer/tree/yuboxfixes-0xFEEDC0DE64-cleanup because the base ESPAsyncWebServer had many locking issues that were analyzed and fixed by @0xFEEDC0DE64 , and which might be confused with issues in AsyncTCPSock.

@DOSpt
Copy link
Author

DOSpt commented Apr 11, 2021

I see you are using the ESPAsyncWebServer library. Which fork and branch are you using? I do all my tests with https://github.com/yubox-node-org/ESPAsyncWebServer/tree/yuboxfixes-0xFEEDC0DE64-cleanup because the base ESPAsyncWebServer had many locking issues that were analyzed and fixed by @0xFEEDC0DE64 , and which might be confused with issues in AsyncTCPSock.

I'm using that fork and for the filesystem I'm using LittleFS

@avillacis
Copy link
Member

Do you have a minimal reproducing case, and a clear set of instructions to reproduce this issue? As in, a sketch that sets up just enough to trigger the bug, and the instructions, such as "repeatedly fetch URL such and such until failure"?

@avillacis
Copy link
Member

BTW, as of commit fea600a installed callbacks are never removed under any condition other than straight-up destruction of the object. You should test whether this fixes your bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants