-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Slow" reaction on TCP telegrams #142
Comments
All networking is done by the SocketDispatcher running on its own thread and transferred to/from other tasks via queues. It polls the sockets on a 1ms interval when there was nothing to do on the last call, see note here: https://github.com/PerMalmberg/Smooth/blob/master/lib/smooth/core/network/SocketDispatcher.cpp#L112 There's also a 10ms timeout when doing the select-call here https://github.com/PerMalmberg/Smooth/blob/master/lib/smooth/core/network/SocketDispatcher.cpp#L78 which might attribute to the response times, but I honestly don't remember what the criteria is for it to reach the timeout as sockets operate in blocking mode. There's also the transfer between tasks that may delay things, esp. if the tasks are written such that they are using delays instead of waiting for events to arrive. Does it make a difference in response timie if you have a constant stream of data compared to single messages? |
I tweaked a little bit with the two timing parameters you mentioned (sleep_for and the timeouts for select). The sleep_for seems to have not so much influence - if I increase it to 2 or 3 the response time does not change so much. But the timeout values for select() seem to have some more influence. Reducing them to zero give a bit faster responses. The linux man mage for select() says:
I think setting this to zero seems to make much sense since here we are actually polling. Shall I modify this also in the next PR? But nevertheless the responsetimes are still not what I am expecting. Currently I am trying to implement some CPU-load measurement for each CPU in oder to better balance the load - unfortunately right now I am getting weired results: during higher network load the CPU load of both cores reduce significantly. Strange... |
Well, we are polling non-blocking sockets, so I expect select() to return immediately anyway, but I might be wrong. If it indeed does make a difference then we should perhaps make it configurable such that its easy to adjust without needing to do code changes. I'm sure I set it at the current value for a reason. |
I used
It seems as if SocketDispatcher consumes really much ressources. Do you have any idea if I could have a chance to optimize the runtime behaviour somehow? I tried already setting for some functions the attribute |
I did similar tests way back, also got weird results. It gets better if you run on a single core iirc. I have no ideas as to what to optimize, but I'd start with ensuring that SocketDispatcher doesn't perform work when it doesn't need to. Perhaps I have made an error in how it waits for work? |
In order to check for possible optimizations, I am currently trying to understand how Smooth works internally. As far as I understand, SocketDispatcher is calling select() in order to check for work. If there is work, it calls |
You've probably figured it out already, but each Task gets notified via a condition_variable when there is data available for that task. So there is a handover of the data through queues between the socketdispatcher and the tasks. Or, in other words, no readable() and writable() are only run in the socketdispatcher. |
I still don't understand fully the internals of Smooth, especially the passing of messages through queues. And so I don't understand if somewhere there might be some delay. |
Oh, I'm certain that there are things in Smooth that needs optimization. I
really wish I had the time to be of more active assistance, I know this
would be a fun thing to dig into. Perhaps I can make some time and do some
testing on Linux, at least I'd not have to setup the HW. Smooth doesn't see
a difference between the two platform. It might be tricky to see the
problem on Linux though as the speed of the computer isn't really
comparable with the ESP32.
Looking forward to your results.
…On Wed, 17 Mar 2021, 18:31 Lothar, ***@***.***> wrote:
I still don't understand fully the internals of Smooth, especially the
passing of messages through queues. And so I don't understand if somewhere
there might be some delay.
Nevertheless I still see a response time af approx. 15ms (on average) in
Wireshark. Approx. 1,5ms of this is due to the UART communication. In our
compay we are doing a similar thing with a 300MHz µC with integrated
Ethernet MAC (no UART in between). Here we have a response time of approx.
1ms. So the communication with the ESP32 is more than a factor of 10
slower. I would have expected that the 240MHz ESP32 (with UART in between)
will have a response time of max. 5ms - or even less. But maybe this is
also related to the fact that the ESP32 executes a lot of code from serial
flash which is tremendously slow. Maybe the new version of the ESP32 (the
ESP32-S3) might be faster since it has an octal serial flash (and not 4 bit
like today).
In order to figure out if lwip or smooth is the bottleneck I will setup a
new project doing this communication with bare esp-idf and lwip
functionality only. I will let you know the result - but it will take some
time.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#142 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAU2LLBYB5T5LLAXLFRLNE3TEDRNRANCNFSM4YPNRPWA>
.
|
Now I experimented a little bit with the TCP socket server example. After applying some "optimizations" I achieved TCP echo response time of approx. 2-3ms. The bare example code had a response time of approx. 6-8ms. These optimizations are:
After all this I come to the conclusion that the ESP32 is quite slow - probably due to the fact that a lot of code is executed from slow serial flash memory. The "overhead" of the socket management in Smooth makes it even slower. Next I will check if I could optimize Smooth a little somehow. But I have some doubt that I will be able to achieve my target of 5ms response time using Smooth. |
That's a good reduction in time just from fairly small changes. And yes, it should be possible to operate sockets in parallel with Smooth. SocketDispatcher only works with the sockets that are registered with it. |
In the meantime I tried to speedup Smooth a little bit. I did the following optimizations:
After all this I was able to drill the response time down to 7-8ms (from initially 10-15ms). So, these actions seem to make much sense - but I currently don't know which of these brings how much - maybe there are some modifications which might not be so urgent. |
|
Edit: I just checked the influence of the select timeout: reducing it from 10ms to 1ms gives on average a 4ms faster response. So this is one of the more interesting improvements for my scenario. |
|
Now I also checked the performance difference of making several functions inline. With compiler options set to "optimize for performance" it seems to have only a small advantage; probably less than 1ms. So we can neglect this. |
I see the problem that TCP communication seems to be relatively "slow". I put "slow" in quotes because it is "somehow" fast but not really fast enough for what I intend to do. In my application I send TCP telegrams from PC to the ESP32. The telegram is more or less tunneled to the asynchronous interface without major modifications. The reply coming from the asynchronous interface is then packed into a TCP telegram and sent back as response. The response time of this system is approx. 10 - 15ms (measured with Wireshark on the PC). Here I was hoping to have a response time of maybe 5ms. The response time on the serial communication is approx. 1ms only - so this can not be the problem. The delay obviously occurs somewhere in Wifi, Lwip, Smooth.
I checked already several times all menuconfig options for possible optimizations and I also distributed all tasks to the two cores of the ESP32 in order to have a somehow balanced load.
So, my question is: do you have an idea if there is somewhere a delay in Smooth coming e.g. from a task cycle time? Do you have any other hint how I could speedup the communication?
The text was updated successfully, but these errors were encountered: