-
Notifications
You must be signed in to change notification settings - Fork 567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offload HTTP work into dedicated process #1223
Comments
zmq vs multiprocessing.queue benchmarks1M tiny binary messages:
1M big JSON messages:
The biggest discovery here is that the cost of IPC is almost negligible. We can do roughly 1 million messages in ten seconds, that's pretty amazing. Pyzmq may be slightly slower for larger messages. But it has two very important features that multiprocessing.queue doesn't -- router/dealer topology, and async support. 1M big JSON messages (async):
|
Sorry to barge in, I would like to recommend Niquests as a solution embedding advanced multiplexing, DNS-over-QUIC/HTTPS/TLS, async, happy eyeballs, etc... Given the project goals, I think it's a good match. If interested, I can assist. Regards, |
@Ousret I hadn't heard of niquests, thanks for the recommendation! I am finished overhauling DNS and almost ready to start on HTTP. Right now we use httpx for our web library, but my plan was to benchmark its speed vs aiohttp before the overhaul. Having seen niquests, I think we should try it too. From its readme it seems like it wins in features. Do you want to write a benchmark comparing the speed of all three -- httpx, aiohttp, and niquests? EDIT: I see you've already done some benchmarks. I didn't realize you're the author of the tool. Congrats and nice work on those features! I'll handle writing the benchmark. For BBOT, speed and stability matter a lot since we can easily issue tens of thousands of requests during a single scan. I'm especially interested in async performance with a big pool size (i.e. 50 concurrent requests). |
Here what my experience tells me about those benchmarks. aiohttp is fairly low-to-mid level http client with a c extension, it's nearly unbeatable as of right now in terms of raw performance. But one cannot compare httpx, or niquests against it. The features served are on another level. If you want to beat aiohttp, you'll have no choice but to implement urllib3-future itself, but usually, such a speed won't be productive as I have seen many remote peers (waf) simply blocking you due to the excessive throughput My advise, is to leverage Niquests (asyncio + multiplexing + happy eyeball + multiple dns over https provider) with a pool of 50 to 100 connections. You'll keep a nice and clean code with a lot of flexibility. This shall be interesting to witness work along your software. Lastly, all others are currently blocked by advanced WAF (TLS fingerprinting) and our is really closer to a real browser. Especially with HTTP/3. Let me know if you need anything. For the WAF proof import niquests
import requests
import httpx
URL = "https://kick.com/video/f60e4115-7b7b-4680-a4b9-a48e7b74d45c"
if __name__ == "__main__":
r = requests.get(URL)
print("Requests", r)
r = httpx.get(URL)
print("HTTPX", r)
r = niquests.get(URL)
print("Niquests", r)
r = niquests.get(URL)
print("Niquests", r) gives you
|
Thanks, that's really insightful. Especially about the WAF. Excited to try it out. |
HTTP engine added in #1340. |
Eventually I would like to give HTTP and DNS each their own process with their own event loop, etc.
This would enable the scans to go much faster by both decreasing the CPU usage in the main process and freeing up the async event loop to do other things. It would make managing DNS/HTTP rate-limits easier, and allow us to finally replace projectdiscovery's
httpx
.Check out this example that uses ZeroMQ and unix sockets to farm out concurrent web requests to another process:
The text was updated successfully, but these errors were encountered: