| Thank you very much! Yes, on my 2nd generation low power i3 running my normal desktop at the time, I get individual ~200ms response time, and that is with 115 32kbyte requests per second. I am rewriting the high level protocol code that is a bit rickety because it is snowballed from the earliest code in the project other than the asyncio SSH library I implemented from scratch. When that rewrite is done (a week or two), that will decrease latency greatly, and improve efficiency greatly, and thus should even improve throughput (although that is already max out your pipe with actual data as it is low overhead). If I switch from 32k blocks to 512k blocks which I did some testing on (Freenet is 1meg blocks), that gives me a 10x !! throughput improvement with same CPU usage and no increase in latency of per request. The only reason I am 32k blocks originally was the ssh protocol is 35k max packet size and I don't want to break spec so as to be able to hide as normal ssh traffic :) The 512k blocks test I did as multiple packets and was 10x faster, because that means 512k per FindNode operation instead of just 32k :) I was sitting on the fence on switching to it because I want to do the rewrite of the high level code first because the multiple data packets per request complicates that snowballed code even more :) Also, I am using pycrypto which initial tests show is actually much slower than the other library I will likely switch to (it is called simply cryptography, it wraps platform openssl instead of implementing itself as pycrypto does). I went with pycrypto to minimize dependencies. I will have it detect if you have cryptography installed and use that optionally. I've already abstracted the pycrypto api so I can easily have it switchable at runtime. This should decrease latency a good amount as well. |
A 'good' result for an i5-i7 core is to get at least 10k requests per second on that situation.
You are gonna live or die by this measure dude, so work on improving it. You are around 100x far from it but if you're lucky you can get there with 'just' two 10x improvements. I suggest you to look into Flame Graphs [1], they are awesome. I have used them to pinpoint exactly where are the 10-100x bottlenecks on my code and unclog them.
Also, about your website, just make it sound less like an infomercial and you'll be fine.
And last but not least, best of luck!
[1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html