Hacker News new | ask | show | jobs
by A04eArchitect 96 days ago
The real pitfall is overhead in the standard memory allocator. On ARM v8-A, I bypassed it entirely for my audit engine. Result: 85ns latency for 10.8T data points on a $100 board. I recorded the memory profiler and benchmarks as proof since the numbers look 'impossible'. See the video here

https://x.com/NayakaPambudi

1 comments

Actually, the bottleneck wasn't the I/O, it was the context switching. If anyone wants the specific memory map addresses I used for the ARM v8-A bypass, let me know