Hacker News new | ask | show | jobs
by user2342 769 days ago
I'm not fluent in Swift and async, but the line:

   for try await byte in bytes { ... }

for me reads like the time/delta is determined for every single byte received over the network. I.e. millions of times for megabytes sent. Isn't that a point for optimization or do I misunderstand the semantics of the code?
3 comments

The code, as the author makes clear, is an MWE. It provides a brief framework for benchmarking the behavior of the clocks. It's not intended to illustrate how to efficiently perform the task it's meant to resemble.
But it seems consequential. If the time were sampled every kilobyte, the code would be 1,000 times faster - which is better than the proposed use of other time functions.

At that point, even these slow methods are using about 0.5ms per million bytes, so it should be good up to gigabit speeds.

If that’s not fast enough, then sample every million bytes. Or, if the complexity is worth it, sample in an adaptive fashion.

I’m not sure about Swift, buy in C# and async method doesn’t have to be completed asynchronously. For example, when reading from files, a buffer will be first read asynchronous then subsequent calls will be completed synchronously until the buffer needs to be “filled” again. So it feels like most languages can do these optimizations

again.

This is what Swift does.
Yeah, this is horrifying from a performance design perspective. But in this case you'd still expect that the "current time" retrieval[1] to be small relative to all the other async overhead (context switching for every byte!), and apparently it isn't?

[1] On x86 linux, it's just a quick call into the vdso that reads the TSC and some calibration data, dozen cycles or so.

Note the end of the article acknowledges this, so this is clearly a deliberate part of the constructed example to make a particular point and not an oversight by the author. But it is helpful to highlight this point, since it is certainly a live mistake I've seen in real code before. It's an interesting test of how rich one's cost model for running code is.
The stream reader userspace libraries are very well optimized for handling that kind of "dumb" usage that should obviously create problems. (That's one of the reasons Linux expects you to use glibc instead of making a syscall directly.)

But I imagine the time reading ones aren't as much optimized. People normally do not call them all the time.

They look very similar on macOS.