Hacker News new | ask | show | jobs
by deltaci 1552 days ago
this is very naive reimplentation of the C# version. I managed to reduce the runtime of the same file from 5.7 seconds to just 800ms

    using var file = File.OpenRead("file.bin");
    var counter = 0;
    var sw = Stopwatch.StartNew();
    var buf = new byte[4096];
    while (file.Read(buf,0,buf.Length) > 0)
    {
        foreach (var t in buf)
        {
            if (t == '1')
            {
                counter++;
            }
        }
    }

    sw.Stop();
    Console.WriteLine($"Counted {counter:N0} 1s in {sw.Elapsed.TotalMilliseconds:N4} milliseconds");
2 comments

This code has a bug that can cause the count to be overreported. The last `file.Read` may only partially fill the buffer, but this code will look for 1s in the entire buffer.

(This bug won't affect the performance comparison, but I was just reminded of how error prone these kinds APIs can be vs the PHP/Python route of having the library function just allocate a new buffer each time.)

Would something like this help? (No of course I haven't compiled it.)

    int len;
    while ((len=file.Read(buf,0,buf.Length)) > 0) {
        for (int i=0; i<len; i++_) {
            if (buf[i] == '1') {
                counter++;
            }
        }
    }
If you know the file isn’t too large, there also is File.ReadAllBytes (https://docs.microsoft.com/en-us/dotnet/api/system.io.file.r...)

My C# is very rusty (no pun intended) but I would guess the core of the program could be something like

  File.ReadAllBytes("file.bin").Where(x => x == '1').Count()
And nitpick: the code you gave has a bug. file.Read can return less than 4096. If so, you should only loop over the part of the buffer it filled.