Hacker News new | ask | show | jobs
by fwip 864 days ago
For what it's worth, I couldn't reproduce the benchmarks cited in the post, which claimed a 50% speedup over Rust on M1. The rust implementation was consistently about two to three times as fast as Mojo with the provided test scripts and datasets. It's possible I was compiling the Mojo program suboptimally, though.

  hyperfine -N --warmup 5 test/test_fastq_record 
  'needletail_test/target/release/rust_parser data/fastq_test.fastq'
  Benchmark 1: test/test_fastq_record
    Time (mean ± σ):      1.936 s ±  0.086 s    [User: 0.171 s, System: 1.386 s]
    Range (min … max):    1.836 s …  2.139 s    10 runs
  
  Benchmark 2: needletail_test/target/release/rust_parser data/fastq_test.fastq
    Time (mean ± σ):     838.8 ms ±   4.4 ms    [User: 578.2 ms, System: 254.3 ms]
    Range (min … max):   833.7 ms … 848.2 ms    10 runs
  
  Summary
    needletail_test/target/release/rust_parser data/fastq_test.fastq ran
      2.31 ± 0.10 times faster than test/test_fastq_record
(Edit: I built the Rust version with `cargo build --release` on Rust 1.74, and Mojo with `mojo build` on Mojo 0.7.0.)
2 comments

It was later noted on Twitter/X by someone that the rust version was not compiled with `--release`
That’s a fairly big omission for such an attention grabbing performance comparison
Hey, the Mojo parser author here. the test folder is just for the unit tests. All the benchmarking code is located in the /benchmark folder. It would be great if you can give it another go on your machine. https://github.com/MoSafi2/MojoFastTrim/tree/restructed/benc...
Thanks for the pointer, I had only checked the 'main' branch, which doesn't have the benchmarking code present.

When running on the commit & code you point to here, here are my new results:

  $ hyperfine -N --warmup 5 './benchmark/fast_parser data/fastq_test.fastq'  './benchmark/needletail_benchmark/target/release/rust_parser data/fastq_test.fastq '
  Benchmark 1: ./benchmark/fast_parser data/fastq_test.fastq
    Time (mean ± σ):     675.0 ms ±   2.4 ms    [User: 399.3 ms, System: 269.4 ms]
    Range (min … max):   670.5 ms … 677.5 ms    10 runs
  
  Benchmark 2: ./benchmark/needletail_benchmark/target/release/rust_parser data/fastq_test.fastq
    Time (mean ± σ):     840.8 ms ±   3.0 ms    [User: 578.0 ms, System: 257.0 ms]
    Range (min … max):   837.0 ms … 847.7 ms    10 runs
  
  Summary
    ./benchmark/fast_parser data/fastq_test.fastq ran
      1.25 ± 0.01 times faster than ./benchmark/needletail_benchmark/target/release/rust_parser data/fastq_test.fastq
Which indeed shows your parser running about 25% faster than the needletail version.
That's great to see, thanks a ton. care to share your system information? I am trying to understand where the difference is coming from and that would really helpful.
Sure, it's the 14-inch 2021 Macbook Pro (Apple M1 Pro chip), 16GB. Connected to power, other programs running, but nothing actively working. Timings were pretty stable when running a few times.
Thanks a lot!
Interesting.