| I couldn't get your 150M file, so I used one of the smaller files I could get by clicking on the first set shown in the table (the FASTA file was only 30KB) and duplicated it until it was around 150MB. Here's a comparison with Common Lisp: ~/fasta-dna $ time python3 run.py 0.3797277865097147 21.828 secs ~/fasta-dna $ time sbcl --script run.lisp 0.37972778 2.415 secs ~/fasta-dna $ ls -al nc_045512.2.fasta -rw-r--r-- 1 156095639 2021-09-25 11:15
nc_045512.2.fasta So, almost as fast as Nim (the time includes compilation time)? Here's the Common Lisp code: (with-open-file (in "nc_045512.2.fasta")
(loop for line = (read-line in nil)
while line
with gc = 0 with total = 0 do
(unless (eql (aref line 0) #\>)
(loop for i from 0 below (length line)
for ch = (char line i) do
(setf total (1+ total))
(when (or (eql ch #\C) (eql ch #\G))
(setf gc (1+ gc)))))
finally (format t "~f~%" (/ gc total))))
With a top-level function and some type declarations it could run even faster, I think.EDIT: compiling the Lisp code to FASL and annotating the types brings the total runtime to 2.0 seconds. Running it from source increases the time very slightly, to 2.08 seconds, showing how the SBCL compiler is incredibly fast. Taking 0.7 seconds to compile a few lines of code is crazy, imagine when your project grows to many thousands of lines. The Lisp code still can't really match Nim, which is really C at runtime, in speed when excluding compile-time, but if you need a scripting language, CL is great (specially when used with the REPL and SLIME). |
Also, @benjamin-lee this version of the Nim program is a bit lower level, but probably much faster:
Compile with -d:danger and so on, of course. { On a small 30kB test file I got about a 1.7x speed-up over that of the blog post. I also could not find the 150 MB file. Multiplying up the tiny 30 KB file like @brabel, I got only a 1.25x speed-up down to 0.5 seconds. So, might not be worth the low levelness, but a real file might tilt more towards the 1.7x end. }