Hacker News new | ask | show | jobs
by johnklos 643 days ago
I've always interpreted the definition of storage as arbitrarily large, not specifically infinite. The universe, after all, is finite. The "well, acshually" arguments aren't interesting, because they're 100% abstract.
4 comments

It is defined as arbitrarily large but not infinite. That's not because of physical concerns, but because some of the theorems don't work if the memory is actually infinite.
You're comparing an a priori concept with a posteriori one. It's like claiming the number five doesn't "acshually" exist. Like yea, it's a concept, concepts don't exist.

A universe isn't a turing machine because it can't run all the programs that can run on a turing machine. This isn't exactly controversial.

What's the difference between arbitrary large and infinite? Would you say the number of possible Turing computable functions is merely arbitrary large and not actually infinite?
There is a very clear distinction: one is finite the other is infinite

If you only allow arbitrary large turning machines, there is a fixed number of programs which can run

When you're talking about something like neural networks on a 4004, the "well ackshually" argument does become very much relevant. The limitations of that kind of platform are hard enough that they do not approximate a Turing machine with respect to modern software.
Running Linux on a 4004 is possible, as we've seen, but running llama is just way too far? Interesting take.
Llama takes a lot more MIPS and a lot more RAM than linux. Linux is more complicated, but computers were running linux 30 years ago. In this case, quantity has a quality all of its own.
It takes 14,493,515,821 cycles to boot Alpine Linux in an qemu.

    perf stat -Bddd qemu-system-x86_64   -m 2048   -cdrom alpine.iso   -boot d   -enable-kvm   -cpu host   -smp 2   -net nic -net user,hostfwd=tcp::2222-:22   -nographic   -serial mon:stdio   -monitor telnet:127.0.0.1:1234,server,nowait   -d in_asm,cpu   -D qemu.log
It takes 1,927,757,029,221 cycles to summarize a 1625 token Dijkstra essay with LLaMA 8B.

    perf stat -Bddd llamafile -m Meta-Llama-3.1-8B-Instruct.BF16.gguf -f ~/prompt1625.txt -c 4096 -n 40
Ignoring things like AVX512 you're looking at about 100x more compute to do something serious with LLaMA.

However! If you just want to demo it working, then you could generate 4 tokens using TinyLLaMA 1.1B which takes 25,164,386,466 cycles. That's about the same cost as booting Linux. So you could do TinyLLaMA if you can do Linux.

That's closer than I thought, to be honest.

Note also that the 4004 lacks a floating-point unit of any kind - not just a vector unit. I think people make 8-bit integer quantizations of LLMs, though, which would be the fastest versions to run on a 4004.

A lot of quants just upcast to floats. Some of them work on integer multiplication using pmaddubsw. But oof, it looks like the i4004 doesn't even have that.