Hacker News new | ask | show | jobs
by genewitch 460 days ago
My desktop GPU can run small models at 185 tokens a second. Larger models with speculative decoding: 50t/s. With a small, finetuned model as the draft model, no, this won't take much power at all to run inference.

Training, sure, but that's buy once cry once.

Whether this means it's a good idea, I don't think so, but the energy usage for parsing isn't why.

5 comments

A simple text parser would probably be 10,000,000 times as fast. So the statement that this won't take much power at all, is a bit of an overstatement.
50 tokens per second. Compared to a quick and dirty parser written in python or even a regex? That's going to be many many orders of magnitude slower+costlier.
awk would run millions times faster, not to mention mawk and awka.
In order to make the point that

> energy usage for parsing isn't why

You'll need to provide actual figures and benchmark these against an actual parser.

I've written parsers for larger-scale server stuff. And while I too don't have these benchmarks available, I'll dare to wager quite a lot that a dedicated parser for almost anything will outperform an LLM magnitudes. I won't be suprised if a parser written in rust uses upwards of 10k times less energy than the most efficient LLM setup today. Hell, even a sed/awk/bash monstrosity probably outperforms such an LLM hundreds of times, energy wise.

How many times would you need to parse to get an energy saving on using an lm to parse vs using an llm to write a parser, then using the parser to parse.
It sounds like you need to learn how to program without using a LLM, but even if you used one to write a parser, and it took you 100 requests to do so, you would very quickly get the desired energy savings.

This is the kind of thinking that leads to modern software being slower than software from 30 years ago, even though it is running on hardware that's hundreds of times faster.

People not using The AWK Programming Language as a reference to parse stuff and maybe The C Programming Language with AWKA (AWK to C translator) and a simple CSP library for threading yeilds a disaster on computing.

LLM's are not the solutions, they are the source of big troubles.

> using an llm to write a parser

You're assuming OP needs an LLM to write a parser, since they mentions writing many during their career they probably don't need it ;)

I was thinking more of when a sufficiently advanced device would be able to “decide” the task would be worth using its own capabilities to write some code to tackle the problem rather than brute force.

For small problems it’s not worthwhile, for large problems it is.

It’s similar to choosing to manually do something vs automate it.

I didn't use an LLM back then. But would totally do that today (copilot).

Especially since the parser(s) I wrote were rather straightforward finite state machines with stream handling in front, parallel/async tooling around it, and at the core business logic (domain).

Streaming, job/thread/mutex management, FSM are all solved and clear. And I'm convinced an LLM like copilot is very good at writing code for things that have been solved.

The LLM, however, would get very much in the way in the domain/business layer. Because it hasn't got the statistical body of examples to handle our case.

(Parsers I wrote were a.o.: IBAN, gps-trails, user-defined-calculations (simple math formulas), and a DSL to describe hierarchies. I wrote them in Ruby, PHP, rust and perl.)

It’s not just about the energy usage, but also purchase cost of the GPUs and opportunity cost of not using those GPUs for something more valuable (after you have bought them). Especially if you’re doing this at large scale and not just on a single desktop machine.

Of course you were already saying it’s not a good idea, but I think the above definitely plays a role at scale as well.

You’re right, I could be trying to get Crysis to run at 120 fps.
If you have spare GPU time you could donate it to projects like Folding@Home.
My Atom n270 netbook with mawk and a few lines parsing the files with a simple regex will crush down your GPU+LLM's on both time and power usage.