If LLM’s were local and cheap, sure. They’re just too heavyweight of a tool to use for simple CLI output manipulation today. I don’t want to send everything to the cloud (and pay a fee), and even if it was a local LLM, I don’t want it to eat all my RAM and battery to do simple text manipulation.
In 20 years, assuming some semblance of moore’s law still holds for storage/RAM/gpu, I’m right there with you.
On my M1 Pro/16GB RAM mac I get decently fast, fully local LLMs which are good enough to do this sort of thing. I use them in scripts all the time. Granted, I haven’t checked the impact on the battery life I get, but I definitely haven’t noticed any differences in my regular use.
https://github.com/ggerganov/llama.cpp is a popular local first approach. LLaMa is a good place to start, though I typically use a model from Vertex AI via API
I have ollama's server running, and I interact with it via the REST API. My preferred model right now is Intel's neural chat, but I'm going to experiment with a few more over the holidays.
I use ollama (https://ollama.ai/) which supports most of the big new local models you might've heard of: llama2, mistral vicuna etc. Since I have 16GB of RAM, I stick to the 7b models.
Yeah, it would be much better if you could send a sample of the input and desired output and have the LLM write a highly optimized shell script for you, which you could then run locally on your multi-gigabyte log files or whatever.
With fine-tuning, you can get really good results on specific tasks that can run on regular cpu/mem. I'd suggest looking into the distillation research, where large model expertise can be transferred to much smaller models.
Also, an LLM trained to be good at this task has many more applications than just turning command output into structured data. It's actually one of the most compelling business use cases for LLMs
The complaint is less whether it would work, and more a question of taste. Obviously taste can be a personal thing. My opinions are my own and not those of the BBC, etc.
You have a small C program that processes this data in memory, and dumps it to stdout in tabular text format.
Rather than simplify by stripping out the problematic bit (the text output), you suggest adding a large, cutting-edge, hard to inspect and verify piece of technology that transforms that text through uncountable floating point operations back into differently-formatted UTF8.
It might even work consistently (without you ever having 100% confidence it won't hallucinate at precisely the wrong moment).
You can certainly see it being justified for one-off tasks that aren't worth automating.
But to shove such byzantine inefficiency and complexity into an engineered system (rather than just modify the original program to give the format you want) offends my engineering sensibilities.
If you can modify the original program, then that is by far the best way to go. More often than not, you cannot change the program, and in relation to the broader applicability, most unstructured content is not produced by programs.
Unfortunately, in my experience it is required more often than not. I have two open issues with up streams that I'm trying to not work around. They haven't even replied let alone consider changes or support contributions. These aren't small projects either. You'd be surprised how many solo maintained projects won't even entertain
Anyway, I do try, but in my experience, if it happens, it's not over night and your going to have to maintain a work around for an amount of time
Yes, makes sense. Although this was originally a post about output of common command-line tools. Some of these are built on C libraries that you can just use directly. They are usually open source.
As someone who maintains a solution that solves similar problems to jc, I can assure you that you don’t need a LLM to parse most human readable output.
this is a terrible idea, I can't think of a less efficient method with worse correctness guarantees. What invariants does the LLM enforce? How do you make sure it always does the right thing? How do you debug it when it fails? What kind of error messages will you get? How will it react to bad inputs, will it detect them (unlikely), will it hallicinate an interpretation (most likely)
I used to focus on the potential pitfalls and be overly negative. I've come to see that these tradeoffs are situational. After using them myself, I can definitely see upsides that outweigh the downsides
Developers make mistakes too, so there are no guarantees either way. Each of your questions can be asked of handwritten code too
In 20 years, assuming some semblance of moore’s law still holds for storage/RAM/gpu, I’m right there with you.