| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by visarga 1045 days ago
	In my tests LLaMa2-13B is useable for information extraction tasks and LLaMA2-70B is almost as good as GPT-4 (for IE). These models are the real thing. We can fine-tune LLaMAs, unlike OpenAI's models. Now we can have privacy, control and lower prices. We can introduce guidance, KV caching and other tricks to improve the models. The enthusiasm around it reminds me of JavaScript framework wars of 10 years ago - tons of people innovating and debating approaches, lots of projects popping up, so much energy!

7 comments

pavlov 1045 days ago

> “The enthusiasm around it reminds me of JavaScript framework wars of 10 years ago”

Hmm. If LLMs turned out like JS frameworks, that would mean that in ten years people will be saying:

“Maybe we don’t really need all this expensive ceremony, honestly this could be done with vanilla if/else heuristics…?”

link

spacebanana7 1045 days ago

I can imagine a bloated world where 500B param models are used for tasks where 7B param modes perform adequately.

At that time, there could be complaints on hacker news about messaging apps with autocomplete models that take up gigabytes.

link

pavlov 1045 days ago

I could see a world where people turn to an LLM for a task where the hand-rolled solution could be a simple state machine or some nested switch statements.

The irony would be that the LLM could write you that code, but if you don’t know to ask…

link

falcor84 1045 days ago

The real irony would be if the AI uses an LLM to write the code the second time it sees the same request repeated, deploys it to effortlessly deal with all future requests, and goes back to playing Quake while collecting the full paycheck.

link

phillipcarter 1045 days ago

A key difference is that the impact of these enormous models is obscured away. I don't feel the impact of having to run GPT-3.5-turbo at scale because it's just an API call away, and it's even more reliable now.

The main critiques outside of data privacy I've read are related to energy consumption, but even then, it's...not compelling? I read an article[0] that estimated the training of ChatGPT (3.5) to emit as much C02 as more than 3 round-trip flights between SF and NYC. That's not good! But also, really highlights that if we're to reduce emissions, there's clearly bigger targets than the largest ML models in the world.

[0]: https://themarkup.org/news/2023/07/06/ai-is-hurting-the-clim...

link

dmd 1045 days ago

"How can I sum this column of numbers?"

"IDK, throw it at the LLM"

link

azeirah 1045 days ago

I do stuff like this sometimes when I have some csv or something and need it in JSON.

Could easily be done with 1 line of bash or js or python or whatever but... it's easier to just let the LLM do it for me :|

link

satvikpendem 1045 days ago

Or you could ask the AI for a script to convert it (whose resultant JSON would be free of hallucinations) and ask the AI to run the code, even, since I believe ChatGPT now has that ability, because I've definitely noticed that ChatGPT will sometimes mess up the data when converting from one format to another.

link

why_only_15 1045 days ago

That would be awesome, but we've tried for decades and haven't gotten there with basic if/else. I do think it's pretty plausible that if you combine some very slimmed down models with strong heuristics you could get far though. At the moment I think expense doesn't really matter -- an hour of a knowledge worker's time is worth 416,000 tokens of GPT-4, the most expensive model out there. For llama-2 it's even less time. Unless you're processing truly epic numbers of tokens, by far the most important is whether we can use these things for real.

link

pavlov 1045 days ago

> "That would be awesome, but we've tried for decades and haven't gotten there with basic if/else."

Oh, I know — I was trying to throw some shade on the state of JS frameworks rather than LLMs. With the pendulum now swinging back to vanilla DOM manipulation, it feels like the enormous effort spent on devising ways to wrap web UIs in endless variations of abstractions might have been somewhat of a waste.

link

antupis 1045 days ago

Well these models at least tell in name what they do eg llama-2-70b-Guanaco-QLoRA-fp16.

link

intelVISA 1045 days ago

Subtle :)

link

broast 1045 days ago

Any information about how long it takes to achieve a fine tune comparable to OpenAI's current fine tunes of ada models? On consumer hardware vs cloud? OpenAI's fine tune times are on the order of hours for tens of thousands of samples, but expensive. Any information on the effort and time involved in fine tuning compared to OpenAI's current process would be appreciated.

link

MrYellowP 1045 days ago

> information extraction task

I do that with orca-mini-3b in ggml format and it's pretty good at it, at twice the speed. Of all the LLMs I've tried, this one gave me the best results. It just requires a properly written prompt.

link

nsomaru 1045 days ago

Could you elaborate on the prompting strategies you have used that are more effective?

link

asabla 1045 days ago

> The enthusiasm around it reminds me of Javascript wars 10 years ago... so much energy!

I kind of have the same feeling as well. With all this energy it's really hard to keep up with all new ideas, implementations, frameworks and services.

Really excited for what this will bring us the next coming years

link

sva_ 1045 days ago

> With all this energy it's really hard to keep up with all new ideas, implementations, frameworks and services.

The majority of them are mostly irrelevant. You just need to figure out which.

link

dchuk 1045 days ago

Can you share an example of information extraction prompts? I specifically am interested in using LLMs as basically general purpose web scrapers that can take html and extract matching data per a prompt into structured json schema…do you think this is possible with llama 2?

link

behnamoh 1045 days ago

> ... and lower prices.

Not sure about this. atm, the cost of any cloud GPU (spot or not) far exceeds the cost of OpenAI's API. I'd be glad to be proven wrong because I, too, want to run L2 (the 70b model).

Also, buying a GPU, even 4090, is not feasible for most people. And it's not just about GPU—you'd have to build a PC for it to work, and there's the hidden maintenance cost of running desktop Linux (to use GPTQ for instance). It's not surprising that most users prefer someone else (OpenAI) to do it for them.

link

jddj 1045 days ago

I have to admit, I wouldn't have imagined even a few months ago that I'd be reading this comment.

Sure, you can run something comparable to OpenAI's flagship product at home, but it's moderately expensive and slightly inconvenient so people will still pay for the convenience.

link

worldsavior 1045 days ago

It looks like it will always be a war like Android VS iOS only now it's with AI models.

link