Hacker News new | ask | show | jobs
by tantalor 671 days ago
Is this just using LLM to be cool? How does pure LLM with basic "In the scale between 0-10 ..." prompt stack up against traditional, battle-tested sentiment analysis tools?

Gemini suggests NLTK and spaCy

https://www.nltk.org/

https://spacy.io/

3 comments

I'm wondering how their LLM parsing 250 mil words in 9 hours compares with performance of traditional sentiment analysis.

Also, many exisiting sentiment analysis tools have a lot of research behind them that can be referenced when interpreting the results (known confounds etc). I don't think there is yet an equivalent for the LLM approach

Pretty slow. I built a sentiment analysis service (https://classysoftware.io/) and 250M words @ ~384 words per message I’m pushing 5.6 hours to crunch all that data, and even at that I’m pretty sure there are ways to push it lower without sacrificing on accuracy.
And yet, it's so much easier to deploy an LLM, either through a service or on prem.
It's easier to do a lot of things. That doesn't make it better.
But it does make people to feel like that action is now possible. And once someone believes something is possible, they're more likely to do it
Often makes things built on top of them better because of improved speed of iteration.
Sometimes, doing something different does result in something better.

For example, EVs. Compare EVs to ICEVs and you can point out a lot of faults, but ICEVs have had 100 years of refinement. Perhaps you're comparing battle hardened SA with fledgling LLM based SA?

Never don't do new things, not only when its for fun, but especially when its for fun. If you want to be closed minded, that's your choice, but don't try to put that mentality onto others.

Keep your non-hacker mindset to yourself.

What do you mean? Deploying something like spaCy is far easier than deploying an LLM in my experience.
By "deploy", they almost certainly mean "set up to use" and they may have also included "learn how to use" and all its various forms, as well.

LLMs really are almost magic in how they can help in this space; and setting them up is often just getting an API key and throwing some money and webservice calls at them.

Set up to use, sure. Learn? Learning isn’t deployment in this context.

Setting up spaCy is just `pip install spacy`. No need to worry about GPUs or dedicated services like you do with LLMs.

yeah, then you have to learn the API. Basically every option for running LLMs have converged on the openAI (web) API.
pip install vllm, boom you have an openapi-compatible webserver. No further action necessary.
No, there is further action necessary. If you want any kind of decent performance, you need to run it with an appropriate GPU. This is not true of spaCy, which makes spaCy easier to deploy.
Because LLMs WILL dominate all NLP use cases, whether you like it or not.

Its like the linux of operating systems. Sure you can handwrite up some custom OS more specialized for a purpose. But its much easier to just use linux, which everyone understands on a basic level and is extremely robust, and modifying it slightly for the end goal.

And saying "Traditional sentiment analysis" tools are "Battle tested" is laughable. LLMs in the past year alone, probably has 1000x the cumulative usage of all sentiment analysis tools in history.

LLMs get 100 billion + each year in research, improvements, engineering, optimisations.

LLMs keep rapidly improving year to year in capabilities. Sonnet 3.5 already obliterates the original GPT-4 in every aspect.

LLMs keep getting cheaper year to year. Gemini flash is like 100x cheaper than the original GPT3.5.

You can onboard any person who can write python, to start using LLMs to perform language analysis in a day. Versus weeks to use these traditional tools.

Nearly all NLP tasks will be standardised to use LLMs as the baseline default tool. Sure there'll be some short term degradations in some specific aspect, but there's no stopping the tide.

By the way, traditional ML-based translation is also pretty much dead and replaced by LLMs. I've been seeing an explosion in fan-translations done by say Sonnet 3.5, the improvement in fluency and accuracy is just radical and extreme, I often don't even notice the AI-translation anymore.

Aside from a half dozen or so zeros, you're right on.
on what, the spending? Facebook alone said that it will spend $40b this year on AI. probably not all of it is on Llama but a sizable portion is.
Sorry, but not really. If you know what you do, you don't just pick an LLM. LLMs are trained/built for a specific task: text generation. Other models are trained on different tasks. If you know what you do, you compare models (I don't mean LLM models with that!) and choose the best performing. Just because LLMs receive more training doesn't mean they have a better performance. Very weird and flawed way of thinking. This is just hype thinking
I have to agree with the parent. LLMs are excellent at a large range of NLP tasks. Of course they are not going to replace all ML models, but when it comes to NLP they are clearly better than lots of trained models (e.g. https://arxiv.org/pdf/2310.18025).
LLMs are general purpose tools and absolutely are not better than trained models (using the latest techniques) for a specific task. I mean, that's obviously true if you think about it.

You can use similar datasets and the latest model architectures and if you train a model purely for sentiment analysis it will be better than frontier general purpose LLMs for sentiment analysis.

It's really mind-boggling that so many people disagree via downvotes that you compare models and choose the best performing one, independent of the hype ...
I can see a simliarity here in comparing Java/JavaScript/any other modern, more productive language to C. Yes, both can write more or less the same program, but you'll get the same result with less effort and more quickly with the modern languages. Yes, modern languages will always be slower and heavier on resources than C.
That's inaccurate, because traditional sentiment analysis, or rather the entire NLP ecosystem, is a very niche and underoptimized space.

Its not comparing C against Javascript, its comparing Ada against Javascript. Ada is not going to be any faster than javascript because its too niche and therefore underoptimised.

The theoretical minimum computation required by LLMs is far higher than traditional simple NLP algorithms. But the practical computation cost of LLMs will soon be cheaper, because LLMs get so much investment and use, there's massive full-stack optimizations all the way from the GPU to the end libraries.