| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hustwindmaple1 769 days ago
	I wonder why companies like IBM are jumping on the LLM bandwagon and training/releasing models that have no chance of competing with Llama/Mistral? To me it just looks like a complete waste of $$ because nobody will use them in any serious scenarios

9 comments

paxys 769 days ago

IBM made $60 billion in revenue last year. Where do you think it all came from? The same companies/governments that buy their overpriced crap are going to buy these new LLMs as well.

link

breezeTrowel 769 days ago

These are open weight models released under an Apache 2.0 license. There's nothing to buy.

link

kkielhofner 768 days ago

IBM is a sales and services org.

Their customers aren’t going to build their own RAG and agent frameworks, vector DBs, data ingest pipelines, finetunes, high scale inference serving solutions, etc, etc.

There’s an incredible amount of stuff to buy.

link

hustwindmaple1 768 days ago

Right, but they can just use Llama/Mistral for free, instead of their inferior models, which I'm sure take quite a bit of resources to train in the first place.

link

kkielhofner 767 days ago

Yes but using someone else's models doesn't make them an "AI company".

link

paxys 768 days ago

Who is going to host them?

link

xarope 769 days ago

Enterprises think differently. They want data provenance, privacy, ability to mitigate/transfer risk etc. If IBM is willing to offer that, there will be enterprises that bite.

link

_lvbh 769 days ago

Llama and Mistral are already local & fulfill these requirements

link

abdullin 769 days ago

IBM goes at great lengths to train models on clean data that has lower risk of copyright or legal issues attached. Just take a look at the model description.

That data issue is important enough for some companies to pick mediocre model over llama or mistral.

link

doctorpangloss 769 days ago

What if I told you that a lot of freely licensed code on GitHub is not clean? That the authors may have read something and rewritten it in a way that wasn’t transformative? So it basically has the same problems.

link

bayindirh 768 days ago

What if I told you the supposedly clean "The Stack" dataset contains at least one GPL repository inside, just because their license detection tool bugged out?

IBM and other big players are vigilant about these things, and this is what companies pay for.

Their software may not be better in some metrics, but they're cleaner in some and their support contracts allows people to sleep tight at night.

This is what money buys. Peace of mind and continuity.

link

maccard 768 days ago

> IBM and other big players are vigilant about these things, and this is what companies pay for.

And more importantly, IBM will guarantee it in the case that they're wrong. _That's_ what companies pay for.

link

doctorpangloss 768 days ago

Indemnity is moving the goal posts, no? So you’re conceding that their data isn’t clean. But they say it’s clean.

This support contract stuff: what are you talking about? You download these models, you use them. What would you pay for? It’s not clean data, they say it’s clean: why would I pay liars? Let’s game out the indemnity idea. I pay $10k/mo for 12 months. Then OpenAI loses v. NYTimes, ruled LLM training is not fair use, need express permission. IBM pulls the models. What the hell did I pay $120k for? And by the way, you can pay a law student 1 beer to tell you OpenAI is going to lose because of Warhol v Goldsmith. You can do whatever you want with your money, but I personally would not waste it on worthless indemnity.

link

rolisz 769 days ago

Yes, but nobody got fired for buying IBM

link

antod 769 days ago

Yeah if something you install doesn't work, you get the blame. If IBM supplies something that doesn't work (likely), you get to blame them instead.

link

szszrk 769 days ago

Between "works" and "doesn't work" there is a full rainbow of possible answers, KPIs, yearly reviews in a network of matrix reporting.

There will be market for their services. Maybe a different one, but there will be.

link

morgante 769 days ago

That's a pretty outdated phrase, even in enterprise.

link

kubami 769 days ago

By no means it is an outdated phrase. Ask any startup sales person!

link

morgante 769 days ago

I personally know a VP who was fired for buying "IBM Cloud." You can absolutely get flak for choosing IBM these days, even at a stodgy enterprise.

The gist is still current, but you need to fill in AWS as the current uncontroversial choice.

link

jazzyjackson 769 days ago

but who can you pay to run these models and fulfill these requirements /for you/ ;)

link

worthless-trash 768 days ago

I could be wrong, but I think thats what the RHEL AI, topic was all about 24 hours ago ?

link

keefle 769 days ago

Can I sue Lamma and Mistral if things go wrong?

link

dartos 768 days ago

Llama is owned by Meta, so you’d be suing meta

But I’m pretty sure both models have “we’re not responsible” clauses.

link

keefle 768 days ago

That was my point. Whereas if you are using a service from IBM as an enterprise, you would be able to sue them

link

insane_dreamer 768 days ago

“Nobody was ever fired for hiring IBM”

link

mhh__ 769 days ago

IBM do a mixture of shovelware and extremely hardcore tech so they could honestly go either way with this.

link

semi-extrinsic 769 days ago

Agreed. For example their research lab in Zurich has been absolutely world-leading in things like atomic force microscopy (AFM) for four decades, including the Nobel prize in Physics in 1986 (AFM) and 1987 (high-temperature superconductivity). They also invented things like trellis coding and token ring.

link

propter_hoc 769 days ago

> mixture of shovelware and extremely hardcore tech

Citation needed

All I've seen from them in my professional experience is actually legacy mainframe maintenance.. Not shovelware, but very far from hardcore tech.

link

mistrial9 769 days ago

no - here is an example from Aug 2019 EE Times:

PALO ALTO, Calif. – IBM defined at (trade show ed.) Hot Chips a new interface for the 2020 version of its Power 9 CPUs. The Open Memory Interface (OMI) will enable packing on a server more main memory at higher bandwidth than DDR, and as a potential Jedec standard could rival GenZ and Intel’s CLX.

OMI basically removes the memory controller from the host, relying instead on a controller on a relatively small DIMM card. Microchip’s Microsemi division already has a DDR controller running on cards in IBM’s labs. The approach promises to deliver up to 4TBytes memory on a server at about 320GBytes/second or 512GB at up to 650GB/s sustained rates.

link

starspangled 763 days ago

https://research.ibm.com/semiconductors#publications

https://research.ibm.com/blog/albany-semiconductor-research-... etc

IBM doesn't have fabs, but they still do R&D into semiconductors that very much target future commercial processes. They do a fair bit on quantum computing too, to name just a couple of things.

link

andsoitis 769 days ago

“The South Korean technology giant Samsung Electronics was awarded a total of 6,165 United States patents in 2023, the most of any company. Qualcomm ranked second among companies, with 3,854 U.S. patents granted, followed by the likes of Taiwan Semiconductor Manufacturing Company and IBM.” — https://www.statista.com/statistics/274825/companies-with-th...

link

voidmain0001 769 days ago

Non-mainframe maintenance:

https://research.ibm.com/blog/ibm-molecule-generation-experi...

https://www.smithsonianmag.com/smart-news/ibm-engineers-push...

link

mhh__ 768 days ago

Those mainframes are actually pretty modern and interesting.

If IBM split off half of their mainframe division and let some competition get going I think the segment could actually be something to contend with.

The basic idea of the IBM mainframe is almost perfect for what a lot of companies actually need (massively reliable hardware to support lots of middling software; most work is shunting data around) but everyone knows they're going to get locked into IBM.

link

seankurtz 768 days ago

On the contrary, the maintenance and continued improvement of an entire ISA and ISA specific operating systems is exactly my idea of hardcore tech, i.e. continuing to pay a chip org to design new chips for said ISA every generation and implement new instructions...and continuing to pay OS and compiler programmers to work those into their OS's and compilers...I'm not sure where we draw the line on maintenance vs. continued development here, but I'm not sure I'd call that purely maintenence.

There really aren't a lot of companies out there that can claim to do similar (and of course besides s390x, an ancient and venerable CISC, IBM also has Power, so they are doing this 2x over). You'll find a lot of IBM employees contributing to what I'd consider "hardcore" tech like LLVM and the Linux kernel as a result, because they genuinely have a large amount of expertise in those and similar areas. And here I'm not even really including Red Hat, but if you include them then they are even more overweight in the hardcore tech category.

If anything, a lot of the rest of the tech industry has left "hardcore tech" behind due to efficiency concerns as a result of a longrunning industry wide process of consolidation and commodification that IBM has resisted for obvious reasons. IBM is hardcore to a fault if anything.

TLDR: I actually think IBM punches above their weight in the "hardcore tech" area so long as our definition is sufficiently low level rather than say, cloud services, in which case fair enough you can probably fairly say they suck at that.

Here I've also chosen to entirely ignore IBM research.

link

esafak 769 days ago

IBM Research is hardcore.

link

Brajeshwar 769 days ago

When they pitch potential clients for their services, their slides on LLM, AI, ML, etc., must be their own. Whether they use it or not for the services does not matter. These are like the side projects that service companies release to help them close their clients.

link

cess11 769 days ago

Same reason they jumped on the clown bandwagon, it's the kind of offering it's expected to have when you're a company like that. Huge size, leading research departments, big enterprise customers.

They've been doing "AI" for ages. Notably Watson over the last couple of decades or so.

link

jujube3 763 days ago

What is "the clown bandwagon"?

link

Jedd 769 days ago

> ... models that have no chance of competing with ...

I've not seen any proper evaluations for Granite against, say, Llama or Mistral.

Until we do it's probably too early to say they can't compete, at least in some areas where others perform poorly.

link

abdullin 769 days ago

They are Ok-ish.

Previous Granite models were on the level of first llama in my benchmarks.

I’m expecting this version to be roughly comparable to llama 2

link

logicchains 769 days ago

>I wonder why companies like IBM are jumping on the LLM bandwagon and training/releasing models that have no chance of competing with Llama/Mistral

Did you even read the benchmarks they post on that link? Assuming they're not outright lying, their 8B model is superior to Llama/Mistral models of the same size for coding tasks.

link

papruapap 768 days ago

prob getting some reputation in AI space will help them to sell watsonx. tbf, watson predates Transformers paper.

link

halJordan 769 days ago

On the other hand i spend my time wondering why people like you think someone should just throw away their ideas simply because there's already someone in the niche.

link