Hacker News new | ask | show | jobs
by whytai 953 days ago
Every day this video ages more and more poorly [1].

categories of startups that will be affected by these launches:

- vectorDB startups -> don't need embeddings anymore

- file processing startups -> don't need to process files anymore

- fine tuning startups -> can fine tune directly from the platform now, with GPT4 fine tuning coming

- cost reduction startups -> they literally lowered prices and increased rate limits

- structuring startups -> json mode and GPT4 turbo with better output matching

- vertical ai agent startups -> GPT marketplace

- anthropic/claude -> now GPT-turbo has 128k context window!

That being said, Sam Altman is an incredible founder for being able to have this close a watch on the market. Pretty much any "ai tooling" startup that was created in the past year was affected by this announcement.

For those asking: vectorDB, chunking, retrieval, and RAG are all implemented in a new stateful AI for you! No need to do it yourself anymore. [2] Exciting times to be a developer!

[1] https://youtu.be/smHw9kEwcgM

[2] https://openai.com/blog/new-models-and-developer-products-an...

23 comments

If you want to be a start-up using AI, you have to be in another industry with access to data and a market that OpenAI/MS/Google can't or won't touch. Otherwise you end up eaten like above.
> a market that OpenAI/MS/Google can't or won't touch.

But also one that their terms of service, which are designed to exclude the markets that they can't or won't touch, don't make it impractical for you to service with their tools.

Or you can treat what OpenAI is doing like a commodity like AWS and leverage it to solve a meaningful problem.
We just launched our AI-based API-Testing tool (https://ai.stepci.com), despite having competitors like GitHub Co-Pilot.

Why? Because they lack specificity. We're domain experts, we know how to prompt it correctly to get the best results for a given domain. The moat is having model do one task extremely well rather than do 100 things "alright"

Domain specialization could be the moat, not only in the business domain but the sheer cost of deployment/refinement.

Check out Will Bennett's "Small language models and building defensibility" - https://will-bennett.beehiiv.com/p/small-language-models-and... (free email newsletter subscription required)

If the primary value-proposition for your startup is just customized prompting with OpenAI endpoints, then unfortunately it's highly likely it could be easily replicated using the newly announced concept of GPTs.
If you just launched it is too soon to speak.
Of course! Today our assumption is that LLMs are commodities and our job is to get the most out of them for the type of problem we're solving (API Testing for us!)
Sorry to be blunt but they can be totally right, if you do not succeed and have to shut down your startup.
It certainly will be a fun experience. But our current belief is that LLMs are a commodity and the real value is in (application-specific) products built on top of them.
Exactly, everyone is so pessimistic but for every AWS sku there is a billion dollar startup that leads that market.
Time will tell
Even if you aren't eaten, the use case will just be copied and run on the same OpenAI models by competitors, having good prompts is not good enough a moat. They win either way
Writer.ai is quite successful, and is totally in another industry that Google+MS participate in.
depends on how much developers are willing to embrace the risk of building everything on OpenAI and getting locked onto their platform.

What's stopping OpenAI from cranking up the inference pricing once they choke out the competition? That combined with the expanded context length makes it seem like they are trying to lead developers towards just throwing everything into context without much thought, which could be painful down the road

> depends on how much developers are willing to […] getting locked onto their platform.

I mean.. the lock in risks have been known with every new technology since forever now, and not just the risk but the actual costs are very real. People still buy HP printers with InkDRM and companies willingly write petabytes of data into AWS that they can’t even afford to egress at current prices.

To be clear, I despise this business practice more than most, but those of us who care are screaming into the void. People are surprisingly eager to walk into a leaking boat, as long as thousands of others are as well.

Combination of 1) short-term business thinking (save $1 today = $1 more of EPS) and 2) fear of competition building AI products and taking share. thus rush to use first usable platform (e.g. openAI).

Psychology and FOMO plays interesting role in walking directly into a snake pit.

100%.I was even gonna add to my comment that these psychological biases seem to particularly affect business people, but omitted to stay on point. I don’t think like that, but I also can’t say what works better on average, so I’ll try to stay humble.

Also, with AI there’s not really a “roll your own” option as with Cloud – the barrier of entry is gigantic, which obviously the VCs love, because as we all know they don’t like having to compete on price & quality on an open market.

I suspect it is in OpenAI's interest to have their API as a loss leader for the foreseeable future, and keep margins slim once they've cornered the market. The playbook here isn't to lock in developers and jack up the API price, it's the marketplace play: attract developers, identify the highest-margin highest-volume vertical segments built atop the platform, then gobble them up with new software.

They can then either act as a distributor and take a marketplace fee or go full Amazon and start competing in their own marketplace.

Reminds me of that sales entrapment approach from cloud providers. “Here is your free $400, go do your thing” next thing you know you have build so much on there already that it is not worth the time and effort to try and allocate it regardless of the 2k bill increase -haha. Good times.
i mean sure it's lock in, but it's lock in via technical superiority/providing features. Either someone else replicates a model of this level of capability or anyone who needs it doesn't really have a choice. I don't mind as much when it's because of technical innovation/feature set (as opposed to through your usual gamut of non-productive anti-competitive actions). If I want to use that much context, that's not openAIs fault that other folks aren't matching it - they didn't even invent transformers and it's not like their competitors are short on cash.
Well, if said startups were visionaries, the could've known better the business they're entering. On the other hand - there are plenty of VC-inflated balloons, making lots of noise, that everyone would be happy to see go. If you mean these startups - well, farewell.

There's plenty more to innovate, really, saying OpenAI killed startups it's like saying that PHP/Wordpress/NameIt killed small shops doing static HTML. or IBM killing the... typewriter companies. Well, as I said - they could've known better. Competition is not always to blame.

TBH those are low-hanging fruits for OpenAI. Much of the value still being captured by OpenAI's own model.

The sad thing is, GPT-4 is its own league in the whole LLM game, whatever those other startups are selling, it isn't competing with OpenAI.

I’ve been keeping my eye on a YC startup for the last few months that I interviewed with this summer. They’ve been set back so many times. It looks like they’re just “ball chasing”. They started as a chatbot app before chatgpt launched. Then they were a RAG file processing app, then enterprise-hosted chat. I lost track of where they are now but they were certainly affected by this announcement.

You know you’re doing the wrong thing if you dread the OpenAI keynotes. Pick a niche, stop riding on OpenAI’s coat tails.

> - vectorDB startups -> don't need embeddings anymore

they don't provide embedings, but storage and query engines for embeddings, so still very relevant

> - file processing startups -> don't need to process files anymore

curious what is that exactly?..

> - vertical ai agent startups -> GPT marketplace

sure, those startups will be selling their agents on marketplace

> they don't provide embedings, but storage and query engines for embeddings, so still very relevant

But you don't need any of the chain of: extract data, calculate embeddings, store data indexed by embeddings, detect need to retrieve data by embeddings and stuff it into LLM context along with your prompt if you use OpenAI's Assistants API, which, in addition to letting you store your own prompts and manage associated threads, also lets you upload data for it to extract, store, and use for RAG on the level of either a defined Assistant or a particular conversation (Thread.)

It's easy to host your query engine somewhere else and integrate it as a search function in chatGPT. Quite easy to switch providers of search.
As in, use an existing search and call it via 'function calling' as part of the assistants routine - rather than uploading documents to the assistant API?
they definitely do provide embeddings, https://openai.com/blog/new-models-and-developer-products-an... ctrl+f retrieval, "... won't need to ... compute or store embeddings"
I mean embeddingsDB startups don't provide embeddings. They provide databases which allows to store and query computed embeddings (e.g. computed by ChatGPT), so they are complimentary services.
Yeah I still see a chat bot being able to look for related information in a database as useful. But I see it as just one of many tools a good chat experience will require. 128k context means for me there other applications to explore and larger tasks to accomplish with fewer api requests. Better chat history and context not getting lost
HN is quite notorious for that Dropbox comment

I suspect that video is going to end up more notorious, it's even funnier given it's the VCs themselves

I'm firmly in the camp that in a vacuum that comment looks dumb but the thread was actually great.

Those were valid concerns at the time and the market for non technical file storage like they were building was non existant.

Perfectly rational to be skeptical and Drew answered all his questions with well thought out responses.

The infamous comment itself made sense in context, too.
More context, please.

EDIT: I guess it's this:

https://news.ycombinator.com/item?id=8863#9224

that's the one
Embeddings are still important (context windows can't contain all data + memorization and continuous retraining is not yet viable), and vertical AI agent startups can still lead on UX.
Separate embedding DBs are less important if you are working with OpenAI, since their Assistants API exists to (among other things) let you bring in additional data and let them worry about parsing it, storing it, and doing RAG with it. Its like "serverless", but for Vector DBs and RAG implementations instead of servers.
Context windows can't contain all data... yet.
Just because something is great doesn't mean that others can't compete. Even a secondary good product can easily be successful due to a company having invested too much, not being aware of openai (ai progress in general), due to some magic integration, etc.

If it would be only me, no one would buy azure or aws but just gcp.

Vector DBs should never have existed in the first place. I feel sorry for the agent startups though.
How does this absolve vectordbs
If you are using OpenAI, the new Assistants API looks like itnwill handle internally what you used to handle externally with a vector DB for RAG (and for some things, GPT-4-Turbo’s 128k context window will make it unnecessary entirely.) There are some other uses for Vector DBs than RAG for LLMs, and there are reasons people might use non-OpenAI LLMs with RAG, so there is still a role for VectorDBs, but it shrunk a lot with this.
OpenAI is still way too expensive to run a corporate knowledge base on top
It’s more reliable than chatpdfs that relies on vector search. With vector db all you are doing is doing a fuzzy search and then sending in that relevant portion near that text and send it to a LLM model as part of a prompt. It misses info.
I'd be very surprised if the Assistants API is not doing RAG with a vector DB behind the scenes with the supplied files.
It doesn't, but semantic search is a lot less relevant if you can squeeze 350 pages of text into the context.
OpenAI charges for all those input tokens. If an app requires squeezing 350 pages of content in every request is going to cost more. Vector DB still relevant for cost and speed.
Besides the cost factor, stuffing the context window can actually make the results worse. https://www.pinecone.io/blog/why-use-retrieval-instead-of-la...
Startups built around actual AI tools, like if one formed around automatic1111 or oogabooga, would be unaffected, but because so much VC money went to the wrong places in this space, a whole lot of people are about to be burned hard.
damn hahaha it's oobabooga not oogabooga
i'm excited for the open source, local inferencing tech to catchup. The bar's been raised.
None of those categories really fall under the second order category mentioned in the video. Using their analogy they all sound more like a mapping provider versus something like Uber.
Offtopic. I find it's amusing that we not only have "chatGPT" but now also "vectorDB". Apple's influence is really strong.
Probably best not to make your company about features that a frontier AI company would have a high probability of adding in the next 6-12 months.
Why don’t you need embedding?
You might. Depends what your trying to do. For RAG seems like they can 'take care of it' but embeddings also offer powerful semantic search and retrieval ignoring LLMs.
I haven't been paying attention, why are embeddings not needed anymore?
Retrieval: augments the assistant with knowledge from outside our models, such as proprietary domain data, product information or documents provided by your users. This means you don’t need to compute and store embeddings for your documents, or implement chunking and search algorithms. The Assistants API optimizes what retrieval technique to use based on our experience building knowledge retrieval in ChatGPT.

The model then decides when to retrieve content based on the user Messages. The Assistants API automatically chooses between two retrieval techniques:

it either passes the file content in the prompt for short documents, or performs a vector search for longer documents Retrieval currently optimizes for quality by adding all relevant content to the context of model calls. We plan to introduce other retrieval strategies to enable developers to choose a different tradeoff between retrieval quality and model usage cost.

Really cool to see the Assistants API's nuanced document retrieval methods. Do you index over the text besides chunking it up and generating embeddings? I'm curious about the indexing and the depth of analysis for longer docs, like assessing an author's tone chapter by chapter—vector search might have its limits there. Plus, the process to shape user queries into retrievable embeddings seems complex. Eager to hear more about these strategies, at least what you can spill!
> or performs a vector search for longer documents

so, clients upload all their docs to OpenAI database?..

Embedding is poor man's context length increase. It essentially increases your context length but with loss.

There is a cost argument to make still, embedding-based approach will be cheaper and faster, but worse result than full text.

That being said, I don't see how those embedding startups compete with OpenAI, no one will be able to offer better embedding than OpenAI itself. It is hardly a convincing business.

The elephant in the room is the open source models aren't able to match up to OpenAI models, and it is qualitative, not quantitive.

For embeddings specifically, there are multiple open source models that outperform OpenAI’s best model (text-embedding-ada-002) that you can see on the MTEB Leaderboard [1]

> embedding-based approach will be cheaper and faster, but worse result than full text

I’m not sure results would be worse, I think it depends on the extent to which the models are able to ignore irrelevant context, which is a problem [2]. Using retrieval can come closer to providing only relevant context.

1. https://huggingface.co/spaces/mteb/leaderboard

2. https://arxiv.org/abs/2302.00093

> on the MTEB Leaderboard

The point isn't about leaderboard. With increasing context length, the question is on whether we need embeddings or not. With longer context length, embeddings is no longer a necessity, and it lowers its value.

For more trivial use cases, sure, but not for harder stuff like working with US law and precedent.

The US Code is on the order of tens of millions of tokens and I shudder to think how many billions of tokens make up all the judicial opinions that set or interpreted precedent.

OP is incorrect. Embeddings are still needed since (1) context windows can't contain all data and (2) data memorization and continuous retraining is not yet viable.
But the common use case of using a vector DB to pull in augmentation appears to now be handled by the Assistants API. I haven't dug into the details yet but it appears you can upload files and the contents will be used (likely with some sort of vector searching happening behind the scenes).
"yet"
It's also much slower. LLMs are generating text token at a time. That's not very good for search.

Pre-search tokenization however, probably a good fit for LLMs.

There is not much info about retrieval/RAG in their docs at the moment - did you find any example on how is the retrieval supposed to work and how to give it access to a DB?
Checking hn and product hunt a few times a week gives you most of that awareness and I don’t need to remind you about the person behind hn ‘sama’ handle.
more startups should focus on foundation models, it's where the meat is. Ideally there won't be a need for any startup as the platform should be able to self-build whatever the customer wants.
We don't want Open AI to win everything.
Where is the part about embeddings?
There will be a lot of startups who rely on marketing aggressively to boomer-led companies who don't know what email is and hoping their assistant never types OpenAI into Google for them.