| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by captainbland 396 days ago
	Is cost a major consideration for you here? Like if you're dealing with firehose data which I'm assuming is fairly high throughput, do you see an incentive for potentially switching to a more specific NLP classifier model rather than sticking with generative LLMs? Or is it that this is good enough/the ROI of switching isn't attractive? Or is the generative aspect adding something else here?

4 comments

simonw 396 days ago

If you do the calculations against the cheapest available models (GPT-4.1-nano and Gemini 1.5 Flash 8B and Amazon Nova Micro for example - I have a table on https://www.llm-prices.com/ ) it is shockingly inexpensive to process even really large volumes of text.

$20 could cover half a billion tokens with those models! That's a lot of firehose.

link

BowBun 396 days ago

I don't think everyone's using the term 'firehose' the same here. A child comment refers to half a billion tokens for $20.

I did some really basic napkin math with some Rails logs. One request with some extra junk in it was about 400 tokens according to the OpenAI tokenizer[0]. 500M/400 = ~1.25 million log lines.

Paying linearly for logs at $20 per 1.25 million lines is not reasonable for mid-to-high scale tech environments.

I think this would be sufficient if a 'firehose of data' is a bunch of news/media/content feeds that needs to be summarized/parsed/guessed at.

[0] https://platform.openai.com/tokenizer

link

petercooper 396 days ago

No. It's a tiny expense. I mostly use GPT 4.1 Mini for what I'm doing as it's the best balance between results and cost, but Gemini Flash can do the job just as well for a little less if I need it.

As other commenters have mentioned, a firehose can mean many things. For me it might be thousands of different reasonably small things a day which is dollars a day even in the worst case. If you were processing the raw X feed or the whole of Reddit or something, then all of your questions certainly become more relevant :-)

link

captainbland 395 days ago

Yeah that makes sense based on those specifics, thanks

link

scarface_74 395 days ago

I can’t tell you what I’m working on but I can give you a real world example of where traditional models don’t work well.

Sentiment analysis is like the “Hello World” when you’re using Machine Learning.

But I had a use case similar to a platform like Uber eats where someone can be critical of the service provider or be critical of the platform itself. I needed to be able to distinguish sentiment about the platform based on reviews and sentiment about someone on the platform.

No matter what you do, people are going to conflate the reviews.

As far as costs, I mentioned in another comment that I work with online call centers sometimes. There anytime a person has to answer a call, it costs the company from $2-$5.

One call deflection that saves the company $5 can pay for a lot of inference. It’s literally 100x cheaper at least to use an LLM.

link