Hacker News new | ask | show | jobs
by fnordpiglet 741 days ago
We use LLMs in dozens of different production applications for critical business flows. They allow for a lot of dynamism in our flows that aren’t amenable to direct quantitative reasoning or structured workflows. Double digit percents of our growth in the last year are entirely due to them. The biggest challenge is tool chain, limits on inference capacity, and developer understanding of the abilities, limits, and techniques for using LLMs effectively.

I often see these messages from the community doubting the reality, but LLMs are a powerful tool in the tool chest. But I think most companies are not staffed with skilled enough engineers with a creative enough bent to really take advantage of them yet or be willing to fund basic research and from first principles toolchain creation. That’s ok. But it’s foolish to assume this is all hype like crypto was. The parallels are obvious but the foundations are different.

3 comments

No one is saying that all of AI is hype. It clearly isn't.

But the facts are that today LLMs are not suitable for use cases that need accurate results. And there is no evidence or research that suggests this is changing anytime soon. Maybe for ever.

There are very strong parallels to crypto in that (a) people are starting with the technology and trying to find problems and (b) there is a cult like atmosphere where non-believers are seen as being anti-progress and anti-technology.

Yeah I think a key is LLMs in business are not generally useful alone. They require classical computing techniques to really be powerful. Accurate computation is a generally well established field and you don’t need an LLM to do optimization or math or even deductive logical reasoning. That’s a waste of their power which is typically abstract semantic abductive “reasoning” and natural language processing. Overlaying this with constraints, structure, and augmenting with optimizers, solvers, etc, you get a form of computing that was impossible more than 5 years prior and is only practical in the last 9 months.

On the crypto stuff yeah I get it - especially if you’re not in the weeds of its use. A lot of people formed opinions from GPT3.5, Gemini, copilot, and other crappy experiences and haven’t kept up with the state of the art. The rate of change in AI is breathtaking and I think hard to comprehend for most people. Also the recent mess of crypto and the fact grifters grift etc also hurts. But people who doubt -are- stuck in the past. That’s not necessarily their fault and it might not even apply to their career or lives in the present and the flaws are enormous as you point out. But it’s such a remarkably powerful new mode of compute that it in combination with all the other powerful modes of compute is changing everything and will continue too, especially if next generation models keep improving as they seem to be likely to.

That text applies to basically every new technology. Point is that you can't predict it's usefulness in 20 years from that.

To me it still looks like a hammer made completely from rubber. You can practice to get some good hits, but it is pretty hard to get something reliable. And a beginner will basically just bounce it around. But it is sold as rescue for beginners.

I didn't see anything in the article that indicated the authors believed that those who don't see use cases for LLMs are anti-progress or anti-technology. Is that comment related to the authors of this article, or just a general grievance you have unrelated to this article?
> We use LLMs in dozens of different production applications for critical business flows. They allow for a lot of dynamism in our flows that aren’t amenable to direct quantitative reasoning or structured workflows. Double digit percents of our growth in the last year are entirely due to them. The biggest challenge is tool chain, limits on inference capacity, and developer understanding of the abilities, limits, and techniques for using LLMs effectively.

That sounds like corporate buzzword salad. It doesn't tell much as it stands, not without at least one specific example to ground all those relative statements.

Hi, Hamel here. I'm one of the co-authors. I'm an independent consultant and not all clients allow me to talk about their work.

However, I have two that do, which I've discussed in the article. These are two production use cases that I have supported (which again, are explicitly mentioned in the article):

1. https://www.honeycomb.io/blog/introducing-query-assistant

2. https://www.youtube.com/watch?v=B_DMMlDuJB0

Other co-authors have worked on significant bodies of work:

Bryan Bischoff lead the creation of Magic in Hex: https://www.latent.space/p/bryan-bischof

Jason Liu created the most popular OSS libraries for structured data called instructor https://github.com/jxnl/instructor, and works with some of the leading companies in the space like Limitless and Raycast (https://jxnl.co/services/#current-and-past-clients)

Eugene Yan works with LLMs extensively at Amazon and uses that to inform his writing: https://eugeneyan.com/writing/ (However he isn't allowed to share specifics about Amazon)

I believe you might find these worth looking at.

You've linked to a query generator for a custom programming language and a 1 hour video about LLM tools. The cynic in me feels like the former could probably be done by chatgpt off the shelf.

But those do not seem to be real world business cases.

Can you expand a bit more why you think they are? We don't have hours to spend reading, and you say you've been allowed to talk about them.

So can you summarise the business benefits for us, which is what people are asking for, instead of linking to huge articles?

> The cynic in me feels like the former could probably be done by chatgpt off the shelf.

Hello! I'm the owner of the feature in question who experimented with chatgpt last year in the course of building the feature (and working with Hamel to improve it via fine-tuning later).

Even today, it could not work with ChatGPT. To generate valid queries, you need to know which subset of a user's dataset schema is relevant to their query, which makes it equally a retrieval problem as it does a generation problem.

Beyond that, though, the details of "what makes a good query" are quite tricky and subtle. Honeycomb as a querying tool is unique in the market because it lets you arbitrarily group and filter by any column/value in your schema without pre-indexing and without any cost w.r.t. cardinality. And so there are many cases where you can quite literally answer someone's question, but there are multitudes of ways you can be even more helpful, often by introducing a grouping that they didn't directly ask for. For example, "count my errors" is just a COUNT where the error column exists, but if you group by something like the HTTP route, the name of the operation, etc. -- or the name of a child operation and its calling HTTP route for requests -- you end up actually showing people where and how these errors come from. In my experience, the large majority of power users already do this themselves (it's how you use HNY effectively), and the large majority of new users who know little about the tool simply have no idea it's this flexible. Query Assistant helps them with that and they have a pretty good activation rate when they use it.

Unfortunately, ChatGPT and even just good old fashioned RAG is often not up to the task. That's why fine-tuning is so important for this use case.

Thanks for the reply. Huge fan of honeycomb and the feature. Spent many years in observability and built a some of the large in use log platforms. Tracing is the way of the future and hope to see you guys eat that market. I did some executive tech strategy stuff at some megacorp on observability and it’s really hard to unwedge metrics and logs but I’ve done my best when it was my focus. Good luck and thanks for all you’re doing over there.
Glad you like HNY and the feature! Here’s hoping we can help move a lot more of the world over to tracing :)
> do not seem to be real world business cases

The first one is a real world product that lives in production that is user facing for a paid product.

The second video goes in depth about how a AI assistant was built for a real estate CRM company, also a paid product.

I don’t understand the assertion that it’s not “real world” or not “business”

Here are additional articles about these

https://help.rechat.com/guides/lucy

https://www.prnewswire.com/news-releases/honeycomb-launches-...

They think they are real business use cases, because real businesses use them to solve their use cases. They know that chatgpt can't solve this off the shelf, because they tried that first and were forced to do more in order to solve their problem.

There's a summary for ya! More details in the stuff that they linked if you want to learn. Technical skills do require a significant time investment to learn, and LLM usage is no different.

Sounds like something you could do with an LLM
Yet another post claiming "dozens" of production use cases without listing a single one.
I’ve listed plenty in my comment history. I don’t generally feel compelled to trot them all out all the time - I don’t need to “prove” anything and if you think I’m lying that’s your choice. Finally, many of our uses are trade secrets and a significant competitive advantage so I don’t feel the need to disclose them to the world if our competitors don’t believe in the tech. We can keep eating their lunch.