| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by pgao 1050 days ago

With the last few months, there's been a Cambrian explosion of products integrating AI. A big part of this is because LLMs enable users to get good performance on a lot of NLP tasks out-of-the-box by calling an API or running a pre-trained model. Previously, applied AI/ML work revolved around collecting + labeling data, training a model, building infrastructure, etc. to get good enough performance for a given use case. But now the models work well out of the box, the work shifts to building a great product experience around the model and getting to product-market fit with an AI-enabled product.

With visual user interfaces, there's a whole category of product analytics tooling that helps product teams understand how users interact with their products + make product decisions to optimize their product-market fit. LLM apps have introduced a new paradigm for interacting with software, where users can work iteratively with the software via a natural language interface, generating user inputs and model responses consisting of unstructured text.

Traditional analytics techniques don’t deal well with large amounts of unstructured text – it’s hard to summarize, it’s hard to aggregate, and it’s hard to effectively sample. AI developers resort to digging through a pile of hundreds to hundreds of millions of datapoints of unstructured text to understand how users interact with their product.

Tidepool tries to solve this problem using neural network embeddings. After you upload user text interaction events, Tidepool will:

- Automatically group your data by similarity. Tidepool runs embedding clustering on your users’ text interactions to surface interesting attributes: things like prompt topics, prompt languages, and common usage patterns that can be turned into shortcuts.

- Summarize common attributes in your data, using LLMs to determine what each cluster “contains.” For example, understanding that the most common topics that users discuss are business, education, and art.

- Track attributes in production traffic, allowing you to uncover how a specific attribute might be correlated to good / bad product outcomes. We utilize lightweight models running on foundation model embeddings to scalably extract these attributes from hundreds of millions of interaction events in production.

Lastly, we have a self-serve free tier! I thought this may be useful for people building AI applications. If you're interested, please try it - I'd love to hear any feedback on what works and what doesn't :) https://app.tidepool.so/