Hacker News new | ask | show | jobs
by swatcoder 828 days ago
> delivering results 50x faster at 50x less cost

What about quality of results? Are you measuring that too? Did you do so for the traditional reference practice? Using what sort of methodology? How did it your technique compare in quality? What kind of errors was it most likely to make? What techniques have you devised for spotting those errors? Are they the same kind of errors that users would experience when outsourcing? Are the errors easier or harder to spot for one than the other? Are they faster to remediate with one?

I see a clever concept but given the state of LLM's and the nature of how they work, I don't know that nominal cost and speed differences are really enough to sell on. Not for something "crucial to big business decisions." I'd want to know that my failure/miss rate is no worse than when outsourcing and that my net cost and time (including error identification and recovery) still end up ahead. I don't see either of those vital issues touched upon here.

5 comments

This is a great point. We completely agree that high-quality results is essential for adoption. It's basically table stakes for any tool like this to be useful. We've had several versions of this tool that weren't quite "good enough" and never saw any real use. Our latest version seems to meet the first quality threshold for actual work use.

Our method of evaluating quality is not super systematic right now. For this competitive landscape task, we have a "test suite" of ~10 companies and for each we have a sort of "must-include", "should-include", "could-include" set of competitors that should be surfaced. We run these through our tool and others and look at precision and recall on the competitor sets.

In terms of errors, right now our results are a little noisy, since we're biased towards being exhaustive vs selective. There are obviously irrelevant companies in the results that no human would have ever included. Our users can fairly easily filter these out by reading the one sentence overviews of the companies but it's still not a great UX. Actively working on this.

I wonder if it's more about convincing yourself that it faithfully follows the same workflow an analyst would follow. It's always possible to miss stuff, so the best a person or a machine can do is be demonstrably methodical, it sounds like... and that is easier to test. Unless there is really some magic tacit step that human analyst perform to get better answers.
Well, human analysts get on the phone and ask people questions.

Not that an AI can’t do that too!

Though I may hang up…

I am sure most people never asked these questions to a human doing this research.
Most B2B focused AI tools will do 10x better if they pretend to be a normal human run company but just have the AI at the back.

Their clients want to know that the research report was written by a real person and not a bot.

But that doesn’t mean it actually has to be written by a real person and not a bot.

As somebody who have seen a lot of market research coming from "real humans"(TM) I'm pretty sure it's actually hard to make it worse. Maybe there are boutiques which make quality research, but average is abysmal.

I've been in a meeting with a senior executive, PhD, who argued that Total Addressable Market for a supply chain tracking software (basically just a database) is same as the market for the goods it tracks.

It is as if a startup which prints restaurant menus would claim the value of all food sold in restaurants as their TAM.

And this person was otherwise reasonable...

So, yeah, never underestimate the power of Natural Stupidity.

Perhaps

This was a huge problem when I was working on a similar product some years ago. The issue is that false positives or worse, false negatives, can be pretty catastrophic when you're thinking about valuing companies in particular. What's hard for LLMs (currently) is to decide what is most important or to infer the things that aren't explicitly stated.
Yeah when it comes to anything involving numbers there is absolutely no room for hallucination. It must be 100% accurate 100% of the time. No exceptions.
Have you ever seen the research coming out of some of the outsourcing shops that the OP discusses in the post? They are hard not living up to that standard. It's important to realize that this is input for the analyst at a fund or investment bank to do some more digging on the companies and in the process potentially discover more. This isn't going straight to the CEO to form the basis of an investment decision.