| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by JPalakapilly 824 days ago

This is a great point. We completely agree that high-quality results is essential for adoption. It's basically table stakes for any tool like this to be useful. We've had several versions of this tool that weren't quite "good enough" and never saw any real use. Our latest version seems to meet the first quality threshold for actual work use.

Our method of evaluating quality is not super systematic right now. For this competitive landscape task, we have a "test suite" of ~10 companies and for each we have a sort of "must-include", "should-include", "could-include" set of competitors that should be surfaced. We run these through our tool and others and look at precision and recall on the competitor sets.

In terms of errors, right now our results are a little noisy, since we're biased towards being exhaustive vs selective. There are obviously irrelevant companies in the results that no human would have ever included. Our users can fairly easily filter these out by reading the one sentence overviews of the companies but it's still not a great UX. Actively working on this.

1 comments

andy99 824 days ago

I wonder if it's more about convincing yourself that it faithfully follows the same workflow an analyst would follow. It's always possible to miss stuff, so the best a person or a machine can do is be demonstrably methodical, it sounds like... and that is easier to test. Unless there is really some magic tacit step that human analyst perform to get better answers.

link

sroussey 824 days ago

Well, human analysts get on the phone and ask people questions.

Not that an AI can’t do that too!

Though I may hang up…

link