|
|
|
|
|
by n_u
142 days ago
|
|
> wrap a small number of third-party ChatGPT/Perplexity/Google AIO/etc scraping APIs Can you explain a little bit how this works? I'm guessing the third-parties query ChatGPT etc. with queries related to your product and report how often your product appears? How do they produce a distribution of queries that is close to the distribution of real user queries? |
|
If this weren’t automated, the process would look like this: someone manually reviews each response, pulls out the companies mentioned and their order, and then presents that information.
2) Distribution of queries This is a bit of a dirty secret in the industry (intentional or not): usually what happens is you want to take snapshots and measure them overtime to get distribution. However a lot of tools will run a query once across different AI systems, take the results, and call it done.
Obviously, that isn’t very representative. If you search “best running shoes,” there are many possible answers, and different companies behave differently. What better tools do like Profound is run the same prompt multiple times. From my estimates, Profound runs up to 8 times. This gives a broader snapshot of what tends to show up everyday. You then aggregate those snapshots over time to approximate a distribution.
As a side note: you might argue that running a prompt 8 times isn’t statistically significant, and that’s partially true. However, LLMs tend to regress toward the mean and surface common answers over repeated runs and we found 8 times to be a good indicator- the level of completeness depends on the prompt(i.e. "what should i have for dinner" vs "what are good accounting software for startups", i can touch on that more if you want