Hacker News new | ask | show | jobs
by pdntspa 187 days ago
If it delivers accurate data then I can hit that instead of scraping the full HTML. Everybody wins.

What I have found, however, with existing standardization of this kind of data (yours is not the first!), is that shopping sites (big ones) will lie, and you still need to read the HTML as ground truth.

1 comments

You are right. Standardization often drifts from reality. That is why we built Section 9: Cross-Verification. The HTML remains the audit layer. The Agent does not trust blindly. It spot-checks. If commerce.txt says $50 but the HTML says $100, the merchant gets a Trust Score penalty. We do not replace the ground truth. We cache it, and we audit the cache to ensure it matches.
Then why bother with commerce.txt?
Because you don't need to audit every single transaction.

Think of it like a cache. You use the commerce.txt for 99% of your agentic workflows because it’s 30% cheaper in tokens and 95% faster than parsing a 2MB HTML haystack.

You only 'bother' with the HTML for periodic spot-checks or when a high-value transaction requires absolute verification.

Without CommerceTXT, you are forced to pay the 'HTML tax' on every single interaction. With it, you get a high-speed fast lane for context, while keeping the HTML as a decentralized source of truth for when trust needs to be verified. It’s about moving the baseline from 'expensive and fragile' to 'efficient and auditable'.