Hacker News new | ask | show | jobs
by tsazan 174 days ago
It definitely lowers the barrier. But relying on messy HTML as a defense against competitors is 'security through obscurity'. It does not stop them; it just costs you server CPU. The data is public. If you put it on the screen, a scraper can read it. CommerceTXT just ensures that the good bots (AI Agents bringing customers) get it efficiently, while you can still block the bad ones via WAF.
1 comments

If it delivers accurate data then I can hit that instead of scraping the full HTML. Everybody wins.

What I have found, however, with existing standardization of this kind of data (yours is not the first!), is that shopping sites (big ones) will lie, and you still need to read the HTML as ground truth.

You are right. Standardization often drifts from reality. That is why we built Section 9: Cross-Verification. The HTML remains the audit layer. The Agent does not trust blindly. It spot-checks. If commerce.txt says $50 but the HTML says $100, the merchant gets a Trust Score penalty. We do not replace the ground truth. We cache it, and we audit the cache to ensure it matches.
Then why bother with commerce.txt?
Because you don't need to audit every single transaction.

Think of it like a cache. You use the commerce.txt for 99% of your agentic workflows because it’s 30% cheaper in tokens and 95% faster than parsing a 2MB HTML haystack.

You only 'bother' with the HTML for periodic spot-checks or when a high-value transaction requires absolute verification.

Without CommerceTXT, you are forced to pay the 'HTML tax' on every single interaction. With it, you get a high-speed fast lane for context, while keeping the HTML as a decentralized source of truth for when trust needs to be verified. It’s about moving the baseline from 'expensive and fragile' to 'efficient and auditable'.