Hacker News new | ask | show | jobs
by mgdev 443 days ago
This is so good. I've been using it with Claude Code with great success.

I just leave an instruction in CLAUDE.md to validate changes with Playwright. It automatically starts a dev server (wrote a little MCP server to do that), navigates to the page with the changes it just made, and validates that its changes worked. If there is anything unexpected, it self-corrects.

It's like working with a really great mid-level engineer.

What a time to be alive.

2 comments

Interesting use-case. Can you give an example of a prompt you use that triggers this tool? Are you validating UI changes (button color), navigation, or something more complex?
Claude Code is amazing. Unfortunately, it's also very expensive. How do interactions with MCP servers affect token usage/cost?
+1 for claude code being amazing, and especially +1 for the cost. I've spent $500 this week, $.10 - $1 at a time, fixing bugs and adding features. It took a while to get used to not @ tagging all of the files and realizing it just "figures it out" (using tokens to do so of course!)
I burned though $25 in just 3 hours. Claude code will be great when they can get the cost down. If the cost is like 1/10th of that I’d be using it all the time, but +/- $10 / hour is too much.
A US-based dev costs 125/hr, on the low end.

A US-based dev directing Claude Code has like 3x output.

So the biz is spending 125 + AI costs, but saving 250/hr.

An individual dev might feel like a superhuman compared to those not using Claude Code. Could even earn them a substantial promotion.

Either way, seems to net out.

>I burned though $25 in just 3 hours. Claude code will be great when they can get the cost down. If the cost is like 1/10th of that I’d be using it all the time, but +/- $10 / hour is too much.

I've been trying to figure this out, and I don't think it's malicious, but it's just a matter of incentives. Anthropic devs are certainly not paying retail prices for Claude usage, so their benchmark (or just intuition) of efficiency is probably much different than the average user. Without that hard constraint the incentive just isn't there for them to squeeze out a few more pennies, and it ends up way more expensive than stuff like Cline or Cursor.