| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by barefootsanders 141 days ago

OP here. I built skillthis.ai, a tool that takes a description of your professional expertise and generates a Claude Code skill file (a markdown prompt file that customizes Claude's behavior for specific tasks).

155 people used it over 3.5 weeks. I analyzed the results and found some patterns I didn't expect.

The headline finding: someone typed "I a bartender" (12 characters, with a typo) and scored 85/100. A 15,576-character technical specification about development process analysis scored 72/100. The bartender input was reproducible, I ran it twice.

More surprisingly, "hey bro" scored 88/100. The system generated a "Casual Communication Skill" and suggested adding "quantifiable success metrics." The grading algorithm clearly has issues (acknowledged in the post).

What actually predicted quality: - Specific, well-understood domains (plumber, bartender, OKR expert) - Task-oriented descriptions (what you do vs. what you are) - Brevity with clarity (top scores averaged under 100 characters) - Named frameworks or methodologies

What didn't: length (negatively correlated with score), vague enthusiasm, attempts to jailbreak or override Claude's behavior.

The tool uses Claude to generate the skill, then a separate Claude call to grade it. The grading inconsistency is a known problem. I built a guided question flow to address the input quality issue, which asks three follow-up questions when input is too vague.

Stack: Next.js, Supabase, Claude API. Blog post has links to every skill mentioned so you can see the actual outputs.