Hacker News new | ask | show | jobs
by fuckinpuppers 25 days ago
I had a blast having all the major models figure out the most optimal strategy for itself inside of Cursor, with cursorrules, AGENTS.md, .cursor/rules/ mrd files or whatever and learned some interesting things, how it won’t guarantee every instruction even when it’s told to, for example

Seems like the progressive disclosure approach is the best for context efficiency; I wound up with a somewhat tight generic AGENTS.md, and the .cursor/rules individual files with glob matching for file names. Cursor honored those well.

I must have spent a couple hundred on the company dime having the models rephrase/rewrite or change where instructions were found, what made sense as a skill vs a rule, trying to keep things as portable as possible. At this point the Cursor-specific files would need to be ported to a different agent/framework if it needed to be. But the content should be pretty solid.

It was an interesting (and productive) exploration for me

1 comments

> Seems like the progressive disclosure approach is the best for context efficiency; I wound up with a somewhat tight generic AGENTS.md, and the .cursor/rules individual files with glob matching for file names. Cursor honored those well.

This is also generally where I've landed - keep the AGENTS.md super light, and link out to docs as needed. Same idea with skills as well. Basically, preserve the context window at all costs.

The part I'm curious about is, when we're making the sorts of behavior changes you're describing on shared repos, how do we actually measure and quantify impact? It's one thing to tell the team that the agent should perform better, and it's another to say that you made the agent 5% better across a variety of tasks for every dev in the repo.

I didn’t have to share it or quantify it… so I didn’t care.

I just relied on different agents/models and kept asking a thorough prompt of “analyze the agents.md, cursorrules, etc and ensure its token efficient and enforces everything” (it was very specifically worded, I may have even asked an agent for how to ask agents for it) and just kept jumping from the 3 big models and medium and high thinking, each one kept finding little things and at one point moved entirely from one strategy to another, if I remember right.

Once I felt good enough I’ve been using it as my setup for my application and it’s been pretty good without any modifications or tweaks. Originally I decided to do this because I got tired realizing that it wasn’t honoring things I told it to. For example “restart the application after every modification to the server code” and it would “forget” to do that often… somehow now I’ve got it really well tuned for my particular codebase and approach to developing.