| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ahmed-fathi 106 days ago
	No single paper nails that exact claim. SWE-bench Princeton does show that models struggle significantly with real-world issues requiring changes across multiple files and functions which points in that direction. But the local vs global framing is mostly practitioner-observed, not a formally tested hypothesis yet. Fair point, I should have hedged it. https://arxiv.org/abs/2310.06770