|
|
|
|
|
by Alex_001
412 days ago
|
|
That paper is a great pointer — the creativity vs. alignment trade-off feels a lot like the "risk-aversion" effect in humans under censorship or heavy supervision. It makes me wonder: as we push models to be more aligned, are we inherently narrowing their output distribution to safer, more average responses? And if so, where’s the balance? Could we someday see dual-mode models — one for safety-critical tasks, and another more "raw" mode for creative or exploratory use, gated by context or user trust levels? |
|
I feel that companies with top-down management would have more agency and perhaps creativity towards (but not at) the top, and the implementation would be delegated to bottom layers with increasing levels of specification and restriction.
If this translates, we might have multiple layers with varied specialization and control, and hopefully some feedback mechanisms about feasibility.
Since some hierarchies are familiar to us from real-life, we might prefer these to start with.
It can be hard to find humans that are very creative but also able to integrate consistently and reliably (in a domain). Maybe a model doing both well would also be hard to build compared to stacking few different ones on top of each other with delegation.
I know it's already being done by dividing tasks between multiple steps and models / contexts in order to improve efficiency, but having explicit strong differences of creativity between layers sounds new to me.