Hacker News new | ask | show | jobs
by colibri727 772 days ago
There are ways around this problem, mainly clearing context and re-prompting. But as "alignment" gets more precise/accurate in the future, I wager these workarounds will remain available for tasks that justifiably need moderation (for instance engineering of biological warfare materials). This segmentation of LLM agents and their context will be assimilated to project compartmentalization on the basis of need-to-know, and as a result genuine full context clearing will be rendered impossible: the AIs will be designed in such a ways as to remember every interaction you've had with them, and they'll use this activity log to moderate the replies they feed you.