| No human or LLM is too smart alone. Intelligence is a social process intermediated by language. They said their focus was PII but I imagine Social Learning is also very useful when dealing with copyrighted content too: - (current level) use your own model to generate a bunch of solutions to a task - (teacher level) use RAG, web search or a larger model to solve the same task. But always use multiple sources, never drawing from a single copyrighted source example, we want to integrate information. - (grading & teaching feedback) analyze the issues of your own model and synthesize a training example to fit the issues found - it might lack some facts, or not have some skills This would be fair use because it studies the shortcomings of the student model compared to the more empowered teacher models. It can also check for regurgitation of the copyrighted content while formulating completely new text to fit the needs of the student. I prefer to name the method Machine Study rather than Social Learning, but the social part is there. When LLMs get a legitimate way to process copyrighted content it would help their development. This method will ensure the new model never sees the original copyrighted content, only the teacher vs. student analysis outputs related to that content. |