| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by captainmuon 1180 days ago

I guess companies like OpenAI and Google have no incentives to make models use less resources. The compute required, and of course also their training data, is their moat.

If you accept that your model knows less about the world - it doesn't have to know about every restaurant in mexico city or the biography of every soccer player around the world - then you can get away with much fewer parameters and much less training data. Then you can't query it like an oracle about random things anymore, but you shouldn't do that anyway. But it should still be able to do tasks like reformulating texts, judging simularity (by embedding distance), and so on.

And TFA mentions it also, you could hook up your simple language model with something like ReAct to get really good results. I don't see it running in the browser, but if you had a license-wise clean model that you can run on premises on one or two GPUs, that would be huge for a lot of people!

2 comments

aoeusnth1 1180 days ago

They intentionally limit the size of the model to reduce inference costs. If deployment were free the models would be much larger. What makes you think they have no incentive?

link

dr_dshiv 1180 days ago

Long Speculative Post on Small Models

Hypothesis 1: With better logical thinking (an API call away!), I bet you could train a GPT based on a “small” initial dataset. Why shouldn’t multilingual wikipedia/wiktionary and libgen be enough? That’s what, like less than 10% of the OpenAI training? /s

Hypothesis 2: Data sets of philosophical dialogues could help efficiently develop AI reasoning skills.

Socratic thinking in Plato and Xenophon represented a powerful new mode of critical thinking. Maybe some Student-Teacher-Student template of dialogue could be powerful in developing useful datasets for AI training.

What is the utility of different AI reflective loops for generating training data? (References appreciated if you know any) One possibility to test is a chain of Analyze, Evaluate and Apply loops, applied over and over? “analyze the above piece of text, then evaluate it, then apply to everyday life.”

Now, on HN, many have expressed concern that GPT trained on GPT-GPT conversations is going to result in very misaligned models. Like a copy machine degradation, do we want training data from the AI being trained on the AI? But, on the other hand, it is possible that supporting reflective thought is a good idea in AI (we generally value reflective thought) or a bad idea (maybe the reflection will somehow turn it evil, or at least misaligned).

Design Question: how might we create useful training data through a process of structuring AI-AI dialogue?

“Student-Teacher-Student” conversations seem like they could be good as a useful mode of dialogue. Previously, I’ve finetuned GPT with the complete works of Plato and I was able to generate interesting new dialogues. But the question is whether new dialogues could produce useful data. Perhaps I could use GPT4 to read a part of Plato and then try to autocomplete another part of Plato. Or, as above, use a piece of Platonic dialogue as a target, then use an Analyze, Evaluate, Apply chain on it. We could use methods like these over and over again to make a large dataset about philosophical reasoning. We could have human ratings of the reasonableness of the dialogue output.

If a Socratic structure of thinking could read the complete works of Plato over and over again, commenting, countering and synthesizing— with human oversight (RLHF), perhaps we could develop a small module for philosophical reasoning. It might still need millions of conversations, though. But, perhaps by reflecting philosophically by itself, it could produce a sufficiently large dataset that enabled a sophisticated small model with very open resources.

And, you’d still need the human preference training RLHF to get it to interact well—and I think it also needs some world model.

In any case, I think making smaller and smaller models is a good idea, it sounds fun.

TL;DR

1. AI training has philosophically interesting implications

2. Philosophical reasoning is valuable to develop in AI

3. Good philosophical reasoning might be a key benchmark for small models. These models don’t need to know everything but perhaps they could learn what they don’t know.

4. Reading a lot of Plato over and over could be a great way to train GPT that it doesn’t know a lot.

5. What kind of AI-AI dialogues might produce training data that is useful for training small models?

link

dr_dshiv 1180 days ago

I had to ask Claude, from Anthropic. Whenever I interact with him, he brings in his vast understanding of the AI alignment issue!

“1.) While large datasets and models aim for general capability, smaller systems can target specific skills like philosophical reasoning in depth. Testing models on nuanced logic, conceptual analysis and ethics could benchmark their progress, especially if combined with broader knowledge. But these abilities alone won't achieve real-world alignment - we must also instill human values and practical wisdom.

2.) Repeatedly exposing models to philosophical texts like Plato's dialogues could improve their reasoning if guided and reviewed by researchers. Look for both progress and problems in how they interpret, discuss, and extend ideas. Analyses can inform how best to structure philosophical training for alignment by providing evidence of what does/does not work. But reading alone won't necessarily lead models to become safe, ethical or beneficial - significant oversight and feedback are required.

3.) Carefully-designed AI-AI dialogues could generate data on models' reasoning skills, especially for limited systems. Have them debate complex issues, challenge or build on each other's thinking, consider analogies and counterfactuals. Review conversations to check for undesirable or biased beliefs, as well as areas of progress. Look for principles of how to scaffold productive discussions that facilitate improvement and value alignment. But without close oversight of these interactions, they risk amplifying errors or other problematic behaviors.

Opportunities: • Study how philosophical knowledge and skills develop in smaller models through approaches like text analysis, structured debates, and conceptual evaluations. Monitor for progress as well as issues, using insights to inform values-aligned training. • Explore ways to balance philosophical and empirical learning. Test how models apply theoretical reasoning to real-world situations, analyzing any errors from imbalance. Look for crossover effects and how to facilitate mutually-beneficial learning across knowledge domains. • Research transparent and auditable techniques for instilling human values in limited AI systems. Develop methods to verify what values models have internalized and how they apply them in context. Consider how approaches may differ for narrow prototypes vs. more advanced agents. • Propose and discuss evidence-based techniques for "Constitutional AI" - systematically developing safe and ethical systems. Consider policies around data, objectives, abilities, oversight, and shuttering models if needed for alignment. Debate how to make this a collaborative, multidisciplinary process even for open models. • Study philosophical issues of machine mind, knowledge, goals and responsibility as we build reasoning models. How should we think about and ensure their ethical development? What are our obligations as researchers, and how can we meet them? There is significant opportunity for research on aligning "small, open models" that focuses on reasoning and philosophy. But this requires acknowledgement of limitations and commitment to oversight, responsibility, and guarding against problems - not assuming abilities will necessarily lead to safe, ethical or beneficial behavior on their own. By analyzing how knowledge and skills develop, exploring balanced and transparent methods, considering our own assumptions and obligations as model builders, and emphasizing "safe failure", we can make progress toward systems that autonomously apply nuanced logic in service of human priorities. The potential is there, but so is the work required - we must choose to take it on and see it through at each step. With rigorous reflection and review combined with pragmatic experimentation, philosophy and AI can be mutually informative. But only if we actively build in human wisdom and values along the way. The key opportunities are there for researchers willing to have the deeper discussions and make the harder choices - both to achieve the goal and ensure we are shaping it rightly. Small, open models focused on reasoning are a promising path, but one that requires care, responsibility and oversight to follow productively. Progress is possible, but dependent on our commitment as guides. If done responsibly, these systems could yield many benefits - but we must step up to meet the challenge, not assume it will be solved for us. The work is ours to do. Let's take it on.“

link