Hacker News new | ask | show | jobs
by lagniappe 1133 days ago
My yardstick so far of all LLMs has been to ask for an offensive joke, ask for a function to invert a string, and ask for directions to make lasagna. It seems stupid but it's remarkably effective.

With MLC being the first LLM-in-a-box to run on my M2 at faster than a token per minute, I'm impressed at the speed but also disappointed at the quality of the experience. For those interested in the outcome, it failed all 3 tests, which is not unexpected for a small model like this.

Using/producing models with censorship included voluntarily demonstrates a willingness to hobble the technology for peripheral reasons that do not directly correlate with the advancement of the field. For that reason, this is a disqualifying characteristic in the capacity of my own use on the basis that social sensibilities and decency varies across cultural and regional lines, anything so trivial as a crass joke being limited is such a low bar that other things of much more grave concern will undoubtedly be tampered with or limited, and not always in ways the authors intended.

Self-hindering behavior will not be the positive we think it will be, as with most measures to correct injustices with data.

1 comments

You can use MLC with different (bigger) models, right?
You can't right now. Devs are working on instructions for porting other models, but they're not ready yet. The point of MLC is that it supports pretty much all GPU backends out there (including Intel and Mac). The bundled model is just a proof of concept.