Hacker News new | ask | show | jobs
by ltononro 21 hours ago
What kind of coding do you do? Do you keep track of frontier models to vibe check the differences and re-evaluate constantly or are you ok with having a nerfed model forever? (not being judmental, just really wanto to know your framework here)
1 comments

Some of the work I do, I do for an (EU) organisation that doesn't have clear rules or guidelines on the use of AI yet. Though I have seen colleague-developers blatantly putting source code into external Claude-like models, I stay true to my principles and don't. I know for certain that everything that I run through my local, offline Pi Container Sandbox cannot leave the machine, and thus can't result in a data breach. I do this for the peace of mind.

I do (unscientifically) experiment whenever a new capable local LLM (<=130b) releases with a license that permits commercial use. As for knowing my models require more work than Opus, I don't mind still having to puzzle on getting the architecture right. In any case, it forces me to stay in the loop of what's being built, which is a good thing.

So you don't really trust the data policy (non-retention) of the big companies like Anthropic/OpenAI + regulations in EU. This is very interesting. I myself have been blindly trusting these organizations with my data and still not sure if I am trading code/trajectories for productivity.

Another POV is that most of the code written in most of my codebases were generated by Codex/Claude, so they would be "stealing data from themselves" in a sense.

I've been working with Transformers/LLM training in 2018-2021 and then now, more recently again. Things are far different. I think they would be more interested in the "how" you got your code to be satisfactory with your guidance than the actual code generated. But mostly I personally trust that they are not really using my trajectories for that (unless I explicitly allow it in the configs)

I'm adding Pi to Nemesis8 right now because I saw your comment, so thank you!

https://github.com/DeepBlueDynamics/nemesis8

Could you give more details on how to make such a set up?

I'm not familiar with Pi, and not sure which kind of container you are referring to. Something mainstream like docker, or more classic like a BSD jail?

I started to experiment with locale LLMs, through ollama and Lemonade. Enough to throw simple prompts with code excerpts and get small scope code refactors. Though I still struggled to make them work with external tools, like my IDE, so they can be leveraged on to an agentic level with access to a full repository.

That's mainly for work, as they push for using LLMs, though with the new copilote license they provide it doesn't take me even a week to burn the whole token credit.

The tool can be useful, but in my experience without heavy guard rails and loops over tests. I suspect late models to also burn many token into rabbit hole of nonsense hypothesis, instead of doing straight forward correct implemention as you would expect from any entity with such a huge cumulated resources eaten and experimental playground to leverage on. Maybe incentives don't help model provider to minimize sold token, maybe it's just so hard to tame the beast all these bright minds with virtually infinite resources are not good enough.

Anyway, sorry for digression, but I would be extremely interested with a step by step tutorial to make a local LLM work in agentic level, including which kind of hardware is required to make it work properly.