| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cletus 59 days ago

If I were the CTO of any of these companies I would be working my butt off to be making an internal version of Claude. Let me explain my reasoning using Google as an example (disclaimer: Xoogler).

Google has a lot of systems to make a very large monorepo manageable so builds and code search don't take forever. The build system is Blaze (on which Bazel is based), which has a Pythonic syntax and was once Python but that hasn't been the case (AFAIK) for over a decade. This means you build a massive digraph of build artifacts. By "large" I mean somewhere between 100M and 1B vertices (guessing). Loading that became a significant problem for a build so there's heavy caching around that. There's also heavy caching around build artifacts (ie Forge).

So, part of the issue with every developer using Claude is that you have a ton of inefficiency becasue everybody has a significant context. And what is context really? It's not too dissimilar to the build graph and/or code search you already have.

So the infra I would be working on would be some kind of "global context" or "context cache". Now a lot of context changes when you do a local change but a lot doesn't. As an ordinary engineer, you aren't generally modifying /base. You're modifying leaf nodes or branches for very few leaf nodes.

The reasons I see to do this are:

1. Cost-savings by deduplication;

2. Speed if context is partially-cached;

3. You avoid issues of sending out your codes to third-parties. In the case of Google or Amazon, if they use Claude at all, they would probably only be using their own clouds so they avoid this. But Uber doesn't have that luxury;

4. You avoid any issues of people using your prompts for responses for training and leaking any potential sensitie information that way;

5. You can use off-peak resources for a lot of this work;

6. You can control resources within your own pervasive resource management (in the case of Google); and

7. You can more easily integrate into internal tooling.

I also think that expanding compute power is the biggest risk to Anthropic (and OpenAI). There's a vast difference between a model you need a cluster of NVidia's finest to run vs one you can run on a Macbook Pro. We aren't there yet on a Macbook Pro but it'll only be a few years we are.

2 comments

minimaxir 59 days ago

The costs of a) selfhosting a >100B param LLM model b) scaling it to a full company and c) maintaining it are all significant risky investments that is even more expensive in the short term.

Those are generally the core reasons most SaaSes exists. Additionally, (a) is the biggest issue because there is no open-weights model that can match GPT 5.5/Opus 4.8.

link

lijok 59 days ago

Are you describing finetuning?

link