Hacker News new | ask | show | jobs
by j-pb 370 days ago
This. We finally have a tool that can learn from all the libraries and abstractions that have to fit everybody's needs (and do so badly because there is no free lunch), and extract just the parts that are actually relevant to our problem and domain. This allows you to not only produce a much smaller attack surface, but also allows for domain specific optimisations and shortcuts.

It's kinda like project specific semantic monomorphization.

2 comments

Sure. Lot's more debugging than using something battle tested, which is why I have this in my CLAUDE.MD:

> If there is a battle tested, well known package that can help us, then recommend it BEFORE implementing large swaths of custom code.

This is hilarious.
You're right. I didn't fully read what the OP was saying, which is genius; and my response was more towards the article.
I didn't get this either so let me try to explain as ChatGPT did to me:

Monomorphization means taking a generic function and generating a version specific to the type being used, eg a Rust function

  fn identity<T>(x: T) -> T {
      x
  }  
can be compiled into one version for i32 and one for String, which is more efficient since the compiler knows the types:

  fn identity_i32(x: i32) -> i32 { x }
  fn identity_string(x: String) -> String { x }
Semantic monomorphization could mean extracting the parts of the library that are meaningful to generate problem-specific concrete code: instead of importing pandas to do

  import pandas
  df = [{"a": 1}, {"a": 2}]
  total = sum(d["a"] for d in df)
The LLM might skip the import entirely and generate only:

  data = [{"a": 1}, {"a": 2}]
  total = sum(d["a"] for d in data)
If I understood right, the parent found it funny that a comment suggesting we could never use libraries because we can concretize the specific relevant code, would be responded to with a Claude.MD that essentially said "always use libraries instead of concrete relevant code". I missed it because I didn't stop to look up "monomorphization", so I hope this helps anyone else like me get the joke.
Nah, I found it hilarious that any LLM would have any clue what would constitute ‘well baked’ in the context, or that any of this was going to end well.
And what prevents it from having a clue?
You can't expect everyone on this forum to share your dismissive attitude and therefore your comment was both low-effort and confusing.
>This allows you to not only produce a much smaller attack surface

Why does this reduce your attack surface? Can the functions in the library, unrelated to the ones you're using, be triggered by user input somehow?

It's about the functions you _do_ call. Those probably have larger scope beyond your specific use case. Worse they have to support the superposition of various use-cases of their users.

Let's say you got a library to do arbitrary unicode string verification, but your code only ever works with strings of a short bounded length (e.g. 32 byte), an LLM could write you vectored verification instructions for that.