| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by seunosewa 370 days ago
	I disagree. Every python package we install seems to install dozens of libraries, each of which can could harbour malware. Many of them are only used for a single function within them. We have no idea of what most of the packages are for. It's a lot.

5 comments

aDyslecticCrow 370 days ago

https://en.m.wikipedia.org/wiki/Log4j https://en.m.wikipedia.org/wiki/Npm_left-pad_incident

Languages and domais that have leaned too faar into package managers and small libraries are prone to fragility and security nightmares.

For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

Id much rather deal with a bug in our code than a depricated library or breaking version update.

If we are to use a library outside of standard unix or stdlib within my field, better expect a nighmareish code review and a meeting.

Besides being fun; implementing it ourselves improves our skill level for the future. Something vibe coding itself goes against aswell.

link

skydhash 370 days ago

> For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

A project only become serious once legal is breathing down engineering's neck. Before that, it's usually the far west. After, it becomes a security circus trying to patch the technology deficiency (custom registries, complex linting and other analysis tooling,...)

link

giantg2 370 days ago

If it's open source, it may be possible to create your own fork to fix issues.

link

j-pb 370 days ago

This. We finally have a tool that can learn from all the libraries and abstractions that have to fit everybody's needs (and do so badly because there is no free lunch), and extract just the parts that are actually relevant to our problem and domain. This allows you to not only produce a much smaller attack surface, but also allows for domain specific optimisations and shortcuts.

It's kinda like project specific semantic monomorphization.

link

handfuloflight 370 days ago

Sure. Lot's more debugging than using something battle tested, which is why I have this in my CLAUDE.MD:

> If there is a battle tested, well known package that can help us, then recommend it BEFORE implementing large swaths of custom code.

link

lazide 370 days ago

This is hilarious.

link

handfuloflight 370 days ago

You're right. I didn't fully read what the OP was saying, which is genius; and my response was more towards the article.

link

Noumenon72 370 days ago

I didn't get this either so let me try to explain as ChatGPT did to me:

Monomorphization means taking a generic function and generating a version specific to the type being used, eg a Rust function

  fn identity<T>(x: T) -> T {
      x
  }

can be compiled into one version for i32 and one for String, which is more efficient since the compiler knows the types:

  fn identity_i32(x: i32) -> i32 { x }
  fn identity_string(x: String) -> String { x }

Semantic monomorphization could mean extracting the parts of the library that are meaningful to generate problem-specific concrete code: instead of importing pandas to do

  import pandas
  df = [{"a": 1}, {"a": 2}]
  total = sum(d["a"] for d in df)

The LLM might skip the import entirely and generate only:

  data = [{"a": 1}, {"a": 2}]
  total = sum(d["a"] for d in data)

If I understood right, the parent found it funny that a comment suggesting we could never use libraries because we can concretize the specific relevant code, would be responded to with a Claude.MD that essentially said "always use libraries instead of concrete relevant code". I missed it because I didn't stop to look up "monomorphization", so I hope this helps anyone else like me get the joke.

link

lazide 370 days ago

Nah, I found it hilarious that any LLM would have any clue what would constitute ‘well baked’ in the context, or that any of this was going to end well.

link

closeparen 370 days ago

>This allows you to not only produce a much smaller attack surface

Why does this reduce your attack surface? Can the functions in the library, unrelated to the ones you're using, be triggered by user input somehow?

link

j-pb 370 days ago

It's about the functions you _do_ call. Those probably have larger scope beyond your specific use case. Worse they have to support the superposition of various use-cases of their users.

Let's say you got a library to do arbitrary unicode string verification, but your code only ever works with strings of a short bounded length (e.g. 32 byte), an LLM could write you vectored verification instructions for that.

link

AlienRobot 370 days ago

LOL! I thought the article was going to be about reading books and ChatGPT!

And yes, I agree.

https://www.npmjs.com/package/boolean

>converts lots of things to boolean.

>3 million weekly downloads

This is insane.

link

what 370 days ago

3 million weekly downloads for a package that is “deprecated” and the source repo no longer exists. Truly insane.

link

AlienRobot 370 days ago

Even if it wasn't deprecated this is literally

    ['yes', 'y', '1'].indexOf(input.toLowerCase()) !== -1

People adding a dependency to avoid writing one line of code...

link

Viliam1234 369 days ago

I guess some people were taught that the Not-Invented-Here Syndrome is a bad thing and needs to be avoided at all costs. And they took it literally.

I had a similar situation at a former job once, where as a subtask of some task we basically needed to implement a "partial deep copy" for a few specific classes (deep-copy some selected properties, shallow-copy the rest), which could either be implemented in ~50 lines of Java code, or by adding an extra library that provided this functionality using a domain-specific language, plus many other things, some of them potential vulnerabilities... and it took a lot of time to convince my team leader that "reinventing the wheel" is the right thing to do in this case.

"Why do you want to program something that already exists? Don't you realize that every line of code you write is a line of code someone else will have to maintain in the future?" Yeah, good points, but it's not like using a library is without costs either: you need to scan for vulnerabilities, increase versions, sometimes the API changes; in long term that is a lot of work to do just to avoid writing 50 lines of code.

link

lazide 370 days ago

This is the total leopards-eating-faces moment from all the greybeards.

link

rjsw 370 days ago

Same with ruby, I have to use a package with 230 dependencies.

link

ozim 370 days ago

This sounds exactly like under utilized - if someone needs a function or two from a library I guess making yourself depending on 3rd party for such small gain doesn't make sense.

link