Hacker News new | ask | show | jobs
by gonab 1688 days ago
I don't think this is a fair assessment of "most ML people"

Some of the biggest distributed systems built today are used for statistical inference or scientific computation

Most "ML people" I know are highly versatile in software, networks and deep hardware knowledge, i.e., essentially they have a very good understanding of what a computer is and what is capable from

Its very naive to think that you can assemble machine learning systems without having a solid understanding of computers and statistics

You know who also likes python a lot? Hackers. I wonder why

5 comments

The most insane thing about Python is how you can override single methods in classes and use the class like normal. One time I was working on getting FIFO working on Windows, and none of the Python built-ins were set up to handle any random process writing to a named pipe that wasn't within the same Python instance. So what I did was I took the closest implementation Python offered, which was in the multiprocessing module [1], and overrode a single one of the methods to do what I wanted it to do. The module still handled all of the cleanup so I was confident there weren't any memory leak issues, and I was able to make a simple change to the flags it was passing Windows to allow for the functionality I wanted. A language that has a hackable standard library itself is insane and I don't think I've seen it on any other language. In addition, I've found the C/C++ bindings for Python wonderful and intuitive to work with. The setup takes no effort at all and it "just works", batteries included, via ctypes.

https://github.com/python/cpython/blob/main/Lib/multiprocess...

I understand exactly what you are saying. Let me add up by saying python has a beautiful learning curve

In the begging is very simple to churn out code and do wtv you want, but the more you are into it, you start realizing that there are endless possibilities

It's a great language for beginners and even better for experts that just want to solve problems with code without thinking to much about if coding is beautiful or not, or feeling cool, or arrogant about it

It just works

Monkey patching is a terrible practice outside of unit testing and can lead to extremely difficult to debug bugs.

Also monkey patching isn't unique to python.

FWIW I tried looking a few up and standard library seemed hit-or-miss:

JavaScript it didn't work:

  class testClass {
    constructor() { }
    
    callHoHe() {
      console.log('ho', 'he');
    }
  }

  let hi = new testClass();
  hi.callHoHe();

  testClass.callHoHe = () => {
    console.log('haha');
  }

  let heh = new testClass();
  heh.callHoHe();
This ended up just printing 'ho', 'he' twice,

For Java people didn't think it was possible:

https://stackoverflow.com/questions/47006118/is-there-any-wa...

For Java they said here that you just have to use your own similar implementation.

And for C# they have some pretty intense restrictions on overriding standard library stuff:

https://stackoverflow.com/questions/21302768/where-we-can-ov...

Golang doesn't seem to have this functionality as well:

https://stackoverflow.com/questions/37079225/golang-monkey-p...

Ps. it would have been nice to have monkey patching when dealing with btoa and atob in JavaScript, since they have different function on NodeJS vs the browser.

Pretty close with the JS, just change testClass.callHoHe to testClass.prototype.callHoHe and you're good to go. Agreed about btoa and atob, since they're globally scoped and I'm not sure if they can be overwritten...
>Ps. it would have been nice to have monkey patching when dealing with btoa and atob in JavaScript, since they have different function on NodeJS vs the browser.

The better solution is to encapsulate the class and override the methods. Monkey patching is terrible because the behavior of the function is changing at run time. If someone is not aware that you are monkey patching a function the only way for them to determine what is going on is to step through the code with a debugger.

In Scala it's possible, but only at object creation time.
This was my thought exactly. I understand why someone would want to do it. However, when a problem comes up, good luck debugging it in python.
Ruby has the same ability. Although it might be footgun if you(or dependency of dependency) modify or extend orignal class.
> The most insane thing about Python is how you can override single methods in classes and use the class like normal.

Isn't that true of basically every language supporting class-based OOP and inheritance?

It's called 'monkey patching' and python does make it particularly easy, simply:

class.methodName = newMethod

.. kinda thing, future callers now get your method instead of the original.

This does seem a fair bit easier than other languages make it to do?

  class.methodName = newMethod
 
> .. kinda thing, future callers now get your method instead of the original.

Which is as powerful as it is a problem, since doing such kind of monkeypatching will change the behaviour all other instances, including already-created ones, that know nothing about your trick.

Any part of the program can modify any other part of the program in a significant way, making local reasoning and debugging very hard.

So, great for quick-and-dirty single-file scripts/ipython notebooks. Terrible for large systems.

That's the very issue with Python. The way it doesn't enforce sane, clean programming behaviour makes it easy for a beginner/non-programmer to work with it. But a large system with a lot of external libraries is very hard to maintain.

Source: Python user since ~2004

> It's called 'monkey patching' and python does make it particularly easy

Inheritance and overriding method in the descendant class is cleaner, and more broadly supported. When you need monkey patching, sure, its nice that most modern dynamic OO languages support it quite naturally. (Ruby even supports scoped monkey patching via refinements, as well as classic monkey patching and per-object overrides.) But this is not at all unique to Python.

You see exactly what I mean!
Perhaps we're talking about different sets of people? You seem to be describing the people who build ML systems, while the previous poster was talking about the people who use them. Your average data scientist most definitely does not have (or, really, even need to have) deep hardware knowledge or any understanding of networking.
I personally consider the term "data scientist" a very successful creation by a clever marketer

It's sexy and most of the times ...

Ethical hackers have written some of the worst code I’ve ever seen. They have to know a ton about the security of frameworks, networking, etc… it’s a very complex role for sure, but they are not shining beacons of software engineering quality
I agree. Their objective is not to build beautiful code but to provide a working prototype

I'm sure every single one of them is capable of writing world class code if they feel like it

Their exploits are world class and their focus is to exploit

I have seen stuff in JavaScript exploitation that I can't even scratch the surface. I feel like I have been playing piano for 15 years and I can't even understand if that a music that is playing

Python is used to make fast and dirty experiences. You haven't figured out the answer yet, so no point on building dedicated optimized code which might be useless in end

Almost always, ML production models end up being a binary files of matricial weights. This file can be loaded in wtv language or device you decide to use

Most ml people (aka data scientists) I've met have little understanding of what a computer is, and it's fine. They understand stats.

The tiny subset of people who build ml systems (say tensorflow core devs, write actual distributed systems, etc) are actually hpc specialists, and have all the qualities you describe.

Of course, you may work somewhere where you're lucky enough to have everyone be good at everything!