Hacker News new | ask | show | jobs
by randomswede 1797 days ago
None of these require macros.

If (and only if, I have not looked at our hypothetical model and fit GF, but for the sake of argument, I will assume that it does) "fit" specialises on "model" will a "mode = AnOtherModel" cause "fit(model, x, y)" to be exquivalet to Python's "model.fit(x, y)". If you need to provide a custom "fit" method, you do so by providing a method specialised on AnOtherModel to the "fit" GF.

At no point is there a macro involved.

As for the "module a, model nn" and "module b, model nn", I would naively assume that they actually are different models, and therefore something specialising on "a.nn" will not get dispatched to when you pass a "b.nn".

Disclaimer: I don't actually know Julia, at all. But I have written substantial amounts of CLOS code (and Python, but I like CL and CLOS better).

2 comments

I know nothing about Lisp, so at the risk of talking past eachother....

I never said macros were required. I said implementing this type of code without OOP required more boilerplate, and MLJ uses macros to reduce that boilerplate.

As I understand module imports in Julia: Each module developer exports a list of publicly facing objects. Obviously "fit" and "model" would be among them. If you import two modules that both export a new "nn" subtype of shared parent type "model", and both extend "fit" and "predict" and etc to accept their own subtype "nn", then you have to manually specify which module you are referring to every time you call nn, or fit, or predict, or whatever. Is that wrong? If I just import PyTorch, import TensorFlow, and then call "mymodel = TensorFlow.nn; fit(mymodel, mydata)" then Julia doesn't know that the "fit" I am calling is the TensorFlow implementation and not the PyTorch implementation; what if I had WANTED to use module A's "fit" on module B's model, and they intentionally adopted the same abstract type system to enable this interoperability? So instead I have to write "mymodel = TensorFlow.nn; TensorFlow.fit(mymodel, mydata); TensorFlow.predict(mymodel, mynewdata)". Obviously the extra typing is mildly annoying but the bigger problem is potentially introducing bugs by mismatching modules, and the developer's cognitive overload of having to keep track of modules. Python style OOP is a more elegant solution to the namespace problem and results in more readable, maintainble code, at least in my opinion. Anyways, maybe Julia has a more elegant solution I'm not aware of, if so I'd love to hear it.

In Julia it will just dispatch to the correct function.

In other words, one package would define `fit(mymodel::TensorFlowModel)` and the other would define `fit(mymodel:PyTorchModel)`, and then when you call `fit` it'll just dispatch to the appropriate one depending on the type of `mymodel`.

This dispatch-oriented style also allows a shocking degree of composability, e.g. [1], where a lot of packages will just work together, such that you could for example just use the equivalent of PyTorch or TensorFlow on the equivalent of (e.g.) NumPy arrays without having to convert anything.

If you mean "what about the case where both packages just call their model type `Model`", while I've never run into that, the worst case scenario is just that you have to fall back to the Python style explicit Module.function usage (which was always allowed anyways...). And if you if you don't like names being exported, you can always just `import` a package instead of `using` it:

  help?> import
  search: import

  import

  import Foo will load the module or package Foo. Names from the imported Foo module can
  be accessed with dot syntax (e.g. Foo.foo to access the name foo). See the manual
  section about modules for details.

[1] https://www.youtube.com/watch?v=kc9HwsxE1OY
I very frequently run into namespace collisions like that. I think they are quite common in large codebases.

I am aware of the ability to do eg "import TensorFlow; model = TensorFlow.model; TensorFlow.fit(model,data)"

As I mentioned previously, I find Python's OOP "model.fit" syntax to be better, for a variety of reasons.

Thank you for your engagement. Have a nice day.

There's some serious misunderstanding here. You do not have to disambiguate the function call, only the construction of the object. You would write

  m1 = TensorFlow.model()
  fit(m1, data)
  m2 = Pytorch.model()
  fit(m2, data) 
Julia knows which version of model you are using.

YensorFlow.fit and Pytorch.fit are just different methods of the same function.

You've formed some strong opinions based on a pretty big misunderstanding.

Do you have an example of a case where you ran into this in Julia with two packages that you wanted to use together? If the packages are still actively developed, I suspect the developers would be interested to resolve the situation to allow interop.
Never made it that far. This was a feature I use all the time in Python ML development (both consuming open source packages via an OOP interface and also writing in-house model classes) that I consider essential for my productivity and that Julia was missing.

<edit>I'm also nervous of relying on the recourse of asking package maintainers to edit their variable names to improve compatibility with the random third package I want to use; maybe the culture is different in Julia but in Python that's a good way to get laughed out of the issue tracker :) </edit>

If I were you I would maybe ask people like @systemvoltage who took the plunge and wrote a big project in Julia only to find they had trouble maintaining the project. Maybe one reason he can't upgrade without breaking everything is because of namespace collisions amongst his many dependencies? If not that, it's something like that.

@cbkeller

I know in Julia I can just precede every function call and object instantiation with "modulename." and solve the namespace problem that way. What I want to do instead, what I do in python, is bind one namespace of function methods to each object, so that as I code, I don't have to remember which module each each object came from. That is the appeal to me of "model.fit" over "module.fit(model)".

EDIT: This is not some "shave a few seconds off coding time" quality of life issue. This is a mission critical requirements in many enterprise contexts.

Scenario A: Model Development Team trains a model, serializes it, and sends the serialized model object to Production. You want Production to have to lookup which software Package the Model Development Team used for each model, just so they know which "predict" function to call? No, "predict" should just be packaged with the model object as a method, along with "retrain", "diagnose", etc.

Scenario B: I want to fit a dozen different types of models, from multiple packages, on the same data, to evaluate how they each do, and build build some sort of ensemble meta model. So I need to loop over each model object in my complicated ensembling logic. You want me to also programmatically keep track of which module each model comes from?

These are important, happens-every-day use cases in the enterprise.

I think the best solution for this in Julia is to package the model state with all the "method" functions in one struct. Again, this is what MLJ does. This is the closest Julia gets to OOP. But then you either need a lot of boilerplate code, or a Macro to unpack it all for you behind the scenes.

Composability is one of the things the Julia community generally Takes Seriouslyᵀᴹ so definitely don't hesitate to ask if there are two packages that don't play as nicely as you would like!

I'm a bit confused still though why you say it's a "missing" feature, given that as we discussed above, there is absolutely nothing to stop anyone who wants to use the "Python OOP" style of namespacing in Julia from doing so? Most of us don't seem to find it necessary or to prefer it personally, but that doesn't restrict anyone else from choosing it.

Your explanation is stop on for Julia.