Hacker News new | ask | show | jobs
by reikonomusha 4365 days ago
I hate to be that guy, but I wish to share my opinion about closed source mathematical software.

There is no doubt that what Wolfram Research has done with Mathematica is amazing and tempting. It is a very complete and uniform engine, and can be very useful for very different kinds of mathematics.

However, Wolfram Research deliberately keeps their methods and source code closed. Note that this is more serious than just the "Stallman-esque" open-source-everything philosophy. Wolfram insists that users do not need to know implementation details of their methods. This is plainly in their documentation. You can see the uncompelling argument from Wolfram here [6]. The gist of the argument is that interfaces matter, not implementations.

I strongly argue that users, especially mathematicians and engineers, should care about the internals of mathematical software, especially when it's being used, even in a utilitarian fashion, for research and engineering.

Not only this, but Wolfram has litigated against his own employees for publishing mathematical proofs about cellular automata. Information about this lawsuit is sparse, but evidence of it can be seen in [0]. More information can be found here [1].

Unfortunately most responses to the above from users of Mathematica is "well I just use Mathematica as a calculator, nothing serious" or "I wouldn't look at the source code anyway, so what gives?" It's an unfortunate response, and I don't have a technical rebuttal, but a moral one, which many don't want to hear.

It pains me to see the technical reliance on Mathematica (and other software such as MATLAB) in professional mathematicians, scientists, and engineers. It reminds me of an addictive drug; one of the best hackers I know does their work completely in Mathematica, and can no longer work without it.

As is the case with a lot of closed source, proprietary software, there aren't a ton of good alternatives. There is a plethora of logistical issues with existing computer algebra systems, but I nonetheless recommend them. Sage [2] is a continuously growing system based on Python which has backing from a lot of mathematicians. They are continually improving it. There's also Maxima [3]. None of these has quite an extensive array of functionality and graphical capabilities as Mathematica.

I (and others) have written more about this issue more extensively here [4] for those interested. This is an extension on the article written by Jordi G. Hermoso [5].

If you took the time to read this, thanks.

[0] https://groups.yahoo.com/neo/groups/theory-edge/conversation...

[1] http://vserver1.cscs.lsa.umich.edu/~crshalizi/reviews/wolfra...

[2] http://www.sagemath.org/

[3] http://andrejv.github.io/wxmaxima/

[4] http://symbo1ics.com/blog/?p=69

[5] http://www.symbo1ics.com/files/jordi.pdf

[6] http://reference.wolfram.com/language/tutorial/WhyYouDoNotUs...

4 comments

My impression is that you don't hate to be that guy, but actually love to express this opinion.

Open source systems like Sage always look desirable, simply by virtue of being open source. But every time I look at it I'm left with a very bad taste in the mouth because of the constant badmouthing of non-open-source systems that is going on in that community. Companies do that sort of thing, and it doesn't inspire trust. But we know that it can happen just because a few people in the management made bad decisions. But when a community (!) around an open source (!) system takes on that attitude, it looks much worse. Don't you realize you're driving people away?

Why not put all that energy into improving your own system instead of trying to actively hinder others? Examples of that are forking GMP and making in GPL (not LGPL); actively pointing out to people (as Mr. Hermoso did to me) that no you can't link Octave to Mathematica because Octave is GPL (which is just a hindrance for my research, as well as to others); building on the fallacy that results obtained with Sage are inherently better because Sage is open sourced software is _theoretically_ verifiable. All software is buggy, and the only thing that makes a research result more trustworthy is if it is indeed verified, not if it's theoretically verifiable, but no one ever does it. Practical verification is almost never about reading the source code. It's about making sure the result is consistent and computing it with alternative tools.

No, I don't love talking about it. I actually find it saddening and arduous.

I don't consider what I said "bad mouthing". Maybe it was. I tried to be as respectful as possible, and provide links where I could.

I am certainly trying to improve existing systems. I've written a library for doing computational group theory, for which a paper was just published, and I plan to include it in Maxima.

Regarding verifiability, Sage has a lot more going for it than "theoretical verification". Professional mathematicians, especially those in algebraic combinatorics, regularly hold conferences and write software along with papers to show correctness of the system, and write new mathematically grounded functionality.

I apologize to both the authors of open source systems and to potential consumers of such systems if I am driving them away. My goal is to at least spark the idea for one to step back and evaluate what it means/implies/etc. to make use of proprietary mathematical systems, especially in professional or academic settings.

Results obtained with are better because Sage is open: in a mathematical research paper you can say:

"This reduces the problem to computing blah, which we did using the following Sage code. The function foo used here uses the algorithm of X and Y as described in their paper [XY2006]"

You can't say:

"This reduces the problem to calculating blah which the following Mathematica code computes using an unspecified algorithm for which there is no accompanying paper proving correctness."

Of course a paper proving that an algorithm is correct can contain errors, and also even if the proof of correctness is fine the actual implementation can contain bugs. But if you have no way of knowing how something is computed and whether anybody at any point in time even tried to prove mathematically that the method used is correct, you have no moral authority to rely on the result. That's just the standard adopted in mathematics: you can depend on results you have good reason to believe are true and are documented in the literature; you can't rely on stuff that's not written about. I don't know how citing results works in other areas, but that's how it is in mathematics (at least in the fields of mathematics I'm familiar with).

I guess it depends on one's field.

I've never cited a software I didn't write myself, simply saying trust this software, here's the reference. I wouldn't trust a result from Sage any better than one from Mathematica, simply based on which system produced it. I'd trust that x is a solution of an equation if substituting it back verifies it. I'll trust that two graphs are isomorphic if the software gives me a vertex permutation that makes the adjacency matrices identical. It doesn't matter how that isomorphism was computed.

When publishing work, I'll aim to make it verifiable this way.

I believe the vast majority of the use of these systems is not of the type when one needs to blindly trust the software and refer back to it in the paper. At least in my field (physics) it isn't. Yet I use programs like this daily, and I clearly depend on them for my work.

Most of the functionality available in Mathematica (or, I'd argue, most similar systems) are not of the type that one needs to cite. They either use standard and well known algorithms that are available in a multitude of systems (do you cite the methods for matrix multiplication or eigenvalue computation, and would it make a difference?), or the results are much easier to verify than to compute.

In those cases when I need to rely on a published method, like you mention, the method is very unlikely to be a built-in part of any system. So I either need to re-implement it, or use the original code of the authors. If the authors implemented their method in Mathematica instead of Python, does that make their program less reliable? No. It's still a published method, anyone can verify it.

My point is that I hear this argument about Sage very often, and the typical generalization is: "if you used Mathematica for your research, that's wrong, because it's not verifiable". This is a fallacy. It completely ignores how these software are used in practice, and implies that results from open source software are somehow magically reliable (they're not) and don't need verification (they do).

I've yet to come across a situation where the argument does apply at all: point me to a paper which goes truly wrong by citing Mathematica/MATLAB/Maple/etc this way.

Note that I'm not saying that one can claim that a result is correct, a theorem is true, etc. based on the fact that some undocumented algorithm produced it. That's clearly unacceptable.

Nor am I saying that it's never necessary to rely on an algorithm to get such a result.

What I'm saying that when people use Mathematica or other closed source systems, they do not usually commit these mistakes.

Also note that Mathematica programs can be open source and documented (many are). Several built-in packages have accessible and documented source code (e.g. Combinatorica). There's nothing wrong with using these to obtain such a result, and cite the (public and documented) program used to create it.

I think we're in complete agreement: most of the time you don't need a citation for a program you use because the result can be easily verfied. And I agree that what you call a fallacy is a fallacy, I was just pointing out that open source can be citable in a way that closed source isn't and that that is an advantage.
People are apt to discount Mathematica completely. I don't see any problem using Mathematica to generate some results because the author prefers Mathematica over other offerings. However, it would give the results much more credence if the Mathematica-obtained result was then replicated using open software.

I would expect that the difficulty of the port could vary widely between different use cases.

> actively pointing out to people (as Mr. Hermoso did to me) that no you can't link Octave to Mathematica because Octave is GPL (which is just a hindrance for my research, as well as to others)

What exactly did you want to do?

GPLv3 (which is the license Octave uses) does not always prohibit linking GPLv3 code with proprietary code. In particular, if you want to hack up a private copy of Octave for your own use, and do not distribute that to others, that's fine.

The key grant of rights is this, from section 2: "You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force".

A covered work is "either the unmodified Program or a work based on the Program".

"Convey" means "any kind of propagation that enables other parties to make or receive copies".

If you are just doing stuff for your own private use, you are not conveying, and so that grant of rights to "make, run, and propagate covered works...without conditions" applies to you.

We were considering making http://matlink.org/ compatible with Octave. The feedback I got on this was part of why this wasn't done. To make MATLink user friendly, it needs to come with compiled binaries, which would be linked against Mathematica's closed source MathLink library.

If it is the case that GPL doesn't forbid this, I'd love to hear about it.

Yes, you can do that for internal (ie private) use.

The only caveat would be if your job is at a university, and you plan to give copies to students. Distribution would be legally impossible.

I should add that every time I asked WRI support about implementation details, I did receive an answer with references to the method used.
Did they ever share any of their algorithms created in-house? (They claim many are.) If they share the method, why can't they share the code?

They do have some notes on internal implementation, that do not seem up-to-date, here [0]. Were they any more detailed than this?

[0] http://reference.wolfram.com/language/tutorial/SomeNotesOnIn...

While well-meaning, sentiments like these are part of the problem.

The proposed frame is "mathematica is somewhat better in polish and functionality, but we should stick to OSS on principle."

The problem is, by this argument, no will actually ever know what Mathematica is, what it does, and what cool ideas it had that could potentially inspire further work.

All the examples of alternative software have lots to learn from Mathematica; but thinking of Mathematica as a collection of algorithms for math is both wrong and misses the point. Mathematica is increasingly about knowledge computing; its moved on from where other systems are now just thinking about getting to.

Instead of competing with and replacing mathematica, everyone would be better off first learning from it, and then trying to apply and extend the fundamental principles in their own work.

I think you put your finger on it... people talking about Mathematica/Wolfram Language as a CAS are really chasing an idea of what it was 10 or 15 years ago.

Maybe I'm naive but I think that we really can just "all get along" -- notebook-based programming is just one example of how ideas can incubate in closed-source projects, and end up benefiting open-source projects. I'm hoping knowledge-based computation ends up being another.

And the reverse happens too -- Light Table is tremendously exciting, and I hope WRI can learn from that as it develops.

Mathematica is decades ahead of LightTable
Can you list the things that Mathematica does that you think LightTable "should" do?
I am not proposing that Mathematica is "somewhat better". Mathematica is vastly better for many things, especially visualization.

Other systems should "learn" from it, sure. It is hard to change Maxima, unfortunately, since it's entrenched in its roots from the 1960s. That's no excuse for making it difficult to use.

I think Sage's Python interface is doing better and better, learning what it needs to learn from Mathematica, including its documentation and interface.

I definitely don't agree with you that everyone should use or learn Mathematica though to learn it. As a student of computer algebra, I'd have to contend that Mathematica actually does computer algebra remotely correctly. A much more beautiful system for doing computer algebra was Axiom [0].

[0] http://www.axiom-developer.org/

I am worried that Sage will get stuck in the mud as its Python code base grows, without a powerful expressive functional core like Mathematica's Lisp-ish core language. For example, see the discussion in this page about simple association lists vs what Mathematica 10 just launched
Have you noticed that in Mathematica you're not even able to make your own opaque data types? Only Wolfram has that ability.

Python is usable for actually writing software systems and algorithms. When a Mathematica code base grows over 10 lines, it becomes virtually unmaintainable in my experience.

You can make your own opaque data objects, quite easily, thanks to HoldAll, UpValues, and Internal`SetNoEntry (the nuclear option).

What precisely becomes unmaintainable about Mathematica/Wolfram Language code after 10 lines? You could just be bad at programming in WL.

Name[Field1, Field2, ...] is not an opaque data type. Maybe you can give me an idiomatic example of how to build a binary tree, for example? Or maybe something more complicated like a doubly linked list?

I do not write Mathematica code, but most code I've seen usually ends up being this mess of functions. I'll give you that maybe the code I've seen has just been bad, so we can ignore my point there.

People into traditional programming languages look at Mathematica and see abstractions for which they see no purpose.

People into PL research look at Mathematica and scoff at its low-brow, for-the-masses term rewriting, lacking whatever theoretical property deemed "essential" that week.

Yet somehow the language has formalized and made more computable and consistent more domains of math, science, and increasingly data than any other.

There is no honor or glory in purposeful ignorance.

I disagree that it has made more domains of math computable and consistent. It's syntax is consistent, but it's evaluation semantics is wildly inconsistent. This is noticeable when you use some of their internal simplification algorithms on non-trivial problem.

I think Axiom covered more surface area in terms of mathematics than Mathematica. Mathematica is mostly good at performing over reals and complexes and doing term-rewriting algebra. Axiom supported arbitrary algebraic structures.

That doesn't preclude Wolfram releasing the source code for the calculation engine.
Sometimes it's just a matter of signing an NDA to see the source code. I use a pretty expensive ($100k) EE simulation package and have questioned model implementations on a few occasions. The developers had no issues sending me the code in question for a simple 1 page NDA.
I think you are a bit confused. I would compare Mathematica to a an old digital desktop calculator. Nobody who used those wanted to see their insides. The point in Mathematica is not that they have some closed source pixie dust that no one else can understand or implement. The value of the software comes from the fact that you need a lot of grunt work in a large software package to maintain user experience - fix all the unfun bugs etc - and very few open source projects manage to attract enough interest to have people do also the gruntwork and not the interesting new development or 'cool refactorings'.