Hacker News new | ask | show | jobs
Getting a scientific prize for open-source software (gael-varoquaux.info)
142 points by gorpovitch 2387 days ago
3 comments

OP is right-on as to why this is a big deal in academia.

I want to see it spread beyond computer programming libraries into areas where sharing is harder, like open source scientific equipment and fully reproducible methods in chemistry experiments.

> OP is right-on as to why this is a big deal in academia.

The ACM (academic computer science group) has awarded prizes to these open source systems projects amongst others recently:

- Wireshark

- Jupyter

- GCC

- Mach

- Coq

- LLVM

- Eclipse

- make

- Java

- Tcl/Tk

The linked article is written like the academy never works on or recognises open source software or implementation work, or using open licences is unusual. That's not true.

OP was talking about academia in general, and not just about CS-academia, which is of course a lot more sensitive to open source software.

In traditional (= non-CS) academia, proprietary software is still very much the norm, and as long as institutes get a free license for academic usage they also don't seem to care about open source too much. I don't know how much precedent there is, but such recognition from traditional academia still seems to be pretty rare and worthy of highlighting.

Yes. The self-congratulation was very off putting.
> fully reproducible methods in chemistry experiments

Top-tier open source libraries for cheminformatics (or other natural science -informatics flavours) would already be a welcome start.

What do you think is missing in the current offering (OpenBabel, RDKIT, maybe some other I am missing)?

Context: I do research in computational chemistry, and write an open source library for this, that could be used for cheminformatics too. I don't really know what is needed for this though, since I never touched cheminformatics.

I've dabbled a bit with OpenBabel and RDKIT, but I found their interfaces especially for simple things (traversing atoms/bonds in a molecule) quite unwieldy. I suspect that a big part of this could "just" be missing documentation / tutorials to get into it.

Maybe I'm just not deep enough into it, but from my impression so far especially when it comes to application-level software (in contrast to specialized research), OEChem and similar closed source libraries seem to be the most widely used ones, with nothing quite comparable available.

Context: Software Engineer that is also currently a biochemistry undergraduate.

> I found their interfaces especially for simple things (traversing atoms/bonds in a molecule) quite unwieldy.

Somehow the same for me, this is part of why I started my own project (http://chemfiles.org). I have the impression that for cheminformatics you want to see molecules as graphes, is this true or is a list of bonds enough for usual purposes?

I have heard of OEChem but never used it. I'll try to find some documentation to have a look.

> I have the impression that for cheminformatics you want to see molecules as graphes

Yeah, that was my thinking.

I've also seen your work on lumol, so you seem to be one of the few people working in the field with Rust! I just recently started writing a SMILES parser in Rust[0], as a first step towards an in-memory graph representation of molecules. I have a first rough draft of that locally, though it's very rough and changing a lot, as I have to adjust it weekly as I'm basically learning the required theory at the same time :D

[0]: https://github.com/hobofan/smiles-parser

Published chemical syntheses are described in very great detail. Physical chemistry/spectroscopy papers likewise describe apparatus, collection, and analysis often down to the nuts and bolts. I don't see how to open source work requiring a femtosecond mid-infrared laser or a prep requiring a synthesis lab with all the reagents, labware, and safety equipment. Buried in the open source PR is the unshakeable underlying belief that science begins when the data are in the can and ready for analysis.
> I don't see how to open source work requiring a femtosecond mid-infrared laser or a prep requiring a synthesis lab with all the reagents, labware, and safety equipment.

You can put text documents on GitHub describing process, in the same way as you can code and data. If you have some setup with a femtosecond mid-infrared laser or prep requiring a synthesis lab with all the reagents, labware, and safety equipment you can open source the bill of parts, the build instructions and the lab book. It'd probably be very valuable to do that so please do!

Here are the freely available supplemental data to a paper in the Journal of the American Chemical Society blending organic synthesis, computation, and spectral characterization. 122 pages of exquisite details from a multi-lab collaboration. Lots more like it out there.

Note: I am not in any way affiliated with this research or the labs involved. This came out of a quick search.

https://pubs.acs.org/doi/suppl/10.1021/jacs.6b13031/suppl_fi...

Doesn’t that prove my point? I know people post their artefacts. I often review them. Not sure what you’re trying to say?
If you think that all software written on public money should be public, consider this petition: https://publiccode.eu
I think the budget used to make this software possible should also be public... yet we have a black budget