| Background upfront: I'm the guy behind the C++ interpreter and ROOT's new interfaces. I'm the co-author of the only surviving C++ reflection proposal and the author of the std::variant proposal. I have contributed to the C++ Core Guidelines (http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines https://youtu.be/1OEu9C51K2A). * HEP stores about 0.5 exabytes of data in ROOT format, that's almost exclusively serialized objects that do not know anything about TObject. * XRootD is not really specific for ROOT files. A better example would maybe be our JavaScript de-serialization library, https://root.cern.ch/js/ * No way will the python binding be dropped. I wonder where you got that rumor from. About one third of our users is using it. * HEP is limited by CPU resources, which is part of the reason why HEP decided to use a close-to-bare-metal language for the number crunching part. * We just made the use of python and R multivariate analysis tools with ROOT data more straightforward. * We have people from genomics etc coming to ask for help, because they cannot find a system that scales as well as ROOT does. And then we have a different perception of the direction out there. I see that Hadoop was nice but slow, Spark is nice but slow, so now things are moving to C++, see e.g. ScyllaDB. There is no reason for us to move away from it, but every reason to make it more usable. And yes, I agree that this is an issue. But many physicists do not. |
* Physicists still don't like pyroot interfaces, otherwise rootpy wouldn't exist.
* astropy is proof that you can be performant and user friendly. Julia is proof that you don't even need a C++ library underneath.
* Saying ROOT scales well is weird; It is true that ROOT and the ROOT IO/ROOT files are efficient, but it needs but additional services have helped it scale (dCache, XRootD, batch farm/grid/DIRAC, etc...)
* Not sure what the ScyllaDB tangent has to do with anything. There are scalable open source RDBMS options out there too like CitusDB, Greenplum which support UDFs. Hadoop and Spark with HDFS are still great for certain applications, and as general data analysis tools are great, but it's tricky to really get them to perform well without HDFS and the grid model of computing doesn't lend itself well to that paradigm.
* I've heard the C++ interpreter is much better with Cling (if that's you, I applaud your effort!) CINT was a gun that fired in both directions for every grad student I ever had to help.
* XRootD has little to do with ROOT anymore other than it also implements the original root protocol.
* ROOT is not modular. It is both an application and a collection of libraries and somewhat of a VM. That does make some things convenient, but it also makes some things extremely hard.
There are many reasons to move away from ROOT, and the astrophysics community is a prime example of that!