|
|
|
|
|
by _coveredInBees
2465 days ago
|
|
To be honest, I've always been puzzled with how extensively people use Jupyter notebooks, despite how deficient they can be in many situations. Don't get me wrong, I have used them a lot and I still use them on occasion, but pretty much the only times I would really reach for them are when I want code + documentation to reside together. So if I am prototyping things that also would benefit from Markdown + Latex comments/documentation, it is far superior to a script with text comments. The only other situation where I've found them to be pretty useful are when utilizing their interactive widgets to let the end user explore the data in interesting ways. But most people seem to use it as an IDE when it is quite deficient compared to something like PyCharm. Even for quick prototyping, I find PyCharm to be far more useful because a) I can directly run things in the IPython console, b) Examine variables in the variable display widget, c) Attach a debugger to the console at any time and start debugging with an unmatched debugging experience in Python land, d) Easily have the correct venv be utilized by the IPython console, e) Have outstanding linting + code introspection + autocompletion, f) Have sane git diffs that make it easy to use version control appropriately and frequently unlike with Jupyter notebooks, and probably a bunch of other benefits that come with having access to a powerful IDE. All of these make rapid prototyping much faster than anything I can achieve in a Jupyter notebook. I've also been pretty unimpressed with the quality of the average Jupyter notebook that I find on Github repos. They encourage dumping everything into global state, rely on state across code cells in non-obvious manners and in general result in ugly scripts that need a lot more work if they had to be refactored into modular packages/modules. Running automated jobs from Jupyter seems a bit crazy and I hope people stop to think whether that is the appropriate path to take when trying to write automated jobs. |
|
Your comments about what Jupyter notebooks incentivize are definitely true (I'd argue its part of the appeal - people new to programming often don't immediately grok types of state and Jupyter kind of just says "eh, work around it"). I certainly fall victim to it and often wish for a week of "the expectation is that you will go through your notebooks and create modules or packages for them".
I also agree that running automated jobs from Jupyter isn't an optimal solution, but for many companies, if it means a data scientist (whose coding skills are primarily statistical) can get a report into production without a single engineering hour, it's often worth it.