Hacker News new | ask | show | jobs
by aldanor 2333 days ago
'Thinking about the data' most often requires looking at the data from hundreds of different angles, quickly investigating its properties and statistics, maybe plotting or fitting a few things, checking some hypotheses etc (all of the above code you will most likely throw out after the initial stage).

Same with the results - once you've coded something (perhaps outside of a notebook environment) and obtained results, verifying that they are what you expect is much more efficient to do in a notebook.

1 comments

Maybe you use a notebook I'm completely unfamiliar with, but my experience is that they allow you to write code, run it, and save the results in cells. My IDE does all of that except the saving of partial results part, but this can be done easily by just dumping your precomputed data to disk if you can't recompute it easily. In either case, an IDE gives you get an actual debugger, plus with IntelliJ it has a great data visualization plugins, database viewer, great autocompletion, and integrates with your VCS, etc. What do you do when you need an actual debugger, or need to profile your code? What about documentation for the function you are calling? In my IDE this is a popup, in every notebook I've used, this is a google search.
I use both PyCharm and JupyterLab on daily basis, typically dealing with multi-gb datasets.

If I'm writing a library or adding new features to one, or writing tests I'll use PyCharm sure thing, otherwise the notebook is a quicker way to sketch prototypes and always have a kernel with preloaded datasets and pre-imported stuff ready at hand. I don't want to wait 10 minutes to just load the data every time I want to check if my new function works well on it at big scale. That's one of the most important bits.

PyCharm is a clear winner at actually writing code that you won't throw in the bin 10 min later, and once you know what to write.

Debugging? Don't remember ever using PyCharm despite the fact it exists... either pudb or python-devtools or something else. I'd just write tests and things start working in the process. And btw you have pdb debugger (some weak version of it) in jupyter if you really need it. Docstrings? Press tab twice in the notebook. Or keep PyCharm open on the side so you can cmd-b. Profiling? Never a pycharm builtin, maybe something like flamegraph but an external tool anyway.