Hacker News new | ask | show | jobs
by janalsncm 1065 days ago
I was extremely stubborn when I started out in python. Built a script for everything. Jupyter is messy. But once I started using it I never went back for data analysis tasks.

Say you have a large file you want to read into memory. That’s step 1, it takes a long time to parse that big json file they sent you. Then you want to check something about that data, maybe sum one of the columns. That’s step 2. Then you realize you want to average another column. Step 3.

If you write a basic python script, you have to run step 1 and 2 sequentially, then once you realize you want step 3 as well, you need to run 1, 2 and 3 sequentially. It quickly becomes much more convenient to have the file in memory.

1 comments

I like to imagine it's like a very advanced REPL that's somewhat reproducible (if you run everything from the beginning). If you don't find the appeal of being able to mutate state live for experimentation then it isn't for you.