Hacker News new | ask | show | jobs
by chalence 3117 days ago
I'm a fan and frequent user of Pandas, but does the increase in Stack Overflow questions indicate a surge in popularity, or difficulty in use? I for one run into Pandas issues frequently and often find myself searching for the succinct solution (though, to be fair, this may also be an indication of the impressive scope Pandas strives for).
4 comments

I think you're right. There are many, many things I have come across which I search stack overflow excessively because I am overly surprised there isn't a better method of achieving the task. Try and do a cross join in pandas, it's deeply dissatisfying.
Pandas is useful and I don't want to bad mouth it as people obviously find it useful. However, it has a complicated API and contains about 200k lines of code. So, it is not a surprise that documentation is a challenge and that there are lot of Stack Overflow questions. For example, figuring out which method result in copies of the data vs new views is hard.

Compare with dlply. It solves a similar problem as pandas does but has a vastly simpler API. To be fair, Pandas does do more but dlply is also more flexible. I looked at implementing something like dlply in Python but you really need to have a lazy evaluation syntax. dlply makes extensive use of this feature of R. As the downside, it can be very confusing to new users as it is hard to debug this lazy evaluation code.

Rather than adopting Pandas to build our product, I built a very minimal version of it (on top of numpy) that only does what we need. That was some extra work but I'm happy I did it as we avoid this huge dependency. I understand quite well my little minimal version does, it is only about 1000 lines of Python code and some tiny C extensions.

I've used pandas frequently and find their docs to he entirely unsatisfying. Stackoverflow provides examples and fairly good insight on using pandas as a whole compared to the docs which just says this function exists basically.
Pandas has much more breadth than depth. The first few months I used it, I felt the same. At some point it all just clicked and I more or less knew where to look for stuff in the docs.
Anything mildly complex is difficult. And I guess an unpopular tool wouldn't have an increase in questions for a long time. The people would either leave it, or learning. So the number of questions should be decline or at least be around the same number.