| HN Mirror

Aside from the performance differences, data.table makes it very easy to do interactive manipulation, at the cost of making it hard to program. Pandas currently goes in the opposite direction.

I'd rather have R/data.table at the prompt and python/pandas in my script, but if you have to err on one side, the python/pandas "low magic" is the side to err on. Pandas does have its own strange corners, though. For example, it seems like it tries hard to stick similar-typed columns into contiguous matrices, which leads to some unexpected casting, and I have no idea what the supposed benefit is over just keeping distinct columns.