Hacker News new | ask | show | jobs
by CoolGuySteve 2360 days ago
Both R and pandas force you to wrap your problem around dataframes and vectorized operations. But sometimes you really do just want to write a loop that iterates over the data.

Right now the only way to do that without significant performance costs is to drop down into C or avoid the problem completely by using Julia.

Having worked with both R and Python on large datasets, I think both languages are really easy until they aren’t. Eventually you hit a performance wall.

2 comments

You can increase the speed of loops in Python using Numba. It's really a great performance booster with just a few decoraters added.
You can drop down into the Numpy values array in Pandas to get your performance gain when iteration is otherwise slow.