Hacker News new | ask | show | jobs
by ececconi 3912 days ago
When you're working with large companies that process enormous volumes of transactions you can't just download things to excel.
4 comments

You would be shocked and amazed by the kinds of things that Excel is routinely (mis)-used for in business.
That doesn't stop people from trying. I'm sure some manager in some company is angry at his IT staff because he is trying to load 10GB of statistics into Excel and his staff is telling him that he shouldn't do that.
Surely if you're that large then you could invest in some in-house domain-specific solution. There are whole programming languages dedicated to doing statistics on datasets.
Isn't that the whole point of AWS and their competitors? The savings of passing the problem off to domain experts to work out all the details often outweighs the benefits of having a solution that is 100% customized for your business.
For many companies (even large companies) it is cheaper and faster to use a service like this than build an in-house solution.
Not to mention that if you want this built for your company you call up the BI provider and tell them to send some consultants over to build it for you. You might train one person or possibly two at the most to be able to make tech calls with the BI provider. Much easier for a company to say "I want a dashboard that reports X, Y, and Z" and then the BI company scurries off, builds it, brings it back to you, and you hand over some cash for their effort.
Very true. Even with smaller companies, data sets like events inside an app are too much for Excel. Ever since I picked up R, I haven't touched Excel.
Hey, what's the difference between R and a library for Haskell (or even Python)? It's interesting that a language that is limited to a particular use (statistical analysis) has taken off so much instead of a library for a general purpose language.
You helped answer the questions yourself. The fact that it was built for statistical analysis means that the syntax and overall flow is REALLY easy to understand and manipulate data with. I imagine someone who is familiar with Python would gravitate towards Pandas or another library, but for the BI that I do every day, R is perfect.

People love saying that BI/Data Science is 80% cleaning data (which may or may not be true), but I've found R to be the best for cleaning up 100k+ rows at a time.