Hacker News new | ask | show | jobs
by andybak 1165 days ago
What the hell is BI?
2 comments

I know right? Hilariously bad title, especially considering the I/l ambiguity - a Boys Love novel? Bi-sexual computing? Bachine Learning?

Obviously jargon is a thing, we all deal with it, but when you put something up like this to a more general audience, you’ve got to give some thought to presentation, otherwise you end up with apparent word salad.

Yep. I actually guessed it was Business Intelligence before I posted but I play dumb on things like this as a matter of principle.

We shouldn't have to guess.

Business Intelligence
What does that mean or imply?
People inside a company prefer to have financial and operational information to make informed operational decisions or projections.

Sometimes that data is stored in a variety of places (excel files, a smattering of disconnected databases, sometimes enriching the data with more data pulled from an API...).

Extracting that data to a common location (historically called a data warehouse) tends to be the work of a data pipeline, and the tool used to join, display, filter, dynamically aggregate, and visualize the data has been historically categorized as a "Business Intelligence" tool. Normally these BI tools provide data caching and allow temporary integration of multiple data sources.

The intent is to make it easy for business users to explore, analyze, visualize, share, and present datasets or results.

The biggest examples off the top of my head would be PowerBI, Tableau, and Apache Superset, but "data reporting tool" is a competitive market with many entrants.

Sounds like essentially consolidating data in a database and visualizing it. Is there a difference between a database and a "data warehouse"?
A data warehouse is a specific use-case for a database.

> Sounds like essentially consolidating data in a database and visualizing it

Yes, and the database in which you consolidate the data is called a "data warehouse" In many cases the sources of data are ... not optimal for querying. (And I mean like "Spread across a dozen excel spreadsheets that people e-mail out new versions of when they make a change" by "not optimal")

Consolidating data makes sense and taking it from sources you can't query makes sense, but once you put it into a source you can query, that sounds like a database. I'm not sure why there is a different term here.
Yes, they have fundamentally different architectures to serve their respective use cases.
What is the different architecture and what is the different use case? Databases are already general tools. You put data in and query it. At what point does it become a "data warehouse" ?