| Power BI embeds a quite good columnstore database engine. It is exactly the same database engine as in the SQL Server Analysis Services Tabular model. For the types of queries typical in BI (primarily aggregations that involve lots of table scans), this yields great performance. Non-technically savvy users can import millions of rows of data and explore this with interactive visuals. The performance characteristics of the database engine allow moderately complex logic to remain high-performance into 10s or 100s of millions of records. This is on laptop-spec hardware. 100Ms of records will need enough supporting RAM, but the data is highly compressed, so this fits. The performance scaling allows simple aggregates of single-digit-billions-of-records to remain fast: usually just a couple of seconds. Performance tuning for more complex logic or larger datasets becomes complex. When I talk about the performance characteristics above, I mean for the in-memory database engine. This performance comes out of the box. There are no indices to create or typical database administration. Users set up an import of data and define relationships among tables, and define aggregation logic. The query language is DAX, a functional, relational language built with the goal of making business logic and aggregations simple to write. For simple use cases, users do not even need to write their logic in DAX. The viz layer automatically generated basic aggregates and provides a sophisticated filtering and interactive viz canvas experience. There is also an ETL layer, called Power Query. This has very different performance characteristics, and in general you should try to minimize your work in this tool and its M language, as it has very different semantics. The sweet spot is intra-row logic and basic filtering of data that will be imported into the columnstore database I mentioned above. The same underpinning technology, VertiPaq, supports columnar compression and storage/low level query in other Microsoft products. Columnstore indices in SQL Server also use VertiPaq for compression and storage engine. The data storage layer in Microsoft Fabric utilizes parquet files in delta tables (same tech as Databricks). Part of the VertiPaq engine is a set of compression optimization heuristics that are quite good. They apply this optimization to get 30%-50% compression improvements in the parquet files they generate. They call this feature V-order (v for VertiPaq). The standard compression algorithms in Parquet include dictionary and run-length encoding. The same are used (though obviously with different implementations) in VertiPaq's native storage format. The compression optimization heuristics include some for identifying superior sort orders to help maximize the effect of these styles of compression. V-order is the application of these heuristics to the data before applying compression for the parquet files. To be 100% clear, V-order is a pre-processing step that yields vanilla parquet files, not any sort of extension to the parquet format. V-order is applied automatically in the storage layer for Microsoft Fabric, called OneLake. You may come across documentation for xVelocity; this is an abandoned name for the same technology, VertiPaq. I can talk about this literally all day. If you've got specific followups, feel free to ask here or email me at (any name) at (my username) dot com. |
Power BI is a desktop application for Windows and also a SaaS solution for hosting the solutions you build in the desktop tool.
The SaaS product offering is confusing for the time being due to the introduction of Fabric, which includes Power BI and a whole data stack besides.
Your Power BI power users will need a Windows platform available, though I have seen plenty of use of Parallels or a remote desktop infrastructure for users not using a Windows device.
There are authoring experiences in-browser that are improving and in some cases surpassing what is in Power BI Desktop, but if you're doing a lot of work authoring in Power BI, then the desktop application is where you want to be for now.
For pure querying or just data model development, there are third party tools that are superior to Power BI Desktop (disclosure: I do work part time with one of these third party vendors).