Hacker News new | ask | show | jobs
by rpowers 1470 days ago
The thing I hate about all of these GoG approaches is the wastefulness of the translation of data + style into visual representations. For example, if you have a dashboard with 8+ charts visible, the scaffolding of the charting library starts to weigh down the system in both performance and memory usage. VegaLite, especially, seems to make a copy of the data being passed in. Looking at the examples of ObservablePlot, I can see more wasteful processing in the form of dataset.map(d => d.property) sprinkled in several places.
3 comments

This has nothing to do with the GoG.

This applies to any charting library that forces you to provide both spec and unaggregated data to memory/cpu constrained clients (e.g. Javascript in the browser). This is done for implementation-simplicity (Vega, for example), but obviously doesn't scale to larger datasets.

I've implemented a system where the data part of the spec is munged in-database, and aggregated data is provided to the browser, along with hints for axes, scales, legends, etc. It requires a part of the GoG interpreter to be resident on the server-side.

That sounds very similar to VizML (the visualization/data processing library underlying Tableau). That has been my big complaint about most visualization libraries - there is no sharing of the underlying data set for multiple projections across the same large data set. Grid/table libraries have the same issues
Yes. Tableau would have to separate rendering from data select/filter/aggregation, especially because integrating with customer databases live is a key use case. Hence the built-in buffet of connectors/drivers.

It looks like with later versions they switched to kind of a hybrid approach (part-remote, part-local) with Hyper to reduce latency for interactivity.

> there is no sharing of the underlying data set for multiple projections across the same large data set

But that would require some kind of open standard for portability, no?

>But that would require some kind of open standard for portability, no?

I like the approach AGGrid uses - they provide a viewport based interface that the grid uses to display data, and you can implement that interface on top of your data model - https://www.ag-grid.com/javascript-data-grid/viewport/. Unfortunately it's only available in their enterprise version, but this approach scales to both grid and chart based UIs. D3 has a bit of that flavor as well, since you can map visual attributes into your underlying data any way you'd like.

I was assuming you were referring to a declarative approach in your previous message.

The Ag Grid approach makes sense if data and vis need to be wired together programmatically.

I didn't know GoG existed when it came to writing up a couple of tutorials[1][2] on how to go about building a (very, very simple) charting tool[3] on top of my canvas library. I'm going to have to re-assess those lessons, and add some links to other guides, now that I know about them.

Luckily for me, the main purpose of the lessons was not so much about how to build a charting tool, but rather concentrated on how to break the code into modules in the hope that some of the modules could be reused in other, similar projects.

If I'm making obvious mistakes in the approach, or code, that I set out in the lessons then feedback is always welcome so corrections/improvements can be made to them!

[1] - Building the chart frame, code management, etc - https://scrawl-v8.rikweb.org.uk/learn/eighth-lesson/

[2] - Generate bar charts and line charts from crime data - https://scrawl-v8.rikweb.org.uk/learn/ninth-lesson/

[3] - demo of the final code - https://scrawl-v8.rikweb.org.uk/demo/modules-001.html

Here's a ~4K line implementation of a useful GoG subset that you might find useful:

https://github.com/h2oai/lightning/blob/master/src/lightning...

It's used in H2O: https://github.com/h2oai/h2o-3

As the other guy mentioned this has nothing to do with GoG. A good data language or library should provide the user (and plotting libraries) copyless, cheap and immutable slices of the data being handled. Javascript just doesn't really have one. It shouldn't be the concern of the plotting library however.