Hacker News new | ask | show | jobs
by elisbce 1070 days ago
The biggest issue I had with plot libraries is that they don't work out of the box for millions of data points. Last time I was doing a data science project, I tried all of the major plotting libraries and none of them works well beyond a few million data points. I want a graph that I can visualize and zoom in/out in real-time and that became the hard part of the project. Only one product claims to be able to handle it using GPUs in the cloud and it needs a paid subscription and uploading your data into the cloud. I don't want yet another library, but some library that works really well and can utilize local GPU for plotting.
10 comments

This one does! https://github.com/wwwtyro/candygraph

Scroll down into the examples for some plots with lots of points: https://wwwtyro.github.io/candygraph/examples/dist/

Or that interactivity, 3d plots and styling of plots are kind of half baked, if supported at all.

Or that they try to emulate the non-intuitive Matlab plotting interface.

Surprisingly, immediate mode graphing libraries work pretty well at this!

https://github.com/epezent/implot

Java: https://github.com/SpaiR/imgui-java

Also for rust: https://www.egui.rs/#Demo (Open Plot demo)

For web you'd want to compile for WASM. I imagine you could just make the graphs WASM and embed in existing DOM.

If you use Julia, Makie crushes this use case and comes with great Python interop.

https://github.com/holoviz/datashader is a good one in the Python ecosystem.

That doesn’t sound like it should be that bad. Are you only looking at Python libraries?
Plotly works pretty well (which I suspect you're alluding to) and it works completely offline, no need to upload any data to their cloud.
Not my experience at all. After a couple thousand data points it becomes completely useless to the point that it completely freezes a jupyter notebook on my 64 core threadripper. Just trying to zoom can take minutes. It's a total joke.
That's odd. Are you sure this is not related to Jupyter? I use plotly.js via a Rust wrapper (https://github.com/igiagkiozis/plotly) and the performance seems ok when generating a static, interactive html. The wrapper language itself should be irrelevant here. Is it the same if you generate a static html-file? (EDIT: I only view the html in a browser as is, no notebooks)

While I can't speak for millions of data points, generating a gyroscope plot with x, y, z, where each gyro axis is 400k+ samples is fine performance wise. This is generating a static, interactive html. Zooming etc is fine on my M1 MacbookPro 13" - delay when zooming in this specific case is maybe 0.5secs. The html-file is 60mb+.

I might have got a million samples in to KST before. It was always extremely fast and has great ergonomics for panning and zooming plots
at a certain point it would probably help to meaningfully downsample/summarize the data at the larger scales..."semantic zooming"...then you just aren't plotting as many points
I too want the moon on a stick, for free
So no criticising open source code then?