Hacker News new | ask | show | jobs
by rparrish 3981 days ago
I’ll say upfront that I’m a Product Manager at Treasure Data, and we market to Data Scientists. Specifically to enable you to perform analysis on large datasets, directly from your local machine. More generally, Treasure Data enables the collection, storage & analytics of large-scale event data.

For performing preliminary analytics, I’ll agree with what the previous respondents have said - iPython Notebook is a GREAT tool. It’s certainly my go to. The libraries I think of using when working within this context are as follows:

The go-to packages: > http://ggplot2.org/ (R) > http://matplotlib.org/ (Python)

Graph visualizations: > http://gephi.github.io/ > http://neo4j.com/

Online dashboards: > https://github.com/stitchfix/pyxley (<- I’m particularly excited to try this out) > http://bokeh.pydata.org/

Of course, the challenge is you don’t have a static dataset! New data is continuously coming in. Your dataset is growling larger all the time. It may be too large to fit on your local machine.

That’s why Treasure Data was founded, to enable the easy collection of, and analytics on, this type of data stream. Treasure enables complete removal of the engineering & devops for these collection & storage steps.

For example: > Want a continuously updated dashboard of your incoming data? = Treasure Data + Jupiter Notebooks + Pyxley > Want to perform graph visualizations on event data? = Treasure Data + Jupiter Notebooks + Neo4J > Want to create visualizations in R? = Treasure Data + R + ggplot

The above is enabled through Treasure Data’s integration with Pandas & R. (http://docs.treasuredata.com/articles/jupyter-pandas).

Good luck in your work!

2 comments

Disclaimer: I'm a co-founder of Linkurious. Our products are used by data scientists and less experienced-users.

We have an open-source graph visualization JS tookit called linkurious.js: https://github.com/Linkurious/linkurious.js

We also offer a commercial product to visualize Neo4j stored graphs: https://linkurio.us/product/

Linkurious CTO here :)

Our product connects to Neo4j databases and enables full-text search, visualization design (select nodes/edges, apply size/color to nodes/edges according to properties) with a point-and-click interface.

Try it out: http://linkurio.us/demo/ (the "create account" button generates a temporary login/password for the demo)