Hacker News new | ask | show | jobs
by joppy 1913 days ago
The first things I would look for in a data science language are multidimensional arrays, linear algebra packages, data frame and time series libraries ... none of which feature on this page.
5 comments

Yeah I'm confused. The only "data science" I can see here is the the title.

How is list comprehension a data science primitive? How did this get over 4,000 stars on GitHub with a glaring lack of basic data science functionality? Is this used by actual practitioners?

GitHub stars are bookmarks for me, not an indicator of usefulness.

It does say it’s under heavy development.

Maybe 4.3k+ GitHub users just want to make sure they get updates?

I wish HN had a way to save a story without upvoting it or showing it publicly on your profile (the "favorite" feature implemented right now), like Reddit's "Save". Many times I'm interested in something to check out later but it's not something worth upvoting (like this story, based on other comments) and I want my interests to stay private.
This issue is better solved externally - using a bookmark manager. It would allow you to have all your "read later" links in one place rather than being scattered over different websites. Personally I use Safari's reading list feature for that.
Quite a few people were unhappy when twitter renamed "favorite" to "like" because they had used it as a bookmark and did not want to imply advocacy. Seems like both intents could be supported fairly easily.
I recommend Instapaper or Pocket. They’re cheap, but worth it
Right next to Star is Watch, which would be much more suitable towards that, no?
No, Watch emails you a bunch. Stars just show up in a list so you can find it later. That being said, public bookmarks always seemed weird to me. Why not just actually bookmark it with your browser? Not that it matters.
If you are using github app, it's less friction to star it than open the page in the browser window and bookmark it.
What's the benefit of using the app?
GitHub added a "custom events" for Watch. You can for example only watch on new releases. You should maybe check it out!
That's still not the same as stars, it still emails you.
If you've ever tried to use Watch as a bookmark, I feel like it's obvious why that is not a good solution
You may want to revise your judgement, GitHub added support to watch on "custom events", such as: new issues, new PRs, new releases, etc. You might want to try again.
Chinese based github project, the stars mostly are hyped.
Thank you! I've seen this language/extension/library pop up a few times and I don't see, even remotely, how it could displace the Python data science stack. The biggest competitor to Python in this space, IMO, is Julia. Go+ seems light-years behind, and heading in the wrong direction entirely.
R is the competitor, actually many of things in the Python data stack are directly copied from R: seaborn's ~ operator, dataframe, ...
My point is that I enjoy the Python stack, and I'm seriously considering Julia on future projects; I'm not giving R the same consideration. Python vs. R is almost a matter of taste IMO. I vastly prefer Python to R for data science. That's not to throw shade at R. Like you suggested, the Python stack owes R everything.
Apart from Gonum[1] numerical libraries, I haven't found specific data science related Go libraries in my search for it for some hobby projects when compared to Python ecosystem.

Interestingly Prose[2] A Go library for text processing yielded better results for named-entity extraction when compared to NLTK in my tests in terms of accuracy and obviously performance.

Perhaps Go is not being applied enough in the Data Science/ML and for fields where it's applied (Network) Math in the standard library seems to be sufficient.

[1] https://github.com/gonum/gonum

[2] https://github.com/jdkato/prose

Yea, my list would also be:

- ndim arrays with broadcasting

- time series

- plotting

- linalg: blas/mkl

- storage - hdf5, zarr, arrow, parquet, netcdf

I don't see any of those either in go+.

Seems like Julia could do all of those things.
I might contribute a feature towards this, specifically a time series lib.