Hacker News new | ask | show | jobs
by amrrs 2708 days ago
Absolutely right and something we see in Data science too. Coding is a domain where we work cross-architecture. I can't be just a useful Python programmer - I can only be a useful Python programmer in Marketing automation or some Business associated with it.

The fat package makes programmers not realize this.

3 comments

As a data scientists who considers himself moderately good with the pydata library set, I realized the other day if you took those 5 libraries away from me, I'm not sure I have that much to offer in terms of my programming abilities. I don't necessarily feel bad about that, although it did give me pause.
I’ve known people who built lucrative careers on expertise in specialised technologies, and others who’s entire careers disappeared because they were too specialised and the technology got deprecated. You might be fine, but I’d definitely recommend diversifying your skill set a bit. Being a generalist has served me well.
I would say every capable programmer should be a "generalist" in a sense and be able to transfer their skills with relatively little effort. If they can't, they are not really that good at the abstract concepts. Still this doesn't prevent one from focusing more and specializing more on one field/language that they love and be as familiar with it as possible.
the libraries have grown nicely over the years, though I am surprised they still lack some basic stuff.

why in 2019 is there no good way to visualize a decision tree? having to install, configure, and get working graphviz feels very hoaky.

I prefer python, but R still does some things really well that the python libraries are just not up to par

1) anything geospatial, including drawing maps. here is a list of projects my students did to give examples: https://pennmusa.github.io/MUSA_801.io/

2) time series

3) linear models, is it so hard to give me a good summary?

If anyone knows of any packages that do these >= R, I'd love to see them :)

> why in 2019 is there no good way to visualize a decision tree?

You should try https://github.com/parrt/dtreeviz. From Terrence Parr and Vince Grover, released in Sept 2018.

There is a good background article on the problem space and their design iterations here: https://explained.ai/decision-tree-viz/index.html

thanks for the info, but as I said above anything that relies on the installation of graphiz is extremely hoaky, and as you can see on many posts on SO, doesnt work many times.
I came in to the pydata library after extended time in Matlab/Octave. Learning there first was terrible for programming basics (e.g. polymorphism) but was excellent for ensuring I knew what a given algorithm does/should do.

I highly recommend spending some time in C/C++, Go, or Julia to pydata-first folks that ask.

As a data scientist myself, I learned to write code that utilized arrays and matrices - from the most basic library. From cleaning to analysis to machine learning (I supposed numpy was used for ML). Is this the most correct manner to write code? I don't know but learning to do DS without specific Python libraries has improved my coding ability. Aside from being able to read in different data formats I believe I could do a fair portion of my work in C.
What is the pydata library set? numpy, pandas, matplotlib, sklearn and the like? My google fu is failing me...
Yeah pretty much.
Data scientist is a term similar to Web Master, I don’t think it will be a thing in 10 years.
Someone has to make the tools that are used to automate the marketing, analyze the business data, whatever the task at hand is