|
|
|
|
|
by trevz
3045 days ago
|
|
A couple of thoughts, off the top of my head: Programming languages: - python (for general purpose programming)
- R (for statistics)
- bash (for cleaning up files)
- SQL (for querying databases)
Tools: - Pandas (for Python)
- RStudio (for R)
- Postgres (for SQL)
- Excel (the format your customers will want ;-) )
Libraries: - SciPy (ecosystem for scientific computing)
- NLTK (for natural language)
- D3.js (for rendering results online)
|
|
It is worth understanding the concepts of numpy and pandas. Furthermore, try out IPython/Jupyter, especially for rapid publishing (people run their blogs on jupyter notebooks).
I think certain libraries depend very much on where you focus. Machine learning? Native language processing? Visualization? Something in economics? Fundamental sciences? For instance, I never need NLTK in theoretical astrophysics ;-) Instead, I need powerful GPU based visualization, which is however very old school with VTK and Visit/Amira/Paraview (also very much pythonic).