Hacker News new | ask | show | jobs
by elymspears 4333 days ago
Location: Boston, MA

Remote: OK (I even have probably-suitable computers already if Linux is OK for the job).

Willing to relocate: Not now, but it is possible in mid-to-late 2015 depending on the destination. It is not possible for me to relocate until that time.

Technologies: I separate them into computation and statistics/math below.

Computational: Python and many associated scientific and data technologies, Haskell, some C/C++, MySQL, Postgres, MSSQL, Vertica, Riak, Redis, relational database theory and algorithms, git, mercurial, various flavors of Linux, experience with (but strong dislike of) the following: MATLAB, Stata, SAS, Excel, development on Windows. I care a lot about software best practices and sound-but-not-overkill software architecture choices.

Statistics/Math: Advanced degrees in applied math (Brown) and engineering (Harvard), with heavy focus on machine learning, Bayesian statistics, Monte Carlo and sampling-based strategies in scientific computing (MCMC, hybrid Monte Carlo, simulated annealing), GPU programming, distributed computing with 0MQ and MPI, limited experience with map reduce paradigm. I've also done three years of Ph.D. coursework in probability theory, econometrics, Bayesian statistics, real analysis, and statistical modeling. I left a Ph.D. program early to pursue more immediate applied work instead of academia. I have experience with, but a strong dislike of, classical Frequentist / hypothesis-testing methods, especially in econometrics (things like panel data analysis, Fama MacBeth Regression, Seemingly Unrelated Regression, etc.). I feel strongly that setting up automated frameworks to crank out datamined hypotheses that satisfy arbitrary statistical significance levels is shoddy, bad work that I am unable to feel proud of, and thus I can't feel happy in any job that requires this kind of statistical work. I am happy, however, to work extra hard in those kinds of jobs to perform the same sorts of analysis using more robust Bayesian and machine learning methods to support the solutions to the same business problems in a better way that I am capable of feeling proud of when I am able to work hard and do it well. I am also happy to share lots of academic and pragmatic resources outlining the overwhelming argument for why those sorts of frameworks are unacceptably bad for statistical practice.

Resume: please ask by email

Email: spearsem a t g m a i l

About: I am looking for work doing machine learning to solve business problems. I have experience as a research analyst at a quantitative equity asset management firm, a radar data engineer at a defense lab, and also developing open source analytics tools for clients at a technology consulting start-up. I have done graduate work in machine learning, computer vision, and artificial intelligence. I enjoy working with data infrastructure technologies and I really love software craftsmanship. My preferred languages are Python and Haskell, but as long as high quality supporting infrastructure is available (Linux, distributed version control) I enjoy working in any language and learning new technologies.

I am looking to avoid jobs that are centrally focused on the stewardship and dev-ops aspects of big data, such as database curation and administration, unstructured software tool making for servicing ad hoc analytics requests, and similar activities. I struggle with the current job market because many positions that purport to be focused on "data science" are really solely data stewardship -- and sometimes even worse when they also require focusing on devops technologies; in those cases it's more like being a "data secretary" than a data scientist and these jobs don't allow for much autonomy, creativity, or genuine use of math and statistics.

I am looking for roles where I will be primarily responsible for modeling, forecasting, and rendering computation of business solutions efficient and parsimonious. To the extent that database admin or ad hoc software dev are side tasks that I do occasionally in the service of major modeling tasks, I am happy to do them and enjoy it. But when a job becomes centrally focused on only the "stewardship" and "operations" aspects of data, lacking any need for creative mathematical and statistical modeling choices, it becomes a job that is not acceptable for me.

I am very skeptical about joining start-ups after a very bad working experience at one, and several unpleasant late-stage interviews/offer negotiations with start-ups. I won't categorically rule out the possibility, but I find most investors and founders expect recruits to just intrinsically feel motivated to take the job, rather than actually attempting to put numbers on the risks and suss out whether a start-up offer is truly quantitatively compelling. I'm a skeptical person: that's just my temperament. It's nothing personal, but I won't feel invested in any business idea unless it is quantitatively justified according to features of the job itself and the job offer.

Note: I don't have much available at Github due to pretty draconian IP rules at my last two jobs, but here is a link to my careers page at StackOverflow: < http://careers.stackoverflow.com/ely > and from there you can see Q&A from my SO account and accounts at the math and statistics sites too.