Hacker News new | ask | show | jobs
by secant 3154 days ago
What system could governments use to replace SAS if they were so minded?
2 comments

Used to work for federal healthcare contractor, primarily in SAS. File conversion was a pain, but working in Stata would generally yield runtimes roughly an order of magnitude faster.
It helps a lot to think of SAS a as a set of tools, and then recognise what provides equivalent capabilities.

The DATA step follows conventions very similar to awk, though what SAS offers by way of data conversion (especially from mainframe formats) is hard to provide. The fundamental concept of iterating over the input stream is useful to keep in mind.

I've also found that awk is useful for writing SAS programs themselves. I bumped into this dealing with large data dictionaries and trying to make sense of them. Parsing those and generating the corresponding SAS statements, then seeing if the results made sense was far easier than coding by hand. (The dictionary, of course, failed to correspond entirely to the actual datasets, requiring mods, but the dev/test/modify cycle was far faster, and far more repeatable.)

For data storage, an RDBMS backend or SQLite is probably good, though you can also use various structured files (CSV, other delimited, column-formatted, etc.) Columnar + compression buys you much of the advantages of a SAS data set in terms of size.

For the various statistical and graphics capabilities, R, gnuplot, the JS plotting library, and some related bits. I'd really like to see what tools for generating dyamic SVG there are these days, as that's a graphics format that seems exquisitely suited to data-driven rendering.

For advanced quantitative programming: Python or related languages and libraries.

For report generation: these days I'd probably head to a lightweight markup language and Pandoc to create whatever format(s) I wanted. Or you could wire up dynamic Web output with the application engine of your choice.

For application design or creating commandline / back-end tools: whatever tools you prefer, ranging from scripting languages to compiled langauges. It's been a long time since I've worked with SAS, but its Macro and app development language (which I can't even remember the name of now) are both quite crufty.

The key advantage to SAS as I noted in an earlier comment is that many of the tool choices are made for you, in that that's what SAS offers you. You can go outside that set, though back in the day, few shops really seemed to be much interested in that. The problem is that the tool choices are limited, and any additional tools have significant costs.

Going with free/open options liberates you, but also means you've got to litigate the tool-choice battle. That seems to be a problem mostly at shops that continue to use SAS in part -- I don't know if it's sunk-cost fallacy or other dynamics, but there's quite often resistance at both management and developer/analyst levels to going to other tools (or had been in my experience). I found that and other dyanmics sufficiently frustrating that I largely stopped using the tool decades ago, with occasional (and regrettable) relapses.