Hacker News new | ask | show | jobs
by bkj123 5681 days ago
Regarding R's processing power, I haven't found it to be an issue. When building a model and testing, I use a sample of the data which is usually less than 100,000 observations. I use samples even when using a tool like SAS Enterprise Miner.

As far as scoring, I usually export using PMML and run it natively on the database. Makes for fast execution. PMML is available in R, RapidMiner, and other packages.