|
|
|
|
|
by logophobia
2897 days ago
|
|
He's probably refering to a combination of features that "nosql" databases have. * Schema-on-read: Makes it easier to ingest large amounts of data, and then do ad-hoc exploration. The schema is only determined when reading the data, which is a bit easier for one-off data exploration, you determine how to interpret the data when actually using it. Not appropriate for production systems though. For example, a customer gives you a few TBs of data, you dump it on hadoop, and query it with spark. It would slow you down if you first have to convert it to a relational schema. Again, only good for one-off stuff. * Most SQL databases have column limits, so if you have a very large amount of features, I'd imagine you'd run into these limits. * Scalability. Feature engineering is very parallelizable, most normal SQL databases (excluding stuff like cassandra) aren't trivial to scale. |
|