Hacker News new | ask | show | jobs
by agentgt 3816 days ago
I think you understand this but its more than just handling large datasets memory friendly. True streaming would allow you to keep your connection pool happy. If you stream a large dataset you have to keep the connection open (lets say for a web request which is the common case) and you can't reuse that connection till you retrieve the entire dataset (or I guess read a subset and stop). For batch processing this ok but for web requests this is generally not ok. Asynchronous drivers are push based and are analogous to NIO HTTP like Netty (some non-JDBC async drivers I think even use Netty). But I'm going to gather you understand that and/or either using a very sophisticated pooling technique/drivers.

So if I block too long while reading an iterator like object because the client is taking to long to read... I think you can imagine what happens. This is why so many of the JDBC wrappers (such as Spring JDBC and JDBI ) do not return iterators or at least do not advertise it as an awesome feature.

1 comments

Our pooling was pretty shitty actually, but it didn't need to be fancy as 95% of our code was some sort of batch processing (as you guessed), after which the JVM terminated. But yes, highly tweaked JDBC drivers over all.

Hibernate has iterator methods, but I recall (in 2010) it still loaded the entire result set into memory, with a //TODO comment. I remember thinking "W...T...F..." I can't tell you how many -Xmx16G (or 32/64) flags I deleted...