|
|
|
|
|
by saltysugar
4476 days ago
|
|
Kafka is mostly in Scala (which runs on JVM). Scala provides many attractive features to build scalable and type-safe code. And about performance, Kafka is used of LinkedIn and it process hundreds of gigabytes of data, close to a billion messages per day (reported in their paper in 2011), and the engineers claim that they're processing terabytes of data a day now. Not sure on what basis you claim the choice of language to be "questionable", but keep in mind that Scala's type-safety and many other features are much more difficult to achieve in C/C++. Cleaner code is sometimes more important than some tiny gain in performance. Also in terms of scaling, Kafka cleverly takes advantage of many aspects in their design to ensure low-latency high-throughput. * Little random I/O * Relying heavily on the OS pagecache for data storage Performance-wise, Kafka can outshine some of the in-memory message storing message queues. Source:
http://kafka.apache.org/documentation.html#design http://research.microsoft.com/en-us/um/people/srikanth/netdb... |
|
> Java garbage collection becomes increasingly fiddly and slow as the in-heap data increases.
So they had to work around that. In my view it's not a tiny issue. I'd say, instead of working around such inherent limitations, it's better not to have them to begin with when making high performance systems. That was my main point above. Time spent dancing around such problems defeats the purpose of supposed easiness of development.