| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by llbeansandrice 1989 days ago
	> I agree with a few other commentators here that Hadoop/Spark isn't being used a lot in their production environments I guess I'm the odd-man out because that's all I've used for this kind of work. Spark, Hive, Hadoop, Scala, Kafka, etc.

2 comments

josephmosby 1989 days ago

I should have specified more thoroughly.

I am not seeing Spark being chosen for new data eng roll-outs. It is still very prevalent in existing environments because it still works well. (used at $lastjob myself)

However - I am still seeing a lot of Spark for machine-learning work by data scientists. Distributed ML feels like it is getting split into a different toolkit than distributed DE.

link

llbeansandrice 1989 days ago

I guess it depends on what jobs you're looking for. There's a lot of exiting companies/teams (like mine) looking to hire people but we're on the "old stack" using Kafka, Scala, Spark, etc. We don't do any ML stuff but I'm on the pipeline side of it. The data scientists down the line tend to use Hive/SparkSQL/Athena for a lot of work but I'm much less involved with that.

Not all jobs are new pasture and I think that's forgotten very frequently.

link

markus_zhang 1989 days ago

I'd love to do some Kafka, Scala and Spark. What kind of exp are you looking for?

link

teddyuk 1989 days ago

I'm also the odd one out, so many enterprises moving to spark on databricks.

link