|
|
|
|
|
by sbrother
483 days ago
|
|
Apache Beam in Python does this, with code like counts = (
lines
| 'Split' >> (
beam.FlatMap(
lambda x: re.findall(r'[A-Za-z\']+', x)).with_output_types(str))
| 'PairWithOne' >> beam.Map(lambda x: (x, 1))
| 'GroupAndSum' >> beam.CombinePerKey(sum))
I'm not sure how I feel about it, other than the fact that I'd 100x rather write Beam pipelines in basically any other language. But that's about more than syntax. |
|