Hacker News new | ask | show | jobs
by cedws 238 days ago
Did I understand correctly you’re using it for file processing? If so does it yield reliability benefits? We have an assortment of jobs written in Go that process files of various types (CSV, Parquet, TXT) in S3 too. The issue we have is that our Kubernetes jobs crash all the time when they encounter something unexpected. Obviously we should invest into making them more robust but what we really want is some way for the jobs to continue processing whatever they can instead of crashing and starting over.
1 comments

In our case, the files being processed are datasets that have already been normalized through another ETL tool. Since we're doing the preprocessing ourselves elsewhere, our Gleam parsers are set up to expect a pretty rigid set of inputs. We do all of the file IO / streaming in Elixir and pass the raw data into Gleam as Elixir maps: so Gleam just takes maps, parses them into types pretty rigidly, and our entire Gleam module ecosystem assumes "perfect enough" data.

If we encounter row-level errors in a batch, we log those alongside the outputs. There's nothing particularly intrinsic about out usage of Gleam that prevents the workers from crashing during processing, its all about having error handling set up within the job itself to avoid killing the process or pod running it.