You can probably do something that creates as many FIFOs as you have parallelism and just be careful about emitting whole records like https://github.com/c-blake/bu/blob/main/doc/funnel.md . That one's Nim, but the meat is only like 50 lines and easily ported to C like your progress tool. ( EDIT: and it will also probably be drastically lower overhead than `parallel` which has over 70X worse time overhead and 10X the RAM overhead of tools written in fast, native-compiled languages: https://github.com/c-blake/bu/blob/main/tests/strench.sh )
Also, the last time I tried, to do similar with FIFOs (no /tmp | whatever storage like other e.g.s here https://news.ycombinator.com/item?id=37211687), GNU parallel needed some - for me - specially compiled Perl interpreter with threads enabled to use its `parcat` program which is also probably slow. Besides the nagware insanity, `parallel` seems just not a very compelling tool in either machine|human overheads unless -- maybe -- you already know Perl (which I always found a supremely forgettable language).