Hacker News new | ask | show | jobs
by jsmeaton 4124 days ago
I have a workflow that I'd really like to automate/rewrite. A wav file is generated on a remote server. That server will rsync/scp it to a processing node. The processing node will query a database, and write out a text file with parts of that file to remove. It'll then convert it to mp3 (using sox and lame) with those parts removed. Another job will then pick up the mp3 file, query another database, and if it gets a hit it will sync that file to s3.

Is this a kind of workflow that would run with pinball? Can you move files around with it, or do you use the file system and pass filepaths around? Ideally, the workflow job would hold onto the wav/mp3 and the associated database fields that are returned so I don't have to juggle weird directories around (and have to sync access to them).

I'm not familiar with any other workflow engines, so I'm unsure if this is the kind of thing that would traditionally run on one. I looked at the user guide but it's currently barren.

1 comments

Pinball is good for this use case. You can build a workflow include a few jobs,

job1. generate a wav file, and put it somewhere say, s3://wav.file

job2 (run after job1): pick the wav file from the location s3://wav.file

you need to know the contract between the parent and child jobs from the business logic. In this example, when you implement job 1 and job 2, you need to have protocol for them to produce store and consume the wav.file..

Thanks for the reply. I'm wondering how you would share the location of the file between jobs though. Can job 1 output a file location that job 2 accepts as an input?

I see there are plans to write up some documentation, but are there any timelines that you're aiming to have those written?

Also, the README calls out mysql as being required. I assume that this, being a django project, will work with other backends too. Is there anything, to your knowledge, that would prevent a different backend being used (like postgres or oracle)?