Hacker News new | ask | show | jobs
by gbuk2013 3216 days ago
Thank you for athenapdf and for rescuing me from the pains of wkhtmltopdf - I am a happy user. :)

My only small problem with it was the somewhat complex setup for using athenapdf-service with a new project (especially since I use docker-machine) but I have now mostly automated the whole thing.

Just out of interest - do you consider asynchronous an advantage (being a Node developer I generally love async very much)? Not that it matters to me - my needs are trivial for the service to handle.

Edit: actually I can see how it async would make my life much more complicated for my simple use case - I would have to write something to track requests and responses rather than just looping through a bunch of URL's that need converting.

1 comments

That's interesting feedback! Thank you :)

We actually went with Docker for the set up because it simplified dependency management tremendously, and it allowed us to deploy on platforms like Kubernetes, Swarm, and ECS. As a plus, it gave us some confidence that if it works for us, it should work for others (obviously, we have come across cases where Docker behaves differently across platforms).

I consider asynchronous processing (in this context) as advantageous in some cases. Indeed, when we were refactoring `athenapdf`, we considered introducing a message queue for workers to pull work from, and to put back when the work completes. The problem with this however, is that we can't as easily scale horizontally (i.e. introduce node replicas behind a load balancer), as if we tried to get / update a job, we may not get the same node we originally got. I mean, the solution can be as easy as introducing a centralised message queue of sorts (or even a sticky session), but that complicates the set up process, so we decided against it.

Taken together, for our specific use cases, we believe it is a lot simpler to consume a synchronous API. No webhooks / callbacks. No polling. No concerns over acknowledgement. If a HTTP call fails, we will know about it immediately. If a complex retry mechanism is needed, we think this should be accomplished in the client application.

In the long term, I believe we should have a toolkit that can easily be plugged into a wider orchestration engine like Conductor (https://netflix.github.io/conductor/). That way, anyone can develop their own conversion process pipeline with ease.