| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by shay_ker 2638 days ago

> At one point we attempted to migrate to Heroku Shield to address some of these issues, but we found that it wasn’t a good fit for our application.

This part seems very hand wavy, given that Heroku Shield would've solved many (all?) of their problems.

> We were also running into limitations with Heroku on the compute side: some of our newer automation-based features involve running a large number of short-lived batch jobs, which doesn’t work well on Heroku (due to the relatively high cost of computing resources).

How much memory did their batch jobs actually need? If they're using Rails, then I'm assuming they're just running a bunch of Sidekiq jobs that are querying PG. I'm surprised that they'd need that much in terms of compute resources. They should be able to get very, very far by making PG do a lot of the work, or by streaming data from PG and not holding a lot of data in memory.

Even if they did need all this, the following two options seem WAY easier to manage:

1) Use dokku to run your super-intense Sidekiq batch jobs on beefy EC2 instances. You can still schedule them in your Rails app in Heroku, no big deal. Many engineering teams have to do this type of split-up anyway when it comes to Application Engineers and Data Engineers, this is just a simpler way to do it.

2) Similar to 1), use a different language runtime for the batch jobs. If you really need to run CPU intensive jobs, why are you using Ruby? If the jobs aren't so intense to mandate maintaining two languages (fwiw, not that hard), why will moving to k8s solve the issue?

Personally, I'm not sold on their decision to move to Kubernetes, and I use Kubernetes for my job.

1 comments

shosti 2638 days ago

> This part seems very hand wavy, given that Heroku Shield would've solved many (all?) of their problems.

Author here; I don’t want to go into too much detail, but we tried Shield early on and had a negative experience that made us wary about using the platform (it seems to use a different tech stack under the hood from “normal” Heroku and lacks a lot of the things that make Heroku great). Also it’s very expensive compared to VPC-based solutions on AWS and GCP.

W.R.T. the batch jobs, I think I didn’t explain super well—we are using a different language and runtime from our “normal” background processing jobs (which use worker queues in Rails), it’s just that Heroku isn’t very well suited for the use case (which is basically FaaS-like but with long-lived jobs).

The “split” workflow you described is basically what we were doing (but with AWS Batch instead of Dokku); it’s just that it’s more cost-efficient to consolidate everything into one cluster (especially with preemptible gke nodes) and also better to have a common set of tooling for the Ops team.

To be fair, we haven’t yet completed the move from Batch to k8s so it’s possible that part of the plan won’t pan out as expected.

link

holografix 2638 days ago

Disclaimer: I work for Salesforce, Heroku’s parent company.

Heroku Shield is a service added on top of Heroku Private spaces.

You usually don’t need Shield unless you want to be compliant with things like HIPAA, etc

Which of course could be your case here.

link

ukd1 2638 days ago

It is and we needed HIPAA. For me, it's priced aggressively (~600%, compared to zero for GCP) and wasn't ready when we looked - i.e. caused a few SEVs.

link

shay_ker 2637 days ago

> ~600%, compared to zero for GCP

I've always been curious. What do you need to do to be HIPAA compliant, from a technology standpoint? I figured it's similar to PCI compliance, but I'm not sure.

From what I've heard, though, the cost isn't quite zero, it's just that you have to own & implement all the work to be HIPAA compliant. But perhaps it's not that bad?

link

holografix 2637 days ago

I’m not in product or legal so take this with a grain of salt:

I know that for a customer I spoke to, keystroke logging on running dynos was something they were really interested in, from a compliance point of view.

I think being able to spin up Postgres DBs with rollbacks, fork and follow, HA etc etc (don’t want to sound like a sales rep) in this highly compliant environment also involves some serious infra wrangling.

link

oskari 2634 days ago

FWIW, Aiven PostgreSQL (http://aiven.io/postgresql) runs latest PG versions and is available in HIPAA compliant configurations on AWS and GCP. We don't charge extra for it, but have a minimum monthly commitment to justify the small setup overhead.

link

shay_ker 2638 days ago

Makes sense. It's hard to tell without understanding what the batch jobs are actually doing... it sounds like you're running something similar to EMR jobs?

We were about to use Heroku Shield at a previous gig. It's definitely expensive, but at our requirements it was still less than an engineer. I wouldn't run a ton of "big data" processing on Heroku nodes though. I'm sure/hope it exists, but I haven't seen a Heroku-ized version of data processing.

link