Hacker News new | ask | show | jobs
by dopidopHN 1283 days ago
Hey I see “service config” referenced a lot in that thread, but your answers has the more occurrences.

I’m not sure I follow what it is.

A technical construct, like a code template or a API that services implements ?

Or a process constructs, like a SOP to follow with checkboxes?

Thanks

1 comments

succinctly: A service config is the authoritative source of truth for what a service is in a format that can be (is) consumed by tooling.

A lot of software development is about generating abstractions.

"Service" is a possible abstraction someone might want to generate and develop.

I think a service abstraction can be defined by:

  A blob of code
  A set of machines to run it on
  A way to stop and start it
  A method to load balance to it
So it would make sense to create a yaml config file committed to a repo containing something like:

  services:
    [
    { 
      name: "CoolAppServerName.prod",
      build_script: "./bin/buildCoolAppServerName.py",
      start_script: "./bin/startCoolAppServerName.py",
      stop_script:  "./bin/stopCoolAppServerName.py",
      hosts:[
        "host_1",
        "host_2",
      ],
      slb_name: "CoolAppServerName.prod",
    },
    {...},
    ]
Once you have a definition, it can be extended to meet growing needs. You might choose to do something like:

    { 
      name: "CoolAppServerName.prod",
      key_metrics: [
        "CoolAppServerName.prod.5xx",
        "CoolAppServerName.prod.latency_percentiles",
      ],
      owner: "CoolTeam",
      ...,
    }
And then you could generate a webpage with a dropdown where "CoolAppServerName.prod" is an option and the dashboard including graphs for the time series metrics "CoolAppServerName.prod.5xx" and "CoolAppServerName.prod.latency_percentiles" automatically show up. Maybe instead of having service names in the dropdown you have owner names in the dropdown.

You could potentially write some code that attempts to validate no significant changes in those metrics and use it to automatically verify that newly pushed code didn't take down the website.

Service config means creating an authoritative service identifier (authoritative because it's the only identifier used in tooling) and then attaching a configuration to it.

Facebook and google have (or at least at some point had) tupperware and borg respectively, that are basically custom verisons of the above extended for their infrastructures.

I see, thanks for the detailed answer.

That furiously remind me of solutions ala kubernetes.

Where you define entry point, healthcheck, etc

A tad more abstract, and larger ( afaik, k8s don’t care how your code is build for instance )

Never heard of Tupperware. Loosely aware of Borg.

Again, I appreciate the time.

When Kubernetes was released, it was thought it would be a successor to borg if not the key components of borg itself, IIRC. https://en.wikipedia.org/wiki/Kubernetes:

  The design and development of Kubernetes was influenced by 
  Google's Borg cluster manager. Many of its top contributors
  had previously worked on Borg;[15][16] they codenamed Kubernetes
  "Project 7" after the Star Trek ex-Borg character Seven of Nine[17]
  and gave its logo a seven-spoked wheel.
There was a lot of early skepticism about it because it was not borg. I guess my understanding is that borg is so integrated into google tooling that it would have been impossible to generalize.

I haven't used it myself yet because a few of the senior engineers (from google/fb) I respect said "absolutely not in our infra."

What are you using instead and what are the main criticisms of kubernetes from your seniors?
A completely bespoke solution. I was both too busy to and too inexperienced with kubernetes to get into it and have a conversation.

IIRC the main criticisms were that it wouldn't scale to our needs and there were some use cases that wouldn't be handled by kubernetes easily. The end result would be two different solutions for the same problem, a slow migration to kubernetes that may or may not stall out, and then a half finished/perpetual migration that would double support costs.