Hacker News new | ask | show | jobs
by AndrewHampton 4124 days ago
So here's a question we've been talking about at my office. When developing a micro-service on your development machine, do you need to run the whole stack or just the service you're working on?

For example, let's I am working on service A, which depends on services B and C. Do I need to run all 3 apps and their data stores locally?

We currently will typically point A to the staging B and C. However, we have some long running jobs that A will initiate on B and B needs to post back to A when it's finished. This doesn't work when pointing to staging B.

6 comments

You should check out a tool we built to solve this problem at Clever: https://github.com/clever/aviator

It lets us spin up a service + all dependent services locally with a single command.

It's my understanding microservices shouldn't have many dependencies on one another. The link to you blog post that explains your rationale doesn't appear to work... do you mind explaining what need this fills?
One option is a framework[1] which tries to start dependent services locally.

A stopgap solution is to use something like an Actor[2] model, which schedules actors on an ActorSystem and to clone the context/scheduler/system for each client ID. As long as your actors are fairly sane, this should be fairly lightweight. Then just shut down the actors for a given client (actorSystem.shutdown() under Akka), either after a time or by having a client send a Shutdown message (or both).

[1]: http://wym.io/ [2]: http://akka.io/

just wanted to ask, in the footer on wym.io it mentions "wym-core is open source software".. but I wasn't able to find the source or a repo anywhere?
Depends on the use case. For automated testing, if I'm developing service A, then locally I'd usually I'd want to stub B & C with something like Mountebank (http://www.mbtest.org/) (though I've never used it to do a post-back as you describe...not sure if it supports that out of the box).

If I just wanted to poke around a running system to see how things interact manually, yeah I'd run everything locally. Probably using Docker images for each dependency service and Vagrant to manage the suite of those images, to preserve my sanity.

Yeah, this is pretty much what we do. We only use test suites for the backend apps which only expose an API. They just stub out calls to other services.

However, our front end apps is where we're running into trouble. The people on the front end want to run through everything in their browser to make sure everything works and looks good.

We use docker and fig (now docker-compose) to run all our apps. So we've been talking about using compose's new external_links feature to link each app together.

I hadn't heard of moutebank though, so thanks for the link!

Mountebank/mocking is a great solution when your components are too big to spin up new ones for 'free'.

It's exactly because Docker-based components end up so big that it seems hard to believe in the viability of that approach to microservices though. :/

It's a great stopgap/migratory step, though.

I think some level of integration testing is always a good thing, but spinning up all dependent services just so you can test yours leads to all kinds of problems, even in microservices. You end up with cascading dependencies.

At scale, some level of stubbing is required to test.

Agreed.

We _partially_ solve this problem by letting you run a dev cluster that you can use for services which are awkward to run locally, but stubbing/mocking definitely has a place in automated testing, certainly.

mountebank could support that through stub resolver injection (http://www.mbtest.org/docs/api/injection). Basically, in addition to returning a response, you can inject some javascript that would call back after some time. Tricky, but possible.
Can't speak for the author, but I can tell you what we do in our company, which is also completely microservice-based.

Backstory: We used to have a helper tool that allowed a developer to run any app locally. It tried to set up the same stack as we were using in production: HAproxy, Nginx, Ruby, Node, PostgreSQL.

It was problematic, because people had machines that differed slightly: Different versions of OS X (or Linux), Homebrew (some used MacPorts), Ruby, Postgres, etc.

We could have spent a lot of time on a script that normalized everything and verified that the correct versions of everything was installed, but the problem with developing such a tool is that you won't know what's going to break until you hire a new developer who needs to bootstrap his box. Or until the next OS X release, or something like that.

Syncing everything with the production environment was also difficult. The way we configure our apps, a lot of the environment (list of database servers, Memcached instances, RabbitMQ hosts, logging etc.) is injected. So with this system we'd have to duplicate the injection: Once in Puppet (for production), a second time on the developer boxes.

So we decided pretty early on to mirate to Vagrant.

---

We now run the whole stack on a Linux VM using Vagrant, configured with the same Puppet configuration we use for our production servers.

The Vagrant box is configured from the exact same Puppet configuration that we use for our production and staging clusters. The Puppet config has a minimal set of variables/declarations that customize the environment that need be tweaked for the environment. From the point of view of the Puppet config, it's just another cluster.

We periodically produce a new Vagrant box with updates whenever there are new apps or new system services. Updating the box is a matter of booting a new clean box and packaging it; Puppet takes care of all the setup. We plan on automating the box builds at some point.

To make the workflow as painless as possible, we have internal "all-round monkey wrench" tool for everything a developer needs to interact with both the VM and our clusters, such as for fetching and installing a new box (we don't use Vagrant Cloud). One big benefit of using Vagrant is that this internal tool can treat it as just another cluster. The same commands we use to interact with prod/staging — to deploy a new app version, for example — are used to interact with the VM.

One notable configuration change we need for Vagrant is a special DNS server. Our little tool modifies the local machine (this is super easy on OS X) and tells it to use the VM's DNS server to resolve ".dev". The VM then runs dnsmasq, which resolves "*.dev" into its own IP. We also have an external .com domain that resolves to the internal IP, for things like Google's OAuth that requires a public endpoint. All the apps that run on the VM then respond to various hosts ending with .dev.

Another important configuration change is support for hot code reloading. This bit of magic has two parts:

- First, we use Vagrant shared folders to allow the developer to selectively "mount" a local application on the VM; when you deploy an app this way, instead of deploying from a Git repo, it simply uses the mounted folder, allowing you to run the app with your local code that you're editing.

- Secondly, when apps run on the VM, they have some extra code automatically injected by the deployment runtime that enables hot code reloading. For Node.js backends, we use a watch system that simply crashes the app on file changes; for the front end stuff, we simply swap out statically-built assets with dynamic endpoints for Browserify and SASS to build the assets every time the app asks for them (with incremental rebuilding, of course). For Ruby backends, we use Sinatra's reloader.

---

Overall, we are very happy with the Vagrant solution. The only major pain point we have faced is not really technical: It's been hard for developers to understand exactly how the box works. Every aspect of the stack needs to be documented so that developers can know where the look and what levers to pull when an app won't deploy properly or a queue isn't being processed correctly. Without this information, the box seems like black magic to some developers, especially those with limited experience with administrating Linux.

We also sometimes struggle with bugs in Vagrant or Virtualbox. For example, sometimes networking stops working, and DNS lookups fail. Or the VM dies when your machine resumes from sleep [1]. Or the Virtualbox file system corrupts files that it reads [2]. Or Virtualbox suddenly sits there consuming 100% CPU for no particular reason. It happens about once every week, so we're considering migrating to VMware.

Another option is to actually give people the option of running their VM in the cloud, such as on Digital Ocean. I haven't investigated how much work this would be. The downside would obviously be that it requires an Internet connection. The benefit would be that you could run much larger, faster VMs, and since they'd have public IPs you could easily share your VM and your current work with other people. Another benefit: They could automatically update from Puppet. The boxes we build today are configured once from Puppet, and then Puppet is excised entirely from the VM. Migrating to a new box version can be a little painful since you lose any test data you had in your old box.

As for your question about what services to run: It's a good question. Right now we only build a single box running everything, even though we have a few different front-end apps that people work on that all use the same stack. We'll probably split this into multiple boxes at some point, as memory usage is starting to get quite heavy. But since all the apps share 90% of the same backend microservices, the difference between the boxes are mostly going to be which front-end apps they run.

[1] https://www.virtualbox.org/ticket/13874

[2] https://www.virtualbox.org/ticket/819

We do roughly the same thing, but with Docker and Fig. We're already using Docker to package our services for deployment, so using Fig to spin up a full environment is pretty painless. And Fig also makes it simple to spin up environments for running integration tests as part of CI to ensure that the Docker containers being deployed and used by developers are always in a working state.
We are indeed planning to migrate to a Docker-based deployment system. I have looked briefly at Docker Compose (new name of Fig), and also at somewhat bigger orchestration systems like Kubernetes.

One goal is to get rid of Puppet (which is, frankly, a buggy, badly-designed mess) and move to a more dynamic, fluid orchestration system based on discovery and autoscaling.

I always am amazed at these comments. This in itself contains as much info as the OP! I feel like they are getting lost in HN. Copy-paste to medium? :)

Thanks!

Bookmarked this, thanks for taking the time to write it up.
It really depends on the size of your micro services. If the services are small, then running each one (via some master shell script) shouldn't be an issue. If the services are large, then something like Vagrant would enable you to get a near-live environment fired up quickly and easily.
Wouldn't blue-green deployment and immutable servers make this a non-issue? Or to put it differently, you probably shouldn't have a staging environment.