Hacker News new | ask | show | jobs
Running Java on Docker images on a Mac (vanwilgenburg.wordpress.com)
45 points by jvwilge 3328 days ago
4 comments

VM, inside a virtualization layer, inside an operating system, inside a VM, inside an operating system.
What are the reasons for wanting to run java inside docker? The jvm is supported andavailable, and portable on every environment socket is, so what does it get you?
Also many common problems like targeting the correct Jdk, customize start parameters and classpath, shipping agents, selecting native libraries etc. are nowadays better solved by using capsule for packaging.

It gets you a single jar without classloader interference or onejar-zipping madness

It even supports automatic download of dependencies like the clojure runtime and sharing those between apps.

http://www.capsule.io

Developing/testing for multiple versions at the same time, having repeatable tests, versioning, using the same environment for development as your final deliverable (e.g. a Linux environment not OS X), being able to take your container and run it on production as is, and more things besides?
wouldn't a build server be better for that sort of release testing? like travis CI / gitlab CI runner on DO / light sail etc @ $5/month
Not all developers have access to a build server or want to bother setting up one.
Yes; you use the build server to build the docker image, and run tests on the docker image in a test environment.
A significant amount of java apps (think finance/high frequency trading) run a ton of java native interface.

A ton of microservices are built with the JVM. Turns out containers are pretty good at that as well.

I feel like I just read a bunch of "facts" that don't answer the question and have no real reasoning behind them.
To clarify a bit - "platform dependent/hard to run code" exists in java land. Docker is good at managing native dependncies.

I'm also assuming folks reading this have had some context around microservices and k8s. Microservices no matter what language are well suited to containers and container orchestration. It's a widely used pattern to have self contained services in containers with some sort of orchestration managing the containers those services are deployed in.

Those "facts" were targeting an audience that have at least some familiarity with containers and the basics of the problems they solve but maybe less familiarity with the JVM (which from my experience is the typical reader on HN).

One I've seen is less technical, and more organizational. Once it's in docker, the development team now owns things like the jvm version, command line parameters (thread pool settings, gc settings, etc). Lacking that, in some large companies, some other team has control of your environment. Silly, yes, but that is one of the bigger motivators I've seen for docker use at big companies...clawing back control.
This is both good and bad. Good for developers who can control more, bad for sysadmins who have to force JVM updates to address security vulnerabilities and inefficient settings (hello world pre-allocating 10 gigs of memory).
Most likely, organisation that needs this kind of workaround do not have sysadmin that monitor or track JVM updates. They just have inflexible process guardian with a 2 week lead time and at least 1 man-week worth of effort on a specific budget code. Of course the only source of change request would be the developers themselves.

I have seen organisation devolve like that on the database side. At one time you have a DBA team monitoring server, planning for capacity, upgrade, optimising script and owning datamodel. A few years of cost cutting and 3 guys own the datamodel of tens of application, over hundreds of servers, basically becoming a huge bottleneck for the 300+ developers. When adding a column is eating half of your project budgets and adding uncertainty on delivery date, developers become creative. The best of them sneak a way to get DBA access, the others multiplex value in existing columns (Hello, "PROPERTIES" xml column !)

It'll come full circle soon enough. Some PHB will centralize the developers that deal with docker, and we'll start over.

In the early days of Unix, there weren't sysadmins really...just one or two devs on each team that dealt with it. They were later centralized, and...seen this cycle before.

My primary reason is orchestration. By putting it inside Docker it runs like every other application in our Kubernetes cluster.

If everything you run is Java, it might not make sense, but in a polyglot, the uniformity can be really beneficial.

I have faced issues relating to requiring specific versions of JDK for each software. I don't work with Java so using a docker would have been a simpler solution, sadly this was before dockers became mainstream.
In the common and unfortunate situation where devops does not allow app developer access to prod environments containers are a dream. If the prod env is running docker your container is almost certain to work.
SOX 404 mandates segregation of duties. You literally cannot access prod if you also have access to dev.

Granted, this is only for US public companies, and only for systems that are in scope for your 404 audits.

We run java + clojure in Docker and we do it so that we can connect all of the dependencies together with Docker-compose. We don't "develop" inside docker, but we do integration tests in docker. We also deploy (CI and prod) on docker for convenience.

This setup also allows us test resiliency under bad network conditions using https://github.com/IG-Group/Havoc/

This is how I like to use Docker. Especially if it's a project that I'm working on every day. Docker-compose makes running a single app and all of its dependencies locally really simple.

The other use case for Docker, for me at least, is running specific versions of libraries. The other day we had a problem with npm and it was working for everyone locally. Turns out we were a few versions behind and it was much simpler to do `docker run -it node:0.12 bash` than to install the node version manager alongside my homebrew install of node.

Running multiple specific version of the JDK is kind of a pain. So if you are working on projects targeting specific versions of the JDK then isolating each on in its own container isn't the dumbest thing to do.
But initializing the JAVA_HOME and related env variables before launching your java process should help, right?
Or, just use Docker to capture all of that :)
if, say, zookeeper was a dependency for local development and you didn't want to go through the installation on every laptop... i guess?
That's right and another possibility is to create a small cluster to do integration testing (on Jenkins for example). It's more easy to share containers with colleagues so you don't have to worry about about dependencies. It's also easier to limit cpu/memory in a container to check the cpu/memory requirements of your application.
I have been working on a Clojure (so JVM) program that I run in Docker for Mac.

It requires some data on disk, which is about 100GB. Since my laptop has a pathetic 250GB SSD I got an external drive for this. All well.

But I get the problem that the Mac shuts down after doing IO from inside the VM for about 10-20 minutes. Just a black screen and a second later reboot.

Has anyone on this forum had the same issue? I have the same when running in VirtualBox, and after wiping the Mac totally and reinstalling the OS. Happens using both USB and Thunderbolt.

Docker for mac has absolutely dreadful filesystem performance for mounted volumes (Docker toolbox with the extra virtualbox indirection actually ends up being faster if you're doing filesystem reads even remotely frequently). If you need to work with 100GB on disk you're going to have a lot of problems.

https://github.com/docker/for-mac/issues/77

For this reason I went out to buy a Intel NUC and am now running NixOS on there. Developing on that machine, with a Clojure nRepl connected via SSH. Works great, especially since my service can now be running and fetching data from the internet when my laptop is closed, so that next time I work I don't have to wait for my service to catch up with the world.
Out of curiosity, what are some reasons to run the Clojure program inside of docker? Could you bypass the issue by just running it locally?
I do so because my Clojure app I'm writing also depends on other programs from 3rd party vendors (open source stuff). I'm too old to spend time setting shit up on every develop machine, server etc. If I can just write one declarative file (docker-compose.yml) and then be done that's nice, so that is what I do.

Deploying on Linux anyways, so Docker isn't a performance overhead.

> It requires some data on disk, which is about 100GB

As someone who develops on a Macbook with a 128 GB drive, this is completely crazy to me. How is it taking up 100 GB? I use Docker for Mac for most of my day-to-day development, mostly for running docker-compose environments so I have separated Redis/Postgres environments for each app. They take up nowhere near 100 GB. Something sounds very wrong, but I can't imagine how it would be caused by the JVM running Clojure.

> Something sounds very wrong

How can it be wrong to simply have 100 GB of data that you need to operate on? If that's how much data you have then that's how much you have.

Sometimes applications have to read and process data from the filesystem. I have several ~100gb datasets 9f large amounts of sensor data.
> How is it taking up 100 GB?

A couple of blockchains. As said, it's the data of my app, not Clojure or Docker.

Oh, I read it as "Yeah it takes up some space on my drive, like 100 GB". Having it as actual data the program is processing makes way more sense, sorry about that.
it sounds like OP's program reads some dataset, and it is 100GB in size.
So much for "write once, run everywhere". "Write once, run on Linux" doesn't sound as good :(
I believe its more like "write once, run everywhere except for mac". Microsoft is working on running linux containers natively on Windows.
Both the xhyve and hyper-v drivers for docker does the same thing, boots a tiny Linux VM that can run docker for you.
I'm very excited about how the Linux syscall ABI has become the new standard - Windows, Linux, FreeBSD, and Solaris (well, SmartOS, and Solaris 10) all support it now. If only macOS would catch up!