Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This seems like a really exciting, albeit natural, evolution of the Docker "platform". It's really true that no one cares about containers, everyone just wants their apps to work.

That being said the one giant omission from Docker is still seems to be management of volumes/data. Great, we can run 100 nodes on AWS in a matter of minutes, but if your system has data storage requirements (ex: Almost every database ever) ... You're kinda left up to your own still. How does Docker Orchestration migrate the data volumes?

They really tried to sell this as "You don't need to do anything but the AWS beta, the DAB and a few commands" which would be wonderful. However with the need for reliable data storage... you're still stuck doing everything "the old fashion way".

(Edit: No, I don't mean store data IN the container, I presume no one is that silly. I meant the attached volumes. No volume mgmt = less greatly less helpful).



Docker relies on the already existing storage providers (ebs, gce, azure, netapp, emc, etc) to manage multi-host data.

Ultimately how do you manage data when it's not in a container? This is how you should be managing data inside the container as well, and docker provides an abstraction for doing this.

P.S. if you are at DockerCon, come to my talk on storage in Docker tomorrow @ 2:25


I guess the question is what do you do if you don't have one of those "existing storage providers" at your disposal. For example, I have a "traditional" collection of VPS instances hosting a variety of services. I want to containerize those services and run them on some kind of a cluster. My VPS provider doesn't offer anything besides local volumes. How do I manage persistent volumes such that they are created/destroyed as needed (say for scaling up/down database read-only slaves) and follow their containers around?

The answer to, "Ultimately how do you manage data when it's not in a container?" is, "Well, today, I spin up a new VPS and clone the database from the master (or a designated read-only slave) on first boot." Obviously that is not the pattern we want to pursue here.

It seems like the answer to the GP's question is that you have to build out the storage container cluster infrastructure in addition to your compute container cluster infrastructure, or move to a provider that offers both aaS. I don't know if that's GlusterFS or something else. This is where people like me need guidance, and on how to orchestrate storage containers with compute containers.


Yes, Docker does not provide storage infrastructure services. If all you are looking for is advice on how to handle storage, this is not a problem of the platform but rather education.

All someone can do is educate on the pros and cons of different solutions.


Docker containers are not meant to be persistent. Things like storage and state should be kept elsewhere and linked into your containers. You should be able to dynamically tear your containers down or spin up 20 of them and load balance requests between different ones.

If you are trying to store a MySQL database inside of a Docker container, you are missing the whole point.


> docker containers are not meant to be persistent. Things like storage and state should be kept elsewhere and linked into your containers.

This just means you need two systems to handle job creation, migration, replication, and load balancing. Everything you listed for docker containers is also needed for databases.

You need to remember: At scale, nothing is persistent. You need to notice a failure in your database systems,tear down the broken one, spin up a new instance, migrate the data from a replica, and keep on going.


It seems to me they are more focused on NOsql databases with multiple nodes and shared data like cassandra. So that you don't have to worry too much about volume/data mgmt because you will have multiple copies of the data on other containers when you lose a docker host or a container.

You also should conside that docker might not be the best option as opposed to just some vms on openstack for your much needed databases.

I'm stuck trying convince my co-workers that docker is not the answer to everything.


In no way did I mean to convey "store data IN the container" as I had simply assumed no one was that silly.

What I actually meant was as "there is no way to deal with volumes". So spinning up 100 instances and hot failover doesn't help if it doesn't deal with the attached volumes. Without that functionality, the orchestration is less magic and more "eh".


Don't use OpenStack. It's an appropriate solution never. I've been at too many shops where OpenStack is just a bottomless cesspool of the company hemorrhaging money to keep a shitty broken platform barely stable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: