It’s been days since my last blog and I have been thinking of topics that we can cover in my later writings. I thought of exploring more on other distributed system available around us. Somehow I decided to stick with the docker for a while. In this blog, we will dig a little deeper. As we often rush towards building up our applications and miss copious details that one should have considered. We will begin with basic but important terms: Docker Images and Containers.
Let us just say for a while that Images are nothing but compiled form of source code that were mentioned in Dockerfile. Images can be build from scratch use Dockerfile, even the base images. Images can be optimised to a good length by keeping best practices for writing Dockerfile. There are some other things as well that should be kept in my while working around docker images and containers. I would like to quote some of them:
Build It Properly:-
- Images are built on multiple layers. Each layers represents the RUN command instruction that was run inside the Dockerfile. As it is very obvious to say that Docker images should be as small as possible. One point that is noteworthy is to not run any unnecessary commands while building images. In order to keep the number layers to a minimum, any file manipulation like moving, extracting, removing, etc, should ideally be made under a single RUN instruction.
- Layers in Image are read-only layers. It can shared by other images and containers. As whenever your pull an image using docker pull command, some of the layers can be copied from the existing images. When a container is launched, an additional writable non-shareable layer is created on the top read-only image layers. As these layers are writable layers and contents on these layers are volatile so we should not store data inside Docker containers. For storage purposes, we should use Docker Volumes.
- Avoid building images from Docker containers using docker commit command. To understand this, we should know that a container filesystem works on Copy-On-Write (COW) technique. It allows it to share resources with images and other containers.
- Writing or modifying on existing file or package is done by storage drivers. Process may very from driver-to-driver as mentioned below
When a file in a container is modified that is different from the image it’s launched, a copy-on-write event takes place.
File is searched in the read-only image layers from top to bottom and it gets copied on the container writable layer.
Then file gets modified and container now fetch the file from top layers. Every modification in container creates an extra unnecessary layers which consume disk space and when this container is committed, we get a giant image.
Use It Efficiently:-
- Docker allows to latch a Tag with your Docker image name which should be used very carefully. Never ever use Latest tag with your Images. As if you provide no tag, images will take Latest tag by default. It could lead to disasters that are nearly unrecoverable as your images might get mixed. Use tags efficiently as a tool for version control that can be easy to remember and manage.
- Docker community keep saying to run one process per container. But sometimes our applications could have different requirements in different time that would not allow to package different processes in different container. Although multiple processes can be run inside a running container using Supervisor, proper Entrypoint techniques etc. Also keep in my mind that docker are build to log multiple process properly so we will need to create proper logging mechanism for these processes.
- It is good practice to pass credentials as arguments while building instead of hard-coding these credentials. Credentials are needed to changed from time to time. So hard coding them isn’t the best way to go.
- Docker provides root access directly to any user that enters the running container. It is fine working around developing and testing environments. But for production servers, that might be the good idea as one may think it is. Docker policies on security keeps improving and updating but we still have to try to use non-root user accessing our files or data inside the docker container.
There are other points as well that one might consider while dockerizing their applications. Above mentioned points, however, are good enough to kickstart your docker application in a optimised manner.