While working with docker, we often use Dockerfile to build a docker image for our applications. As we know, Dockerfile is a text file which contains all the instructions in the form of commands which are needed to build a Docker Image. Dockerfile has a specific format and use a specific set of instructions. So, it is important to know the best practices and methods to create an easy-to-use and effective Dockerfile.
General Guidelines :
- Dockerfile must produce an image from which a container can be easily launched, stopped, destroyed and rebuilt with the minimum setup and configuration.
- It is always recommended to put each Dockerfile in an empty directory and then add the other files needed for building the Docker Image. To increase the build’s performance, excluded unnecessary files and directories by adding a .dockerignore file to that directory.
- Avoid installing unnecessary packages to reduce the complexity, dependencies, file sizes, and build time.
- It is always recommended to run “one process per container”. For example, for a web application, you can use three different unique containers to manage the web application, database, and an in-memory cache. Although it is not a hard and fast rule, use your best judgment to keep containers as clean and modular as possible.
- Try to use minimum number of layers in building an Image, while maintaining a balance between the readability and number of layers.
- Use sorted multi-line arguments alphanumerically while writing a Dockerfile to reduce duplication of packages and and make it easier to update.
- While executing the instructions in the Dockerfile, Docker looks for an existing image in its cache that it can reuse, called build cache to speed up the build process. We should know the basic rules that Docker follows while using cache. If you do not want to use the cache at all you can use the “–no-cache=true” option on the “docker build” command.
Docker Instructions:
FROM :
Use the current Official Repositories as your base image whenever possible. For example, use Debian images as they are very tightly controlled and kept minimal, while still being a full distribution.
RUN :
To make your Dockerfile more readable and maintainable, always split long and complex RUN statements into multiple lines separated with backslashes.
You should avoid “RUN apt-get upgrade” or “dist-upgrade“, as many of the “essential” packages from the parent images won’t upgrade inside an unprivileged container. If a package contained in the parent image is out-of-date, you should contact its maintainers.
Always combine “RUN apt-get update” with “apt-get install” in the same RUN statement to avoid use of build cache as combining it avoids the building of separate layers.
WORKDIR :
While using WORKDIR, try to always use absolute paths for more clarity and reliability. Also you should use WORKDIR instead of using “cd” command which are more complex to read, maintain and troubleshoot.
ADD or COPY :
In most cases, COPY is preferred over ADD although ADD and COPY have similar functionality. The reason is that COPY is more transparent and supports only basic copying of local files into the container, while ADD has some features like tar extraction for local files and remote URL support that are not always used. ADD can be best utilized to auto-extract local tar file into the image. For example
1 |
ADD something.tar.gz / |
If you have several different files to be copied from your context, COPY them individually in the Dockerfile, rather than all at once. This will invalidate the build cache forcing the step to be re-run if any of the specified files changes.
Also use curl or wget to fetch packages from a remote repository instead of using ADD. This will avoid the addition of another layer in your image and also help to reduce the image size as you can delete the files which you no longer need after they’ve been extracted.
EXPOSE :
The EXPOSE instruction indicates the ports on which a container will listen for connections. To reduce complexity, try to use the default, traditional ports for your application. For example, for an image containing the Apache web server, use EXPOSE 80 while for an image containing MySQL, use EXPOSE 3306 and similarly for others.
You can map the container port to any of the host ports for external use by using a flag like “-p” along with the “docker run” command.
ENV :
You can use ENV to update the environment variables of the software that the container installs. For example,
1 |
ENV /usr/local/bin/nginx/bin:$PATH |
will ensure that CMD [“nginx”] works while you run the container. ENV is also useful in providing service specific environment variables for the service you wish to containerize.
It can also be used to set commonly used version numbers to maintain and update the versions of software.
VOLUME :
The VOLUME instruction should be used to expose any database storage area, configuration storage, or files/folders created by your docker container. It is recommended to be used for any service data that can be modified by the user.
USER :
If a service can run without root privileges, USER can be used to change from default root user to a non-root user. For this, you first need to create a user and group in the Dockerfile. For example,
1 |
RUN groupadd -r postgres && useradd --no-log-init -r -g postgres postgres. |
Passing the –no-log-init flag to useradd prevents disk exhaustion, due to the attempt to create a user with a sufficiently large UID inside a Docker container.
Lastly, to reduce layers and complexity, avoid switching USER back and forth frequently.
CMD :
The CMD instruction along with some arguments, should be used to run the software contained inside the Image. CMD command should almost always be used in the form of [“executable”, “param1”, “param2”] . For example, for apache image, you should run something like CMD [“apache2”, “-DFOREGROUND”] . This form of the instruction is recommended for any service-based image.
In most other cases, CMD should be given an interactive shell, such as bash, python and perl. For example, like CMD [“python”] , CMD [“php”, “-a”] , etc . Using this form means that when you execute something like “docker run -it python” , you’ll get dropped into a usable shell, ready to go.
ENTRYPOINT :
ENTRYPOINT can be best used to run the main command of the application whenever a container is launched from the image, followed by the default flag or arguments set by CMD. Whenever additional arguments are supplied along with the image name while executing “docker run” command, the default flag or arguments set by CMD is overwritten.
LABEL :
Try to add labels to your image in order to organize images by project, record licensing information, to aid in automation, or for other reasons. To add a label, add a line in your Dockerfile beginning with LABEL followed by one or more key-value pairs.
So these were some of the Dockerfile best practices that should be followed in order to write an effective Dockerfile. Hope it was helpful. For more updates, Stay tuned!
In case of any help or query, please contact us.
One question about copying multiple files at once, why is it different from copying them individually?
If they are copied all at once and one file changes, build cache will be invalidated. If not, why is that?