FID: A Faster Image Distribution System For Docker Platform: Abstract-Docker Has Been Widely Adopted in Enterprise-Level
FID: A Faster Image Distribution System For Docker Platform: Abstract-Docker Has Been Widely Adopted in Enterprise-Level
2nd2nd
IEEE
International
International
Workshops
Workshops
onon
Foundations
Foundations
andand
Applications
Applications
of of
Self*
Self*
Systems
Systems
(FAS*W)
Abstract—Docker has been widely adopted in enterprise-level own container-based services. In the future, Docker and Docker-
container environment. As an important part of Docker-based based platforms will play an important role in the IT industry.
container ecosystem, Docker Registry provides the service of
storing, distributing and managing Docker images, which is In Docker system, which provides the service of storing,
crucial to run Docker containers. In large-scale container distributing, and managing Docker images, Docker Registry [3]
platforms, deploying applications is prone to overburdening is an important component. A new deployment of Docker
Docker Registry with flooding network traffic, and this situation container comprises two steps: (1) Pulling the published image
may even cause failures of image services. In this paper, we present from the Docker Registry, and (2) Starting a container based on
a new P2P-based large-scale image distribution system called FID the image. Pulling images from Docker Registry is a common
(Faster Image Distribution), which is able to accelerate the speed and important operation that 76% of the time spent on a new
of distributing Docker images by taking full advantage of the deployment will be spent on the pulling [16]. In order to speed
bandwidth of not only Docker Registry but also other nodes in the it up, administrators usually deploy a private Docker Registry in
cluster. We implemented and validated FID on the enterprise-level local clusters. Although it works in small cases, the Docker
cluster. The experiment results show that, compared to the native Registry host will no doubt become a bottleneck for a large-scale
image distribution method, FID reduces at least 97% of network container deployment. In our practice of Gaia, when we
traffic for Docker Registry. Furthermore, it reduces 83.50% of distribute one container onto 3,000 hosts simultaneously, with
distribution time on average when distributing images among 200 the native Docker Registry, the time of deployment is
nodes, and particularly, reduces up to 91.35% of distribution time
unacceptable. With the depletion of bandwidth, image services
for the 500M image of Hadoop.
provided by the Docker Registry becomes unavailable.
Keywords—Docker; Docker Registry; Peer-to-Peer Network; Deploying Two or more instances of Docker Registry may
Distributed Container System alleviate this problem to some degree but new problems such as
image synchronization and load balancing will be induced.
I. INTRODUCTION Therefore, a faster and elegant method to distribute Docker
As a kernel-level virtualization technology, Docker [1] image for large-scale container deployment is in great need.
provides an isolated environment for every container to run apps Considering the similarity between downloading a file from
on the same machine. Developers can easily customize and a server and pulling an image from a Docker Registry, we
package the runtime environment of their apps in a set of files, proposed a new way to accelerate image distribution for large-
which is called Docker Image. Once a Docker Image is created, scale container deployment by integrating P2P network into
users can run the image on any host machine with Docker Docker image distribution system. P2P network can make full
Engine. The feature that “Build once and run anywhere” greatly use of the bandwidth of all the nodes to share the burden of
simplifies the development, delivery and deployment of Docker Registry and increase the speed of deployment. We
applications. Nowadays, more and more companies built their implemented the P2P-based large-scale image distribution
own container-based platforms with Docker. The container system called FID based on the open-source project Docker
management platform developed by Google, called Kubernetes Distribution [7] and conducted experiments on a 200-nodes
[2], is an important part of the Docker ecosystem. Tencent, the cluster to validate our solution. The experiment result shows,
biggest internet company in China, built a container that FID can provide 91.35% of distribution time reduction at
management platform called Gaia [17] which manages over most when distributing a 500M-size Docker image on 200 nodes
8000 physical machines, and there are over 150 million in parallel. And FID can reduce at least 97% network traffic for
containers running on it every day. IaaS provider, such as Docker Registry. To the best of our knowledge, FID is the first
Tencent Cloud, Ali Cloud and Baidu Cloud also provide their enterprise-level P2P Registry system.
*corresponding author.
192
191
“balance point”, P2P solution will show its significant information about a Docker image. It indicates the layer that the
performance improvement compared to native Docker pulling. image has and the Blob each layer refers to. Blob is a
In their experiments, using DID to distribute images will get compressed file of the layer. Every layer has one corresponding
80% speed-up than original Docker pulling. Blob. The relationship and difference of Docker image stored
in Docker Engine and Docker Registry are shown in Figure 1.
In DID, Controller has to download the whole image from
Docker Registry with the native Docker pulling method first,
which will increase the image distribution time. P2P distribution Docker Engine Docker Registry
can only start after Controller fetches the whole image and starts
to be the first seeder to provide the original image resource in
the P2P network. Blob1
B. Quay
Writable Layer
Quay [21] is a commercial product that provides Docker Blob2
Registry service with continuous integration, private image yum install **
repository, user authentication/authorization, team Read-only ADD file1.. Blob3
management and other features for enterprises. Quay also layer
supports the image format of rkt [14]. With the open source FROM centos
Manifest
Quay client [9], users are able to pull Docker images via
BitTorrent.
Similar to DID, Quay doesn’t provide P2P service in its Fig. 1. A Docker image and its files in Docker Registry.
Docker Registry. Therefore, a node has to download the whole
image first, while other nodes are waiting or pulling the image Figure 2 shows the process of pulling a Docker image. It can
from Docker Registry, which will decrease the speed and light be summarized as follows.
weight advantage of P2P. Once the first download process is 1) Get the Manifest file of the image
done and this node becomes the first seeder, all other nodes in
the P2P network are able to download Docker images to local 2) Find layers that are not existing
disk via BitTorrent. Finally, images are loaded into Docker 3) Download Blobs if the corresponding layers do not exist
Engine by calling Docker “load”. in the local Docker Engine
C. Docket 4) Decompress Blobs and import them into Docker Engine
Docket is a Docker Registry that adopts BitTorrent to
distribute Docker images to a large number of machines.
Docket generates a torrent file for the whole Docker image. Docker
User Docker
Then BT client can use the torrent file to download a tarball file Registry
of a Docker image. Then Docket calls “docker load” interface
to load the tarball into Docker Engine. Downloading the whole pull cmd
image with one torrent file makes Docket unable to reuse the
(1) Get Manifest
layer that already exists on local host, which causes a waste of
network resources.
DID, Quay and Docket are making torrent for the whole
Docker image. In this way, BT client needs to download the
entire image tarball although some of the layers already exist. (2) Det ect layers
It is not necessary to download the existing layers. In our
solution, we have done some optimization in this problem.
P2P has been introduced to accelerate the dispatching of (3) Get Blobs
virtual machine image by previous works. VDN [2] shares VM (P2P Download)
image chunks during distribution. And VDN has an excellent
image distribution performance under heavy traffic.
Considering that a Docker image is made up from multi-layers,
distributing Docker images in chunk level may break the layer
structure among the layers. So VDN is not suitable for Docker (4) Extract & Load
image distribution. VMtorrent [27] also introduces P2P in VM
distribution and gains a performance improvement.
IV. METHOD
Fig. 2. The sequence diagram of pulling a Docker image, which can be divided
A Docker image consists of a series of layers. Each layer is into four steps: getting Manifest, finding the not existing layers, downloading
a read-only file system. A layer contains a set of files or folders. Blobs of the layers, and extracting and importing them into Docker Engine.
A Docker image in Docker Registry is stored as two main types
of static files: Manifest and Blobs. Manifest describes the meta
193
192
Usually, Manifest is a small text file. Therefore, only the 3) Added an interface for getting torrent files
way of how the blob is downloaded should be optimized. If we
want to use P2P image distribution, we can change the step (3) In this interface, other clients can get torrent files of layers
to P2P downloading. Considering the usage scenario (3) we via http requests. Clients need to provide the Blob ID of a layer
mentioned in Section , the internal network topology is more in the http request. The response of torrent getting interface is
stable compared to the Internet. In the Internet, peers may join the content of the torrent file. But P2P Registry needs to be the
in or quit from a P2P network in anytime [24]. And the object initial seeding peer in the BitTorrent distribution. Otherwise no
in our distribution is static files. So we can use BitTorrent for one can get the data from P2P network even though they hold
image P2P distribution. For downloading blobs with BitTorrent, the torrent file.
every blob needs a torrent file corresponding to it. And Docker
Registry should be the torrent maker and initial seeder. The time
to make a torrent file is the blob uploading finished. Blob data
and torrent files are stored on the backend storage. So the
Docker Registry needs to provide a torrent fetching interface
and every BitTorrent client can fetch a torrent file via this
interface.
Another way to get a Docker image is docker load. It loads
an image from a compressed tar archive or STDIN. The
contents of the tar archive are blobs and layers configurations.
If we want to import an image by docker load, a tar archive of
this image is needed. Registry provides interfaces for getting
Manifest and Blob via HTTP. We can get materials that the tar
archive required by calling Registry’s interfaces. The way that
integrating BitTorrent in the above process is by using
Fig. 3. Architecture of FID. Boxes with dotted line represent Docker Hosts.
BitTorrent in Blob downloading (Step 3). The black line indicates the data flow of Docker images, and the blue line mean
the communication between BitTorrent peers and trackers.
V. IMPLEMENTATION
In order to achieve the goals listed in Section . We To be the initial seeding peer, P2P Registry should get the
carefully designed the system architecture. Figure 3 shows the Blob from the backend storage and store the Blob in BT client’s
architecture of FID. Several instances of P2P Registry are work directory. Then BitTorrent client gets the corresponding
deployed in the system. For keeping the data consistency of this blob and downloads file specified in the torrent file.
between P2P Registries, a shared distributed storage is adopted. Because the file already exists in BT client work directory, BT
FID Agent is a new component that is responsible for client only needs to announce to trackers that it has all pieces of
downloading Docker images. BT trackers keep peers’ this resource. Then other peers can find P2P Docker Registry
information. Every peer can find each other via trackers. The by getting peers interface provided by trackers.
details of every component are described below.
B. FID Agent
A. P2P Docker Registry In order to avoid modifying the already existing source code
Based on Docker Registry [7] ˈWe developed P2P Docker of Docker Engine, we developed a component called FID Agent
Registry that supports P2P image downloading. There are three to handle BT downloading. FID Agent has two different
main changes in Registry. working modes. The first one is called load mode. In this mode,
FID Agent downloads Blobs and packages them by the
1) Integrated a BitTorrent client in Registry organizational structure of Docker image’s tar archive. Then the
We added a BitTorrent client in Registry runtime. It is the agent calls Docker’s load interface to load this tar archive [23].
core modification for P2P Registry. But the BT client in P2P DID and Docket have similar implements. Another mode is
Registry has a feature, which is different from other FID called proxy mode. In this mode, the FID Agent is running like
Agent’s, that BT client in P2P Registry only sends data to other a lightweight HTTP proxy for the Docker Engine. FID Agent
peers without receiving data. Because P2P Registry already has intercepts Docker’s Blobs downloading requests, then
the data of the Blobs. downloads Blobs via BitTorrent and returns the data of Blobs
to Docker Engine. In FID Agent, random peer selection policy
2) Generate torrent file at the end of Blob uploading is adopted to avoid that the P2P Registry is more inclined to be
Before Docker Engine pushes an image to Docker Registry, chosen as peer, which could reduce P2P Registry’s network
Docker Engine exports every layer of the image in a overload.
compressed file called Blob. Then uploads Blobs via uploading The differences between load mode and proxy mode are
interface provided by Registry. The best time to generate torrent elaborated below.
file is at the end of the Blob uploading. So we modified the Blob
uploading interface and added a torrent generation function at 1) Load mode
its end. After torrent file is generated, the torrent file will be While studying the interfaces of Docker Engine, we found
stored on the backend storage. that there is no interface that can import a layer of an image
194
193
separately. And the only way to import a Docker image is Obviously, proxy mode is more light weight than load mode.
calling the “docker load” interface. So in this mode, FID Agent We conduct some experiments to compare the performance
downloads the layers that is not existing in the Docker daemon about the two modes in the next section.
and packages them into a tar file. The pseudo-code for FID
running in load mode is shown in Algorithm 1.
In the load mode, FID Agent provides a restful API to users.
Users call this API with the image name to pull the image
instead of typing “docker pull” command. After receiving the
pulling request, FID Agent will call the P2P Registry’s restful
API to get the Manifest for specific image. Next, FID Agent
analyzes the Manifest and gets the Blob IDs belong to the image.
In this step, the FID Agent checks whether layers already exist
on Docker host. The existing layer won’t be downloaded. Then
the FID Agent calls the “Get Torrent” interface of P2P Registry
for every Blob. And the FID Agent puts torrent files that just
got to the BT client to start the downloading process. BT client
C. Docker Engine
is running in the FID Agent. BT client creates download task
for every torrent file. When all the download tasks are finished, In our architecture, FID Agents are responsible for
FID Agent package the Blobs and configuration files according downloading images. P2P is integrated in FID Agent, so the
to image tar archive’s organization. Finally, “docker load” logic of P2P is decoupled from Docker. In our design, the
interface provided by Docker client library can load this tar source code of Docker Engine has not been modified.
archive into Docker Engine. In load mode, FID Agent uses “docker load” command to
import a tar archive into Docker Engine. Nothing is needed in
this situation but to keep the Docker Engine running.
In proxy mode, we should configure the proxy address in the
Docker Engine configuration. And restarting the docker
daemon is required to make the configuration effected. Starting
with Docker 1.12, Docker added a new feature named “live
restore”. Users can configure the Docker Engine so that
containers remain running even if the Docker Engine is
terminated. Before Docker 1.12, restarting Docker Engine will
cause all the running containers to be killed. The version of
Docker we used supports “live restore”. So modifying the
configure file of Docker and restarting Docker Engine do not
affect the running containers.
2) Proxy mode D. BT Tracker
There is a configurable option for Docker Engine called BT Tracker is a core component in BitTorrent network.
“http_proxy”. This configuration specifies a proxy server for Tracker assists the communication between peers by using
Docker Engine. Every http request sent from Docker Engine BitTorrent protocol. Once the tracker was stopped, the
will be forwarded to proxy server first. Based on this feature, BitTorrent network will not work. For high availability of the
we developed proxy mode for FID Agent. The pseudo-code for service, we added multiple trackers in our system design. And
FID running in the proxy mode is shown in Algorithm 2. these trackers’ address will be written to torrent files when
torrent files are generated. During the BitTorrent download,
In the proxy mode, FID Agent intercepts all the http requests every BitTorrent client will announce the resources it had to all
from Docker Engine. But FID Agent only processes the Blob trackers. Even though some of the trackers were stopped, BT
requests and simply forwards other requests. FID Agent client can still get another peers’ information from running
extracts Blob ID from requests. Then it calls the “Get Torrent” trackers.
interface of P2P Docker Registry to get the torrent belongs to
this Blob. And the P2P downloading process is the same as the E. Storage
load mode. At the beginning of the downloading task, FID There are serval types of storage to support, such as file
Agent needs to get a reader of the downloading file. Then FID system, swift [15], gcs (Google Cloud Storage) [25]. In cross-
Agent can read the data from this reader and write the data to IDC scenario, the distrusted file system Ceph [11] is
Blob request continually. recommended. Its feature of multi-cluster synchronization can
The proxy mode doesn’t need to handle a lot of problems, help synchronizing data between several IDCs, which is
such as Docker version adapting, layer detecting, and images important for large-scale container deployment cross multi-
loading. In proxy mode, FID Agent just focuses on the Blob IDCs. As known, bandwidth between IDCs is far less than
downloading. inner-IDC. Multiple Ceph clusters can be deployed in multiple
IDCs and several Registry deployed in different IDCs. Registry
195
194
gets data from the nearest Ceph and Docker Engine pull images ܶௗ ൌ ݔܽܯሺܦ ǡ ܦ ሻ ܶ௧ ሺܣǡ ܤሻ ܶௗ ሺݎܽݐ ሻ (2)
from nearest Registry. It will reduce the traffic between multi
IDC and reduce the time of distributing images. It is worth to
mentioning that Docker Registry can be easily configured and 250
connected to Ceph via Swift interface [19].
200
VI. EXPERIMENT
150
Time(s)
In order to verify that our design and implements meet the
design goals, we conducted experiments to evaluate FID. First 100
we compared the performance of load mode, proxy mode and
original docker pull in a single node. And then, we presented 50
the performance of FID in large scale deployment. Furthermore, 0
we analyzed the network traffic during P2P image distribution. 50MB 100MB 500MB 1GB 2GB 5GB
A. Distribution time of load mode and proxy mode Load-mode Docker Proxy-mode
In our implements, we provided a FID Agent that could run
Fig. 4. Compare the performance of three ways when distributing Docker
in two modes. As we know, a P2P protocol brings extra images
communication in a network. So comparing the performance
differences between P2P distribution and Docker native
600
distribution is needed. And it is necessary to know the
performance about load mode and proxy mode to make a choice 500
between these two modes. The experiment is executed on a 400
Time(s)
single node. Figure 4 shows the performance of three solutions
of distributing images. The x axis represents the size of the 300
Docker image in experiment. And the y axis represents the time 200
of distributing this image. We can see that in three ways, the
100
distribution time an image grows as the image size grows. And
native docker pull (named Docker in Figure 4) has the best 0
performance. Load mode has the highest slope. Proxy mode is 40Nodes 80Nodes 120Nodes 160Nodes 200Nodes
in the middle.
1G-docker 1G-FID 500M-docker
In BitTorrent, peer often communicates with trackers and
the other peers. During BitTorrent downloading, every peer 500M-FID 500M-Docket 1G-Docket
checks whether the piece it downloaded is correct. Thus
running BitTorrent protocol will bring extra overhead and it (1)
leads to BT-pulling costs more time than docker pull. For load
mode and proxy mode, the reason of proxy mode having a better 120
performance proxy mode does not need to wait for all layers to 100
be downloaded and packages them into a tar archive. And in 80
Time(s)
196
195
We empirically selected four image for testing. They are one distribution time did not increase significantly. The lines of FID
small image (“docker registry” 30 MB) and three larger images are almost horizontal. Compared to Docket, FID has the better
that are often used in large-scale deployments i.e., “centos” performance. When distributing a 500MB image (hadoop) in
(200MB), “hadoop” (500MB), and “tensorflow” (1GB). In 200 nodes, distribution time using Docker native pulling
every distribution test, we tested 3 times and calculated the method costs 500.77s and Docket costs 112.45s on average. But
FID only costs 43.30s on average, and 96.5% of pulling task
can be completed in 50s (shown in Figure 6). Although
VMware DID [8] has not provided their source code of P2P
Registry, but their experiments data shows that Harbor
distributes 500MB Docker image in 100 nodes costs about 180s.
In the same scale, with the FID, the longest pulling time in a
distribution job is less than 70s. Table shows the reduction of
distribution time using FID in different scale of distribution task.
FID reduced 91.35% distribution time of using Docker native
distribution. FID could save 83.50% of distribution time in a
200-scale distribution.
Fig. 7. The comparison of traffic that read from P2P Registry and FID Agent.
197
196
200 nodes, and save 93% network traffic of Docker Registry [11] Weil S A, Brandt S A, Miller E L, et al. Ceph: A scalable, high-
when distributing 200M-size image in all experiments. performance distributed file system[C]//Proceedings of the 7th
symposium on Operating systems design and implementation. USENIX
Future works is to come up with a P2P traffic scheduling Association, 2006: 307-320.
method between different IDCs. In the current architecture, [12] Maltzahn C, Molina-Estolano E, Khurana A, et al. Ceph as a scalable
nodes in different IDCs share the same BitTorrent Trackers, and alternative to the hadoop distributed file system[J]. login: The USENIX
Magazine, 2010, 35: 38-49.
the peer selection algorithm is a randomly policy. We will study
[13] Shvachko K, Kuang H, Radia S, et al. The hadoop distributed file
and develop the peer selection and traffic scheduling solution system[C]//Mass storage systems and technologies (MSST), 2010 IEEE
in P2P algorithm. Moreover, in practice we found that 26th symposium on. IEEE, 2010: 1-10.
extracting and importing Docker images also influence the [14] “rkt: A security-minded, standards-based container engine”
image distribution time. We will explore the acceleration of [Link]
image extracting and importing to further reduce the time of [15] “OpenStack Swift” [Link]
distributing images in large scale deployment. [16] Tyler H. Brandom S., et al., Slacker: Fast Distribution with Lazy Docker
Containers, 14th USENIX Conference on File and Storage Technologies
ACKNOWLEDGMENT (FAST’16), 2016.
This work is supported by Tencent and PKU-Tencent Joint [17] “Practice of Docker in Tencent Gaia” [Link]
Innovative Lab. Tencent Cloud provides the compute nodes for [18] Hausenblas, M.: Docker Registries: The Good, the Bad & the Ugly,
our experiments. PKU Finelab also has made a lot of [Link]
contributions. [19] Day, S.: Docker Registry V2 - A New Model for Image Distribution. In:
Docker Con 2015. San Francisco (2015)
REFERENCES [20] “Fast Distribution of Docker Images using P2P method”
[Link]
[1] Merkel D. Docker: lightweight linux containers for consistent
development and deployment[J]. Linux Journal, 2014, 2014(239): 2. [21] “Quay Documentation” [Link]
[2] Burns B, Grant B, Oppenheimer D, et al. Borg, omega, and kubernetes[J]. [22] “Xiao, D.: Custom docker registry that allows for lightning fast deploys
Communications of the ACM, 2016, 59(5): 50-57. through” [Link]
[3] “Docker Registry” [Link] . [23] “Docker load | Docker Documentation”
[Link]
[4] Schollmeier R. A definition of peer-to-peer networking for the
classification of peer-to-peer architectures and applications[C]//Peer-to- [24] Cohen, B.: Incentives build robustness in bittorrent. In: Workshop on
Peer Computing, 2001. Proceedings. First International Conference on. Economics of Peer-to-Peer Systems, vol. 6, pp. 68–72 (2003)
IEEE, 2001: 101-102. [25] “Google Cloud Storage” [Link]
[5] Liu Y, Guo Y, Liang C. A survey on peer-to-peer video streaming [26] Peng C, Kim M, Zhang Z, et al. VDN: Virtual machine image distribution
systems[J]. Peer-to-peer Networking and Applications, 2008, 1(1): 18-28. network for cloud data centers[C]//INFOCOM, 2012 Proceedings IEEE.
[6] Cohen B. Incentives build robustness in BitTorrent[C]//Workshop on IEEE, 2012: 181-189.
Economics of Peer-to-Peer systems. 2003, 6: 68-72. [27] Reich J, Laadan O, Brosh E, et al. VMTorrent: scalable P2P virtual
[7] “Docker Distribution” [Link] machine streaming[C]//CoNEXT. 2012, 12: 289-300.
[8] “Harbor” [Link] [28] “Device Mapper” [Link]
[9] “quayctl” [Link]
[10] “Spark” [Link]
198
197