Skip to content

Conversation

@azr
Copy link
Contributor

@azr azr commented May 6, 2024

TLDR: this makes pulls of big images ~2x faster (edit: and a bit more in the latest iteration), and closes #9922.

cc: #8160, #4989


Hello Containerd People, I have this draft PR I would like to get your eyes on.

It basically makes pulls faster, but also tries to have not such a big memory impact, by getting consecutive chunks of the layers and immediately pushing them in the pipe (that writes to a file + that signature checksum thing).
I noticed it made pulls ~2x faster, when using the correct settings.

The settings have a big impact, and so I did a bunch of perf tests with different settings, here are some results on a ~8GB image using a r6id.4xlarge instance, pulling it from s3.
Gains are somewhat similar on a ~27GB and a ~100GB image (with a little tiny bit of slowdown)
I also tried on an nvme, and a ebs drives, they are ofc slower but gains are still the same.


Metrics on a r6id.4xlarge timing crictl pull of a 8.6GB image.

The first one with 13 tries is with 0 parallelism, it's the current code.
The rest are tries with different settings

  • c_para (max number of chunks being pulled per layer at once)
  • chunk_size_mb ( size of chunks in mb )
  • ctd_max_con ( max # of layer pulled at once )
tmpfs tests:
dst    agv_time          count(*)
-----  ----------------  --------
tmpfs  44.0761538461539  13      

dst    c_para  chunk_size_b  ctd_max_con  agv_time  count(*)
-----  ------  ------------  -----------  --------  --------
tmpfs  110     32            3            22.625    4       
tmpfs  100     32            3            22.64     5       
tmpfs  130     32            2            22.76     1       
tmpfs  120     32            4            22.824    5       
tmpfs  110     32            2            22.85     1       
tmpfs  80      32            4            22.99     1       
tmpfs  110     32            4            23.018    5       
tmpfs  90      64            4            23.09     1       
tmpfs  90      32            3            23.18     1       
tmpfs  110     64            3            23.2125   4       
tmpfs  80      64            3            23.29     1       
tmpfs  90      64            3            23.32     1       
tmpfs  100     32            4            23.352    5       
tmpfs  70      15            4            23.4      1       
tmpfs  100     64            3            23.65     5       
tmpfs  120     15            3            23.68     1       
tmpfs  110     64            2            23.74     1       
tmpfs  100     64            4            23.77     5       
tmpfs  70      32            4            23.81     5       
tmpfs  120     32            3            23.83     5
[...]
nvme (885GB) tests:
dst         agv_time          count(*)
----------  ----------------  --------
added-nvme  47.4008333333333  12      

dst         c_para  chunk_size_mb  ctd_max_con  agv_time  count(*)
----------  ------  ------------  -----------  --------  --------
added-nvme  130     32            3            25.24     1       
added-nvme  70      32            4            26.1      1       
added-nvme  80      32            3            26.31     1       
added-nvme  100     32            3            26.38     1       
added-nvme  120     32            4            26.58     1       
added-nvme  130     32            2            26.71     1       
added-nvme  80      32            4            26.73     1       
added-nvme  120     10            3            26.82     1       
added-nvme  80      64            3            26.93     1       

Observations, I did a little go program to multipart download big files directly into a file at different positions with different requests, and that was much faster than piping single threadedly into a file. Containerd pipes in a checksumer and then pipes into a file. I think that this can in some conditions create some sort of thrashing, hence why the parameters are very important here.

That simple go program had pretty bad perfs with one connection, but I was able to saturate the network with multiple connection, with better or on par perfs with aws-crt.

I think that for maximum perfs, we could try to re-architecture things a bit; like concurrently write directly into the tmpfile, and then tell the checksumer our progress, so that it can do that in parallel, and then carry on like usual.

@k8s-ci-robot
Copy link

Hi @azr. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@azr azr force-pushed the azr/parallel-layer-fetch branch 4 times, most recently from 2531c18 to 0748b0f Compare May 7, 2024 09:57
@azr azr marked this pull request as ready for review May 7, 2024 09:59
@azr azr changed the title Parallel layer fetch Multipart layer fetch May 7, 2024
@swagatbora90
Copy link
Contributor

swagatbora90 commented May 8, 2024

@azr Thanks for the PR, this looks promising. I wonder if you were able to get any memory usage data from your tests? Previous effort to use ECR containerd resolver , which has similar multipart layer download, showed that it can take up disproportionate amount of memory specially when we increase the number of parallel chunks(without providing significant benefit to latency). The high memory utilization was mainly from the htcat library that ECR resolver uses to performance parallel Ranged Gets. I think we should understand these tradeoffs.

Also can you share some information about your test image? Number of layers? Size of individual layers?

@akhilerm
Copy link
Member

akhilerm commented May 9, 2024

/ok-to-test

@azr
Copy link
Contributor Author

azr commented May 14, 2024

Hey @swagatbora90 , of course !

The theory in my mind is that this should use max/worst-case max_concurrent_downloads * (max_parallelism * goroutine footprint) memory ; where the goroutine footprint should be: the goroutine stack, 32 * 1024 bytes (of buffer), request clone. io.Copy will create buffers of 32 * 1024 bytes here; I have not tried playing with buffer sizes, could be an option too.

I think memory usage would be better if we were to directly write in parallel in a file at different positions, with 'holes'. And, sort of tell our progress to the checksumer with no-op writers that tell where we are, etc. (DL actually was so much faster this way in a test program I did, but it was not doing any unpacking, etc.)

I also think it could be nice to be able to have a per registry parallelism setting, because not all registries are s3 backed, and docker.io seems to throttle things at 60mb/s.


Topology of images:

~8GB image

From crictl images, size is 3.97GB

dive infos:
Screenshot 2024-05-14 at 15 14 03

Total Image size: 8.6 GB                                                                                                                -rw-r--r--   1000:1000      205 B  │   │   │   └── README
Potential wasted space: 34 MB                                                                                                           drwxr-xr-x   1000:1000      319 B  │   │   ├── Xresources
Image efficiency score: 99 %
~27GB image

From crictl images, size is 17.7GB

dive infos:
Screenshot 2024-05-14 at 15 44 37

Total Image size: 27 GB
Potential wasted space: 147 MB
Image efficiency score: 99 %

Here are memory usages, where I'm periodically storing ps -p $pid -o rss= of containerd in debug mode started with vscode, and gctraces enabled.

~27GB image pull, max_concurrent_downloads: 2, 0 parallelism (before)

memory_usage_17g_pull_before
(typo, replace KB by MB here.)

~27GB image pull, max_concurrent_downloads: 2, 110 parallelism, 32mb chunks

memory_usage_17g_pull_110p_32mbc
(typo, replace KB by MB here.)


GC traces:

8GB image with `GODEBUG=gctrace=1`, parallelism set to 110 and chunksize set to 32
INFO[2024-05-13T14:35:20.998417039Z] PullImage "..." 
gc 6 @7.661s 0%: 0.050+1.4+0.049 ms clock, 0.80+0.24/4.2/9.7+0.79 ms cpu, 7->7->5 MB, 7 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 7 @8.021s 0%: 0.053+1.5+0.053 ms clock, 0.86+0.057/4.5/9.6+0.85 ms cpu, 11->12->6 MB, 12 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 81 @2400.006s 0%: 0.039+0.20+0.002 ms clock, 0.15+0/0.18/0.43+0.010 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 8 @8.246s 0%: 0.14+1.5+0.069 ms clock, 2.3+0.091/4.9/11+1.1 ms cpu, 13->13->9 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 9 @10.168s 0%: 0.70+5.3+0.056 ms clock, 11+12/15/0.12+0.91 ms cpu, 18->20->11 MB, 19 MB goal, 1 MB stacks, 0 MB globals, 16 P
gc 10 @10.181s 0%: 0.32+4.0+0.10 ms clock, 5.2+13/10/0+1.6 ms cpu, 21->22->11 MB, 25 MB goal, 1 MB stacks, 0 MB globals, 16 P
gc 11 @10.868s 0%: 0.16+2.0+0.008 ms clock, 2.5+5.2/6.9/8.7+0.14 ms cpu, 26->26->19 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 12 @11.141s 0%: 0.10+2.3+0.055 ms clock, 1.6+0.23/7.3/14+0.88 ms cpu, 37->37->21 MB, 41 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 13 @11.366s 0%: 0.94+3.0+0.051 ms clock, 15+0.19/8.1/13+0.82 ms cpu, 40->40->22 MB, 43 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 14 @11.940s 0%: 0.81+2.0+0.047 ms clock, 12+0.29/7.1/13+0.76 ms cpu, 41->41->22 MB, 45 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 15 @12.879s 0%: 0.45+2.8+0.084 ms clock, 7.3+0.18/6.8/14+1.3 ms cpu, 43->43->22 MB, 45 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 16 @13.172s 0%: 0.052+2.5+0.089 ms clock, 0.83+0.21/8.1/13+1.4 ms cpu, 45->45->23 MB, 46 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 17 @13.453s 0%: 0.22+3.7+0.069 ms clock, 3.5+0.21/7.6/14+1.1 ms cpu, 46->47->23 MB, 47 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 18 @13.867s 0%: 0.14+2.4+0.080 ms clock, 2.2+0.17/7.2/14+1.2 ms cpu, 47->47->23 MB, 48 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 19 @14.120s 0%: 0.051+3.1+0.047 ms clock, 0.82+0.63/7.6/14+0.75 ms cpu, 48->49->25 MB, 49 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 20 @14.352s 0%: 0.055+2.8+0.007 ms clock, 0.88+0.14/6.4/12+0.11 ms cpu, 50->51->20 MB, 52 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 21 @14.490s 0%: 0.12+2.7+0.052 ms clock, 1.9+0.14/6.7/12+0.84 ms cpu, 41->42->13 MB, 42 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 22 @14.528s 0%: 0.12+1.8+0.051 ms clock, 2.0+0.22/6.5/12+0.81 ms cpu, 27->27->19 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 23 @15.572s 0%: 0.14+2.1+0.078 ms clock, 2.2+0.083/6.6/12+1.2 ms cpu, 39->39->20 MB, 40 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 24 @15.737s 0%: 0.053+1.5+0.076 ms clock, 0.85+0.092/5.3/12+1.2 ms cpu, 40->41->20 MB, 42 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 25 @15.963s 0%: 0.60+2.4+0.082 ms clock, 9.6+0.067/6.4/11+1.3 ms cpu, 39->40->12 MB, 41 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 26 @28.410s 0%: 0.18+1.4+0.004 ms clock, 2.9+0.064/4.8/9.9+0.072 ms cpu, 24->25->13 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 27 @28.530s 0%: 0.054+2.2+0.051 ms clock, 0.86+0/5.8/8.6+0.82 ms cpu, 27->27->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 28 @28.638s 0%: 0.044+1.7+0.052 ms clock, 0.71+0.063/5.2/9.0+0.84 ms cpu, 27->27->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 29 @28.771s 0%: 0.043+1.7+0.047 ms clock, 0.69+0.087/5.3/10+0.75 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 30 @29.766s 0%: 0.11+2.1+0.087 ms clock, 1.8+0/6.7/9.7+1.3 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 31 @34.550s 0%: 0.054+1.9+0.004 ms clock, 0.87+0.062/5.3/9.8+0.072 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 32 @34.665s 0%: 0.046+1.5+0.051 ms clock, 0.75+0.057/4.9/10+0.83 ms cpu, 29->29->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 33 @34.779s 0%: 0.043+1.4+0.008 ms clock, 0.70+0.076/4.7/10+0.13 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 34 @34.915s 0%: 0.12+1.7+0.010 ms clock, 1.9+0.10/5.2/10+0.16 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 35 @35.284s 0%: 0.052+1.4+0.005 ms clock, 0.84+0.072/4.7/10+0.081 ms cpu, 31->31->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 36 @35.414s 0%: 0.11+1.7+0.047 ms clock, 1.9+0.095/6.0/10+0.75 ms cpu, 31->32->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 37 @35.544s 0%: 0.049+2.2+0.055 ms clock, 0.79+0.081/6.7/11+0.89 ms cpu, 32->32->17 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 38 @35.695s 0%: 0.10+2.3+0.004 ms clock, 1.6+0.058/5.6/9.6+0.077 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 39 @35.876s 0%: 0.14+2.4+0.047 ms clock, 2.2+0.073/6.0/9.6+0.75 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 40 @35.997s 0%: 0.046+2.2+0.006 ms clock, 0.74+0.064/6.1/10+0.11 ms cpu, 34->35->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 41 @36.117s 0%: 0.10+2.3+0.046 ms clock, 1.7+0.058/5.8/10+0.74 ms cpu, 35->35->17 MB, 36 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 42 @36.237s 0%: 0.039+2.4+0.048 ms clock, 0.63+0.069/4.9/11+0.77 ms cpu, 35->35->18 MB, 36 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 43 @36.362s 0%: 0.038+1.9+0.020 ms clock, 0.61+0.077/5.8/10+0.33 ms cpu, 36->36->18 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 44 @36.488s 0%: 0.031+2.5+0.008 ms clock, 0.49+0.079/5.7/9.7+0.13 ms cpu, 36->37->18 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 82 @2430.001s 0%: 0.029+0.17+0.003 ms clock, 0.11+0/0.16/0.41+0.012 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 45 @39.116s 0%: 0.50+2.7+0.089 ms clock, 8.0+0.24/6.5/10+1.4 ms cpu, 37->37->19 MB, 38 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 46 @39.838s 0%: 0.053+2.5+0.051 ms clock, 0.85+0.078/6.8/11+0.81 ms cpu, 38->38->20 MB, 39 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 47 @39.957s 0%: 0.059+1.4+0.055 ms clock, 0.94+0.11/4.5/10+0.88 ms cpu, 40->40->13 MB, 41 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 48 @40.036s 0%: 0.041+1.7+0.056 ms clock, 0.66+0.066/5.5/8.8+0.91 ms cpu, 27->28->7 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 49 @40.061s 0%: 0.030+2.1+0.059 ms clock, 0.49+0.060/6.3/9.7+0.94 ms cpu, 13->14->7 MB, 15 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 50 @40.083s 0%: 0.075+1.7+0.048 ms clock, 1.2+0.17/5.0/8.8+0.78 ms cpu, 13->14->7 MB, 15 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 51 @40.104s 0%: 0.12+1.5+0.081 ms clock, 1.9+0.049/4.9/9.6+1.3 ms cpu, 14->15->9 MB, 16 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 52 @40.115s 0%: 0.18+5.4+0.15 ms clock, 2.9+0.51/11/16+2.4 ms cpu, 17->18->13 MB, 19 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 53 @40.157s 0%: 0.083+1.8+0.092 ms clock, 1.3+0.078/5.2/8.5+1.4 ms cpu, 26->26->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 54 @40.847s 0%: 0.052+1.5+0.050 ms clock, 0.83+0.060/4.6/10+0.80 ms cpu, 26->26->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 55 @41.094s 0%: 0.14+2.5+0.046 ms clock, 2.2+0.14/6.5/10+0.74 ms cpu, 25->26->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 56 @41.191s 0%: 0.80+1.4+0.050 ms clock, 12+0.062/4.3/10+0.80 ms cpu, 26->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 57 @41.464s 0%: 0.053+1.5+0.004 ms clock, 0.85+0.069/4.7/10+0.078 ms cpu, 27->27->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 58 @41.838s 0%: 0.053+1.4+0.005 ms clock, 0.85+0.067/4.8/11+0.084 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 59 @41.993s 0%: 0.049+2.4+0.065 ms clock, 0.79+0.071/5.2/9.7+1.0 ms cpu, 29->29->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 60 @42.457s 0%: 0.052+1.7+0.050 ms clock, 0.83+0.080/5.5/10+0.81 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 61 @42.654s 0%: 0.053+1.7+0.095 ms clock, 0.85+0.064/4.9/10+1.5 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 62 @42.936s 0%: 0.051+1.7+0.049 ms clock, 0.83+0.079/5.5/10+0.79 ms cpu, 30->30->15 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 63 @43.177s 0%: 0.050+1.6+0.057 ms clock, 0.81+0.068/5.5/11+0.92 ms cpu, 31->31->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 64 @43.333s 0%: 0.12+2.3+0.005 ms clock, 2.0+0.061/5.8/9.8+0.094 ms cpu, 32->32->17 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 65 @43.477s 0%: 0.051+2.0+0.004 ms clock, 0.82+0.067/6.6/11+0.076 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 66 @43.619s 0%: 0.14+2.0+0.10 ms clock, 2.3+0.058/5.3/10+1.7 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 67 @44.696s 0%: 0.053+1.4+0.006 ms clock, 0.85+0.073/4.6/10+0.099 ms cpu, 34->35->14 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 68 @44.768s 0%: 0.034+1.4+0.004 ms clock, 0.55+0.051/4.6/10+0.075 ms cpu, 28->28->6 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 69 @44.814s 0%: 0.034+1.7+0.048 ms clock, 0.55+0.071/4.9/10+0.77 ms cpu, 13->13->6 MB, 13 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 70 @44.845s 0%: 0.12+3.1+0.12 ms clock, 1.9+0/5.4/11+2.0 ms cpu, 17->17->12 MB, 17 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 71 @44.966s 0%: 0.086+1.5+0.005 ms clock, 1.3+0.10/4.7/10+0.089 ms cpu, 24->24->13 MB, 25 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 72 @45.108s 0%: 0.16+2.1+0.082 ms clock, 2.5+0.16/5.8/9.7+1.3 ms cpu, 26->26->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 73 @45.260s 0%: 0.10+1.3+0.005 ms clock, 1.6+0.058/4.3/10+0.094 ms cpu, 27->27->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 74 @45.463s 0%: 0.045+1.5+0.004 ms clock, 0.73+0.11/4.9/9.4+0.074 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 75 @45.641s 0%: 0.14+1.7+0.005 ms clock, 2.2+0.063/5.0/10+0.088 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 76 @45.797s 0%: 0.039+1.3+0.006 ms clock, 0.63+0.067/4.6/10+0.097 ms cpu, 29->29->13 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 77 @45.860s 0%: 0.030+1.1+0.004 ms clock, 0.48+0.85/3.8/9.2+0.075 ms cpu, 29->30->11 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
8GB image with `GODEBUG=gctrace=1`, parallelism set to 0 ( existing code )
gc 6 @6.173s 0%: 0.044+1.4+0.051 ms clock, 0.70+0.15/4.5/9.4+0.82 ms cpu, 8->8->5 MB, 9 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 7 @6.557s 0%: 0.12+1.6+0.004 ms clock, 1.9+0.42/5.7/7.0+0.070 ms cpu, 11->13->8 MB, 12 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 8 @12.802s 0%: 0.16+1.8+0.096 ms clock, 2.5+0.90/5.1/7.6+1.5 ms cpu, 18->19->15 MB, 18 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 9 @13.092s 0%: 0.11+1.2+0.041 ms clock, 1.8+0.095/4.3/9.7+0.67 ms cpu, 29->29->15 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 10 @13.231s 0%: 0.047+1.2+0.054 ms clock, 0.76+0.11/4.0/9.6+0.87 ms cpu, 27->27->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 11 @13.707s 0%: 0.047+1.3+0.056 ms clock, 0.76+0.21/4.3/9.9+0.90 ms cpu, 28->28->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 111 @3300.010s 0%: 0.051+0.19+0.002 ms clock, 0.20+0/0.17/0.44+0.011 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 12 @13.868s 0%: 0.15+1.9+0.004 ms clock, 2.5+0.088/5.2/9.2+0.077 ms cpu, 28->28->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 13 @14.677s 0%: 0.22+1.8+0.095 ms clock, 3.6+0.13/5.4/10+1.5 ms cpu, 31->31->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 14 @14.908s 0%: 0.21+1.9+1.0 ms clock, 3.4+0.14/6.6/10+16 ms cpu, 32->33->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 15 @15.091s 0%: 0.18+2.5+0.058 ms clock, 2.9+0.085/6.7/11+0.94 ms cpu, 33->34->17 MB, 34 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 16 @15.312s 0%: 0.053+2.4+0.050 ms clock, 0.86+0.080/6.0/9.8+0.80 ms cpu, 34->35->18 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 17 @15.650s 0%: 0.049+2.1+0.005 ms clock, 0.79+0.063/5.7/10+0.088 ms cpu, 36->36->18 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 18 @15.829s 0%: 0.11+3.2+0.058 ms clock, 1.9+0.084/6.7/9.4+0.93 ms cpu, 36->37->18 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 19 @16.008s 0%: 0.050+2.9+0.005 ms clock, 0.80+0.070/7.4/10+0.080 ms cpu, 37->38->20 MB, 38 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 20 @16.184s 0%: 0.049+1.8+0.047 ms clock, 0.79+0.11/4.9/9.7+0.76 ms cpu, 40->41->15 MB, 42 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 21 @16.289s 0%: 0.052+2.5+0.005 ms clock, 0.84+0.087/5.3/9.6+0.095 ms cpu, 31->31->8 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 22 @16.338s 0%: 0.28+1.3+0.11 ms clock, 4.5+2.0/4.6/8.1+1.8 ms cpu, 16->22->13 MB, 21 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 23 @17.335s 0%: 0.073+1.2+0.026 ms clock, 1.1+0.20/4.2/9.2+0.42 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 24 @17.461s 0%: 0.051+1.5+0.049 ms clock, 0.82+0.094/4.6/5.8+0.78 ms cpu, 29->29->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 25 @17.592s 0%: 0.11+1.5+0.004 ms clock, 1.8+0.047/4.6/9.6+0.069 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 26 @17.779s 0%: 0.048+1.3+0.047 ms clock, 0.77+0.019/4.2/9.2+0.75 ms cpu, 31->31->8 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 112 @3330.001s 0%: 0.033+0.17+0.003 ms clock, 0.13+0/0.16/0.41+0.015 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 27 @58.347s 0%: 0.14+1.9+0.056 ms clock, 2.2+0/5.3/8.8+0.90 ms cpu, 17->17->14 MB, 17 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 28 @58.451s 0%: 0.13+1.5+0.049 ms clock, 2.0+0.079/4.9/9.6+0.78 ms cpu, 29->30->12 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 29 @58.573s 0%: 0.073+1.4+0.046 ms clock, 1.1+0.11/4.5/9.1+0.74 ms cpu, 25->25->13 MB, 25 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 30 @58.673s 0%: 0.050+1.4+0.046 ms clock, 0.80+0.10/4.5/9.1+0.74 ms cpu, 26->26->13 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 31 @58.818s 0%: 0.050+1.2+0.048 ms clock, 0.80+0.071/4.1/9.3+0.77 ms cpu, 26->26->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 32 @62.929s 0%: 0.13+1.7+0.071 ms clock, 2.1+0.083/5.1/9.1+1.1 ms cpu, 27->27->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 33 @64.699s 0%: 0.11+1.8+0.053 ms clock, 1.8+0.086/4.7/9.0+0.85 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 34 @64.801s 0%: 0.082+2.2+0.099 ms clock, 1.3+0.048/5.3/9.0+1.5 ms cpu, 27->28->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 35 @64.910s 0%: 0.050+1.8+0.051 ms clock, 0.81+0.11/5.2/9.2+0.83 ms cpu, 28->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 36 @65.038s 0%: 0.079+1.7+0.050 ms clock, 1.2+0.056/5.2/9.2+0.81 ms cpu, 29->29->14 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 37 @65.383s 0%: 0.14+1.9+0.004 ms clock, 2.3+0.069/6.1/9.2+0.076 ms cpu, 29->30->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 38 @65.507s 0%: 0.050+1.7+0.060 ms clock, 0.81+0.15/5.2/9.4+0.96 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 39 @65.636s 0%: 0.050+2.4+0.054 ms clock, 0.81+0.15/6.3/9.4+0.86 ms cpu, 30->31->16 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 40 @65.770s 0%: 0.050+2.1+0.052 ms clock, 0.80+0/5.3/10+0.83 ms cpu, 32->32->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 41 @65.942s 0%: 0.052+2.2+0.054 ms clock, 0.83+0.083/5.8/9.5+0.87 ms cpu, 32->32->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 42 @66.057s 0%: 0.047+1.9+0.052 ms clock, 0.75+0.054/5.3/9.5+0.84 ms cpu, 32->33->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 43 @66.171s 0%: 0.040+2.0+0.004 ms clock, 0.65+0.068/5.6/10+0.077 ms cpu, 33->34->17 MB, 34 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 44 @66.290s 0%: 0.037+1.8+0.050 ms clock, 0.60+0.079/4.9/9.2+0.80 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 45 @66.407s 0%: 0.063+2.2+0.046 ms clock, 1.0+0.20/5.5/9.6+0.74 ms cpu, 34->34->17 MB, 35 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 46 @66.527s 0%: 0.048+2.4+0.046 ms clock, 0.78+0.078/6.3/8.2+0.73 ms cpu, 35->35->17 MB, 36 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 47 @69.148s 0%: 0.058+2.5+0.075 ms clock, 0.93+0.17/5.9/10+1.2 ms cpu, 35->35->18 MB, 36 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 48 @69.296s 0%: 0.31+3.0+0.055 ms clock, 5.0+0.057/7.4/8.7+0.88 ms cpu, 36->36->19 MB, 37 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 49 @70.016s 0%: 0.052+1.4+0.050 ms clock, 0.84+0.16/4.4/9.9+0.81 ms cpu, 39->39->13 MB, 40 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 50 @70.060s 0%: 0.092+1.5+0.054 ms clock, 1.4+0.15/4.7/8.2+0.87 ms cpu, 26->27->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 51 @70.111s 0%: 0.099+1.5+0.054 ms clock, 1.5+0.15/4.8/8.5+0.87 ms cpu, 26->27->6 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 52 @70.134s 0%: 0.032+1.5+0.008 ms clock, 0.51+0.17/4.2/9.2+0.13 ms cpu, 13->13->7 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 53 @70.159s 0%: 0.083+1.4+0.084 ms clock, 1.3+0.12/4.2/8.5+1.3 ms cpu, 13->14->7 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 54 @70.179s 0%: 0.029+1.4+0.053 ms clock, 0.46+0.053/4.5/8.7+0.85 ms cpu, 13->13->6 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 55 @70.211s 0%: 0.091+1.2+0.004 ms clock, 1.4+0.89/4.4/8.6+0.068 ms cpu, 12->13->6 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 56 @70.217s 0%: 0.070+1.2+0.080 ms clock, 1.1+0.25/4.0/8.7+1.2 ms cpu, 12->13->11 MB, 14 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 57 @70.296s 0%: 0.14+2.2+0.10 ms clock, 2.3+0/5.2/9.3+1.7 ms cpu, 23->23->12 MB, 24 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 58 @71.177s 0%: 0.051+1.3+0.007 ms clock, 0.82+0.049/4.1/9.3+0.11 ms cpu, 24->24->12 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 59 @71.259s 0%: 0.048+1.5+0.048 ms clock, 0.77+0.043/4.5/8.9+0.77 ms cpu, 24->25->13 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 60 @71.383s 0%: 0.10+2.2+0.11 ms clock, 1.7+0.048/5.6/9.5+1.8 ms cpu, 25->26->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 61 @71.904s 0%: 0.052+2.0+0.005 ms clock, 0.83+0.11/5.4/9.4+0.081 ms cpu, 27->27->13 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 62 @72.029s 0%: 0.048+1.5+0.046 ms clock, 0.78+0.18/4.2/9.3+0.74 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 63 @72.482s 0%: 0.15+1.8+0.008 ms clock, 2.4+0.060/4.9/9.5+0.14 ms cpu, 27->28->14 MB, 29 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 64 @72.640s 0%: 0.052+1.4+0.057 ms clock, 0.83+0.065/4.4/10+0.91 ms cpu, 28->28->14 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 65 @72.849s 0%: 0.051+1.5+0.049 ms clock, 0.82+0.055/4.8/9.8+0.78 ms cpu, 29->29->14 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 66 @73.122s 0%: 0.050+2.2+0.085 ms clock, 0.81+0.11/5.1/9.3+1.3 ms cpu, 29->30->15 MB, 30 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 67 @73.342s 0%: 0.14+1.6+0.055 ms clock, 2.3+0.087/4.8/10+0.88 ms cpu, 30->30->15 MB, 31 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 68 @73.502s 0%: 0.14+1.8+0.004 ms clock, 2.2+0.16/6.0/9.7+0.074 ms cpu, 31->31->16 MB, 32 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 69 @73.641s 0%: 0.050+2.3+0.005 ms clock, 0.81+0.053/5.9/9.2+0.081 ms cpu, 32->33->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 113 @3360.001s 0%: 0.048+0.46+0.002 ms clock, 0.19+0/0.43/0.17+0.011 ms cpu, 0->0->0 MB, 1 MB goal, 0 MB stacks, 0 MB globals, 4 P (forced)
gc 70 @73.786s 0%: 0.052+3.2+0.058 ms clock, 0.83+0.063/7.2/10+0.93 ms cpu, 32->33->16 MB, 33 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 71 @74.857s 0%: 0.055+1.4+0.048 ms clock, 0.88+0.091/4.3/9.3+0.78 ms cpu, 33->33->13 MB, 34 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 72 @74.925s 0%: 0.042+1.3+0.005 ms clock, 0.67+0.071/4.1/9.2+0.080 ms cpu, 26->26->5 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 73 @74.966s 0%: 0.036+1.2+0.049 ms clock, 0.59+0.067/3.8/9.1+0.78 ms cpu, 11->11->5 MB, 12 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 74 @75.002s 0%: 0.10+1.3+0.005 ms clock, 1.6+1.6/4.6/7.5+0.081 ms cpu, 11->12->6 MB, 12 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 75 @75.006s 0%: 0.020+1.3+0.10 ms clock, 0.32+0.053/3.8/9.0+1.6 ms cpu, 11->12->11 MB, 13 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 76 @75.119s 0%: 0.049+1.4+0.067 ms clock, 0.78+0.10/4.2/9.3+1.0 ms cpu, 22->22->12 MB, 24 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 77 @75.222s 0%: 0.11+1.7+0.077 ms clock, 1.9+0.086/5.0/10+1.2 ms cpu, 24->24->12 MB, 25 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 78 @75.342s 0%: 0.049+1.6+0.046 ms clock, 0.78+0.19/5.4/10+0.73 ms cpu, 24->25->13 MB, 26 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 79 @75.552s 0%: 0.12+1.3+0.056 ms clock, 2.0+0.052/4.2/9.6+0.90 ms cpu, 25->25->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 80 @75.734s 0%: 0.048+1.5+0.047 ms clock, 0.77+0.052/4.5/9.0+0.76 ms cpu, 27->27->13 MB, 27 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 81 @75.860s 0%: 0.31+1.4+0.093 ms clock, 5.0+0.055/4.5/9.9+1.4 ms cpu, 27->27->14 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 82 @75.985s 0%: 0.044+1.3+0.051 ms clock, 0.70+0.045/3.9/8.8+0.82 ms cpu, 28->28->5 MB, 28 MB goal, 0 MB stacks, 0 MB globals, 16 P
gc 83 @76.008s 0%: 0.031+1.3+0.046 ms clock, 0.50+0.70/4.5/9.9+0.75 ms cpu, 15->15->11 MB, 15 MB goal, 0 MB stacks, 0 MB globals, 16 P

Grpc tracing screenshots from the same run (8GB image with `GODEBUG=gctrace=1`, parallelism set to 110 and chunksize set to 32): Screenshot 2024-05-13 at 16 39 45

Screenshot from another run for a ~27GB image, after a while, all chunks seem to take the same amount of time, ~22s, we've probably reached the writing speed burst limit, and are slowly taking more time to do things:

Screenshot 2024-05-13 at 16 00 21

@azr

This comment was marked as outdated.

@azr azr force-pushed the azr/parallel-layer-fetch branch 10 times, most recently from c13969f to 8fc47db Compare May 21, 2024 15:07
@swagatbora90
Copy link
Contributor

@azr Thanks for adding the performance numbers. I ran some tests as well using your patch and the memory usage looks better than what I saw in the htcat implementation specially with high parallelism count.

However, I do observe that increasing parallelism does not yield better latency and may lead to higher memory usage (I think there is a number of other factors to consider here mainly type of instance used for testing, network bandwidth). I tried to limit the test to a single image with a single layer and fixing the chunk size to 20 MB. A lower parallelism count(3 or 4) may be preferable than setting parallelism to upwards of 10.

Using a c7.12xlarge instance to pull a 3GB single layer image from ECR private repo.

Parallelism Count Chunk Size(MB) Total Download time(sec) Network Pull time(sec) Download Speed(MBPS) Max Memory used (from cgroups memory.peak)
1 20 65.9 51.78 53 15.9
2 20 39.3 32.08 88.8 17.5
3 20 36.6 22.57 95.4 18
4 20 36.8 16.82 94.8 17
5 20 36.8 14.78 94.8 17
10 20 36.9 13.92 94.6 20
20 20 36.9 14.31 94.6 22
30 20 36.7 14.97 95.1 26
40 20 36.8 14.21 94.8 31
50 20 36.7 14.3 95.1 36
100 20 36.8 14.91 94.8 52

multipart1

Also the network download time was much faster (see Network Pull time) (~15sec) while containerd took additional ~20secs to complete the pull (before it started unpacking). I calculated the Network Download time by periodically calling /containerd.services.content.v1.Content/ListStatuses" filtering the layer digest and checking when the content.Offset == content.Size. I am still not sure why containerd takes so much time after it has already committed to the content store, pprof does not show any significant cpu usage by containerd either during this time. Are we blocked on GC or some underlying syscall(fp.Sync) to complete?

@swagatbora90
Copy link
Contributor

@dmcgowan @kzys

@dmcgowan dmcgowan added this to the 2.1 milestone May 22, 2024
@azr azr force-pushed the azr/parallel-layer-fetch branch 3 times, most recently from 4e43047 to 4bcf933 Compare April 24, 2025 11:42
Signed-off-by: Adrien Delorme <[email protected]>
@azr azr force-pushed the azr/parallel-layer-fetch branch from 4bcf933 to a196ee6 Compare April 24, 2025 12:29
@estesp
Copy link
Member

estesp commented Apr 24, 2025

/test pull-containerd-k8s-e2e-ec2

Copy link
Member

@dmcgowan dmcgowan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just get this in, there are still maybe a few interface tweaks we can make before final release that won't affect the functionality.

r.Release(1)
for range parallelism {
go func() {
for i := range queue { // first in first out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if there is competive condition in http2 stdlib on window updating.
Will check this part later. it's not blocker.

@github-project-automation github-project-automation bot moved this from Needs Update to Review In Progress in Pull Request Review Apr 24, 2025
Signed-off-by: Adrien Delorme <[email protected]>
@fuweid fuweid added this pull request to the merge queue Apr 24, 2025
@dmcgowan dmcgowan changed the title perf(pull): Multipart layer fetch Multipart layer fetch Apr 24, 2025
Merged via the queue into containerd:main with commit ef7bdf1 Apr 24, 2025
102 of 103 checks passed
@github-project-automation github-project-automation bot moved this from Review In Progress to Done in Pull Request Review Apr 24, 2025
@dmcgowan dmcgowan added the area/distribution Image Distribution label Apr 25, 2025
@cartermckinnon
Copy link

@azr thank you for all the hard work on this, it's an awesome improvement! 🙏

@azr
Copy link
Contributor Author

azr commented May 14, 2025

Aw thanks ! ❤️ You're welcome !! 🚀

@azr azr deleted the azr/parallel-layer-fetch branch May 14, 2025 06:47
mansikulkarni96 added a commit to mansikulkarni96/containerd that referenced this pull request Dec 4, 2025
containerd 2.1.0

Welcome to the v2.1.0 release of containerd!

The first minor release of containerd 2.x focuses on continued stability alongside
new features and improvements. This is the first time-based released for containerd.
Most the feature set and core functionality has long been stable and harderened in production
environments, so now we transition to a balance of timely delivery of new functionality
with the same high confidence in stability and performance.

* Add no_sync option to boost boltDB performance on ephemeral environments ([containerd#10745](containerd#10745))
* Add content create event ([containerd#11006](containerd#11006))
* Erofs snapshotter and differ ([containerd#10705](containerd#10705))

* Update CRI to use transfer service for image pull by default ([containerd#8515](containerd#8515))
* Support multiple cni plugin bin dirs ([containerd#11311](containerd#11311))
* Support container restore through CRI/Kubernetes ([containerd#10365](containerd#10365))
* Add OCI/Image Volume Source support ([containerd#10579](containerd#10579))
* Enable Writable cgroups for unprivileged containers ([containerd#11131](containerd#11131))
* Fix recursive RLock() mutex acquisition ([containerd/go-cni#126](containerd/go-cni#126))
* Support CNI STATUS Verb ([containerd/go-cni#123](containerd/go-cni#123))

* Retry last registry host on 50x responses ([containerd#11484](containerd#11484))
* Multipart layer fetch ([containerd#10177](containerd#10177))
* Enable HTTP debug and trace for transfer based puller ([containerd#10762](containerd#10762))
* Add support for unpacking custom media types  ([containerd#11744](containerd#11744))
* Add dial timeout field to hosts toml configuration ([containerd#11106](containerd#11106))

* Expose Pod assigned IPs to NRI plugins ([containerd#10921](containerd#10921))

* Support multiple uid/gid mappings ([containerd#10722](containerd#10722))
* Fix race between serve and immediate shutdown on the server ([containerd/ttrpc#175](containerd/ttrpc#175))

* Update FreeBSD defaults and re-organize platform defaults ([containerd#11017](containerd#11017))

* Postpone cri config deprecations to v2.2 ([containerd#11684](containerd#11684))
* Remove deprecated dynamic library plugins ([containerd#11683](containerd#11683))
* Remove the support for Schema 1 images ([containerd#11681](containerd#11681))

Please try out the release binaries and report any issues at
https://github.com/containerd/containerd/issues.

* Derek McGowan
* Phil Estes
* Akihiro Suda
* Maksym Pavlenko
* Jin Dong
* Wei Fu
* Sebastiaan van Stijn
* Samuel Karp
* Mike Brown
* Adrien Delorme
* Austin Vazquez
* Akhil Mohan
* Kazuyoshi Kato
* Henry Wang
* Gao Xiang
* ningmingxiao
* Krisztian Litkey
* Yang Yang
* Archit Kulkarni
* Chris Henzie
* Iceber Gu
* Alexey Lunev
* Antonio Ojea
* Davanum Srinivas
* Marat Radchenko
* Michael Zappa
* Paweł Gronowski
* Rodrigo Campos
* Alberto Garcia Hierro
* Amit Barve
* Andrey Smirnov
* Divya
* Etienne Champetier
* Kirtana Ashok
* Philip Laine
* QiPing Wan
* fengwei0328
* zounengren
* Adrian Reber
* Alfred Wingate
* Amal Thundiyil
* Athos Ribeiro
* Brian Goff
* Cesar Talledo
* ChengyuZhu6
* Chongyi Zheng
* Craig Ingram
* Danny Canter
* David Son
* Fupan Li
* HirazawaUi
* Jing Xu
* Jonathan A. Sternberg
* Jose Fernandez
* Kaita Nakamura
* Kohei Tokunaga
* Lei Liu
* Marco Visin
* Mike Baynton
* Qiyuan Liang
* Sameer
* Shiming Zhang
* Swagat Bora
* Teresaliu
* Tony Fang
* Tõnis Tiigi
* Vered Rosen
* Vinayak Goyal
* bo.jiang
* chriskery
* luchenhan
* mahmut
* zhaixiaojuan

* **github.com/Microsoft/hcsshim**                                                 v0.12.9 -> v0.13.0-rc.3
* **github.com/cilium/ebpf**                                                       v0.11.0 -> v0.16.0
* **github.com/containerd/cgroups/v3**                                             v3.0.3 -> v3.0.5
* **github.com/containerd/containerd/api**                                         v1.8.0 -> v1.9.0
* **github.com/containerd/continuity**                                             v0.4.4 -> v0.4.5
* **github.com/containerd/go-cni**                                                 v1.1.10 -> v1.1.12
* **github.com/containerd/imgcrypt/v2**                                            v2.0.0-rc.1 -> v2.0.1
* **github.com/containerd/otelttrpc**                                              ea5083fda723 -> v0.1.0
* **github.com/containerd/platforms**                                              v1.0.0-rc.0 -> v1.0.0-rc.1
* **github.com/containerd/ttrpc**                                                  v1.2.6 -> v1.2.7
* **github.com/containerd/typeurl/v2**                                             v2.2.2 -> v2.2.3
* **github.com/containernetworking/cni**                                           v1.2.3 -> v1.3.0
* **github.com/containernetworking/plugins**                                       v1.5.1 -> v1.7.1
* **github.com/containers/ocicrypt**                                               v1.2.0 -> v1.2.1
* **github.com/davecgh/go-spew**                                                   d8f796af33cc -> v1.1.1
* **github.com/fsnotify/fsnotify**                                                 v1.7.0 -> v1.9.0
* **github.com/go-jose/go-jose/v4**                                                v4.0.4 -> v4.0.5
* **github.com/google/go-cmp**                                                     v0.6.0 -> v0.7.0
* **github.com/grpc-ecosystem/grpc-gateway/v2**                                    v2.22.0 -> v2.26.1
* **github.com/klauspost/compress**                                                v1.17.11 -> v1.18.0
* **github.com/mdlayher/socket**                                                   v0.4.1 -> v0.5.1
* **github.com/moby/spdystream**                                                   v0.4.0 -> v0.5.0
* **github.com/moby/sys/user**                                                     v0.3.0 -> v0.4.0
* **github.com/opencontainers/image-spec**                                         v1.1.0 -> v1.1.1
* **github.com/opencontainers/runtime-spec**                                       v1.2.0 -> v1.2.1
* **github.com/opencontainers/selinux**                                            v1.11.1 -> v1.12.0
* **github.com/pelletier/go-toml/v2**                                              v2.2.3 -> v2.2.4
* **github.com/petermattis/goid**                                                  4fcff4a6cae7 **_new_**
* **github.com/pmezard/go-difflib**                                                5d4384ee4fb2 -> v1.0.0
* **github.com/prometheus/client_golang**                                          v1.20.5 -> v1.22.0
* **github.com/prometheus/common**                                                 v0.55.0 -> v0.62.0
* **github.com/sasha-s/go-deadlock**                                               v0.3.5 **_new_**
* **github.com/smallstep/pkcs7**                                                   v0.1.1 **_new_**
* **github.com/stretchr/testify**                                                  v1.9.0 -> v1.10.0
* **github.com/tchap/go-patricia/v2**                                              v2.3.1 -> v2.3.2
* **github.com/urfave/cli/v2**                                                     v2.27.5 -> v2.27.6
* **github.com/vishvananda/netlink**                                               v1.3.0 -> 0e7078ed04c8
* **github.com/vishvananda/netns**                                                 v0.0.4 -> v0.0.5
* **go.etcd.io/bbolt**                                                             v1.3.11 -> v1.4.0
* **go.opentelemetry.io/auto/sdk**                                                 v1.1.0 **_new_**
* **go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc**  v0.56.0 -> v0.60.0
* **go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp**                v0.56.0 -> v0.60.0
* **go.opentelemetry.io/otel**                                                     v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/exporters/otlp/otlptrace**                            v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc**              v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp**              v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/metric**                                              v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/sdk**                                                 v1.31.0 -> v1.35.0
* **go.opentelemetry.io/otel/trace**                                               v1.31.0 -> v1.35.0
* **go.opentelemetry.io/proto/otlp**                                               v1.3.1 -> v1.5.0
* **golang.org/x/crypto**                                                          v0.28.0 -> v0.36.0
* **golang.org/x/exp**                                                             aacd6d4b4611 -> 2d47ceb2692f
* **golang.org/x/mod**                                                             v0.21.0 -> v0.24.0
* **golang.org/x/net**                                                             v0.30.0 -> v0.38.0
* **golang.org/x/oauth2**                                                          v0.22.0 -> v0.27.0
* **golang.org/x/sync**                                                            v0.8.0 -> v0.14.0
* **golang.org/x/sys**                                                             v0.26.0 -> v0.33.0
* **golang.org/x/term**                                                            v0.25.0 -> v0.30.0
* **golang.org/x/text**                                                            v0.19.0 -> v0.23.0
* **golang.org/x/time**                                                            v0.3.0 -> v0.7.0
* **google.golang.org/genproto/googleapis/api**                                    5fefd90f89a9 -> 56aae31c358a
* **google.golang.org/genproto/googleapis/rpc**                                    324edc3d5d38 -> 56aae31c358a
* **google.golang.org/grpc**                                                       v1.67.1 -> v1.72.0
* **google.golang.org/protobuf**                                                   v1.35.1 -> v1.36.6
* **k8s.io/api**                                                                   v0.31.2 -> v0.32.3
* **k8s.io/apimachinery**                                                          v0.31.2 -> v0.32.3
* **k8s.io/apiserver**                                                             v0.31.2 -> v0.32.3
* **k8s.io/client-go**                                                             v0.31.2 -> v0.32.3
* **k8s.io/cri-api**                                                               v0.31.2 -> v0.32.3
* **k8s.io/kubelet**                                                               v0.31.2 -> v0.32.3
* **k8s.io/utils**                                                                 18e509b52bc8 -> 3ea5e8cea738
* **sigs.k8s.io/json**                                                             bc3834ca7abd -> 9aa6b5e7a4b3
* **sigs.k8s.io/structured-merge-diff/v4**                                         v4.4.1 -> v4.4.2
* **tags.cncf.io/container-device-interface**                                      v0.8.0 -> v1.0.1
* **tags.cncf.io/container-device-interface/specs-go**                             v0.8.0 -> v1.0.0

Previous release can be found at [v2.0.0](https://github.com/containerd/containerd/releases/tag/v2.0.0)
* `containerd-<VERSION>-<OS>-<ARCH>.tar.gz`:         ✅Recommended. Dynamically linked with glibc 2.35 (Ubuntu 22.04).
* `containerd-static-<VERSION>-<OS>-<ARCH>.tar.gz`:  Statically linked. Expected to be used on Linux distributions that do not use glibc >= 2.35. Not position-independent.

In addition to containerd, typically you will have to install [runc](https://github.com/opencontainers/runc/releases)
and [CNI plugins](https://github.com/containernetworking/plugins/releases) from their official sites too.

See also the [Getting Started](https://github.com/containerd/containerd/blob/main/docs/getting-started.md) documentation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Parallelise layer downloads