compose app images dumping and preloading#71
Conversation
doanac
left a comment
There was a problem hiding this comment.
pretty clever way of doing this. nothing major i see.
| archs.append(platform.split('/')[1]) | ||
|
|
||
| with tempfile.TemporaryDirectory() as tmp_dst_dir: | ||
| dump_app_images(os.path.abspath(os.getcwd()), archs, tmp_dst_dir) |
There was a problem hiding this comment.
i think you can replace os.path.abspath(os.getcwd()) with the cur_work_dir you've added on line 146
There was a problem hiding this comment.
line #148 can change a cur dir, so we have to get it once more here. cur_work_dir is needed here to make sure that execution environment comes back to the original cur work dir. But, all that stuff can be removed, I basically need it for unit & manual testing, it allows me running this script from ci-scripts directory and points it to my factory's containers dir.
There was a problem hiding this comment.
this all kind of makes sense. the code might be a little less complex if this test-env specific logic was moved into the the main section, but I'm not sure. this is tested, so i'd say leave it.
| target_name = list(target_json.keys())[0] | ||
| logging.info('Preloading app images for ' + target_name) | ||
| sha = target_json[target_name]['custom']['containers-sha'] | ||
| image_data_dir = os.path.join('/var/cache/bitbake/app-images/', sha) |
There was a problem hiding this comment.
I guess this directory will grow over time. I've added a note in Jira about this so we remember to attempt to prune old data eventually: https://foundriesio.atlassian.net/jira/software/projects/FF/boards/13?selectedIssue=FF-11
There was a problem hiding this comment.
we might be able to reason about what to delete by looking at targets.json and understanding that no new LMP builds would use an existing tarball, so it could be safe to delete. then we'd basically always keep the minimal amout of tarballs around. but that can be added further down the road
There was a problem hiding this comment.
I consider it as a temporal solution, putting image data is simple and straightforward and good for a start, but it has many drawbacks.
- First of all, it requires the container and lmp builds to have access to the same NFS volume;
- secondary, this storage has size limits and is not redundant and so reliable as GCS or S3;
- Once we move towards using ostree as a storage for container images we will have to store images somewhere on the cloud (most likely Treehub backed by S3 for the beginning) so devices have access to it.
| shutil.rmtree(self._container_dir) | ||
|
|
||
|
|
||
| class DockerDaemon: |
There was a problem hiding this comment.
we already have dockerd running when this code gets hit. Do you need your own instance to make sure you have a pristine environment?
There was a problem hiding this comment.
Yes, there are a few reasons for that (the one you mentioned is valid too):
-
a host environment executing the given may have or may not have dockerd running. The given approach makes the implementation less vulnerable/sensitive to a host env, so it can work on a local host, VM or any other container as long as dockerd is installed;
-
The host dockerd's data root (/var/lib/docker/) can be polluted by data of some other images while we need data/files that belongs solely to the target app images;
-
Using
--platformoption requires running dockerd in the experimental mode."--platform" is only supported on a Docker daemon with experimental features enabled. This option is required because we want to dump images of other than a host architecture (dumping arm64's images on amd64 host). -
It allows dumpling images of all target architectures concurrently, multiple instances of DockerDaemon can be running concurrently without interfering each other what can speed up an overall process. Not implemented yet though :)
| def pull(self, platform, url): | ||
| cmd = self._get_cmd(platform, url) | ||
| logger.info('Fetching image: {}'.format(cmd)) | ||
| subprocess.run(cmd.split(), check=True, stdout=None, stderr=None, env=self._env) |
There was a problem hiding this comment.
think you need to check the return value of subprocess.run or do a subprocess.check_call which will raise an error if it fails
There was a problem hiding this comment.
I think check=True param makes it raise an exception if the command/process fails. Anyway, I see that I need to improve the implementation in terms of exception handling, I think I should catch exception within the given module. Thanks.
There was a problem hiding this comment.
Added (forced push) exception handling in the main.
| logger.info('Copying image {} to {}'.format(service_cfg['image'], dst_dir)) | ||
| downloader.pull(arch, service_cfg['image']) | ||
|
|
||
| yield download_app |
There was a problem hiding this comment.
slick use of chaining together some context managers.
There was a problem hiding this comment.
@doanac foundriesio/meta-lmp#195 has been merged, so I suppose we should merge this PR too.
ec252fd to
bbbd3dd
Compare
|
Please add a license/copyright to the new files (not every file got one, something we should fix, but we can get the new ones right). e.g.: |
| def __init__(self, dst_dir: str, containerd_addr: str): | ||
| self.data_root = dst_dir | ||
| self._containerd_addr = containerd_addr | ||
| #self.cmd = '/home/mike/bin/dockerd-rootless.sh' |
There was a problem hiding this comment.
Please remove this line if not really required.
There was a problem hiding this comment.
Just some minor comments, but LGTM, will let @doanac do the final +1 here.
Updated (forced push) according to the comments.
|
Just some minor comments, but LGTM, will let @doanac do the final +1 here. |
7423eb4 to
28b4f3d
Compare
| image_data_dir = os.path.join('/var/cache/bitbake/app-images/', sha) | ||
|
|
||
| arch_map = {'aarch64': 'arm64', 'x86_64': 'amd64', 'arm': 'arm'} | ||
| image_data_tar = os.path.join(image_data_dir, '{}-{}.tar'.format(sha[:7], arch_map[arch])) |
There was a problem hiding this comment.
just noticed this - sha[:7] isn't actually always the case:
8edbdad
There was a problem hiding this comment.
Are you implying to use a full hash/sha value? BTW, we use it in other places, e.g.
(tag is 7 symbol hash).There was a problem hiding this comment.
not the full hash but tag needs to be computed correctly. In the example you shared its actually getting that from the TAG environment variable set in docker-app-publish.sh which uses git log -1 --format=%h. Basically what I'm saying is 99% of the time the short-hash (TAG) is 7 characters. However, we occasionally have exceptions, so you have to use git log -1 --format=%h to ensure you get the correct tag.
There was a problem hiding this comment.
understood. In this case, I should think of something else as the given script is called during lmp build while 'sha' is a sha of corresponding containers repo commit so running git log -1 --format=%h will yield an abbreviated hash of the current lmp commit.
- dump all container images of each app into a tar file, format dockerd:overlay2; - load the dumped images into var/lib/docker during LmP build. Signed-off-by: Mike Sul <[email protected]>
28b4f3d to
0c1a8cd
Compare
dockerd:overlay2;
Signed-off-by: Mike Sul [email protected]