ci: use fuse-overlayfs to reduce runtime storage used by builder#28602
ci: use fuse-overlayfs to reduce runtime storage used by builder#28602thunder-coding wants to merge 19 commits intomasterfrom
Conversation
a7ddb84 to
4235b7c
Compare
| else | ||
| REPOROOT="$(dirname $(readlink -f $0))/../" | ||
| SEC_OPT=" --security-opt seccomp=$REPOROOT/scripts/profile.json" | ||
| SEC_OPT=" --security-opt seccomp=$REPOROOT/scripts/profile.json --security-opt apparmor=_custom-termux-package-builder --cap-add CAP_SYS_ADMIN --device /dev/fuse" |
There was a problem hiding this comment.
@fornwall as far as I know you are the only person using Darwin for building packages locally. Does the fuse work on Darwin? If I am not mistaken, on MacOS systems, docker runs as on a lightweight VM. I'm not sure if apparmor profiles work on it. Can you confirm?
|
I love the idea but I have a few questions.
|
|
Just so we're clear, this change will only require AppArmor to be installed within the container, right? |
as long as
No, NDK is only used when building outside Termux
Could be done, I thought of that earlier before trying to think of fuse. But our buildscripts assume that we do not modify upstream NDK in ~/lib or that in
AppArmor works at the kernel level, it won't be able to work in monitoring the docker container if it is not running on the host.
It's already packaged by Alpine, Gentoo, Guix, openSUSE, Void, etc. So I think it's fine to depend on it? Could go and add a config to not use apparmor for backward compatibility, but I believe that some of the apparmor hardening that is possible is worth it (like setuid/setgid restrictions).
Should be as simple as installing the apparmor package and enabling it, with their bootmanager, and doing a simple reboot. Does not need to have a distro that is making use of it, just needs to package it. Also, I don't think that disabling setuid and setgid calls is possible without apparmor just after changing the builder uid/gid. This should fully forbid sudo inside containers Also this is the beginning of fuse stuff. If this merge doesn't break anything, I'll be further using fuse for making builds much more reproducible by mounting |
Will there be an option to disable this so that I am a very heavy user of |
I'm not saying it's not available, I'm saying asking contributors to enable a security subsystem their distro may not ship and that they may not understand well is a significant burden to contributing. |
I use Gentoo, and Gentoo does have an |
You can go and load the relaxed profile using
Thanks, makes sense. Will be making apparmor optional then! |
Being open about our motivations is something that can definitely help avoid misunderstandings. |
apparmor wasn't needed earlier, but will be required (or let's say highly recommended) if we continue to go with fuse-based approach for Android NDK and my future plans for making builds more reproducible. mount syscalls on Linux do require And the setuid, setgid changes are being made just because let's just do this while we are at it instead of waiting for another occassion |
In case you wonder, both MrAdityaAlok (in an earlier version of buildorder.py changes PR) and fornwall (in Google Play Termux, visible currently) have implemented their own versions of that, but both of them unfortunately have a somewhat inconvenient performance overhead. Yours is the first version of the idea I have heard that sounds to me like it might solve the performance overhead, since it does sound to me like nested fuse-overlayfs is probably a lot faster than actually copying files around, but if you wonder how yours compares to previous attempts, you can look at Google Play Termux termux-packages and consider benchmarking in comparison to it. If your method ends up having extremely low performance overhead to the point that it's clearly about the same speed as the current performance, then benchmarking likely isn't necessary. |
In the long term, if the follow-up change to however, that problem can be resolved (and improved to catch more errors that are currently not caught) if this PR is approved, because then a workflow that tests on-device builds could eventually be added to the main repository. |
|
When MrAdityaAlok made their space optimization for the CI, they posted a detailed result of how much space would be saved, here is what they showed, I found that very helpful, especially for thinking about how much additional space I could safely use during build while designing contributions, if it's not too much trouble, could you run a similar comparison in GitHub Actions for this space optimization? |
029595e to
890f91a
Compare
Didn't do comparision for this PR, but I have seen locally a decrease of ~2.4G in container sizes. So combined with #28627, we should be seeing 36G+2.4G (for the free-space.sh core part + 2.4G is for removal of compressed docker image) + additional 2.4G for the uncompressed copy of the entire NDK. For building Would appreciate some reviews on this as I am planning to merge this PR #28627 as well as #27739 together |
e42bf19 to
114cd78
Compare
Earlier 54G -> 25G Now 53G -> 16G (10G excluded for docker image)
Does this work? Yes Is this cursed? Not as cursed as using bash for build system
This should allow for more flexible configuration of docker containers by allowing devs to pass on their own flags to docker Also ~/clean.sh will now not remove /home/builder/.termux-build to account for cases where /home/builder/.termux-build is a volume mounted to the docker image, where it is not possible to remove the directory
We now are always using docker for builds, so can't do this
Do not clutter what we do in "Gather build summary" step
Also make sure we use this optimization in package_updates.yml
~/.termux-build on host Should be helpful for local builds for using IDEs and host tools for development
In commits, %ci:free-space will force freeing space in commits In workflow dispatch, a new checkbox should be available
Instead of only supporting one of the flags, we now support passing multiple flags at the same time for more convenience. The command line argument parser will exit as soon as it detects an argument/flag it doesn't handle to preserve maximum compatibility with existing commands
rid of long length variable name
Logic error found by GitHub Copilot
**BREAKING CHANGES** This now requires AppArmor to be installed and running with docker for limiting the capabilities that `CAP_SYS_ADMIN` provides to containers. Host kernel must support fuse. The host /dev/fuse device is passed onto the container. **DETAILED DESCRIPTION** ./scripts/run-docker.sh first starts with relaxed profile, and then after changing the uid and gid of the builder user and group, drops to restricted profile. Each container get's it's own profile so that if ./scripts/run-docker.sh is run parallel with multiple containers, there is no race condition for the when we are changing the builder uid/gid, where the other container will run with higher privileges than needed. For ensuring least privileges, only mount and umount2 syscalls have been permitted in seccomp profile. Additionally rules for allowing clone, clone3 and similar syscalls when certain contain conditions are met and only when CAP_SYS_ADMIN is not set have been removed as we aren't allowing these syscalls when CAP_SYS_ADMIN is set. The AppArmor profile is based on Docker's default AppArmor profile. The profile was extracted using nerdctl (which is an alternate CLI interface to Docker CLI). The profile can be extracted using `nerdctl apparmor inspect`. There are two AppArmor profiles we have setup, one restricted and relaxed. Currently there is little difference between relaxed and restricted profile. The only difference is that relaxed profile allows any kind of mount syscall, while the restricted profile only allows mount syscalls only for fuse.fuse-overlayfs Regarding security of passing /dev/fuse to containers, the Linux kernel documentation specifies that it should be fine to pass this to namespaces. The CAP_SYSTEM_ADMIN is needed only for the mount syscall to work. Due to some historic reasons this needs this dangerous capability. Although the syscall is needed we are only allowing mount and umount syscalls to happen inside the container, so seccomp profile and apparmor profile should be doing the damage control. Linux kernel documentation for fuse-passthrough: https://docs.kernel.org/6.16/filesystems/fuse-passthrough.html Upstream Linux kernel commit for fuse documentation: torvalds/linux@18ee43c Even without apparmor, things should be fine as we aren't fiddling around much with apparmor for security reasons, just currently limiting where the fuse filesystem can be used. In future, AppArmor profiles can also be used for further hardening of docker image
@licy183 and me for Seccomp profile Me for apparmor profile Feel free to add yourself if you believe that you can deal with this nicely. Mostly for others looking for maintenance of the apparmor profile. Just grab the docker's default config using nerdctl apparmor inspect, diff with the current config and figure it out For the seccomp profile, just diff with the exact commit of moby/moby's seccomp profile and store the updated JSON
AppArmor isn't configured by default on distributions other than Ubuntu, so don't mandate it. AppArmor proper configuration and setup is a huge pain especially if you aren't familiar with it and containers in general. Even a lot of the maintainers aren't already familiar with it and using it already so let's just keep it optional and do not use it if not detected on host.
only respect TERMUX_DOCKER_USE_SUDO for running docker commands Even if docker is setup in rootless mode, apparmor needs to be run with sudo. This was the cause of CI failure and me trying to bang my head on why things weren't working on CI. Thanks, a "sudo" is all it takes. I wish it worked in real life as well on people
We assume in a lot of places that all files in termux-packages are trusted. So make sure that the container isn't able to modify anything in this environment. Also do not allow any changes to ~/lib/ where we are storing Android NDK and SDK.
114cd78 to
758b230
Compare
|
Manually merged directly in master. I don't know why it didn't got detected by GitHub, probably I messed up somewhere |
For list of major changes, give a look at the first commit description of this PR, along with the titles of other commits
Will be merging this together with Python 3.13 update on
1st March4th March to reduce the number off docker image builds (Python 3.13 is also modifying the docker image).