Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] New DMZ solution to defeat CVE-2019-5736 #4450

Open
lifubang opened this issue Oct 16, 2024 · 2 comments
Open

[Proposal] New DMZ solution to defeat CVE-2019-5736 #4450

lifubang opened this issue Oct 16, 2024 · 2 comments

Comments

@lifubang
Copy link
Member

lifubang commented Oct 16, 2024

Because of CVE-2019-5736, we have to take some actions to prevent runc binary in the host
could be modified from the container, we had many solutions for these years, for example:
bindfd, memfd, otmpfile, os tmpfile, Embedded dmz and overlay. Most of then can work
for most of times, but in practice, there were many issues for these solutions. So we should find
some new possible solutions.

For Embedded dmz solution, we embed a small binary file to start the real container process,
but it may cause some issues(#4518), especially the capability behavior change issue(#4125). So
we make this solution as a opt-in solution. Consider we have moved the binary clone code
from runc init to runc parent process(#3987), so the memory usage of binary clone will not
be included in container's memory cgroup accounting. We can embed a big binary file in runc,
and copy it to memfd, then use it to start runc init process.

The whole steps should like this:

  1. Move runc init to contrib/cmd and it could be compiled as a separate binary file, for example
    name it as runc-dmz;
  2. Embed runc-dmz to runc binary;
  3. Remove the runc-dmz file;
  4. Change the old dmz solution to copy runc-dmz binary data to memfd;
  5. Use this memfd to start runc init, like runc-dmz init.

I think the size of runc-dmz should be smaller than runc, and we can read it to memory paralleled,
so it would help to reduce the start time of the container, and will fix the issue #4449.

@cyphar
Copy link
Member

cyphar commented Oct 16, 2024

I'm not sure that a separated runc init would be much smaller (most of runc's logic lives in runc init after all) and doing this split would make the whole runc binary a fair bit larger. So the overhead of copying the binary would still be fairly large (given that the current ~15MB copy adds ~60% overhead using the figures from #4448.)

The overlayfs approach (which it seems I suggested doing in the commit message when I wrote this protection in 0a8e411 😅) has basically no performance or memory overhead (based on my performance testing in #4448, it has <1% overhead over doing no copying at all and that might just be noise) and solves the most common case. It also has the benefit of giving us page cache sharing, which none of the copy-based solutions can.

Conceptually, having runc init be separate sounds nice but I don't think the extra binary size overhead is worth it. 15MB for a program as small as runc is already insane, let alone the ~25MB we would get if we had runc init be a separate embedded binary.

(Also O_TMPFILE and tempfiles are the same solution. It's just that O_TMPFILE is the nicer way of doing it on newer kernels.)

@kolyshkin
Copy link
Contributor

I played with the separate runc init a long time ago; I don't remember any details now except it was not worth it.

Overlay approach indeed looks very promising since most of the systems where runc is used have overlayfs. The only catch is it requires a relatively new kernel v5.1, released in May 2019, meaning some very old distros might not have the syscalls needed (fsopen/fsmount); but even Ubuntu 18.04 has kernel 5.15 (in HWE), and RHEL8 kernel has needed syscalls backported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants