-
Notifications
You must be signed in to change notification settings - Fork 129
Integrate Capabilities in Open-Lambda #301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Great progress! Let's do one case at a time, each in different PRs: How about this first? "right before forking a Leaf to run the user code". But is it before, or after forking? For you four questions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start!
|
||
capList[0] = cap; | ||
if (cap_set_flag(caps, CAP_EFFECTIVE, 1, capList, setting) == -1) { | ||
cap_free(caps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the nice things about a goto is it gives you a single place to clean things up, like this:
if (caps != NULL)
cap_free(caps);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tylerharter I actually didn't get what you mean here. I tried to use the goto statement already. Is there an even better way to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have cap_free in multiples places. What I mean is that cap_free can be after "out:". You'll need to check whether it is NULL before freeing there, of course.
@@ -8,6 +9,71 @@ | |||
#include <sys/types.h> | |||
#include <sys/wait.h> | |||
#include <seccomp.h> | |||
#include <sys/capability.h> | |||
|
|||
static int modify_cap_impl(int cap, int setting) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To what extent can we just create Python wrappers around the cap API (can_get_proc, cap_set_flag, etc), and then have the logic using it in Python code only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tylerharter I couldn't actually find a native way to do this in Python (atleast at a granular level) and then came across the seccomp code in this file. It feels like we should be able to do a lot with the wrappers - like we have good amount of control over how we want to expose these functions to Python. Is there a concern around doing it in this way?
We may have to do it in a different way (or use an external module?) for dropping privileges in sock.go since these would only work in Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only plan to support Python in the forseeable future, and the more code we have in Python (rather than C), the more people will be able to understand and contribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tylerharter I see. Based on my research, I couldn't find any official Python libraries that would give us control over capabilities. There are some community wrappers available but I'm assuming they use similar C code underneath. I'm also concerned that if we rely on them, we could open a channel for untrustworthy libraries/ packages that aren't well maintained or supported. What are your thoughts on that?
min-image/runtimes/python/server.py
Outdated
@@ -208,6 +208,7 @@ def main(): | |||
file.write(pid) | |||
print(f'server.py: joined cgroup, close FD {fd_id}') | |||
|
|||
# drop privileges here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make capability dropping optional, via config, like it is for seccomp? https://github.com/open-lambda/open-lambda/blob/main/min-image/runtimes/python/server.py#L189
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tylerharter I added the command line argument for enabling capabilities as well. However, we'll need to pass the flag on to everywhere we drop capabilities - to check it and drop only if it's true. Wondering what would be the best way to do that.
@tylerharter My bad, I meant to say Okay sure, that makes sense. I'll start with |
@tylerharter "How about this first? "right before forking a Leaf to run the user code". But is it before, or after forking?" Taking this scenario first, I'm not sure exactly which one would be better or what the difference in the impact would be - dropping capabilities before vs after? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Looking forward to seeing the next iteration.
@@ -1,9 +1,10 @@ | |||
FROM ubuntu:22.04 | |||
|
|||
RUN apt-get autoremove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave this out of this PR. Maybe you can drop me a note about why you want this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tylerharter upon setting up OpenLambda for the first time, the DockerFile errored out due to some dependency or cleanup issue and suggested to run this to remove any conflicting stale package in the instance. I don't remember the exact error now but I adding this should make sure none of that happens
@@ -1,4 +1,5 @@ | |||
#include <Python.h> | |||
#include <python3.13/Python.h> | |||
// #include <Python.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to delete old code than comment out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tylerharter This is only here for testing reasons. I think due to how Python is set up on my system, I have to make that particular change (mentioning the exact version in the include statement) or my C linter complains. I'll remove this in the final iteration
@@ -8,6 +9,71 @@ | |||
#include <sys/types.h> | |||
#include <sys/wait.h> | |||
#include <seccomp.h> | |||
#include <sys/capability.h> | |||
|
|||
static int modify_cap_impl(int cap, int setting) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only plan to support Python in the forseeable future, and the more code we have in Python (rather than C), the more people will be able to understand and contribute.
|
||
capList[0] = cap; | ||
if (cap_set_flag(caps, CAP_EFFECTIVE, 1, capList, setting) == -1) { | ||
cap_free(caps); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have cap_free in multiples places. What I mean is that cap_free can be after "out:". You'll need to check whether it is NULL before freeing there, of course.
@@ -177,15 +185,16 @@ def main(): | |||
|
|||
global bootstrap_path | |||
|
|||
if len(sys.argv) < 2: | |||
print("Expected execution: chroot <path_to_root_fs> python3 server.py <path_to_bootstrap.py> [cgroup-count] [enable-seccomp]") | |||
if len(sys.argv) < 3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a TODO to move to named args (in a future PR) now that we're getting more than a couple.
assert return_val == 0 | ||
|
||
# set CAP_SYS_ADMIN | ||
return_val = ol.modify_caps(21, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid hardcoding this? Anyway we can expose some consts (like CAP_SYS_ADMIN) from our ol module to Python?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, aren't we always doing this, regardless of cmd line args?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tylerharter Right, for that, I was concerned that I would have to pass that boolean flag down the function calls. enable-seccomp
bool is only needed in the beginning where it is "enabled" and then that's it. But since we would have to drop privileges at certain points in the code, we would need the drop-capabilities
bool value to propagate down.
I left it as it is for now for testing while I thought of a good solution for it or if you have any suggestions.
print(" cgroup-count: number of FDs (starting at 3) that refer to /sys/fs/cgroup/..../cgroup.procs files") | ||
print(" enable-seccomp: true/false to enable or disables seccomp filtering") | ||
print(" enable-seccomp: true/false to enable or disable seccomp filtering") | ||
print(" enable-capabilities: true/false to enable or disable capability dropping") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should call it drop-capabilities or something like that? I think it is intuitive that "enable-seccomp" is going to restrict you. But it's less obvious whether "enable-capabilities" is going to give you more capabilities, or turn on logic that limits your capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, I'll change the name
Upon looking at the code (and the workshop slides), I found 3 possible places where we might drop privileges:
freshProc()
[src/worker/sandbox/sock.go
] right before exec-ing theserver.py
.fork_server()
[min-image/runtimes/server.py
] right before the while loop that forks a new Zygotestart_container()
[min-image/runtimes/server.py
] right before forking a Leaf to run the user code.Some general questions I have:
web_server.py
orfork_server.py
invoked? (EDIT: I meantweb_server()
andfork_server()
functions inserver.py
my bad)su
before I can navigate to the directory and create a lambda function with my code. Is it possible that none of this is run as root right from the beginning itself so this issue doesn't occur either? Or maybe the base directory can be initialized with more relaxed permissions?strace
the system calls (or if we know ones that are needed already) to see which capabilities they need and then iteratively add and test them to see if all things work as expected. Which also brings me to another question