SEANK.H.LIAO

k8s non root capabilities

using capabilities in kubernetes when you're not root

non root

By default, containers are root in their own little sandbox, which is nice from a don't-need-to-think-about-permissions perspective, but a bit less nice if you're concerned about privilege escalation.

So you have the ability to run as not-root, either by setting USER xxx in a Dockerfile or at runtime. But sometimes you need to do special things, like binding to port 80. Linux gives us capabilities, granting little slices of elevated permissions.

docker / kubernetes

Unfortunately for us, it doesn't work just by granting the capability to everything in the container. So you actually need to use setcap to set the permissions you want on the binary (sometimes necessitating a custom image build) before the capabilities will work.

Dockerfile for a simple Go http server

 1FROM golang:1.16-alpine AS build
 2WORKDIR /workspace
 3
 4# static binary for scratch container
 5ENV CGO_ENABLED=0
 6
 7# get the setcap command
 8RUN apk add --update --no-cache libcap
 9
10# main.go:
11#     package main
12#
13#     import (
14#             "log"
15#             "net/http"
16#     )
17#
18#     func main() {
19#             log.Fatal(http.ListenAndServe(":80", http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request) {
20#                     rw.Write([]byte("ok"))
21#             })))
22#     }
23COPY go.mod main.go .
24RUN go build -o /bin/app .
25
26# set net_bind both [e]ffective and [p]ermitted
27RUN setcap cap_net_bind_service+ep /bin/app
28
29
30FROM scratch
31COPY --from=build /bin/app /bin/app
32ENTRYPOINT ["/bin/app"]

k8s Pod (can also be combined with PodSecurityPolicy)

 1apiVersion: v1
 2kind: Pod
 3metadata:
 4  name: http
 5spec:
 6  # run as not root
 7  securityContext:
 8    runAsGroup: 65535
 9    runAsNonRoot: true
10    runAsUser: 65535
11  containers:
12    - name: http
13      image: http:latest
14      imagePullPolicy: IfNotPresent
15      securityContext:
16        allowPrivilegeEscalation: false
17        capabilities:
18          # set permitted privileges
19          add:
20            - NET_BIND_SERVICE
21          # default drop all
22          drop:
23            - all
24        privileged: false
25        readOnlyRootFilesystem: true

bonus

test with KinD

1docker build -t http .
2kind create cluster
3kind load docker-image http
4kubectl apply -f pod.yaml