k8s clustered apps

starting clustered applications in k8s

SEAN K.H. LIAO

k8s clustered apps

starting clustered applications in k8s

clustered

So you want to run a clustered thing in k8s? Likely a database, using raft or similar.

Use a statefulset: you get your own persistentvolume per pod, and you get your own stable, addressable hostname. This can be retrieved either as the HOSTNAME env var (possibly unstable?), or set by custom env var with fieldref.

Use cert-manager: who wants to futz around with csrs

use publishNotReadyAddresses: true on a headless service to get name resolution before pods are ready, pods need to see each other before they are ready.

etcd

People on the internet only talk about running etcd outside the cluster for k8s...

Notes: data dir should be changed, Assumes a CA is available and called internal-ca.

Just works (I think)

  1apiVersion: cert-manager.io/v1
  2kind: Certificate
  3metadata:
  4  name: etcd-certs
  5spec:
  6  secretName: etcd-certs
  7  duration: 2160h
  8  renewBefore: 360h
  9  dnsNames:
 10    - "localhost"
 11    - "etcd"
 12    - "etcd.default"
 13    - "etcd.default.svc"
 14    - "etcd.default.svc.cluster.local"
 15    - "*.etcd-headless"
 16    - "*.etcd-headless.default"
 17    - "*.etcd-headless.default.svc"
 18    - "*.etcd-headless.default.svc.cluster.local"
 19  ipAddresses:
 20    - "127.0.0.1"
 21    - "::1"
 22  issuerRef:
 23    name: internal-ca
 24    kind: ClusterIssuer
 25---
 26apiVersion: v1
 27kind: Secret
 28metadata:
 29  name: etcd
 30  labels:
 31    app.kubernetes.io/name: etcd
 32type: Opaque
 33data:
 34  etcd-root-password: "eDgzelB1aVlsUQ=="
 35---
 36apiVersion: v1
 37kind: Service
 38metadata:
 39  name: etcd-headless
 40  labels:
 41    app.kubernetes.io/name: etcd
 42spec:
 43  type: ClusterIP
 44  clusterIP: None
 45  publishNotReadyAddresses: true
 46  ports:
 47    - name: client
 48      port: 2379
 49      targetPort: client
 50    - name: peer
 51      port: 2380
 52      targetPort: peer
 53  selector:
 54    app.kubernetes.io/name: etcd
 55---
 56apiVersion: v1
 57kind: Service
 58metadata:
 59  name: etcd
 60  labels:
 61    app.kubernetes.io/name: etcd
 62spec:
 63  type: ClusterIP
 64  ports:
 65    - name: client
 66      port: 2379
 67      targetPort: client
 68    - name: peer
 69      port: 2380
 70      targetPort: peer
 71  selector:
 72    app.kubernetes.io/name: etcd
 73---
 74apiVersion: apps/v1
 75kind: StatefulSet
 76metadata:
 77  name: etcd
 78  labels:
 79    app.kubernetes.io/name: etcd
 80spec:
 81  selector:
 82    matchLabels:
 83      app.kubernetes.io/name: etcd
 84  serviceName: etcd-headless
 85  podManagementPolicy: Parallel
 86  replicas: 3
 87  updateStrategy:
 88    type: RollingUpdate
 89  template:
 90    metadata:
 91      labels:
 92        app.kubernetes.io/name: etcd
 93    spec:
 94      securityContext:
 95        fsGroup: 1001
 96        runAsUser: 1001
 97      containers:
 98        - name: etcd
 99          image: docker.io/bitnami/etcd:3.4.13-debian-10-r22
100          imagePullPolicy: "IfNotPresent"
101          command:
102            - etcd
103          env:
104            - name: POD_NAME
105              valueFrom:
106                fieldRef:
107                  fieldPath: metadata.name
108            - name: ETCDCTL_API
109              value: "3"
110            - name: ETCD_NAME
111              value: "$(POD_NAME)"
112            - name: ETCD_DATA_DIR
113              value: /bitnami/etcd/data
114            - name: ETCD_ADVERTISE_CLIENT_URLS
115              value: "https://$(POD_NAME).etcd-headless.default.svc.cluster.local:2379"
116            - name: ETCD_LISTEN_CLIENT_URLS
117              value: "https://0.0.0.0:2379"
118            - name: ETCD_INITIAL_ADVERTISE_PEER_URLS
119              value: "https://$(POD_NAME).etcd-headless.default.svc.cluster.local:2380"
120            - name: ETCD_LISTEN_PEER_URLS
121              value: "https://0.0.0.0:2380"
122            - name: ALLOW_NONE_AUTHENTICATION
123              value: "yes"
124            - name: ETCD_ROOT_PASSWORD
125              valueFrom:
126                secretKeyRef:
127                  name: etcd
128                  key: etcd-root-password
129            - name: ETCD_INITIAL_CLUSTER
130              value: "etcd-0=https://etcd-0.etcd-headless.default.svc.cluster.local:2380,etcd-1=https://etcd-1.etcd-headless.default.svc.cluster.local:2380,etcd-2=https://etcd-2.etcd-headless.default.svc.cluster.local:2380"
131            - name: ETCD_INITIAL_CLUSTER_STATE
132              value: new
133            - name: ETCD_CLIENT_CERT_AUTH
134              value: "true"
135            - name: ETCD_TRUSTED_CA_FILE
136              value: /var/secret/tls/ca.crt
137            - name: ETCD_CERT_FILE
138              value: /var/secret/tls/tls.crt
139            - name: ETCD_KEY_FILE
140              value: /var/secret/tls/tls.key
141            - name: ETCD_PEER_CLIENT_CERT_AUTH
142              value: "true"
143            - name: ETCD_PEER_TRUSTED_CA_FILE
144              value: /var/secret/tls/ca.crt
145            - name: ETCD_PEER_CERT_FILE
146              value: /var/secret/tls/tls.crt
147            - name: ETCD_PEER_KEY_FILE
148              value: /var/secret/tls/tls.key
149          ports:
150            - name: client
151              containerPort: 2379
152            - name: peer
153              containerPort: 2380
154            - name: metrics
155              containerPort: 2381
156          livenessProbe:
157            httpGet:
158              path: /health
159              port: 2381
160          readinessProbe:
161            httpGet:
162              path: /health
163              port: 2381
164          volumeMounts:
165            - name: certs
166              mountPath: /var/secret/tls
167            - name: data
168              mountPath: /bitnami/etcd
169      volumes:
170        - name: certs
171          secret:
172            secretName: etcd-certs
173        - name: data
174          emptyDir: {}

cockroachdb

Notes: will complain if certs have wider perms than rwx------, which will cause issues if running as non-root in k8s (uses fsGroups to keep volume owner as root). Slightly modified from official manifests (change service names, certs, data dir for kind). Assumes a CA is available and called internal-ca.

Nodes need a manual action to join, could be a Job but need to time it right (after cert signing, nodes started):

1kubectl exec -it cockroachdb-0 -- /cockroach/cockroach init --certs-dir=/cockroach/cockroach-certs

manifest:

  1apiVersion: cert-manager.io/v1
  2kind: Certificate
  3metadata:
  4  name: cockroachdb-node
  5spec:
  6  secretName: cockroachdb-node
  7  duration: 2160h
  8  renewBefore: 360h
  9  dnsNames:
 10    - "node"
 11    - "localhost"
 12    - "cockroachdb"
 13    - "cockroachdb.default"
 14    - "cockroachdb.default.svc"
 15    - "cockroachdb.default.svc.cluster.local"
 16    - "*.cockroachdb-headless"
 17    - "*.cockroachdb-headless.default"
 18    - "*.cockroachdb-headless.default.svc"
 19    - "*.cockroachdb-headless.default.svc.cluster.local"
 20  ipAddresses:
 21    - "127.0.0.1"
 22    - "::1"
 23  issuerRef:
 24    name: internal-ca
 25    kind: ClusterIssuer
 26---
 27apiVersion: cert-manager.io/v1
 28kind: Certificate
 29metadata:
 30  name: cockroachdb-client-root
 31spec:
 32  secretName: cockroachdb-client-root
 33  duration: 2160h
 34  renewBefore: 360h
 35  commonName: root
 36  usages:
 37    - client auth
 38  issuerRef:
 39    name: internal-ca
 40    kind: ClusterIssuer
 41---
 42apiVersion: v1
 43kind: ServiceAccount
 44metadata:
 45  name: cockroachdb
 46  labels:
 47    app: cockroachdb
 48---
 49apiVersion: rbac.authorization.k8s.io/v1beta1
 50kind: Role
 51metadata:
 52  name: cockroachdb
 53  labels:
 54    app: cockroachdb
 55rules:
 56  - apiGroups:
 57      - ""
 58    resources:
 59      - secrets
 60    verbs:
 61      - get
 62---
 63apiVersion: rbac.authorization.k8s.io/v1beta1
 64kind: RoleBinding
 65metadata:
 66  name: cockroachdb
 67  labels:
 68    app: cockroachdb
 69roleRef:
 70  apiGroup: rbac.authorization.k8s.io
 71  kind: Role
 72  name: cockroachdb
 73subjects:
 74  - kind: ServiceAccount
 75    name: cockroachdb
 76    namespace: default
 77---
 78apiVersion: v1
 79kind: Service
 80metadata:
 81  name: cockroachdb
 82  labels:
 83    app: cockroachdb
 84spec:
 85  ports:
 86    - port: 26257
 87      targetPort: 26257
 88      name: grpc
 89    - port: 8080
 90      targetPort: 8080
 91      name: http
 92  selector:
 93    app: cockroachdb
 94---
 95apiVersion: v1
 96kind: Service
 97metadata:
 98  name: cockroachdb-headless
 99  labels:
100    app: cockroachdb
101  annotations:
102    prometheus.io/scrape: "true"
103    prometheus.io/path: "_status/vars"
104    prometheus.io/port: "8080"
105spec:
106  ports:
107    - port: 26257
108      targetPort: 26257
109      name: grpc
110    - port: 8080
111      targetPort: 8080
112      name: http
113  publishNotReadyAddresses: true
114  clusterIP: None
115  selector:
116    app: cockroachdb
117---
118apiVersion: apps/v1
119kind: StatefulSet
120metadata:
121  name: cockroachdb
122spec:
123  serviceName: "cockroachdb-headless"
124  replicas: 3
125  selector:
126    matchLabels:
127      app: cockroachdb
128  template:
129    metadata:
130      labels:
131        app: cockroachdb
132    spec:
133      serviceAccountName: cockroachdb
134      affinity:
135        podAntiAffinity:
136          preferredDuringSchedulingIgnoredDuringExecution:
137            - weight: 100
138              podAffinityTerm:
139                labelSelector:
140                  matchExpressions:
141                    - key: app
142                      operator: In
143                      values:
144                        - cockroachdb
145                topologyKey: kubernetes.io/hostname
146      containers:
147        - name: cockroachdb
148          image: cockroachdb/cockroach:v20.1.5
149          imagePullPolicy: IfNotPresent
150          ports:
151            - containerPort: 26257
152              name: grpc
153            - containerPort: 8080
154              name: http
155          livenessProbe:
156            httpGet:
157              path: "/health"
158              port: http
159              scheme: HTTPS
160            initialDelaySeconds: 30
161            periodSeconds: 5
162          readinessProbe:
163            httpGet:
164              path: "/health?ready=1"
165              port: http
166              scheme: HTTPS
167            initialDelaySeconds: 10
168            periodSeconds: 5
169            failureThreshold: 2
170          volumeMounts:
171            - name: datadir
172              mountPath: /cockroach/cockroach-data
173            - name: certs
174              mountPath: /cockroach/cockroach-certs/ca.crt
175              subPath: ca.crt
176            - name: certs
177              mountPath: /cockroach/cockroach-certs/node.crt
178              subPath: tls.crt
179            - name: certs
180              mountPath: /cockroach/cockroach-certs/node.key
181              subPath: tls.key
182            - name: client
183              mountPath: /cockroach/cockroach-certs/client.root.crt
184              subPath: tls.crt
185            - name: client
186              mountPath: /cockroach/cockroach-certs/client.root.key
187              subPath: tls.key
188          env:
189            - name: COCKROACH_CHANNEL
190              value: kubernetes-secure
191            - name: POD_NAME
192              valueFrom:
193                fieldRef:
194                  fieldPath: metadata.name
195            - name: GOMAXPROCS
196              valueFrom:
197                resourceFieldRef:
198                  resource: limits.cpu
199                  divisor: "1"
200            - name: MEMORY_LIMIT_MIB
201              valueFrom:
202                resourceFieldRef:
203                  resource: limits.memory
204                  divisor: "1Mi"
205          command:
206            - /bin/sh
207            - -exc
208            - >
209              /cockroach/cockroach
210              start
211              --logtostderr=WARNING
212              --certs-dir=/cockroach/cockroach-certs
213              --advertise-host=$(POD_NAME).cockroachdb-headless.default
214              --http-addr=0.0.0.0
215              --join=cockroachdb-0.cockroachdb-headless.default,cockroachdb-1.cockroachdb-headless.default,cockroachdb-2.cockroachdb-headless.default              
216      terminationGracePeriodSeconds: 60
217      volumes:
218        - name: datadir
219          emptyDir: {}
220        - name: certs
221          secret:
222            secretName: cockroachdb-node
223            defaultMode: 256
224        - name: client
225          secret:
226            secretName: cockroachdb-client-root
227            defaultMode: 256
228  podManagementPolicy: Parallel
229  updateStrategy:
230    type: RollingUpdate