Skip to main content

Deploying Elasticsearch, fluentd and Kibana on Kubernetes with Digital Ocean

As part of the Digital Ocean Kubernetes Challenge, I deployed the Elasticearch, Fluentd and Kibana stack for log analytics. It's my first time deploying a Statefulset and Daemonset, and I encountered several challenges along the way, which gave me the opportunity to practice debugging Kubernetes issues.

I started off by developing in a local cluster, but in this guide I will demonstrate how to deploy on Digital Ocean's managed Kubernetes cluster.

Disclaimer#

This is not a production-ready deployment. Rather it is a simplified guide to get started learning how to set up Elasticsearch, fluentd and Kibana.

Requirements#

Clone the repo#

The completed repo can be found here. Feel free to clone it to follow along.

$ git clone https://github.com/joshchoo/digital-ocean-kubernetes-challenge.git

Provision a managed Kubernetes cluster#

Let's begin by provisioning three nodes with at least 4GB of RAM. I found that using the nodes with less RAM from Digital Ocean caused the pods to crash continuously ๐Ÿฅฒ.

Provisioning three nodes allows the Elasticsearch service to tolerate one node becoming unavailable.

$ doctl kubernetes cluster create k8s-challenge \    --size=s-2vcpu-4gb \    --count=3 \    --region=sgp1 \    --surge-upgrade=false \    --wait=false

It takes a while for Digital Ocean to provision the nodes. We can check the progress on the Digital Ocean site:

Digital Ocean provisioning nodes in progress

Once the nodes have been provisioned, we should see the following details:

Digital Ocean nodes provisioned

We can click on the "Kubernetes Dashboard" button to open the dashboard!

On the dashboard, take note that Digital Ocean has provided the do-block-storage Storage Class. We will use this to conveniently provision Elasticsearch with storage backed by Digital Ocean's Block Storage.

Kubernetes dashboard Digital Ocean Block Storage class

Context#

After creating the cluster, the context should have changed to the newly provisioned cluster. This means that future kubectl commands will execute against the Kubernetes cluster on Digital Ocean instead of local clusters, such as that created by minikube.

$ kubectl config get-contextsCURRENT   NAME                    CLUSTER                 AUTHINFO                      NAMESPACE*         do-sgp1-k8s-challenge   do-sgp1-k8s-challenge   do-sgp1-k8s-challenge-admin          minikube                minikube                minikube                      default

Create the logging namespace#

By default, Kubernetes deploys resources to the default namespace. A namespace is like a logical sub-cluster within the Kubernetes cluster. We shall deploy our logging analytics stack to a new logging namespace.

# infra/logging-ns.yamlapiVersion: v1kind: Namespacemetadata:  name: logging
$ kubectl apply -f infra/logging-ns.yaml

Create the Elasticsearch cluster#

Elasticsearch is a search engine that is commonly used to search for logs across many backend applications.

We shall deploy the Elasticsearch cluster in the newly-created logging namespace. The Elasticsearch resource will be a StatefulSet instead of Deployment to maintain persistent identifiers on Elasticsearch pods. Persistent identifiers ensure that the "primary" Elasticsearch pod does not access the storage of the secondary, and vice versa.

As mentioned previously, we created three replicas (each deployed on a different node) so that Elasticsearch can tolerate one node becoming unavailable. If one node fails, the remaining two can still form a quorum and elect a new leader.

As previously mentioned, we will use the do-block-storage Storage Class to automatically provision storage on Digital Ocean.

The tedious alternative would have been to manually provision Digital Ocean Block Storage, and deploy Persistent Volume and Persistent Volume Claim resources. Thankfully we don't have to do this.

# infra/elasticsearch.yamlapiVersion: apps/v1kind: StatefulSetmetadata:  name: elasticsearch-cluster  namespace: loggingspec:  selector:    matchLabels:      app: elasticsearch  serviceName: elasticsearch  # Create at least 3 primary-eligible nodes so that if one fails, the others can still safely form a quorum.  # Source: https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery-voting.html  replicas: 3  template:    metadata:      labels:        app: elasticsearch    spec:      containers:        - name: elasticsearch          image: docker.elastic.co/elasticsearch/elasticsearch:7.15.2          resources:            limits:              cpu: 1000m            requests:              cpu: 100m          ports:            - containerPort: 9200              name: http-api              protocol: TCP            - containerPort: 9300              name: inter-node              protocol: TCP          volumeMounts:            - name: elasticsearch-data # should match volumeClaimTemplates.metadata.name              mountPath: /usr/share/elasticsearch/data          # See required envvars: https://www.elastic.co/guide/en/elasticsearch/reference/7.15/docker.html#docker-compose-file          env:            - name: node.name              valueFrom:                fieldRef:                  fieldPath: metadata.name # resolves to elasticsearch-{ordinal}            - name: cluster.name              value: elasticsearch-cluster            - name: discovery.seed_hosts              # DNS name for each Pod: <StatefulSet metadata.name-{ordinal}>.<serviceName>.<namespace>.svc.cluster.local              # Truncated DNS: <StatefulSet metadata.name-{ordinal}>.<serviceName>              value: "elasticsearch-cluster-0.elasticsearch,elasticsearch-cluster-1.elasticsearch,elasticsearch-cluster-2.elasticsearch"              # The initial master nodes should be identified by their node.name, which we defined above.              # See: https://www.elastic.co/guide/en/elasticsearch/reference/7.9/discovery-settings.html#CO15-1            - name: cluster.initial_master_nodes              value: "elasticsearch-cluster-0,elasticsearch-cluster-1,elasticsearch-cluster-2"            # Disabled bootstrap.memory_lock because of error: "memory locking requested for elasticsearch process but memory is not locked"            # - name: bootstrap.memory_lock            #   value: "true"            - name: ES_JAVA_OPTS              value: "-Xms512m -Xmx512m"            - name: "node.max_local_storage_nodes"              value: "3"      initContainers:        # https://www.digitalocean.com/community/tutorials/how-to-set-up-an-elasticsearch-fluentd-and-kibana-efk-logging-stack-on-kubernetes        - name: fix-permissions          image: busybox          command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]          securityContext:            privileged: true          volumeMounts:            - name: elasticsearch-data              mountPath: /usr/share/elasticsearch/data        # Elasticsearch demands at least vm.max_map_count 262144 compared to the default 65530        # Related: https://stackoverflow.com/questions/51445846/elasticsearch-max-virtual-memory-areas-vm-max-map-count-65530-is-too-low-inc        - name: increase-vm-max-map-count          image: busybox          command: ["sysctl", "-w", "vm.max_map_count=262144"]          securityContext:            privileged: true  volumeClaimTemplates:    - metadata:        name: elasticsearch-data      spec:        storageClassName: do-block-storage        accessModes:          - ReadWriteOnce        resources:          requests:            storage: 1Gi---apiVersion: v1kind: Servicemetadata:  name: elasticsearch  namespace: loggingspec:  selector:    app: elasticsearch  # Set up a headless service  clusterIP: None  ports:    # The HTTP API interface for client requests    - name: http-api      port: 9200      targetPort: 9200      # The transport interface for inter-node communication    - name: inter-node      port: 9300      targetPort: 9300
$ kubectl apply -f infra/elasticsearch.yaml

We might see some "pod has unbound immediate PersistentVolumeClaims" warnings when viewing logs for the Pods (see image), but that just means that there isn't yet an underlying Persistent Volume to support the PersistentVolumeClaims. Don't worry about it! Digital Ocean is provisioning the Block Storage for the Persistent Volume behind-the-scenes. Just wait a bit and the warning should disappear.

Kubernetes dashboard unbounded immediate PersistentVolumeClaims warning

We should see the following when the Elasticsearch pods are now ready. Notice in the top menu-bar that we are narrowing to the logging namespace:

Kubernetes dashboard Elasticsearch pods

$ kubectl get pods -n loggingNAME                      READY   STATUS    RESTARTS   AGEelasticsearch-cluster-0   1/1     Running   0          11melasticsearch-cluster-1   1/1     Running   0          10melasticsearch-cluster-2   1/1     Running   0          10m

We can test that the service is running by sending a request to it. First, we need to establish a port-forward so that we can send requests via localhost (127.0.0.1) to the Elasticsearch port at :9200.

$ kubectl port-forward elasticsearch-cluster-0 9200:9200 -n loggingForwarding from 127.0.0.1:9200 -> 9200Forwarding from [::1]:9200 -> 9200

We should see the following response if Elasticsearch is running successfully.

$ curl 127.0.0.1:9200{  "name" : "elasticsearch-cluster-0",  "cluster_name" : "elasticsearch-cluster",  "cluster_uuid" : "84XJWldhT3qX0WNG8_JqEQ",  "version" : {    "number" : "7.15.2",    "build_flavor" : "default",    "build_type" : "docker",    "build_hash" : "93d5a7f6192e8a1a12e154a2b81bf6fa7309da0c",    "build_date" : "2021-11-04T14:04:42.515624022Z",    "build_snapshot" : false,    "lucene_version" : "8.9.0",    "minimum_wire_compatibility_version" : "6.8.0",    "minimum_index_compatibility_version" : "6.0.0-beta1"  },  "tagline" : "You Know, for Search"}

Deploy fluentd#

We've just deployed Elasticsearch, but we need a way to collect logs from other applications running in our Kubernetes cluster. Applications typically write their logs to stdout or stderr, and Kubernetes stores the pod container logs in /var/log.

fluentd is a data collector that allows us to push these logs to Elasticsearch so that we can search the logs there. We shall deploy it as a DaemonSet because we want fluentd to run on every Kubernetes node.

We will also create a ServiceAccount resource and bind it with a ClusterRole that grants fluentd permissions to get/list/watch pods and namespaces.

Important: We need to ensure that fluentd does not read its own logs from /var/log. Otherwise this could cause an error that prevent fluentd from sending logs to Elasticsearch. See the FLUENT_CONTAINER_TAIL_EXCLUDE_PATH environment variable below for the fix.

# infra/fluentd.yamlapiVersion: apps/v1kind: DaemonSetmetadata:  name: fluentd  namespace: logging  labels:    app: fluentdspec:  selector:    matchLabels:      app: fluentd  template:    metadata:      labels:        app: fluentd    spec:      # Use the fluentd ServiceAccount to run this pod      serviceAccountName: fluentd      tolerations:        - key: node-role.kubernetes.io/master          effect: NoSchedule      containers:        - name: fluentd          image: fluent/fluentd-kubernetes-daemonset:v1.14.3-debian-elasticsearch7-1.0          env:            - name: FLUENT_ELASTICSEARCH_HOST              value: "elasticsearch.logging.svc.cluster.local"            - name: FLUENT_ELASTICSEARCH_PORT              value: "9200"            - name: FLUENT_ELASTICSEARCH_SCHEME              value: "http"            - name: FLUENTD_SYSTEMD_CONF              value: disable              # Fixes logs not being sent to Elasticsearch/Kibana: https://github.com/fluent/fluentd/issues/2545#issuecomment-747488212              # Prevent Fluentd from reading its own logs recursively...            - name: FLUENT_CONTAINER_TAIL_EXCLUDE_PATH              value: /var/log/containers/fluent*            - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE              value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/          resources:            limits:              memory: 512Mi            requests:              cpu: 100m              memory: 200Mi          # Map the Node's folders onto the Pod's          volumeMounts:            - name: varlog              # Kubernetes captures stdout/stderr logs from each pod to the Node's /var/log path.              mountPath: /var/log            - name: varlibdockercontainers              mountPath: /var/lib/docker/containers              readOnly: true      terminationGracePeriodSeconds: 30      volumes:        - name: varlog          hostPath:            path: /var/log        - name: varlibdockercontainers          hostPath:            path: /var/lib/docker/containers---apiVersion: v1kind: ServiceAccountmetadata:  name: fluentd  namespace: logging  labels:    app: fluentd---apiVersion: rbac.authorization.k8s.io/v1kind: ClusterRolemetadata:  name: fluentd  labels:    app: fluentdrules:  - apiGroups:      - ""    resources:      - pods      - namespaces    verbs:      - get      - list      - watch---# Bind the above ClusterRole permissions to the ServiceAccountkind: ClusterRoleBindingapiVersion: rbac.authorization.k8s.io/v1metadata:  name: fluentdroleRef:  kind: ClusterRole  name: fluentd  apiGroup: rbac.authorization.k8s.iosubjects:  - kind: ServiceAccount    name: fluentd    namespace: logging

Let's deploy the fluentd resources.

$ kubectl apply -f infra/fluentd.yaml

We should see three fluentd pods, one for each node.

$ kubectl get pods -n loggingNAME                          READY   STATUS    RESTARTS   AGEpod/elasticsearch-cluster-0   1/1     Running   0          19mpod/elasticsearch-cluster-1   1/1     Running   0          18mpod/elasticsearch-cluster-2   1/1     Running   0          18mpod/fluentd-5c79q             1/1     Running   0          3m13spod/fluentd-j8szn             1/1     Running   0          3m13spod/fluentd-wrkkd             1/1     Running   0          3m13s

Deploy Kibana#

We've deployed Elasticsearch and fluentd, which means that we can now capture logs from pods running in our Kubernetes cluster and search them by querying Elasticsearch's REST API.

Searching the logs via the REST API is not a great user experience though. Instead we could use Kibana, which provides a web page to search for logs in a web browser!

Because Kibana is a stateless application, we will deploy it as a Deployment resource instead of StatefulSet. Additionally, we will set the Kibana server base path to the /kibana endpoint.

# infra/kibana.yamlapiVersion: v1kind: Servicemetadata:  name: kibana  namespace: logging  labels:    app: kibanaspec:  ports:    - port: 5601  selector:    app: kibana---apiVersion: apps/v1kind: Deploymentmetadata:  name: kibana  namespace: logging  labels:    app: kibanaspec:  replicas: 1  selector:    matchLabels:      app: kibana  template:    metadata:      labels:        app: kibana    spec:      containers:        - name: kibana          image: docker.elastic.co/kibana/kibana:7.15.2          resources:            limits:              cpu: 1000m            requests:              cpu: 100m          env:            # This should match the path specified in the Ingress file            - name: SERVER_BASEPATH              value: "/kibana"            - name: SERVER_REWRITEBASEPATH              value: "true"            - name: ELASTICSEARCH_HOSTS              value: http://elasticsearch:9200            - name: ELASTICSEARCH_URL              value: http://elasticsearch:9200          ports:            - containerPort: 5601
$ kubectl apply -f infra/kibana.yaml

We should see one Kibana pod running.

$ kubectl get pods -n loggingNAME                      READY   STATUS    RESTARTS   AGEelasticsearch-cluster-0   1/1     Running   0          24melasticsearch-cluster-1   1/1     Running   0          23melasticsearch-cluster-2   1/1     Running   0          23mfluentd-5c79q             1/1     Running   0          8m19sfluentd-j8szn             1/1     Running   0          8m19sfluentd-wrkkd             1/1     Running   0          8m19skibana-64dd44844b-cnls2   1/1     Running   0          4m24s

Let's set up a port-forward to the Kibana pod so that we can access it from our web browser locally.

$ kubectl port-forward kibana-64dd44844b-cnls2 5601:5601 -n loggingForwarding from 127.0.0.1:5601 -> 5601Forwarding from [::1]:5601 -> 5601

In a browser, visit http://localhost:5601/kibana. We should be greeted with the Kibana home web page.

Kibana home page

Before we can start seeing logs, we need to create an "index pattern". Open the sidebar -> Analytics -> Discover -> "Create index pattern".

Kibana create index pattern page

Now enter logstash-* in the "Name" field and @timestamp in the "Timestamp" field, and click on "Create index pattern" to continue.

Kibana create index pattern details page

Head back to Discover in the sidebar, and we should see all the logs now! Sweet!

Kibana discover page

Deploying a counter#

Let's check that logging works properly by deploying an application in a different namespace than logging. Our application is a counter that logs the date and time every second to stdout.

# infra/counter.yamlapiVersion: v1kind: Podmetadata:  name: counterspec:  containers:    - name: count      image: busybox      args: [/bin/sh, -c, 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']

We can inspect the logs from the counter pod with the following kubectl command:

$ kubectl logs counter0: Mon Dec  6 01:20:21 UTC 20211: Mon Dec  6 01:20:22 UTC 20212: Mon Dec  6 01:20:23 UTC 20213: Mon Dec  6 01:20:24 UTC 20214: Mon Dec  6 01:20:25 UTC 20215: Mon Dec  6 01:20:26 UTC 20216: Mon Dec  6 01:20:27 UTC 20217: Mon Dec  6 01:20:28 UTC 20218: Mon Dec  6 01:20:29 UTC 20219: Mon Dec  6 01:20:30 UTC 202110: Mon Dec  6 01:20:31 UTC 2021

Now, let's search Kibana for kubernetes.pod_name: counter, and we should now see the same counter pod's logs!

Kibana counter logs

References#