Kafka and Kubernetes - How to "dump" topic?

Some time ago, I wanted to debug messages on a specific topic in Kafka, but I did not want to do that on production (we set up Kafka on Kubernetes using bitnami chart). So, the question was, how can I do that? How can I debug something on production? It almost always a bad idea, so I wanted to "dump" a topic on my local machine.

Something like "dump of topic" does not exist of course, because what does it mean in Kafka world? 🤔 But, you can create a consumer which will consume the topic from the beginning and pipe messages into a file; then using producer you can put these messages onto a specific topic on your local machine.

Remember that my Kafka Cluster is deployed on Kubernetes Cluster. I can "ssh" onto Kafka pod, run command and then copy the file on my local machine. Or, I can just exec a command on a Kafka pod and pipe the output (messages) to my local machine and save the messages locally.

Here is the comamnd:

kubectl exec kafka-0 --namespace kafka -- ./opt/bitnami/kafka/bin/kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic name-of-my-topic --from-beginning --timeout-ms 5000 > /tmp/name-of-my-topic
view raw kafka-dump.sh hosted with ❤ by GitHub

You tell kafka-0 pod that you want to run the specific command on this pod; and the output of this command should be piped to /tmp/name-of-my-topic file on your local machine.

You have messages in this file. Now, you should be able to "produce" them on your local Kafka via:

cat [FILE] | kafka-console-producer --bootstrap-server [HOST1:PORT1] --topic [TOPIC]

Here is a script which will dump all topics (excluding internal topics) to your local machine:

#!/usr/bin/env bash
TOPICS=$(kubectl exec kafka-0 --namespace kafka -- ./opt/bitnami/kafka/bin/kafka-topics.sh --list --zookeeper kafka-zookeeper:2181 --exclude-internal)
while IFS= read -r topic; do
kubectl exec kafka-0 --namespace kafka -- ./opt/bitnami/kafka/bin/kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic $topic --from-beginning --timeout-ms 5000 > /tmp/$topic
done <<< "$TOPICS"

I hope it was useful! 👋

Comments

Popular posts from this blog

GitLab - extends keyword

Managing Secrets in GitLab / Git

GitLab - trigger keyword