Tutorial

Get Kafka up and running

You can start a single Kafka broker using either of the following distributions

In order to install Apache Kafka
Step 1: download
Download a version from Apache Kafka page and un-tar it. For example, for version 2.11

tar -xf kafka_2.11-2.0.0.tgz .
cd kafka_2.11-2.0.0
Step 2: Start Zookeeper
We know that Kafka depends on Zookeeper. If you already have a Zookeeper instance up and running, you can skip this step. Make sure you provide the running Zookeeper quorum in config/zookeeper.properties file.

./bin/zookeeper-server-start.sh config/zookeeper.properties
Step 3: Start Kafka

./bin/kafka-server-start.sh config/server.properties
To facilitate running commands, add the bin directory to your $PATH.

export PATH=$PATH:$KAFKA_HOME/bin

Kafka Command Line (basics)

To start working with Kafka command line tools, here are a set of commands.

Kafka Topics Tutorial

Kafka consumer and producer Tutorial

Source code on GitHub

Multi-broker setup and ISR

Let's start a Kafka cluster with multiple brokers. Use the following docker-compose file to run your cluster by running docker-compose up (assuming you have installed Docker and docker-compose).

Note that as we are running three dockers on the same host and also the limitations on brokers and listeners and advertised listeners we are using different ports for each broker (i.e. 9092, 9093, and 9094). Keep in mind that in real deployments, these ports stay the same. Multi-broker Kafka setup in docker-compose

To complete your tutorial, first create a topic. This time, we use replication factor 3 as we have 3 brokers available.


    kafka-topics --bootstrap-server localhost:9092 --create --replication-factor 1 --partitions 3 --topic test2
    kafka-topics --bootstrap-server localhost:9092 --describe --topic test2
Distribution of brokers and partitions

Let's produce some messages


kafka-console-producer --broker-list localhost:9092 --topic test2
>first message
>second message
>third message

And then consume


kafka-console-consumer --bootstrap-server localhost:9092 --topic test2 --from-beginning
second message
third message
first message

There are 3 messages in this topic. Because we used the console producer, it doesn't use any key for the messages and by design uses the round-robin method. Hence, each partition has only one message. Let's see the message in partition 2


kafka-console-consumer --bootstrap-server localhost:9092 --topic test2 --from-beginning --partition 2
first message

It is first message. As we saw earlier, broker1 is the leader for this partition. Let's stop this broker using docker stop broker1. Now describe the topic again


kafka-topics --bootstrap-server localhost:9092 --describe --topic test2
Topic:test2	PartitionCount:3	ReplicationFactor:3	Configs:
	Topic: test2	Partition: 0	Leader: 2	Replicas: 2,3,1	Isr: 2,3
	Topic: test2	Partition: 1	Leader: 3	Replicas: 3,1,2	Isr: 3,2
	Topic: test2	Partition: 2	Leader: 3	Replicas: 1,2,3	Isr: 3,2

As you can see,

  • The replica list is still 1,2,3 hoping that broker 1 recovers
  • There was a leader election between broker 2 and 3 and broker 3 became the leader.
  • In all the partitions, Isr is changed by removing broker 1

If you try and consume the topic again, you will see that all the messages including first message are still available and it is the power of failure recovery and replication


kafka-console-consumer --bootstrap-server localhost:9092 --topic test2 --from-beginning
first message
second message
third message

Tutorial on Avro, Schema Registry, and Kafka

The code in the tutorial can be found on GitHub

Register Avro schema from command line

The Avro schema in JSON format