Apache Kafka is a free, open-source Distributed Data Streaming Technology used for real-time data pipelines. It is used by thousands of companies for high-performance data pipelines, stream processing, and data integration at scale. Kafka can handle a high volume of data and enables you to pass messages from one end-point to another.

In this post, we will show you how to install Apache Kafka on Arch Linux.

Step 1 – Configure Repository

By default, the default repository is outdated in Arch Linux, so you will need to modify the default mirror list. You can do it by editing the mirrorlist configuration file:

nano  /etc/pacman.d/mirrorlist

Remove all lines and add the following lines:

## Score: 0.7, United States
Server = http://mirror.us.leaseweb.net/archlinux/$repo/os/$arch
## Score: 0.8, United States
Server = http://lug.mtu.edu/archlinux/$repo/os/$arch
Server = http://mirror.nl.leaseweb.net/archlinux/$repo/os/$arch
## Score: 0.9, United Kingdom
Server = http://mirror.bytemark.co.uk/archlinux/$repo/os/$arch
## Score: 1.5, United Kingdom
Server = http://mirrors.manchester.m247.com/arch-linux/$repo/os/$arch
Server = http://archlinux.dcc.fc.up.pt/$repo/os/$arch
## Score: 6.6, United States
Server = http://mirror.cs.pitt.edu/archlinux/$repo/os/$arch
## Score: 6.7, United States
Server = http://mirrors.acm.wpi.edu/archlinux/$repo/os/$arch
## Score: 6.8, United States
Server = http://ftp.osuosl.org/pub/archlinux/$repo/os/$arch
## Score: 7.1, India
Server = http://mirror.cse.iitk.ac.in/archlinux/$repo/os/$arch
## Score: 10.1, United States
Server = http://mirrors.xmission.com/archlinux/$repo/os/$arch

Save and close the file then update all the package indexes with the following command:

pacman -Syu

Step 2 – Install Java JDK

Apache Kafka is a Java-based software, so Java must be installed on your system. You can install it by simply running the following command.

pacman -S jre17-openjdk curl

After the successful installation, you can verify the Java version using the following command.

java --version

You will get the following output.

openjdk 17.0.6 2023-01-17
OpenJDK Runtime Environment (build 17.0.6+10)
OpenJDK 64-Bit Server VM (build 17.0.6+10, mixed mode)

Step 3 – Install Apache Kafka

First, create a dedicated user to run Apache Kafka using the following command.

useradd -r -d /opt/kafka -s /usr/sbin/nologin kafka

Next, download the latest version of Apache Kafka from their official download page.

curl -fsSLo kafka.tgz https://downloads.apache.org/kafka/3.3.2/kafka_2.13-3.3.2.tgz

Once the download is completed, extract the downloaded file with the following command.

tar -xzf kafka.tgz

Next, move the extracted directory to /opt.

mv kafka_2.13-3.3.2 /opt/kafka

Next, change the ownership of the Kafka directory.

chown -R kafka:kafka /opt/kafka

Next, create a Kafka log directory using the following command.

sudo -u kafka mkdir -p /opt/kafka/logs

Next, edit the Kafka configuration file and define your log directory.

sudo -u kafka nano /opt/kafka/config/server.properties

Change the following line:

log.dirs=/opt/kafka/logs

Save and close the file when you are done.

Step 4 – Create a Systemd Service File for Kafka

It is recommended to create a systemd service file to manage the Kafka service. First, create a Zookeeper service file using the following command.

nano /etc/systemd/system/zookeeper.service

Add the following configuration.

[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties
ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Next, create a Kafka service file:

nano /etc/systemd/system/kafka.service

Add the following configuration.

[Unit]
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
User=kafka
ExecStart=/bin/sh -c '/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties > /opt/kafka/logs/start-kafka.log 2>&1'
ExecStop=/opt/kafka/bin/kafka-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

Save and close the file, then reload the systemd daemon to apply the changes.

systemctl daemon-reload

Next, start and enable both services using the following command.

systemctl enable zookeeper kafka
systemctl start zookeeper kafka

You can check the status of both services using the following command.

systemctl status zookeeper kafka

You will get the service status in the following output.

● zookeeper.service
     Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; preset: disabled)
     Active: active (running) since Mon 2023-02-06 05:05:11 UTC; 17s ago
   Main PID: 54782 (java)
      Tasks: 34 (limit: 4700)
     Memory: 68.5M
     CGroup: /system.slice/zookeeper.service
             └─54782 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitG>

Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,594] INFO zookeeper.snapshot.compression.method = CHECKED (o>
Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,594] INFO Snapshotting: 0x0 to /tmp/zookeeper/version-2/snap>
Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,598] INFO Snapshot loaded in 11 ms, highest zxid is 0x0, dig>
Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,598] INFO Snapshotting: 0x0 to /tmp/zookeeper/version-2/snap>
Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,598] INFO Snapshot taken in 0 ms (org.apache.zookeeper.serve>
Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,625] INFO zookeeper.request_throttler.shutdownTimeout = 1000>
Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,626] INFO PrepRequestProcessor (sid:0) started, reconfigEnab>
Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,660] INFO Using checkIntervalMs=60000 maxPerMinute=10000 max>
Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,662] INFO ZooKeeper audit is disabled. (org.apache.zookeeper>
Feb 06 05:05:24 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:24,979] INFO Creating new log file: log.1 (org.apache.zookeeper>
● kafka.service
     Loaded: loaded (/etc/systemd/system/kafka.service; enabled; preset: disabled)
     Active: active (running) since Mon 2023-02-06 05:05:21 UTC; 15s ago
   Main PID: 55164 (sh)
      Tasks: 73 (limit: 4700)
     Memory: 316.6M
     CGroup: /system.slice/kafka.service
             ├─55164 /bin/sh -c "/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties > /opt/kafka/logs/start-kafka.log 2>
             └─55165 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInv>

Feb 06 05:05:21 archlinux systemd[1]: Started kafka.service.

Step 5 – Create a Topic in Kafka

At this point, Kafka is installed and running. Now, it’s time to verify Kafka.

To verify Kafka, create a topic named MyTopic using the following command.

sudo -u kafka /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic MyTopic

You can now verify your created topic using the following command.

sudo -u kafka /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Sample output.

MyTopic

Kafka provides a command line client called producer that will take input from a file or from standard input and send it out as messages to the Kafka cluster.

Run the following command to type a few messages into the console to send to the server.

sudo -u kafka /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic MyTopic

Type some messages as shown below:

>Hi How are you?
>I am fine

Now, open another terminal and run the consumer command line tool to read data from the Kafka cluster and display it to the output.

sudo -u kafka /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic MyTopic --from-beginning

You will get the messages in the following output.

Hi How are you?
I am fine

Conclusion

In this tutorial, we explained how to install Apache Kafka on Arch Linux. We also verified Kafka using producer and consumer. Try installing a Kafka server on dedicated server hosting from Atlantic.Net!