Apache Kafka is a free, open-source Distributed Data Streaming Technology used for real-time data pipelines. It is used by thousands of companies for high-performance data pipelines, stream processing, and data integration at scale. Kafka can handle a high volume of data and enables you to pass messages from one end-point to another.
In this post, we will show you how to install Apache Kafka on Arch Linux.
Step 1 – Configure Repository
By default, the default repository is outdated in Arch Linux, so you will need to modify the default mirror list. You can do it by editing the mirrorlist configuration file:
nano /etc/pacman.d/mirrorlist
Remove all lines and add the following lines:
## Score: 0.7, United States Server = http://mirror.us.leaseweb.net/archlinux/$repo/os/$arch ## Score: 0.8, United States Server = http://lug.mtu.edu/archlinux/$repo/os/$arch Server = http://mirror.nl.leaseweb.net/archlinux/$repo/os/$arch ## Score: 0.9, United Kingdom Server = http://mirror.bytemark.co.uk/archlinux/$repo/os/$arch ## Score: 1.5, United Kingdom Server = http://mirrors.manchester.m247.com/arch-linux/$repo/os/$arch Server = http://archlinux.dcc.fc.up.pt/$repo/os/$arch ## Score: 6.6, United States Server = http://mirror.cs.pitt.edu/archlinux/$repo/os/$arch ## Score: 6.7, United States Server = http://mirrors.acm.wpi.edu/archlinux/$repo/os/$arch ## Score: 6.8, United States Server = http://ftp.osuosl.org/pub/archlinux/$repo/os/$arch ## Score: 7.1, India Server = http://mirror.cse.iitk.ac.in/archlinux/$repo/os/$arch ## Score: 10.1, United States Server = http://mirrors.xmission.com/archlinux/$repo/os/$arch
Save and close the file then update all the package indexes with the following command:
pacman -Syu
Step 2 – Install Java JDK
Apache Kafka is a Java-based software, so Java must be installed on your system. You can install it by simply running the following command.
pacman -S jre17-openjdk curl
After the successful installation, you can verify the Java version using the following command.
java --version
You will get the following output.
openjdk 17.0.6 2023-01-17 OpenJDK Runtime Environment (build 17.0.6+10) OpenJDK 64-Bit Server VM (build 17.0.6+10, mixed mode)
Step 3 – Install Apache Kafka
First, create a dedicated user to run Apache Kafka using the following command.
useradd -r -d /opt/kafka -s /usr/sbin/nologin kafka
Next, download the latest version of Apache Kafka from their official download page.
curl -fsSLo kafka.tgz https://downloads.apache.org/kafka/3.3.2/kafka_2.13-3.3.2.tgz
Once the download is completed, extract the downloaded file with the following command.
tar -xzf kafka.tgz
Next, move the extracted directory to /opt.
mv kafka_2.13-3.3.2 /opt/kafka
Next, change the ownership of the Kafka directory.
chown -R kafka:kafka /opt/kafka
Next, create a Kafka log directory using the following command.
sudo -u kafka mkdir -p /opt/kafka/logs
Next, edit the Kafka configuration file and define your log directory.
sudo -u kafka nano /opt/kafka/config/server.properties
Change the following line:
log.dirs=/opt/kafka/logs
Save and close the file when you are done.
Step 4 – Create a Systemd Service File for Kafka
It is recommended to create a systemd service file to manage the Kafka service. First, create a Zookeeper service file using the following command.
nano /etc/systemd/system/zookeeper.service
Add the following configuration.
[Unit] Requires=network.target remote-fs.target After=network.target remote-fs.target [Service] Type=simple User=kafka ExecStart=/opt/kafka/bin/zookeeper-server-start.sh /opt/kafka/config/zookeeper.properties ExecStop=/opt/kafka/bin/zookeeper-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Next, create a Kafka service file:
nano /etc/systemd/system/kafka.service
Add the following configuration.
[Unit] Requires=zookeeper.service After=zookeeper.service [Service] Type=simple User=kafka ExecStart=/bin/sh -c '/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties > /opt/kafka/logs/start-kafka.log 2>&1' ExecStop=/opt/kafka/bin/kafka-server-stop.sh Restart=on-abnormal [Install] WantedBy=multi-user.target
Save and close the file, then reload the systemd daemon to apply the changes.
systemctl daemon-reload
Next, start and enable both services using the following command.
systemctl enable zookeeper kafka systemctl start zookeeper kafka
You can check the status of both services using the following command.
systemctl status zookeeper kafka
You will get the service status in the following output.
● zookeeper.service Loaded: loaded (/etc/systemd/system/zookeeper.service; enabled; preset: disabled) Active: active (running) since Mon 2023-02-06 05:05:11 UTC; 17s ago Main PID: 54782 (java) Tasks: 34 (limit: 4700) Memory: 68.5M CGroup: /system.slice/zookeeper.service └─54782 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitG> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,594] INFO zookeeper.snapshot.compression.method = CHECKED (o> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,594] INFO Snapshotting: 0x0 to /tmp/zookeeper/version-2/snap> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,598] INFO Snapshot loaded in 11 ms, highest zxid is 0x0, dig> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,598] INFO Snapshotting: 0x0 to /tmp/zookeeper/version-2/snap> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,598] INFO Snapshot taken in 0 ms (org.apache.zookeeper.serve> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,625] INFO zookeeper.request_throttler.shutdownTimeout = 1000> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,626] INFO PrepRequestProcessor (sid:0) started, reconfigEnab> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,660] INFO Using checkIntervalMs=60000 maxPerMinute=10000 max> Feb 06 05:05:13 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:13,662] INFO ZooKeeper audit is disabled. (org.apache.zookeeper> Feb 06 05:05:24 archlinux zookeeper-server-start.sh[54782]: [2023-02-06 05:05:24,979] INFO Creating new log file: log.1 (org.apache.zookeeper> ● kafka.service Loaded: loaded (/etc/systemd/system/kafka.service; enabled; preset: disabled) Active: active (running) since Mon 2023-02-06 05:05:21 UTC; 15s ago Main PID: 55164 (sh) Tasks: 73 (limit: 4700) Memory: 316.6M CGroup: /system.slice/kafka.service ├─55164 /bin/sh -c "/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/server.properties > /opt/kafka/logs/start-kafka.log 2> └─55165 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInv> Feb 06 05:05:21 archlinux systemd[1]: Started kafka.service.
Step 5 – Create a Topic in Kafka
At this point, Kafka is installed and running. Now, it’s time to verify Kafka.
To verify Kafka, create a topic named MyTopic using the following command.
sudo -u kafka /opt/kafka/bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic MyTopic
You can now verify your created topic using the following command.
sudo -u kafka /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092
Sample output.
MyTopic
Kafka provides a command line client called producer that will take input from a file or from standard input and send it out as messages to the Kafka cluster.
Run the following command to type a few messages into the console to send to the server.
sudo -u kafka /opt/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic MyTopic
Type some messages as shown below:
>Hi How are you? >I am fine
Now, open another terminal and run the consumer command line tool to read data from the Kafka cluster and display it to the output.
sudo -u kafka /opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic MyTopic --from-beginning
You will get the messages in the following output.
Hi How are you? I am fine
Conclusion
In this tutorial, we explained how to install Apache Kafka on Arch Linux. We also verified Kafka using producer and consumer. Try installing a Kafka server on dedicated server hosting from Atlantic.Net!