Tools: Apache Kafka Quickstart - Install Kafka 4.2 with CLI and Local Examples
What Apache Kafka is and what it is used for
Install Apache Kafka
Prerequisites you should not skip
Install from the official binary release
Install with Docker
Platform note for Windows users
Start Kafka locally with KRaft
Start a single-node local broker from the extracted tarball
What you are running in this Quickstart
Kafka CLI essentials and main command-line parameters
Kafka connection flags you will use constantly
Topic management with kafka-topics.sh
Producing and consuming with console clients
Inspecting consumer lag with kafka-consumer-groups.sh
One config caveat for local Docker and remote clients
Quickstart examples you can run now
Example run a topic and stream messages end to end
Example run a simple Kafka Connect pipeline from file to topic to file
Troubleshooting and next steps
Broker fails to start
CLI commands fail even though the broker is running
Docker networking surprises
Where to go after the Quickstart Apache Kafka 4.2.0 is the current supported release line, and it's the best baseline for a modern Quickstart because Kafka 4.x is fully ZooKeeper-free and built around KRaft by default. This guide is a practical, command-line-first Quickstart: installing Kafka, starting a local broker, learning the essential Kafka CLI tools, and finishing with two end-to-end examples you can paste into your terminal. Apache Kafka is an event streaming platform. In practical terms, event streaming means capturing event data in real time from sources (databases, sensors, apps), storing the resulting streams durably, and processing or routing them in real time (or later). Kafka brings three core capabilities together in one platform: publish and subscribe to streams of events, store streams durably for as long as needed, and process streams as they occur or retrospectively. That mix is why Kafka is used for real-time data pipelines, integration, messaging, and streaming analytics. For context on where Kafka fits within a broader data infrastructure, see the Data Infrastructure for AI Systems: Object Storage, Databases, Search & AI Data Architecture pillar, which covers S3-compatible object storage, PostgreSQL architecture, Elasticsearch optimization, and AI-native data layers. If you're building on AWS and need a managed alternative, Building Event-Driven Microservices with AWS Kinesis covers implementing event-driven microservices with Kinesis Data Streams. Operationally, Kafka is a distributed system of servers and clients communicating over a high-performance TCP protocol: brokers store and serve data; clients (producers and consumers) write and read events, often at large scale and with fault tolerance. A few concepts you'll see repeatedly in the CLI: Kafka's official Quickstart uses the binary release (tarball) or the official Docker image. Both are valid for local development. Kafka 4.x requires modern Java: for the server and tools, Java 17+ is the baseline for local running, and Kafka 4.0 removed Java 8 support. If you're installing Kafka specifically to learn it, aim for a supported JDK such as Java 17 or 21. Kafka's Java support page lists Java 17, 21, and 25 as fully supported, while Java 11 is supported only for a subset of modules (clients and streams). The official Quickstart for Kafka 4.2.0 starts by downloading and extracting the binary distribution: Notes for advanced readers: Kafka also provides official Docker images on Docker Hub. The Quickstart shows you can pull and run Kafka 4.2.0 like this: There is also a "native" image line (GraalVM native image based). Kafka documentation and the Kafka Improvement Proposal for this image line describe it as experimental and intended for local development and testing, not production. Kafka distributions include Windows scripts (batch files). Kafka docs historically note that on Windows you use bin\windows\ and .bat scripts rather than the Unix bin/ .sh scripts. If you're asking "Do I need ZooKeeper to run Apache Kafka", the modern answer is no. Kafka 4.0 is the first major release designed to operate entirely without ZooKeeper, running in KRaft mode by default, which reduces operational overhead for local and production use. Kafka's 4.2 Quickstart uses three commands: 1) Generate a cluster UUID
2) Format the log directories
3) Start the server Why the "format" step matters in KRaft: Kafka's KRaft operations documentation explains that kafka-storage.sh random-uuid generates the cluster ID and that each server must be formatted with kafka-storage.sh format. One rationale given is that auto-formatting can hide errors, especially around the metadata log, so explicit formatting is preferred. For local development, Kafka can run in a simplified "combined" setup (controllers and brokers together). Kafka's KRaft documentation calls out combined servers as simpler for development but not recommended for critical deployment environments (where you want controllers isolated and scalable independently). For "real" clusters, KRaft controllers and brokers are separate roles (process.roles), and controllers are typically deployed as a quorum of 3 or 5 nodes (availability depends on a majority being alive). Kafka ships with a lot of CLI tooling under bin/. The official operations docs emphasise two useful properties: Also important for Kafka 4.x: AdminClient commands no longer accept --zookeeper. Kafka's compatibility documentation notes that, starting with Kafka 4.0, you must use --bootstrap-server to interact with the cluster. Most tools need a cluster entry point: KRaft introduces broker vs controller endpoints for some tools. For example, kafka-features.sh and parts of the metadata tooling can use controller endpoints, while many admin operations use broker endpoints. The KRaft operations page shows both styles in examples. You will use kafka-topics.sh for the core lifecycle: A good "production-minded" create command looks like this (this exact shape is used in the official operations docs): The Quickstart uses the console producer and consumer because they're fast for validation and smoke tests: Kafka 4.2 also includes CLI consistency improvements. In the upgrade notes: If you maintain internal runbooks, these notes are worth updating now, before Kafka 5.0 removes the deprecated flags. For real systems, "Is my consumer keeping up" is a daily question. The operations guide demonstrates: If you run Kafka in containers or behind load balancers, you will eventually hit the need to set listeners correctly. Kafka's broker configuration docs explain advertised.listeners as the addresses brokers advertise to clients and other brokers, particularly when the bind address is not the address clients should use. The examples below are deliberately CLI-based so you can validate a local Kafka setup before you write any application code. This is the canonical "create, produce, consume" flow from the Kafka 4.2 Quickstart. Open terminal A and create a topic: Now describe it (optional but useful when you're learning partitions and replication factor): Open terminal B and start a producer: Type a couple of lines (each line becomes an event), then leave the producer running: Open terminal C and start a consumer from the beginning: You should see the same lines printed. Why this validates more than "it works": Kafka's Quickstart explains that brokers store events durably and that events can be read multiple times and by multiple consumers. That durability is why this Quickstart pattern is the first thing you should do after any install or upgrade. Kafka Connect answers the recurring question "How do I move data into and out of Kafka without writing custom producers and consumers for everything". The Kafka Connect overview describes it as a tool for scalable, reliable streaming between Kafka and other systems, via connectors. The Kafka 4.2 Quickstart includes a minimal, local Connect demo using the file source and sink connectors. From your Kafka directory, first set the worker plugin path to include the provided file connector jar: Create a tiny input file: Start the Connect worker in standalone mode with both a source and sink connector configuration: What should happen (and why it's useful): Verify the sink file: You can also verify the topic directly: This second example is a great muscle-memory builder because it also teaches you where Connect configuration sits (worker config plus connector configs) and shows a minimal "ingest, store, export" loop. Most "Kafka Quickstart won't start" issues fall into a small set of root causes. Start with the official requirements: If you experimented and now want a clean slate, Kafka's Quickstart shows how to delete the local data directories used in the demo: In Kafka 4.x, validate that you are using --bootstrap-server (not --zookeeper). Kafka's compatibility documentation explicitly calls out the removal of --zookeeper from AdminClient commands starting in Kafka 4.0. If Kafka is in Docker and your client tool is outside Docker (or on another machine), you may need correct listener advertisement. The broker config docs explain that advertised.listeners is used when the addresses clients should connect to differ from the bind addresses (listeners). If you've completed the examples in this post, you've already answered the most common first searches: From here, the most valuable next steps are usually: Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse