What is Apache Kafka?
Apache Kafka is an open-source, distributed, scalable publish-subscribe messaging system. The organization responsible for Kafka is the Apache Software Foundation. The code is written in Scala and was initially developed by the LinkedIn Company. It was open-sourced in 2011 and became a top-level Apache project.
The project has the intention of providing a unified low-latency platform capable of handling data feeds in real-time. It is becoming more and more valuable for different enterprise infrastructures requiring integration between systems. Systems wishing to integrate may publish or subscribe to particular Kafka topics based on their needs.
Kafka was heavily influenced by the construct of transaction logs. Transaction logs are often overlooked, but essential backbone component of numerous enterprise systems such as databases, fault tolerant replication, web servers, e-commerce, etc. Apache Kafka is a massively scalable queue for messages which is constructed like a distributed transaction log.
What Does Apache Kafka do?
This is a platform which is designated to work as a real-time data stream.
Kafka allows the organization of data under particular topics. Data producers write to topics as “publishers”. Consumers or “subscribers” are configured and programmed to read off topic queues.
Topic messages are persisted on disk and replicated within the cluster to prevent data loss. Kafka has a cluster-centric design which offers strong durability and fault-tolerance guarantees.
Apache Kafka Tutorials for Beginners
If you are new to Kafka, start with the following. Each tutorial is approximately a 3-5 minute read and will present Kafka from a high-level perspective. In any case, understanding the Kafka principles presented in this section will put you in the best position to proceed if you choose to do so.
Apache Kafka Use Cases
More coming soon, but to start us off
Apache Kafka Examples
Stay tuned. Bookmark this page.
Featured image adapted from https://flic.kr/p/bGR8bZ