What is Apache Kafka?
Apache Kafka is an open-source, distributed, scalable publish-subscribe messaging system. The organization responsible for Kafka is the Apache Software Foundation. The code is written in Scala and was initially developed by the LinkedIn Company. It was open-sourced in 2011 and became a top-level Apache project.
The project has the intention of providing a unified low-latency platform capable of handling data feeds in real-time. It is becoming more and more valuable for different enterprise infrastructures requiring integration between systems. Systems wishing to integrate may publish or subscribe to particular Kafka topics based on their needs.
Kafka was heavily influenced by the construct of transaction logs. Transaction logs are often overlooked, but essential backbone component of numerous enterprise systems such as databases, fault tolerant replication, web servers, e-commerce, etc. Apache Kafka is a massively scalable queue for messages which is constructed like a distributed transaction log.
- 1 What is Apache Kafka?
- 2 What Does Apache Kafka do?
- 3 Apache Kafka Tutorials for Beginners
- 4 Apache Kafka Components
- 5 Apache Kafka Operations
What Does Apache Kafka do?
This is a platform which is designated to work as a real-time data stream.
Kafka allows the organization of data under particular topics. Data producers write to topics as “publishers”. Consumers or “subscribers” are configured and programmed to read off topic queues.
Topic messages are persisted on disk and replicated within the cluster to prevent data loss. Kafka has a cluster-centric design which offers strong durability and fault-tolerance guarantees.
Apache Kafka Tutorials for Beginners
If you are new to Kafka, start with the following. Each tutorial is approximately a 3-5 minute read and will present Kafka from a high-level perspective. In any case, understanding the Kafka principles presented in this section will put you in the best position to proceed if you choose to do so.
Apache Kafka Use Cases
More coming soon, but to start us off
Apache Kafka Examples
Stay tuned. Bookmark this page.
Apache Kafka Components
What is Kafka Connect?
Kafka Connect is a framework for Kafka with external systems such as files, databases, Hadoop clusters, and equivalent cloud-based versions. It’s an open source component of Apache Kafka.
Kafka Connect is one option to use from moving data in and out of Kafka without writing your own Kafka producer and consumer code.
Kafka Connect key concepts include source and sink connectors as well as standalone or distributed execution modes.
A Source connector is used to ingest data into Kafka topics while a Sink connector is used to deliver data from Kafka to the desired destination.
Kafka Connect can be run either standalone isolated process or distributed across multiple workers.
Kafka Connect Examples
What is Kafka Streams?
Kafka Streams is a client library for building Kafka applications such as Stream Processors. There are options for Java or Scala.
Kafka Streams EXAMPLES
Apache Kafka Operations
Monitoring coming soon
Featured image adapted from https://flic.kr/p/bGR8bZ