What is Apache Kafka®? (A Confluent Lightboard by Tim Berglund) + ksqlDB

What is Apache Kafka®? (A Confluent Lightboard by Tim Berglund) + ksqlDB

https://www.youtube.com/watch?v=06iRM1Ghr1k&ab_channel=Confluent

data

Traditional way of thinking with databases: objects are things. e.g. person, train, themostat

New way with Kafka: events -> a structure of a log: an ordered sequence of events that happened.

  • Kafka manages logs by Topic: an ordered sequence of event that are stored in a durable way.
  • Topics can be large, can be small, can remember data for a while, or forever.
  • Each event represents a thing happening in the business
  • Kafka encourages to think of events first, things second

Application

Old way of applications: giant monolithic application that talks to one database.

New way: a lot of smaller applications that are easier to maintain and change

New way with kafka: small applications can read from a Kafka topic, do some processing, and maybe output to another kafka topic.

With a cluster of kafka topics/streams, we can build new services that analyze these streams

Four key concepts for basic kafka understanding

  • events
  • topics
  • small services communication by consuming and producing events through topics
  • realtime analytics

This is a typical architecture of a system built on kafka

Kafka connect

Kafka connect is the tool that integrates with non-kafka data sources/systems

  • stream data into Kafka-based systems as topics
  • stream data out to external systems and storages

Connectors: pluggable modules. declarative way of integrating with data systems

Kafka streams

Kafka streams: a Java API for data operations on topics: grouping, aggregating, filtering, enrichment

KSQL by Confluent

A query language for querying topics