Apache Kafka, originally developed by LinkedIn and open sourced in 2011, is the de-facto industry standard for real-time data feeds that can reliably handle large volumes of data with extremely high throughput and low latency. Companies like Uber, Netflix and Slack use Kafka to process trillions of messages per day, and, unlike a traditional queue or message broker, Kafka functions as a unified, durable log of append-only, ordered events that can be replayed or archived.
You can download the code for the event stream processing micro-framework and sample application here: https://github.com/event-streams-dotnet/event-stream-processing
Kafka and .NET
Kafka is written in Java, and most of the libraries and tools are only available in Java. This makes the experience of developing for Kafka in C# somewhat limiting. Kafka Streams and Kafka Connect API’s, for example, are only available in Java, and Confluent’s ksqlDB product is only accessible via a REST interface.
All is not lost, however, for C# developers wishing to use Kafka. Confluent, a company founded by the creators of Kafka, offers confluent-kafka-dotnet, a .NET Kafka client that provides high-level
AdminClient API’s. This is useful for building single event stream processing applications, which can be used for scenarios like event sourcing and real-time ETL (extract-transform-load) data pipelines. (There is also an open source kafka-streams-dotnet project that aims to provide the same functionality as Kafka Streams on .NET for multiple event stream processing applications.)
Single Event Stream Processing
As the name implies, single event stream processing entails consuming and processing one event at a time, rather than capturing and processing multiple events at the same time (for example, to aggregate results for a specific timeframe). This is a very powerful paradigm for both event-driven microservice architectures and transforming data as it flows from one data source to another.
A good example is sending an event through a chain of message handlers which apply validation, enrichment and filtering, before writing processed events back to Kafka as a new event stream.
Event Stream Processing Micro-Framework
Because this is the kind of thing you might want to do all the time, it makes sense to create a reusable framework for processing event streams. The framework abstractions should provide a standard approach that is generic, type-safe and extensible, without being coupled to Kafka or any other streaming platform. This is the purpose of the EventStreamProcessing.Abstractions package. There you will find an abstract
EventProcessor class that implements the
IEventProcessor interface. Notice the generic
TSinkEvent type arguments, which allow you to specify any message type.
Next there is the abstract MessageHandler class that implements IMessageHandler, which is used to build a chain of message handlers.
Lastly, the Message class encapsulates an event as a key-value pair.
In addition to a platform-agnostic set of abstractions, there is an EventStreamProcessing.Kafka package that references Confluent.Kafka and has Kafka-specific implementations of the
IEventProcessor interfaces. The
KafkaEventProcessor class overrides the
Process method with code that consumes a raw event, builds the chain of handlers, and produces a processed event.
Build Your Own Event Stream Processing Service
To build your own event stream processing service it’s best to start by creating a new .NET Core Worker Service.
IEventProcessor into the
Worker class constructor, then call
eventProcessor.Process inside the
while loop in the
To set up the event processor you will need some helper methods for creating Kafka consumers and producers.
Next create some classes that extend
MessageHandler in which you override the
HandleMessage method to process the message. Here is an example of a handler that transforms the message.
Then add code to the
CreateHostBuilder method in the
Program class where you set up dependency injection for
Run Kafka Locally with Docker
Note: To run Kafka you will need to allocate 8 GB of memory to Docker Desktop.
Open the Kafka control center:
http://localhost:9021/. Then select the main cluster, go to Topics and create the “raw-events” and “processed-events” topics.
Use the console consumer to show the processed events.
Use the console producer to create raw events.
Check Out the Sample
The event-stream-processing repository has a samples folder that contains a working example of an event processing service based on the Event Stream Processing Micro-Framework. Here is a diagram showing the data pipeline used by the Sample Worker.
- The Sample Producer console app lets the user write a stream of events to the Kafka broker using the “raw-events” topic. The numeral represents the event key, and the text “Hello World” presents the event value.
- The Sample Worker service injects an
KafkaWorkerclass constructor. Then
whileloop until the operation is cancelled.
Program.CreateHostBuildermethod registers an
IEventProcessorfor dependency injection with a
KafkaEventProducerand an array of
KafkaEventConsumerin Sample Worker subscribes to the “raw-events” topic of the Kafka broker running on
localhost:9092. The message handlers validate, enrich and filter the events one at a time. If there are validation errors, those are written back to Kafka with a “validation-errors” topic. This takes place if the message key does not correlate to a key in the language store. The
EnrichmentHandlerlooks up a translation for “Hello” in the language store and transforms the message with the selected translation. The
FilterHandleraccepts a lambda expression for filtering messages. In this case the English phrase “Hello” is filtered out. Lastly, the
KafkaEventProducerwrites processed events back to Kafka using the “final-events” topic.
- The Sample Consumer console app reads the “validation-errors” and “final-events” topics, displaying them in the console.
Follow instructions in the project ReadMe file to run the sample. If you wish to run the Sample Worker in a Docker container, you will need to place it in the same network as the Kafka broker, which can be accomplished using a separate docker-compose.yml file for the Sample Worker.